Liu, Jin; Wang, Kai; Ma, Shuangge; Huang, Jian
2013-01-01
Penalized regression methods are becoming increasingly popular in genome-wide association studies (GWAS) for identifying genetic markers associated with disease. However, standard penalized methods such as LASSO do not take into account the possible linkage disequilibrium between adjacent markers. We propose a novel penalized approach for GWAS using a dense set of single nucleotide polymorphisms (SNPs). The proposed method uses the minimax concave penalty (MCP) for marker selection and incorporates linkage disequilibrium (LD) information by penalizing the difference of the genetic effects at adjacent SNPs with high correlation. A coordinate descent algorithm is derived to implement the proposed method. This algorithm is efficient in dealing with a large number of SNPs. A multi-split method is used to calculate the p-values of the selected SNPs for assessing their significance. We refer to the proposed penalty function as the smoothed MCP and the proposed approach as the SMCP method. Performance of the proposed SMCP method and its comparison with LASSO and MCP approaches are evaluated through simulation studies, which demonstrate that the proposed method is more accurate in selecting associated SNPs. Its applicability to real data is illustrated using heterogeneous stock mice data and a rheumatoid arthritis. PMID:25258655
Liu, Jin; Wang, Kai; Ma, Shuangge; Huang, Jian
2013-01-01
Penalized regression methods are becoming increasingly popular in genome-wide association studies (GWAS) for identifying genetic markers associated with disease. However, standard penalized methods such as LASSO do not take into account the possible linkage disequilibrium between adjacent markers. We propose a novel penalized approach for GWAS using a dense set of single nucleotide polymorphisms (SNPs). The proposed method uses the minimax concave penalty (MCP) for marker selection and incorporates linkage disequilibrium (LD) information by penalizing the difference of the genetic effects at adjacent SNPs with high correlation. A coordinate descent algorithm is derived to implement the proposed method. This algorithm is efficient in dealing with a large number of SNPs. A multi-split method is used to calculate the p-values of the selected SNPs for assessing their significance. We refer to the proposed penalty function as the smoothed MCP and the proposed approach as the SMCP method. Performance of the proposed SMCP method and its comparison with LASSO and MCP approaches are evaluated through simulation studies, which demonstrate that the proposed method is more accurate in selecting associated SNPs. Its applicability to real data is illustrated using heterogeneous stock mice data and a rheumatoid arthritis.
Ding, Xiuhua; Su, Shaoyong; Nandakumar, Kannabiran; Wang, Xiaoling; Fardo, David W
2014-01-01
Large-scale genetic studies are often composed of related participants, and utilizing familial relationships can be cumbersome and computationally challenging. We present an approach to efficiently handle sequencing data from complex pedigrees that incorporates information from rare variants as well as common variants. Our method employs a 2-step procedure that sequentially regresses out correlation from familial relatedness and then uses the resulting phenotypic residuals in a penalized regression framework to test for associations with variants within genetic units. The operating characteristics of this approach are detailed using simulation data based on a large, multigenerational cohort.
Gentry, Amanda Elswick; Jackson-Cook, Colleen K; Lyon, Debra E; Archer, Kellie J
2015-01-01
The pathological description of the stage of a tumor is an important clinical designation and is considered, like many other forms of biomedical data, an ordinal outcome. Currently, statistical methods for predicting an ordinal outcome using clinical, demographic, and high-dimensional correlated features are lacking. In this paper, we propose a method that fits an ordinal response model to predict an ordinal outcome for high-dimensional covariate spaces. Our method penalizes some covariates (high-throughput genomic features) without penalizing others (such as demographic and/or clinical covariates). We demonstrate the application of our method to predict the stage of breast cancer. In our model, breast cancer subtype is a nonpenalized predictor, and CpG site methylation values from the Illumina Human Methylation 450K assay are penalized predictors. The method has been made available in the ordinalgmifs package in the R programming environment. PMID:26052223
Pineda, Silvia; Real, Francisco X; Kogevinas, Manolis; Carrato, Alfredo; Chanock, Stephen J; Malats, Núria; Van Steen, Kristel
2015-12-01
Omics data integration is becoming necessary to investigate the genomic mechanisms involved in complex diseases. During the integration process, many challenges arise such as data heterogeneity, the smaller number of individuals in comparison to the number of parameters, multicollinearity, and interpretation and validation of results due to their complexity and lack of knowledge about biological processes. To overcome some of these issues, innovative statistical approaches are being developed. In this work, we propose a permutation-based method to concomitantly assess significance and correct by multiple testing with the MaxT algorithm. This was applied with penalized regression methods (LASSO and ENET) when exploring relationships between common genetic variants, DNA methylation and gene expression measured in bladder tumor samples. The overall analysis flow consisted of three steps: (1) SNPs/CpGs were selected per each gene probe within 1Mb window upstream and downstream the gene; (2) LASSO and ENET were applied to assess the association between each expression probe and the selected SNPs/CpGs in three multivariable models (SNP, CPG, and Global models, the latter integrating SNPs and CPGs); and (3) the significance of each model was assessed using the permutation-based MaxT method. We identified 48 genes whose expression levels were significantly associated with both SNPs and CPGs. Importantly, 36 (75%) of them were replicated in an independent data set (TCGA) and the performance of the proposed method was checked with a simulation study. We further support our results with a biological interpretation based on an enrichment analysis. The approach we propose allows reducing computational time and is flexible and easy to implement when analyzing several types of omics data. Our results highlight the importance of integrating omics data by applying appropriate statistical strategies to discover new insights into the complex genetic mechanisms involved in disease
Yi, Hui; Breheny, Patrick; Imam, Netsanet; Liu, Yongmei; Hoeschele, Ina
2015-01-01
The data from genome-wide association studies (GWAS) in humans are still predominantly analyzed using single-marker association methods. As an alternative to single-marker analysis (SMA), all or subsets of markers can be tested simultaneously. This approach requires a form of penalized regression (PR) as the number of SNPs is much larger than the sample size. Here we review PR methods in the context of GWAS, extend them to perform penalty parameter and SNP selection by false discovery rate (FDR) control, and assess their performance in comparison with SMA. PR methods were compared with SMA, using realistically simulated GWAS data with a continuous phenotype and real data. Based on these comparisons our analytic FDR criterion may currently be the best approach to SNP selection using PR for GWAS. We found that PR with FDR control provides substantially more power than SMA with genome-wide type-I error control but somewhat less power than SMA with Benjamini–Hochberg FDR control (SMA-BH). PR with FDR-based penalty parameter selection controlled the FDR somewhat conservatively while SMA-BH may not achieve FDR control in all situations. Differences among PR methods seem quite small when the focus is on SNP selection with FDR control. Incorporating linkage disequilibrium into the penalization by adapting penalties developed for covariates measured on graphs can improve power but also generate more false positives or wider regions for follow-up. We recommend the elastic net with a mixing weight for the Lasso penalty near 0.5 as the best method. PMID:25354699
Yi, Hui; Breheny, Patrick; Imam, Netsanet; Liu, Yongmei; Hoeschele, Ina
2015-01-01
The data from genome-wide association studies (GWAS) in humans are still predominantly analyzed using single-marker association methods. As an alternative to single-marker analysis (SMA), all or subsets of markers can be tested simultaneously. This approach requires a form of penalized regression (PR) as the number of SNPs is much larger than the sample size. Here we review PR methods in the context of GWAS, extend them to perform penalty parameter and SNP selection by false discovery rate (FDR) control, and assess their performance in comparison with SMA. PR methods were compared with SMA, using realistically simulated GWAS data with a continuous phenotype and real data. Based on these comparisons our analytic FDR criterion may currently be the best approach to SNP selection using PR for GWAS. We found that PR with FDR control provides substantially more power than SMA with genome-wide type-I error control but somewhat less power than SMA with Benjamini-Hochberg FDR control (SMA-BH). PR with FDR-based penalty parameter selection controlled the FDR somewhat conservatively while SMA-BH may not achieve FDR control in all situations. Differences among PR methods seem quite small when the focus is on SNP selection with FDR control. Incorporating linkage disequilibrium into the penalization by adapting penalties developed for covariates measured on graphs can improve power but also generate more false positives or wider regions for follow-up. We recommend the elastic net with a mixing weight for the Lasso penalty near 0.5 as the best method.
Advanced colorectal neoplasia risk stratification by penalized logistic regression.
Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F
2016-08-01
Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance.
Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data.
Abram, Samantha V; Helwig, Nathaniel E; Moodie, Craig A; DeYoung, Colin G; MacDonald, Angus W; Waller, Niels G
2016-01-01
Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks. PMID:27516732
Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data
Abram, Samantha V.; Helwig, Nathaniel E.; Moodie, Craig A.; DeYoung, Colin G.; MacDonald, Angus W.; Waller, Niels G.
2016-01-01
Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks. PMID:27516732
Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data.
Abram, Samantha V; Helwig, Nathaniel E; Moodie, Craig A; DeYoung, Colin G; MacDonald, Angus W; Waller, Niels G
2016-01-01
Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks.
Reduced rank regression via adaptive nuclear norm penalization
Chen, Kun; Dong, Hongbo; Chan, Kung-Sik
2014-01-01
Summary We propose an adaptive nuclear norm penalization approach for low-rank matrix approximation, and use it to develop a new reduced rank estimation method for high-dimensional multivariate regression. The adaptive nuclear norm is defined as the weighted sum of the singular values of the matrix, and it is generally non-convex under the natural restriction that the weight decreases with the singular value. However, we show that the proposed non-convex penalized regression method has a global optimal solution obtained from an adaptively soft-thresholded singular value decomposition. The method is computationally efficient, and the resulting solution path is continuous. The rank consistency of and prediction/estimation performance bounds for the estimator are established for a high-dimensional asymptotic regime. Simulation studies and an application in genetics demonstrate its efficacy. PMID:25045172
Penalized spline estimation for functional coefficient regression models
Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z.
2011-01-01
The functional coefficient regression models assume that the regression coefficients vary with some “threshold” variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called “curse of dimensionality” in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application. PMID:21516260
Sparse brain network using penalized linear regression
NASA Astrophysics Data System (ADS)
Lee, Hyekyoung; Lee, Dong Soo; Kang, Hyejin; Kim, Boong-Nyun; Chung, Moo K.
2011-03-01
Sparse partial correlation is a useful connectivity measure for brain networks when it is difficult to compute the exact partial correlation in the small-n large-p setting. In this paper, we formulate the problem of estimating partial correlation as a sparse linear regression with a l1-norm penalty. The method is applied to brain network consisting of parcellated regions of interest (ROIs), which are obtained from FDG-PET images of the autism spectrum disorder (ASD) children and the pediatric control (PedCon) subjects. To validate the results, we check their reproducibilities of the obtained brain networks by the leave-one-out cross validation and compare the clustered structures derived from the brain networks of ASD and PedCon.
Classification of microarray data with penalized logistic regression
NASA Astrophysics Data System (ADS)
Eilers, Paul H. C.; Boer, Judith M.; van Ommen, Gert-Jan; van Houwelingen, Hans C.
2001-06-01
Classification of microarray data needs a firm statistical basis. In principle, logistic regression can provide it, modeling the probability of membership of a class with (transforms of) linear combinations of explanatory variables. However, classical logistic regression does not work for microarrays, because generally there will be far more variables than observations. One problem is multicollinearity: estimating equations become singular and have no unique and stable solution. A second problem is over-fitting: a model may fit well into a data set, but perform badly when used to classify new data. We propose penalized likelihood as a solution to both problems. The values of the regression coefficients are constrained in a similar way as in ridge regression. All variables play an equal role, there is no ad-hoc selection of most relevant or most expressed genes. The dimension of the resulting systems of equations is equal to the number of variables, and generally will be too large for most computers, but it can dramatically be reduced with the singular value decomposition of some matrices. The penalty is optimized with AIC (Akaike's Information Criterion), which essentially is a measure of prediction performance. We find that penalized logistic regression performs well on a public data set (the MIT ALL/AML data).
Greenland, Sander; Mansournia, Mohammad Ali
2015-10-15
Penalization is a very general method of stabilizing or regularizing estimates, which has both frequentist and Bayesian rationales. We consider some questions that arise when considering alternative penalties for logistic regression and related models. The most widely programmed penalty appears to be the Firth small-sample bias-reduction method (albeit with small differences among implementations and the results they provide), which corresponds to using the log density of the Jeffreys invariant prior distribution as a penalty function. The latter representation raises some serious contextual objections to the Firth reduction, which also apply to alternative penalties based on t-distributions (including Cauchy priors). Taking simplicity of implementation and interpretation as our chief criteria, we propose that the log-F(1,1) prior provides a better default penalty than other proposals. Penalization based on more general log-F priors is trivial to implement and facilitates mean-squared error reduction and sensitivity analyses of penalty strength by varying the number of prior degrees of freedom. We caution however against penalization of intercepts, which are unduly sensitive to covariate coding and design idiosyncrasies.
Greenland, Sander; Mansournia, Mohammad Ali
2015-10-15
Penalization is a very general method of stabilizing or regularizing estimates, which has both frequentist and Bayesian rationales. We consider some questions that arise when considering alternative penalties for logistic regression and related models. The most widely programmed penalty appears to be the Firth small-sample bias-reduction method (albeit with small differences among implementations and the results they provide), which corresponds to using the log density of the Jeffreys invariant prior distribution as a penalty function. The latter representation raises some serious contextual objections to the Firth reduction, which also apply to alternative penalties based on t-distributions (including Cauchy priors). Taking simplicity of implementation and interpretation as our chief criteria, we propose that the log-F(1,1) prior provides a better default penalty than other proposals. Penalization based on more general log-F priors is trivial to implement and facilitates mean-squared error reduction and sensitivity analyses of penalty strength by varying the number of prior degrees of freedom. We caution however against penalization of intercepts, which are unduly sensitive to covariate coding and design idiosyncrasies. PMID:26011599
PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data
Hoffman, Gabriel E.; Logsdon, Benjamin A.; Mezey, Jason G.
2013-01-01
Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns's disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn's disease; and one
Penalized Regression and Risk Prediction in Genome-Wide Association Studies.
Austin, Erin; Pan, Wei; Shen, Xiaotong
2013-08-01
An important task in personalized medicine is to predict disease risk based on a person's genome, e.g. on a large number of single-nucleotide polymorphisms (SNPs). Genome-wide association studies (GWAS) make SNP and phenotype data available to researchers. A critical question for researchers is how to best predict disease risk. Penalized regression equipped with variable selection, such as LASSO and SCAD, is deemed to be promising in this setting. However, the sparsity assumption taken by the LASSO, SCAD and many other penalized regression techniques may not be applicable here: it is now hypothesized that many common diseases are associated with many SNPs with small to moderate effects. In this article, we use the GWAS data from the Wellcome Trust Case Control Consortium (WTCCC) to investigate the performance of various unpenalized and penalized regression approaches under true sparse or non-sparse models. We find that in general penalized regression outperformed unpenalized regression; SCAD, TLP and LASSO performed best for sparse models, while elastic net regression was the winner, followed by ridge, TLP and LASSO, for non-sparse models.
A Novel Statistic for Global Association Testing Based on Penalized Regression.
Austin, Erin; Shen, Xiaotong; Pan, Wei
2015-09-01
Natural genetic structures like genes may contain multiple variants that work as a group to determine a biologic outcome. The effect of rare variants, mutations occurring in less than 5% of samples, is hypothesized to be explained best as groups collectively associated with a biologic function. Therefore, it is important to develop powerful association tests to identify a true association between an outcome of interest and a group of variants, in particular a group with many rare variants. In this article we first delineate a novel penalized regression-based global test for the association between sets of variants and a disease phenotype. Next, we use Genetic Analysis Workshop 18 (GAW18) data to assess the power of the new global association test to capture a relationship between an aggregated group of variants and a simulated hypertension status. Rare variant only, common variant only, and combined variant groups are studied. The power values are compared to those obtained from eight well-regarded global tests (Score, Sum, SSU, SSUw, UminP, aSPU, aSPUw, and sequence kernel association test (SKAT)) that do not use penalized regression and a set of tests using either the SSU or score statistics and least absolute shrinkage and selection operator penalty (LASSO) logistic regression. Association testing of rare variants with our method was the top performer when there was low linkage disequilibrium (LD) between and within causal variants. This was similarly true when simultaneously testing rare and common variants in low LD scenarios. Finally, our method was able to provide meaningful variant-specific association information.
Polynomial order selection in random regression models via penalizing adaptively the likelihood.
Corrales, J D; Munilla, S; Cantet, R J C
2015-08-01
Orthogonal Legendre polynomials (LP) are used to model the shape of additive genetic and permanent environmental effects in random regression models (RRM). Frequently, the Akaike (AIC) and the Bayesian (BIC) information criteria are employed to select LP order. However, it has been theoretically shown that neither AIC nor BIC is simultaneously optimal in terms of consistency and efficiency. Thus, the goal was to introduce a method, 'penalizing adaptively the likelihood' (PAL), as a criterion to select LP order in RRM. Four simulated data sets and real data (60,513 records, 6675 Colombian Holstein cows) were employed. Nested models were fitted to the data, and AIC, BIC and PAL were calculated for all of them. Results showed that PAL and BIC identified with probability of one the true LP order for the additive genetic and permanent environmental effects, but AIC tended to favour over parameterized models. Conversely, when the true model was unknown, PAL selected the best model with higher probability than AIC. In the latter case, BIC never favoured the best model. To summarize, PAL selected a correct model order regardless of whether the 'true' model was within the set of candidates.
Penalized Regression for Genome-Wide Association Screening of Sequence Data
Alexander, D.H.; Sehl, M.E.; Sinsheimer, J.S.; Sobel, E.M.; Lange, K.
2016-01-01
Whole exome and whole genome sequencing are likely to be potent tools in the study of common diseases and complex traits. Despite this promise, some very difficult issues in data management and statistical analysis must be squarely faced. The number of rare variants identified by sequencing is apt to be much larger than the number of common variants encountered in current association studies. The low frequencies of rare variants alone will make association testing difficult. This article extends the penalized regression framework for model selection in genome-wide association data to sequencing data with both common and rare variants. Previous research has shown that lasso penalties discourage irrelevant predictors from entering a model. The Euclidean penalties dealt with here group variants by gene or pathway. Pertinent biological information can be incorporated by calibrating penalties by weights. The current paper examines some of the tradeoffs in using pure lasso penalties, pure group penalties, and mixtures of the two types of penalty. All of the computational and statistical advantages of lasso penalized estimation are retained in this richer setting. The overall strategy is implemented in the free statistical genetics analysis software Mendel and illustrated on both simulated and real data. PMID:21121038
A penalized robust method for identifying gene-environment interactions.
Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Xie, Yang; Ma, Shuangge
2014-04-01
In high-throughput studies, an important objective is to identify gene-environment interactions associated with disease outcomes and phenotypes. Many commonly adopted methods assume specific parametric or semiparametric models, which may be subject to model misspecification. In addition, they usually use significance level as the criterion for selecting important interactions. In this study, we adopt the rank-based estimation, which is much less sensitive to model specification than some of the existing methods and includes several commonly encountered data and models as special cases. Penalization is adopted for the identification of gene-environment interactions. It achieves simultaneous estimation and identification and does not rely on significance level. For computation feasibility, a smoothed rank estimation is further proposed. Simulation shows that under certain scenarios, for example, with contaminated or heavy-tailed data, the proposed method can significantly outperform the existing alternatives with more accurate identification. We analyze a lung cancer prognosis study with gene expression measurements under the AFT (accelerated failure time) model. The proposed method identifies interactions different from those using the alternatives. Some of the identified genes have important implications.
Wu, Ying; Cook, Richard J
2015-09-01
Times of disease progression are interval-censored when progression status is only known at a series of assessment times. This situation arises routinely in clinical trials and cohort studies when events of interest are only detectable upon imaging, based on blood tests, or upon careful clinical examination. We consider the problem of selecting important prognostic biomarkers from a large set of candidates when disease progression status is only known at irregularly spaced and individual-specific assessment times. Penalized regression techniques (e.g., LASSO, adaptive LASSO, and SCAD) are adapted to handle interval-censored time of disease progression. An expectation-maximization algorithm is described which is empirically shown to perform well. Application to the motivating study of the development of arthritis mutilans in patients with psoriatic arthritis is given and several important human leukocyte antigen (HLA) variables are identified for further investigation.
Regression methods for spatial data
NASA Technical Reports Server (NTRS)
Yakowitz, S. J.; Szidarovszky, F.
1982-01-01
The kriging approach, a parametric regression method used by hydrologists and mining engineers, among others also provides an error estimate the integral of the regression function. The kriging method is explored and some of its statistical characteristics are described. The Watson method and theory are extended so that the kriging features are displayed. Theoretical and computational comparisons of the kriging and Watson approaches are offered.
On the use of a penalized least squares method to process kinematic full-field measurements
NASA Astrophysics Data System (ADS)
Moulart, Raphaël; Rotinat, René
2014-07-01
This work is aimed at exploring the performances of an alternative procedure to smooth and differentiate full-field displacement measurements. After recalling the strategies currently used by the experimental mechanics community, a short overview of the available smoothing algorithms is drawn up and the requirements that such an algorithm has to fulfil to be applicable to process kinematic measurements are listed. A comparative study of the chosen algorithm is performed including the 2D penalized least squares method and two other commonly implemented strategies. The results obtained by penalized least squares are comparable in terms of quality to those produced by the two other algorithms, while the penalized least squares method appears to be the fastest and the most flexible. Unlike both the other considered methods, it is possible with penalized least squares to automatically choose the parameter governing the amount of smoothing to apply. Unfortunately, it appears that this automation is not suitable for the proposed application since it does not lead to optimal strain maps. Finally, it is possible with this technique to perform the derivation to obtain strain maps before smoothing them (while the smoothing is normally applied to displacement maps before the differentiation), which can lead in some cases to a more effective reconstruction of the strain fields.
PRAMS: a systematic method for evaluating penal institutions under litigation.
Wills, Cheryl D
2007-01-01
Forensic psychiatrists serve as expert witnesses in litigation involving the impact of conditions of confinement, including mental health care delivery, on the emotional well-being of institutionalized persons. Experts review volumes of data before formulating opinions and preparing reports. The author has developed PRAMS, a method for systematically reviewing and presenting data during mental health litigation involving detention and corrections facilities. The PRAMS method divides the examination process into five stages: paper review, real-world view, aggravating circumstances, mitigating circumstances, and supplemental information. PRAMS provides the scaffolding on which a compelling picture of an institution's system of care may be constructed and disseminated in reports and during courtroom testimony. Also, PRAMS enhances the organization, analysis, publication, and presentation of salient findings, thereby coordinating the forensic psychiatrist's efforts to provide expert opinions regarding complex systems of mental health care.
Heinze, Georg
2006-12-30
In logistic regression analysis of small or sparse data sets, results obtained by classical maximum likelihood methods cannot be generally trusted. In such analyses it may even happen that the likelihood meets the convergence criteria while at least one parameter estimate diverges to +/-infinity. This situation has been termed 'separation', and it typically occurs whenever no events are observed in one of the two groups defined by a dichotomous covariate. More generally, separation is caused by a linear combination of continuous or dichotomous covariates that perfectly separates events from non-events. Separation implies infinite or zero maximum likelihood estimates of odds ratios, which are usually considered unrealistic. I provide some examples of separation and near-separation in clinical data sets and discuss some options to analyse such data, including exact logistic regression analysis and a penalized likelihood approach. Both methods supply finite point estimates in case of separation. Profile penalized likelihood confidence intervals for parameters show excellent behaviour in terms of coverage probability and provide higher power than exact confidence intervals. General advantages of the penalized likelihood approach are discussed.
Synthesizing regression results: a factored likelihood method.
Wu, Meng-Jia; Becker, Betsy Jane
2013-06-01
Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported in the regression studies to calculate synthesized standardized slopes. It uses available correlations to estimate missing ones through a series of regressions, allowing us to synthesize correlations among variables as if each included study contained all the same variables. Great accuracy and stability of this method under fixed-effects models were found through Monte Carlo simulation. An example was provided to demonstrate the steps for calculating the synthesized slopes through sweep operators. By rearranging the predictors in the included regression models or omitting a relatively small number of correlations from those models, we can easily apply the factored likelihood method to many situations involving synthesis of linear models. Limitations and other possible methods for synthesizing more complicated models are discussed. Copyright © 2012 John Wiley & Sons, Ltd. PMID:26053653
Differentiating among penal states.
Lacey, Nicola
2010-12-01
This review article assesses Loïc Wacquant's contribution to debates on penality, focusing on his most recent book, Punishing the Poor: The Neoliberal Government of Social Insecurity (Wacquant 2009), while setting its argument in the context of his earlier Prisons of Poverty (1999). In particular, it draws on both historical and comparative methods to question whether Wacquant's conception of 'the penal state' is adequately differentiated for the purposes of building the explanatory account he proposes; about whether 'neo-liberalism' has, materially, the global influence which he ascribes to it; and about whether, therefore, the process of penal Americanization which he asserts in his recent writings is credible. PMID:21138432
Differentiating among penal states.
Lacey, Nicola
2010-12-01
This review article assesses Loïc Wacquant's contribution to debates on penality, focusing on his most recent book, Punishing the Poor: The Neoliberal Government of Social Insecurity (Wacquant 2009), while setting its argument in the context of his earlier Prisons of Poverty (1999). In particular, it draws on both historical and comparative methods to question whether Wacquant's conception of 'the penal state' is adequately differentiated for the purposes of building the explanatory account he proposes; about whether 'neo-liberalism' has, materially, the global influence which he ascribes to it; and about whether, therefore, the process of penal Americanization which he asserts in his recent writings is credible.
NASA Astrophysics Data System (ADS)
Kasimov, Nurlybek; Brown-Dymkoski, Eric; Vasilyev, Oleg V.
2015-11-01
A novel volume penalization method to enforce immersed boundary conditions in Navier-Stokes and Euler equations is presented. Previously, Brinkman penalization has been used to introduce solid obstacles modeled as porous media, although it is limited to Dirichlet-type conditions on velocity and temperature. This method builds upon Brinkman penalization by allowing Neumann conditions to be applied in a general fashion. Correct boundary conditions are achieved through characteristic propagation into the thin layer inside of the obstacle. Inward pointing characteristics ensure nonphysical solution inside the obstacle does not propagate outside to the fluid. Dirichlet boundary conditions are enforced similarly to Brinkman method. Penalization parameters act on a much faster timescale than the characteristic timescale of the flow. Main advantage of the method is systematic means of the error control. This talk is focused on the progress that was made towards the extension of the method to the 3D flows around irregular shapes. This work was supported by ONR MURI on Soil Blast Modeling.
NASA Astrophysics Data System (ADS)
Tauriello, Gerardo; Koumoutsakos, Petros
2015-02-01
We present a comparative study of penalization and phase field methods for the solution of the diffusion equation in complex geometries embedded using simple Cartesian meshes. The two methods have been widely employed to solve partial differential equations in complex and moving geometries for applications ranging from solid and fluid mechanics to biology and geophysics. Their popularity is largely due to their discretization on Cartesian meshes thus avoiding the need to create body-fitted grids. At the same time, there are questions regarding their accuracy and it appears that the use of each one is confined by disciplinary boundaries. Here, we compare penalization and phase field methods to handle problems with Neumann and Robin boundary conditions. We discuss extensions for Dirichlet boundary conditions and in turn compare with methods that have been explicitly designed to handle Dirichlet boundary conditions. The accuracy of all methods is analyzed using one and two dimensional benchmark problems such as the flow induced by an oscillating wall and by a cylinder performing rotary oscillations. This comparative study provides information to decide which methods to consider for a given application and their incorporation in broader computational frameworks. We demonstrate that phase field methods are more accurate than penalization methods on problems with Neumann boundary conditions and we present an error analysis explaining this result.
Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs
SHOJAIE, ALI; MICHAILIDIS, GEORGE
2010-01-01
Summary Directed acyclic graphs are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical and biological systems where directed edges between nodes represent the influence of components of the system on each other. Estimation of directed graphs from observational data is computationally NP-hard. In addition, directed graphs with the same structure may be indistinguishable based on observations alone. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose an efficient penalized likelihood method for estimation of the adjacency matrix of directed acyclic graphs, when variables inherit a natural ordering. We study variable selection consistency of lasso and adaptive lasso penalties in high-dimensional sparse settings, and propose an error-based choice for selecting the tuning parameter. We show that although the lasso is only variable selection consistent under stringent conditions, the adaptive lasso can consistently estimate the true graph under the usual regularity assumptions. PMID:22434937
Morales, Jorge A.; Leroy, Matthieu; Bos, Wouter J.T.; Schneider, Kai
2014-10-01
A volume penalization approach to simulate magnetohydrodynamic (MHD) flows in confined domains is presented. Here the incompressible visco-resistive MHD equations are solved using parallel pseudo-spectral solvers in Cartesian geometries. The volume penalization technique is an immersed boundary method which is characterized by a high flexibility for the geometry of the considered flow. In the present case, it allows to use other than periodic boundary conditions in a Fourier pseudo-spectral approach. The numerical method is validated and its convergence is assessed for two- and three-dimensional hydrodynamic (HD) and MHD flows, by comparing the numerical results with results from literature and analytical solutions. The test cases considered are two-dimensional Taylor–Couette flow, the z-pinch configuration, three dimensional Orszag–Tang flow, Ohmic-decay in a periodic cylinder, three-dimensional Taylor–Couette flow with and without axial magnetic field and three-dimensional Hartmann-instabilities in a cylinder with an imposed helical magnetic field. Finally, we present a magnetohydrodynamic flow simulation in toroidal geometry with non-symmetric cross section and imposing a helical magnetic field to illustrate the potential of the method.
Method for nonlinear exponential regression analysis
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
A method for nonlinear exponential regression analysis
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Relationship between Multiple Regression and Selected Multivariable Methods.
ERIC Educational Resources Information Center
Schumacker, Randall E.
The relationship of multiple linear regression to various multivariate statistical techniques is discussed. The importance of the standardized partial regression coefficient (beta weight) in multiple linear regression as it is applied in path, factor, LISREL, and discriminant analyses is emphasized. The multivariate methods discussed in this paper…
Shrinkage regression-based methods for microarray missing value imputation
2013-01-01
Background Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. Results To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Conclusions Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods. PMID:24565159
The Precision Efficacy Analysis for Regression Sample Size Method.
ERIC Educational Resources Information Center
Brooks, Gordon P.; Barcikowski, Robert S.
The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…
Interquantile Shrinkage in Regression Models
Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.
2012-01-01
Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546
Calculation of Solar Radiation by Using Regression Methods
NASA Astrophysics Data System (ADS)
Kızıltan, Ö.; Şahin, M.
2016-04-01
In this study, solar radiation was estimated at 53 location over Turkey with varying climatic conditions using the Linear, Ridge, Lasso, Smoother, Partial least, KNN and Gaussian process regression methods. The data of 2002 and 2003 years were used to obtain regression coefficients of relevant methods. The coefficients were obtained based on the input parameters. Input parameters were month, altitude, latitude, longitude and landsurface temperature (LST).The values for LST were obtained from the data of the National Oceanic and Atmospheric Administration Advanced Very High Resolution Radiometer (NOAA-AVHRR) satellite. Solar radiation was calculated using obtained coefficients in regression methods for 2004 year. The results were compared statistically. The most successful method was Gaussian process regression method. The most unsuccessful method was lasso regression method. While means bias error (MBE) value of Gaussian process regression method was 0,274 MJ/m2, root mean square error (RMSE) value of method was calculated as 2,260 MJ/m2. The correlation coefficient of related method was calculated as 0,941. Statistical results are consistent with the literature. Used the Gaussian process regression method is recommended for other studies.
Birthweight Related Factors in Northwestern Iran: Using Quantile Regression Method
Fallah, Ramazan; Kazemnejad, Anoshirvan; Zayeri, Farid; Shoghli, Alireza
2016-01-01
Introduction: Birthweight is one of the most important predicting indicators of the health status in adulthood. Having a balanced birthweight is one of the priorities of the health system in most of the industrial and developed countries. This indicator is used to assess the growth and health status of the infants. The aim of this study was to assess the birthweight of the neonates by using quantile regression in Zanjan province. Methods: This analytical descriptive study was carried out using pre-registered (March 2010 - March 2012) data of neonates in urban/rural health centers of Zanjan province using multiple-stage cluster sampling. Data were analyzed using multiple linear regressions andquantile regression method and SAS 9.2 statistical software. Results: From 8456 newborn baby, 4146 (49%) were female. The mean age of the mothers was 27.1±5.4 years. The mean birthweight of the neonates was 3104 ± 431 grams. Five hundred and seventy-three patients (6.8%) of the neonates were less than 2500 grams. In all quantiles, gestational age of neonates (p<0.05), weight and educational level of the mothers (p<0.05) showed a linear significant relationship with the i of the neonates. However, sex and birth rank of the neonates, mothers age, place of residence (urban/rural) and career were not significant in all quantiles (p>0.05). Conclusion: This study revealed the results of multiple linear regression and quantile regression were not identical. We strictly recommend the use of quantile regression when an asymmetric response variable or data with outliers is available. PMID:26925889
Numerical study of impeller-driven von Kármán flows via a volume penalization method
NASA Astrophysics Data System (ADS)
Kreuzahler, S.; Schulz, D.; Homann, H.; Ponty, Y.; Grauer, R.
2014-10-01
Studying strongly turbulent flows is still a major challenge in fluid dynamics. It is highly desirable to have comparable experiments to obtain a better understanding of the mechanisms generating turbulence. The von Kármán flow apparatus is one of those experiments that has been used in various turbulence studies by different experimental groups over the last two decades. The von Kármán flow apparatus produces a highly turbulent flow inside a cylinder vessel driven by two counter-rotating impellers. The studies cover a broad range of physical systems including incompressible flows, especially water and air, magnetohydrodynamic systems using liquid metal for understanding the important topic of the dynamo instability, particle tracking to study Lagrangian type turbulence and recently quantum turbulence in super-fluid helium. Therefore, accompanying numerical studies of the von Kármán flow that compare quantitatively data with those from experiments are of high importance for understanding the mechanism producing the characteristic flow patterns. We present a direct numerical simulation (DNS) version the von Kármán flow, forced by two rotating impellers. The cylinder geometry and the rotating objects are modelled via a penalization method and implemented in a massive parallel pseudo-spectral Navier-Stokes solver. From the wide range of different impellers used in von Kármán water and sodium experiments we choose a special configuration (TM28), in order to compare our simulations with the according set of well documented water experiments. Though this configuration is different from the one in the final VKS experiment (TM73), using our method it is quite easy to change the impeller shape to the one actually used in VKS. The decomposition into poloidal and toroidal components and the mean velocity field from our simulations are in good agreement with experimental results. In addition, we analysed the flow structure close to the impeller blades, a region
New Robust Face Recognition Methods Based on Linear Regression
Mi, Jian-Xun; Liu, Jin-Xing; Wen, Jiajun
2012-01-01
Nearest subspace (NS) classification based on linear regression technique is a very straightforward and efficient method for face recognition. A recently developed NS method, namely the linear regression-based classification (LRC), uses downsampled face images as features to perform face recognition. The basic assumption behind this kind method is that samples from a certain class lie on their own class-specific subspace. Since there are only few training samples for each individual class, which will cause the small sample size (SSS) problem, this problem gives rise to misclassification of previous NS methods. In this paper, we propose two novel LRC methods using the idea that every class-specific subspace has its unique basis vectors. Thus, we consider that each class-specific subspace is spanned by two kinds of basis vectors which are the common basis vectors shared by many classes and the class-specific basis vectors owned by one class only. Based on this concept, two classification methods, namely robust LRC 1 and 2 (RLRC 1 and 2), are given to achieve more robust face recognition. Unlike some previous methods which need to extract class-specific basis vectors, the proposed methods are developed merely based on the existence of the class-specific basis vectors but without actually calculating them. Experiments on three well known face databases demonstrate very good performance of the new methods compared with other state-of-the-art methods. PMID:22879992
Analysis of regression methods for solar activity forecasting
NASA Technical Reports Server (NTRS)
Lundquist, C. A.; Vaughan, W. W.
1979-01-01
The paper deals with the potential use of the most recent solar data to project trends in the next few years. Assuming that a mode of solar influence on weather can be identified, advantageous use of that knowledge presumably depends on estimating future solar activity. A frequently used technique for solar cycle predictions is a linear regression procedure along the lines formulated by McNish and Lincoln (1949). The paper presents a sensitivity analysis of the behavior of such regression methods relative to the following aspects: cycle minimum, time into cycle, composition of historical data base, and unnormalized vs. normalized solar cycle data. Comparative solar cycle forecasts for several past cycles are presented as to these aspects of the input data. Implications for the current cycle, No. 21, are also given.
Cathodic protection design using the regression and correlation method
Niembro, A.M.; Ortiz, E.L.G.
1997-09-01
A computerized statistical method which calculates the current demand requirement based on potential measurements for cathodic protection systems is introduced. The method uses the regression and correlation analysis of statistical measurements of current and potentials of the piping network. This approach involves four steps: field potential measurements, statistical determination of the current required to achieve full protection, installation of more cathodic protection capacity with distributed anodes around the plant and examination of the protection potentials. The procedure is described and recommendations for the improvement of the existing and new cathodic protection systems are given.
Liu, Xiang; Peng, Yingwei; Tu, Dongsheng; Liang, Hua
2012-10-30
Survival data with a sizable cure fraction are commonly encountered in cancer research. The semiparametric proportional hazards cure model has been recently used to analyze such data. As seen in the analysis of data from a breast cancer study, a variable selection approach is needed to identify important factors in predicting the cure status and risk of breast cancer recurrence. However, no specific variable selection method for the cure model is available. In this paper, we present a variable selection approach with penalized likelihood for the cure model. The estimation can be implemented easily by combining the computational methods for penalized logistic regression and the penalized Cox proportional hazards models with the expectation-maximization algorithm. We illustrate the proposed approach on data from a breast cancer study. We conducted Monte Carlo simulations to evaluate the performance of the proposed method. We used and compared different penalty functions in the simulation studies.
Mapping urban environmental noise: a land use regression method.
Xie, Dan; Liu, Yi; Chen, Jining
2011-09-01
Forecasting and preventing urban noise pollution are major challenges in urban environmental management. Most existing efforts, including experiment-based models, statistical models, and noise mapping, however, have limited capacity to explain the association between urban growth and corresponding noise change. Therefore, these conventional methods can hardly forecast urban noise at a given outlook of development layout. This paper, for the first time, introduces a land use regression method, which has been applied for simulating urban air quality for a decade, to construct an urban noise model (LUNOS) in Dalian Municipality, Northwest China. The LUNOS model describes noise as a dependent variable of surrounding various land areas via a regressive function. The results suggest that a linear model performs better in fitting monitoring data, and there is no significant difference of the LUNOS's outputs when applied to different spatial scales. As the LUNOS facilitates a better understanding of the association between land use and urban environmental noise in comparison to conventional methods, it can be regarded as a promising tool for noise prediction for planning purposes and aid smart decision-making.
Fast nonlinear regression method for CT brain perfusion analysis.
Bennink, Edwin; Oosterbroek, Jaap; Kudo, Kohsuke; Viergever, Max A; Velthuis, Birgitta K; de Jong, Hugo W A M
2016-04-01
Although computed tomography (CT) perfusion (CTP) imaging enables rapid diagnosis and prognosis of ischemic stroke, current CTP analysis methods have several shortcomings. We propose a fast nonlinear regression method with a box-shaped model (boxNLR) that has important advantages over the current state-of-the-art method, block-circulant singular value decomposition (bSVD). These advantages include improved robustness to attenuation curve truncation, extensibility, and unified estimation of perfusion parameters. The method is compared with bSVD and with a commercial SVD-based method. The three methods were quantitatively evaluated by means of a digital perfusion phantom, described by Kudo et al. and qualitatively with the aid of 50 clinical CTP scans. All three methods yielded high Pearson correlation coefficients ([Formula: see text]) with the ground truth in the phantom. The boxNLR perfusion maps of the clinical scans showed higher correlation with bSVD than the perfusion maps from the commercial method. Furthermore, it was shown that boxNLR estimates are robust to noise, truncation, and tracer delay. The proposed method provides a fast and reliable way of estimating perfusion parameters from CTP scans. This suggests it could be a viable alternative to current commercial and academic methods. PMID:27413770
Conventional occlusion versus pharmacologic penalization for amblyopia
Li, Tianjing; Shotton, Kate
2013-01-01
Background Amblyopia is defined as defective visual acuity in one or both eyes without demonstrable abnormality of the visual pathway, and is not immediately resolved by wearing glasses. Objectives To assess the effectiveness and safety of conventional occlusion versus atropine penalization for amblyopia. Search methods We searched CENTRAL, MEDLINE, EMBASE, LILACS, the WHO International Clinical Trials Registry Platform, preference lists, science citation index and ongoing trials up to June 2009. Selection criteria We included randomized/quasi-randomized controlled trials comparing conventional occlusion to atropine penalization for amblyopia. Data collection and analysis Two authors independently screened abstracts and full text articles, abstracted data, and assessed the risk of bias. Main results Three trials with a total of 525 amblyopic eyes were included. One trial was assessed as having a low risk of bias among these three trials, and one was assessed as having a high risk of bias. Evidence from three trials suggests atropine penalization is as effective as conventional occlusion. One trial found similar improvement in vision at six and 24 months. At six months, visual acuity in the amblyopic eye improved from baseline 3.16 lines in the occlusion and 2.84 lines in the atropine group (mean difference 0.034 logMAR; 95% confidence interval (CI) 0.005 to 0.064 logMAR). At 24 months, additional improvement was seen in both groups; but there continued to be no meaningful difference (mean difference 0.01 logMAR; 95% CI −0.02 to 0.04 logMAR). The second trial reported atropine to be more effective than occlusion. At six months, visual acuity improved 1.8 lines in the patching group and 3.4 lines in the atropine penalization group, and was in favor of atropine (mean difference −0.16 logMAR; 95% CI −0.23 to −0.09 logMAR). Different occlusion modalities were used in these two trials. The third trial had inherent methodological flaws and limited inference could
Semiparametric Regression Pursuit.
Huang, Jian; Wei, Fengrong; Ma, Shuangge
2012-10-01
The semiparametric partially linear model allows flexible modeling of covariate effects on the response variable in regression. It combines the flexibility of nonparametric regression and parsimony of linear regression. The most important assumption in the existing methods for the estimation in this model is to assume a priori that it is known which covariates have a linear effect and which do not. However, in applied work, this is rarely known in advance. We consider the problem of estimation in the partially linear models without assuming a priori which covariates have linear effects. We propose a semiparametric regression pursuit method for identifying the covariates with a linear effect. Our proposed method is a penalized regression approach using a group minimax concave penalty. Under suitable conditions we show that the proposed approach is model-pursuit consistent, meaning that it can correctly determine which covariates have a linear effect and which do not with high probability. The performance of the proposed method is evaluated using simulation studies, which support our theoretical results. A real data example is used to illustrated the application of the proposed method. PMID:23559831
Stochastic Approximation Methods for Latent Regression Item Response Models
ERIC Educational Resources Information Center
von Davier, Matthias; Sinharay, Sandip
2010-01-01
This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates…
Cox regression methods for two-stage randomization designs.
Lokhnygina, Yuliya; Helterbrand, Jeffrey D
2007-06-01
Two-stage randomization designs (TSRD) are becoming increasingly common in oncology and AIDS clinical trials as they make more efficient use of study participants to examine therapeutic regimens. In these designs patients are initially randomized to an induction treatment, followed by randomization to a maintenance treatment conditional on their induction response and consent to further study treatment. Broader acceptance of TSRDs in drug development may hinge on the ability to make appropriate intent-to-treat type inference within this design framework as to whether an experimental induction regimen is better than a standard induction regimen when maintenance treatment is fixed. Recently Lunceford, Davidian, and Tsiatis (2002, Biometrics 58, 48-57) introduced an inverse probability weighting based analytical framework for estimating survival distributions and mean restricted survival times, as well as for comparing treatment policies at landmarks in the TSRD setting. In practice Cox regression is widely used and in this article we extend the analytical framework of Lunceford et al. (2002) to derive a consistent estimator for the log hazard in the Cox model and a robust score test to compare treatment policies. Large sample properties of these methods are derived, illustrated via a simulation study, and applied to a TSRD clinical trial. PMID:17425633
Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula
2011-01-01
Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
Linear Regression in High Dimension and/or for Correlated Inputs
NASA Astrophysics Data System (ADS)
Jacques, J.; Fraix-Burnet, D.
2014-12-01
Ordinary least square is the common way to estimate linear regression models. When inputs are correlated or when they are too numerous, regression methods using derived inputs directions or shrinkage methods can be efficient alternatives. Methods using derived inputs directions build new uncorrelated variables as linear combination of the initial inputs, whereas shrinkage methods introduce regularization and variable selection by penalizing the usual least square criterion. Both kinds of methods are presented and illustrated thanks to the R software on an astronomical dataset.
Kwak, Il-Youp; Moore, Candace R; Spalding, Edgar P; Broman, Karl W
2014-08-01
Most statistical methods for quantitative trait loci (QTL) mapping focus on a single phenotype. However, multiple phenotypes are commonly measured, and recent technological advances have greatly simplified the automated acquisition of numerous phenotypes, including function-valued phenotypes, such as growth measured over time. While methods exist for QTL mapping with function-valued phenotypes, they are generally computationally intensive and focus on single-QTL models. We propose two simple, fast methods that maintain high power and precision and are amenable to extensions with multiple-QTL models using a penalized likelihood approach. After identifying multiple QTL by these approaches, we can view the function-valued QTL effects to provide a deeper understanding of the underlying processes. Our methods have been implemented as a package for R, funqtl. PMID:24931408
Kwak, Il-Youp; Moore, Candace R; Spalding, Edgar P; Broman, Karl W
2014-08-01
Most statistical methods for quantitative trait loci (QTL) mapping focus on a single phenotype. However, multiple phenotypes are commonly measured, and recent technological advances have greatly simplified the automated acquisition of numerous phenotypes, including function-valued phenotypes, such as growth measured over time. While methods exist for QTL mapping with function-valued phenotypes, they are generally computationally intensive and focus on single-QTL models. We propose two simple, fast methods that maintain high power and precision and are amenable to extensions with multiple-QTL models using a penalized likelihood approach. After identifying multiple QTL by these approaches, we can view the function-valued QTL effects to provide a deeper understanding of the underlying processes. Our methods have been implemented as a package for R, funqtl.
L1-Penalized N-way PLS for subset of electrodes selection in BCI experiments
NASA Astrophysics Data System (ADS)
Eliseyev, Andrey; Moro, Cecile; Faber, Jean; Wyss, Alexander; Torres, Napoleon; Mestais, Corinne; Benabid, Alim Louis; Aksenova, Tetiana
2012-08-01
Recently, the N-way partial least squares (NPLS) approach was reported as an effective tool for neuronal signal decoding and brain-computer interface (BCI) system calibration. This method simultaneously analyzes data in several domains. It combines the projection of a data tensor to a low dimensional space with linear regression. In this paper the L1-Penalized NPLS is proposed for sparse BCI system calibration, allowing uniting the projection technique with an effective selection of subset of features. The L1-Penalized NPLS was applied for the binary self-paced BCI system calibration, providing selection of electrodes subset. Our BCI system is designed for animal research, in particular for research in non-human primates.
Hypothesis Testing Using Factor Score Regression: A Comparison of Four Methods
ERIC Educational Resources Information Center
Devlieger, Ines; Mayer, Axel; Rosseel, Yves
2016-01-01
In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and…
Bayes and empirical Bayes methods for reduced rank regression models in matched case-control studies
Zhou, Qin; Lan, Qing; Rothman, Nathaniel; Langseth, Hilde; Engel, Lawrence S.
2015-01-01
Summary Matched case-control studies are popular designs used in epidemiology for assessing the effects of exposures on binary traits. Modern studies increasingly enjoy the ability to examine a large number of exposures in a comprehensive manner. However, several risk factors often tend to be related in a non-trivial way, undermining efforts to identify the risk factors using standard analytic methods due to inflated type I errors and possible masking of effects. Epidemiologists often use data reduction techniques by grouping the prognostic factors using a thematic approach, with themes deriving from biological considerations. We propose shrinkage type estimators based on Bayesian penalization methods to estimate the effects of the risk factors using these themes. The properties of the estimators are examined using extensive simulations. The methodology is illustrated using data from a matched case-control study of polychlorinflated biphenyls in relation to the etiology of non-Hodgkin’s lymphoma. PMID:26575519
An Investigation of the Median-Median Method of Linear Regression
ERIC Educational Resources Information Center
Walters, Elizabeth J.; Morrell, Christopher H.; Auer, Richard E.
2006-01-01
Least squares regression is the most common method of fitting a straight line to a set of bivariate data. Another less known method that is available on Texas Instruments graphing calculators is median-median regression. This method is proposed as a simple method that may be used with middle and high school students to motivate the idea of fitting…
A penalized likelihood approach for mixture cure models.
Corbière, Fabien; Commenges, Daniel; Taylor, Jeremy M G; Joly, Pierre
2009-02-01
Cure models have been developed to analyze failure time data with a cured fraction. For such data, standard survival models are usually not appropriate because they do not account for the possibility of cure. Mixture cure models assume that the studied population is a mixture of susceptible individuals, who may experience the event of interest, and non-susceptible individuals that will never experience it. Important issues in mixture cure models are estimation of the baseline survival function for susceptibles and estimation of the variance of the regression parameters. The aim of this paper is to propose a penalized likelihood approach, which allows for flexible modeling of the hazard function for susceptible individuals using M-splines. This approach also permits direct computation of the variance of parameters using the inverse of the Hessian matrix. Properties and limitations of the proposed method are discussed and an illustration from a cancer study is presented.
Mikhal, Julia; Geurts, Bernard J
2013-12-01
A volume-penalizing immersed boundary method is presented for the simulation of laminar incompressible flow inside geometrically complex blood vessels in the human brain. We concentrate on cerebral aneurysms and compute flow in curved brain vessels with and without spherical aneurysm cavities attached. We approximate blood as an incompressible Newtonian fluid and simulate the flow with the use of a skew-symmetric finite-volume discretization and explicit time-stepping. A key element of the immersed boundary method is the so-called masking function. This is a binary function with which we identify at any location in the domain whether it is 'solid' or 'fluid', allowing to represent objects immersed in a Cartesian grid. We compare three definitions of the masking function for geometries that are non-aligned with the grid. In each case a 'staircase' representation is used in which a grid cell is either 'solid' or 'fluid'. Reliable findings are obtained with our immersed boundary method, even at fairly coarse meshes with about 16 grid cells across a velocity profile. The validation of the immersed boundary method is provided on the basis of classical Poiseuille flow in a cylindrical pipe. We obtain first order convergence for the velocity and the shear stress, reflecting the fact that in our approach the solid-fluid interface is localized with an accuracy on the order of a grid cell. Simulations for curved vessels and aneurysms are done for different flow regimes, characterized by different values of the Reynolds number (Re). The validation is performed for laminar flow at Re = 250, while the flow in more complex geometries is studied at Re = 100 and Re = 250, as suggested by physiological conditions pertaining to flow of blood in the circle of Willis.
Risk prediction with machine learning and regression methods.
Steyerberg, Ewout W; van der Ploeg, Tjeerd; Van Calster, Ben
2014-07-01
This is a discussion of issues in risk prediction based on the following papers: "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory" by Jochen Kruppa, Yufeng Liu, Gérard Biau, Michael Kohler, Inke R. König, James D. Malley, and Andreas Ziegler; and "Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications" by Jochen Kruppa, Yufeng Liu, Hans-Christian Diener, Theresa Holste, Christian Weimar, Inke R. König, and Andreas Ziegler.
NASA Astrophysics Data System (ADS)
Sykas, Dimitris; Karathanassi, Vassilia
2015-06-01
This paper presents a new method for automatically determining the optimum regression model, which enable the estimation of a parameter. The concept lies on the combination of k spectral pre-processing algorithms (SPPAs) that enhance spectral features correlated to the desired parameter. Initially a pre-processing algorithm uses as input a single spectral signature and transforms it according to the SPPA function. A k-step combination of SPPAs uses k preprocessing algorithms serially. The result of each SPPA is used as input to the next SPPA, and so on until the k desired pre-processed signatures are reached. These signatures are then used as input to three different regression methods: the Normalized band Difference Regression (NDR), the Multiple Linear Regression (MLR) and the Partial Least Squares Regression (PLSR). Three Simple Genetic Algorithms (SGAs) are used, one for each regression method, for the selection of the optimum combination of k SPPAs. The performance of the SGAs is evaluated based on the RMS error of the regression models. The evaluation not only indicates the selection of the optimum SPPA combination but also the regression method that produces the optimum prediction model. The proposed method was applied on soil spectral measurements in order to predict Soil Organic Matter (SOM). In this study, the maximum value assigned to k was 3. PLSR yielded the highest accuracy while NDR's accuracy was satisfactory compared to its complexity. MLR method showed severe drawbacks due to the presence of noise in terms of collinearity at the spectral bands. Most of the regression methods required a 3-step combination of SPPAs for achieving the highest performance. The selected preprocessing algorithms were different for each regression method since each regression method handles with a different way the explanatory variables.
Gaussian Process Regression Plus Method for Localization Reliability Improvement.
Liu, Kehan; Meng, Zhaopeng; Own, Chung-Ming
2016-01-01
Location data are among the most widely used context data in context-aware and ubiquitous computing applications. Many systems with distinct deployment costs and positioning accuracies have been developed over the past decade for indoor positioning. The most useful method is focused on the received signal strength and provides a set of signal transmission access points. However, compiling a manual measuring Received Signal Strength (RSS) fingerprint database involves high costs and thus is impractical in an online prediction environment. The system used in this study relied on the Gaussian process method, which is a nonparametric model that can be characterized completely by using the mean function and the covariance matrix. In addition, the Naive Bayes method was used to verify and simplify the computation of precise predictions. The authors conducted several experiments on simulated and real environments at Tianjin University. The experiments examined distinct data size, different kernels, and accuracy. The results showed that the proposed method not only can retain positioning accuracy but also can save computation time in location predictions.
Gaussian Process Regression Plus Method for Localization Reliability Improvement.
Liu, Kehan; Meng, Zhaopeng; Own, Chung-Ming
2016-01-01
Location data are among the most widely used context data in context-aware and ubiquitous computing applications. Many systems with distinct deployment costs and positioning accuracies have been developed over the past decade for indoor positioning. The most useful method is focused on the received signal strength and provides a set of signal transmission access points. However, compiling a manual measuring Received Signal Strength (RSS) fingerprint database involves high costs and thus is impractical in an online prediction environment. The system used in this study relied on the Gaussian process method, which is a nonparametric model that can be characterized completely by using the mean function and the covariance matrix. In addition, the Naive Bayes method was used to verify and simplify the computation of precise predictions. The authors conducted several experiments on simulated and real environments at Tianjin University. The experiments examined distinct data size, different kernels, and accuracy. The results showed that the proposed method not only can retain positioning accuracy but also can save computation time in location predictions. PMID:27483276
Gaussian Process Regression Plus Method for Localization Reliability Improvement
Liu, Kehan; Meng, Zhaopeng; Own, Chung-Ming
2016-01-01
Location data are among the most widely used context data in context-aware and ubiquitous computing applications. Many systems with distinct deployment costs and positioning accuracies have been developed over the past decade for indoor positioning. The most useful method is focused on the received signal strength and provides a set of signal transmission access points. However, compiling a manual measuring Received Signal Strength (RSS) fingerprint database involves high costs and thus is impractical in an online prediction environment. The system used in this study relied on the Gaussian process method, which is a nonparametric model that can be characterized completely by using the mean function and the covariance matrix. In addition, the Naive Bayes method was used to verify and simplify the computation of precise predictions. The authors conducted several experiments on simulated and real environments at Tianjin University. The experiments examined distinct data size, different kernels, and accuracy. The results showed that the proposed method not only can retain positioning accuracy but also can save computation time in location predictions. PMID:27483276
Local Linear Regression for Data with AR Errors
Li, Runze; Li, Yan
2009-01-01
In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set. PMID:20161374
ERIC Educational Resources Information Center
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
[Evaluation of the penal treatment].
Clarke, R V; Sinclair, I
1975-01-01
Evaluative research in the penal field has had two main characteristics: first, there has been a pre-occupation with the objective of treatment to the almost complete exclusion of any other, and second, there has been a concern with the demonstration of effects without a corresponding attempt to understand the nature of the treatment process. This has been because the research has proceeded on an inappropriate "medical" view of penal treatment on which it is assumed that the "cure" of the offender is the major task and that the nature of treatment is relatively easy to understand. The main achievement of this research has been to show that, by and large, penal treatments differ very little in their capacity to reform. The importance of this result should not be underestimated. It has helped to bring about a changed view of delinquency and its treatment which in the long term will have far reaching effects on penal practice. In the shorter term the effects on evaluative research are likely to be two-fold. First it opens the way for evaluation to proceed on a wider front. Instead of needing to pay so much attention to reformative aspects, the researcher will be more free to compare penal measures with respect to such things as their economic and social costs, their cpacity for general deterrence, the protection afforded to the public from the activities of known criminals, and the extent to which they satisfy requirements of justice and humanity.
Gan, Wei; Liu, Xuemin; Sun, Jing
2015-02-01
This paper presents a method of regression evaluation index intelligent filter method (REIFM) for quick optimization of chromatographic separation conditions. The hierarchical chromatography response function was used as the chromatography-optimization index. The regression model was established by orthogonal regression design. The chromatography-optimization index was filtered by the intelligent filter program, and the optimization of the separation conditions was obtained. The experimental results showed that the average relative deviation between the experimental values and the predicted values was 0. 18% at the optimum and the optimization results were satisfactory.
Interquantile Shrinkage and Variable Selection in Quantile Regression
Jiang, Liewen; Bondell, Howard D.; Wang, Huixia Judy
2014-01-01
Examination of multiple conditional quantile functions provides a comprehensive view of the relationship between the response and covariates. In situations where quantile slope coefficients share some common features, estimation efficiency and model interpretability can be improved by utilizing such commonality across quantiles. Furthermore, elimination of irrelevant predictors will also aid in estimation and interpretation. These motivations lead to the development of two penalization methods, which can identify the interquantile commonality and nonzero quantile coefficients simultaneously. The developed methods are based on a fused penalty that encourages sparsity of both quantile coefficients and interquantile slope differences. The oracle properties of the proposed penalization methods are established. Through numerical investigations, it is demonstrated that the proposed methods lead to simpler model structure and higher estimation efficiency than the traditional quantile regression estimation. PMID:24653545
A Hypothesis Verification Method Using Regression Tree for Semiconductor Yield Analysis
NASA Astrophysics Data System (ADS)
Tsuda, Hidetaka; Shirai, Hidehiro; Terabe, Masahiro; Hashimoto, Kazuo; Shinohara, Ayumi
Several researchers have reported the regression tree analysis for semiconductor yield. However, the scope of these analyses is restricted by the difficulty involved in applying the regression tree analysis to a small number of samples with many attributes. It is often observed that splitting attributes in the route node do not indicate the hypothesized causes of failure. We propose a method for verifying the hypothesized causes of failure, which reduces the number of verification hypotheses. Our method involves selecting sets of analysis data with the same cause of failure, extracting the hypothesis by applying the regression tree analysis separately to each set of analysis data, and merging and sorting attributes according to the t value. The results of an experiment conducted in a real environment show that the proposed method helps in widening the scope of applicability of the regression tree analysis for semiconductor yield.
López Fontán, J L; Costa, J; Ruso, J M; Prieto, G; Sarmiento, F
2004-02-01
The application of a statistical method, the local polynomial regression method, (LPRM), based on a nonparametric estimation of the regression function to determine the critical micelle concentration (cmc) is presented. The method is extremely flexible because it does not impose any parametric model on the subjacent structure of the data but rather allows the data to speak for themselves. Good concordance of cmc values with those obtained by other methods was found for systems in which the variation of a measured physical property with concentration showed an abrupt change. When this variation was slow, discrepancies between the values obtained by LPRM and others methods were found.
One-step Sparse Estimates in Nonconcave Penalized Likelihood Models.
Zou, Hui; Li, Runze
2008-08-01
Fan & Li (2001) propose a family of variable selection methods via penalized likelihood using concave penalty functions. The nonconcave penalized likelihood estimators enjoy the oracle properties, but maximizing the penalized likelihood function is computationally challenging, because the objective function is nondifferentiable and nonconcave. In this article we propose a new unified algorithm based on the local linear approximation (LLA) for maximizing the penalized likelihood for a broad class of concave penalty functions. Convergence and other theoretical properties of the LLA algorithm are established. A distinguished feature of the LLA algorithm is that at each LLA step, the LLA estimator can naturally adopt a sparse representation. Thus we suggest using the one-step LLA estimator from the LLA algorithm as the final estimates. Statistically, we show that if the regularization parameter is appropriately chosen, the one-step LLA estimates enjoy the oracle properties with good initial estimators. Computationally, the one-step LLA estimation methods dramatically reduce the computational cost in maximizing the nonconcave penalized likelihood. We conduct some Monte Carlo simulation to assess the finite sample performance of the one-step sparse estimation methods. The results are very encouraging.
An NCME Instructional Module on Data Mining Methods for Classification and Regression
ERIC Educational Resources Information Center
Sinharay, Sandip
2016-01-01
Data mining methods for classification and regression are becoming increasingly popular in various scientific fields. However, these methods have not been explored much in educational measurement. This module first provides a review, which should be accessible to a wide audience in education measurement, of some of these methods. The module then…
ERIC Educational Resources Information Center
Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti
2010-01-01
In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…
Sparling, D.W.; Barzen, J.A.; Lovvorn, J.R.; Serie, J.R.
1992-01-01
Regression equations that use mensural data to estimate body condition have been developed for several water birds. These equations often have been based on data that represent different sexes, age classes, or seasons, without being adequately tested for intergroup differences. We used proximate carcass analysis of 538 adult and juvenile canvasbacks (Aythya valisineria ) collected during fall migration, winter, and spring migrations in 1975-76 and 1982-85 to test regression methods for estimating body condition.
The Bland-Altman Method Should Not Be Used in Regression Cross-Validation Studies
ERIC Educational Resources Information Center
O'Connor, Daniel P.; Mahar, Matthew T.; Laughlin, Mitzi S.; Jackson, Andrew S.
2011-01-01
The purpose of this study was to demonstrate the bias in the Bland-Altman (BA) limits of agreement method when it is used to validate regression models. Data from 1,158 men were used to develop three regression equations to estimate maximum oxygen uptake (R[superscript 2] = 0.40, 0.61, and 0.82, respectively). The equations were evaluated in a…
A comparison of several methods of solving nonlinear regression groundwater flow problems.
Cooley, R.L.
1985-01-01
Computational efficiency and computer memory requirements for four methods of minimizing functions were compared for four test nonlinear-regression steady state groundwater flow problems. The fastest methods were the Marquardt and quasi-linearization methods, which required almost identical computer times and numbers of iterations; the next fastest was the quasi-Newton method, and last was the Fletcher-Reeves method, which did not converge in 100 iterations for two of the problems.-from Author
[Criminalistic and penal problems with "dyadic deaths"].
Kaliszczak, Paweł; Kunz, Jerzy; Bolechała, Filip
2002-01-01
This paper is a supplement to the article "Medico legal problems of dyadic death" elaborated by the same authors. Recalling the cases presented there. It is also an attempt to present the basic criminalistic, penal and definitional problems of dyadic death called also postagressional suicide. Criminalistic problems of dyadic death were presented in view of widely known "rule of seven golden questions"--what?, where?, when?, how?, why?, what method? and who? Criminalistic analysis of cases makes some differences in conclusions but it seemed interesting to match both--criminalistc and forensic points of views to the presented material.
ERIC Educational Resources Information Center
Thompson, Russel L.
Homoscedasticity is an important assumption of linear regression. This paper explains what it is and why it is important to the researcher. Graphical and mathematical methods for testing the homoscedasticity assumption are demonstrated. Sources of homoscedasticity and types of homoscedasticity are discussed, and methods for correction are…
A permutation approach for selecting the penalty parameter in penalized model selection.
Sabourin, Jeremy A; Valdar, William; Nobel, Andrew B
2015-12-01
We describe a simple, computationally efficient, permutation-based procedure for selecting the penalty parameter in LASSO-penalized regression. The procedure, permutation selection, is intended for applications where variable selection is the primary focus, and can be applied in a variety of structural settings, including that of generalized linear models. We briefly discuss connections between permutation selection and existing theory for the LASSO. In addition, we present a simulation study and an analysis of real biomedical data sets in which permutation selection is compared with selection based on the following: cross-validation (CV), the Bayesian information criterion (BIC), scaled sparse linear regression, and a selection method based on recently developed testing procedures for the LASSO.
Whole-Genome Regression and Prediction Methods Applied to Plant and Animal Breeding
de los Campos, Gustavo; Hickey, John M.; Pong-Wong, Ricardo; Daetwyler, Hans D.; Calus, Mario P. L.
2013-01-01
Genomic-enabled prediction is becoming increasingly important in animal and plant breeding and is also receiving attention in human genetics. Deriving accurate predictions of complex traits requires implementing whole-genome regression (WGR) models where phenotypes are regressed on thousands of markers concurrently. Methods exist that allow implementing these large-p with small-n regressions, and genome-enabled selection (GS) is being implemented in several plant and animal breeding programs. The list of available methods is long, and the relationships between them have not been fully addressed. In this article we provide an overview of available methods for implementing parametric WGR models, discuss selected topics that emerge in applications, and present a general discussion of lessons learned from simulation and empirical data analysis in the last decade. PMID:22745228
Comparing regression methods for the two-stage clonal expansion model of carcinogenesis.
Kaiser, J C; Heidenreich, W F
2004-11-15
In the statistical analysis of cohort data with risk estimation models, both Poisson and individual likelihood regressions are widely used methods of parameter estimation. In this paper, their performance has been tested with the biologically motivated two-stage clonal expansion (TSCE) model of carcinogenesis. To exclude inevitable uncertainties of existing data, cohorts with simple individual exposure history have been created by Monte Carlo simulation. To generate some similar properties of atomic bomb survivors and radon-exposed mine workers, both acute and protracted exposure patterns have been generated. Then the capacity of the two regression methods has been compared to retrieve a priori known model parameters from the simulated cohort data. For simple models with smooth hazard functions, the parameter estimates from both methods come close to their true values. However, for models with strongly discontinuous functions which are generated by the cell mutation process of transformation, the Poisson regression method fails to produce reliable estimates. This behaviour is explained by the construction of class averages during data stratification. Thereby, some indispensable information on the individual exposure history was destroyed. It could not be repaired by countermeasures such as the refinement of Poisson classes or a more adequate choice of Poisson groups. Although this choice might still exist we were unable to discover it. In contrast to this, the individual likelihood regression technique was found to work reliably for all considered versions of the TSCE model. PMID:15490436
ERIC Educational Resources Information Center
Baker, Bruce D.; Richards, Craig E.
1999-01-01
Applies neural network methods for forecasting 1991-95 per-pupil expenditures in U.S. public elementary and secondary schools. Forecasting models included the National Center for Education Statistics' multivariate regression model and three neural architectures. Regarding prediction accuracy, neural network results were comparable or superior to…
A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants
ERIC Educational Resources Information Center
Cooper, Paul D.
2010-01-01
A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…
Factor Regression Analysis: A New Method for Weighting Predictors. Final Report.
ERIC Educational Resources Information Center
Curtis, Ervin W.
The optimum weighting of variables to predict a dependent-criterion variable is an important problem in nearly all of the social and natural sciences. Although the predominant method, multiple regression analysis (MR), yields optimum weights for the sample at hand, these weights are not generally optimum in the population from which the sample was…
Double Cross-Validation in Multiple Regression: A Method of Estimating the Stability of Results.
ERIC Educational Resources Information Center
Rowell, R. Kevin
In multiple regression analysis, where resulting predictive equation effectiveness is subject to shrinkage, it is especially important to evaluate result replicability. Double cross-validation is an empirical method by which an estimate of invariance or stability can be obtained from research data. A procedure for double cross-validation is…
An Empirical Likelihood Method for Semiparametric Linear Regression with Right Censored Data
Fang, Kai-Tai; Li, Gang; Lu, Xuyang; Qin, Hong
2013-01-01
This paper develops a new empirical likelihood method for semiparametric linear regression with a completely unknown error distribution and right censored survival data. The method is based on the Buckley-James (1979) estimating equation. It inherits some appealing properties of the complete data empirical likelihood method. For example, it does not require variance estimation which is problematic for the Buckley-James estimator. We also extend our method to incorporate auxiliary information. We compare our method with the synthetic data empirical likelihood of Li and Wang (2003) using simulations. We also illustrate our method using Stanford heart transplantation data. PMID:23573169
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1975-01-01
Ridge, Marquardt's generalized inverse, shrunken, and principal components estimators are discussed in terms of the objectives of point estimation of parameters, estimation of the predictive regression function, and hypothesis testing. It is found that as the normal equations approach singularity, more consideration must be given to estimable functions of the parameters as opposed to estimation of the full parameter vector; that biased estimators all introduce constraints on the parameter space; that adoption of mean squared error as a criterion of goodness should be independent of the degree of singularity; and that ordinary least-squares subset regression is the best overall method.
Improved random-starting method for the EM algorithm for finite mixtures of regressions.
Schepers, Jan
2015-03-01
Two methods for generating random starting values for the expectation maximization (EM) algorithm are compared in terms of yielding maximum likelihood parameter estimates in finite mixtures of regressions. One of these methods is ubiquitous in applications of finite mixture regression, whereas the other method is an alternative that appears not to have been used so far. The two methods are compared in two simulation studies and on an illustrative data set. The results show that the alternative method yields solutions with likelihood values at least as high as, and often higher than, those returned by the standard method. Moreover, analyses of the illustrative data set show that the results obtained by the two methods may differ considerably with regard to some of the substantive conclusions. The results reported in this article indicate that in applications of finite mixture regression, consideration should be given to the type of mechanism chosen to generate random starting values for the EM algorithm. In order to facilitate the use of the proposed alternative method, an R function implementing the approach is provided in the Appendix of the article.
Comparison of neural networks and regression-based methods for temperature retrievals.
Motteler, H E; Strow, L L; McMillin, L; Gualtieri, J A
1995-08-20
Two methods for performing clear-air temperature retrievals from simulated radiances for the Atmospheric Infrared Sounder are investigated. Neural networks are compared with a well-known linear method in which regression is performed after a change of bases. With large channel sets, both methods can rapidly perform clear-air retrievals over a variety of climactic conditions with an overall RMS error of less than 1 K. The Jacobian of the neural network is compared with the Jacobian (the regression coefficients) of the linear method, revealing a more fine-scale variation than expected from the underlying physics, particularly for the neural net. Some pragmatic information concerning the application ofneural nets to retrieval problems is also included.
Wilcox, Rand R; Keselman, H J
2012-05-01
During the last half century hundreds of papers published in statistical journals have documented general conditions where reliance on least squares regression and Pearson's correlation can result in missing even strong associations between variables. Moreover, highly misleading conclusions can be made, even when the sample size is large. There are, in fact, several fundamental concerns related to non-normality, outliers, heteroscedasticity, and curvature that can result in missing a strong association. Simultaneously, a vast array of new methods have been derived for effectively dealing with these concerns. The paper (1) reviews why least squares regression and classic inferential methods can fail, (2) provides an overview of the many modern strategies for dealing with known problems, including some recent advances, and (3) illustrates that modern robust methods can make a practical difference in our understanding of data. Included are some general recommendations regarding how modern methods might be used.
Neural Network and Regression Methods Demonstrated in the Design Optimization of a Subsonic Aircraft
NASA Technical Reports Server (NTRS)
Hopkins, Dale A.; Lavelle, Thomas M.; Patnaik, Surya
2003-01-01
The neural network and regression methods of NASA Glenn Research Center s COMETBOARDS design optimization testbed were used to generate approximate analysis and design models for a subsonic aircraft operating at Mach 0.85 cruise speed. The analytical model is defined by nine design variables: wing aspect ratio, engine thrust, wing area, sweep angle, chord-thickness ratio, turbine temperature, pressure ratio, bypass ratio, fan pressure; and eight response parameters: weight, landing velocity, takeoff and landing field lengths, approach thrust, overall efficiency, and compressor pressure and temperature. The variables were adjusted to optimally balance the engines to the airframe. The solution strategy included a sensitivity model and the soft analysis model. Researchers generated the sensitivity model by training the approximators to predict an optimum design. The trained neural network predicted all response variables, within 5-percent error. This was reduced to 1 percent by the regression method. The soft analysis model was developed to replace aircraft analysis as the reanalyzer in design optimization. Soft models have been generated for a neural network method, a regression method, and a hybrid method obtained by combining the approximators. The performance of the models is graphed for aircraft weight versus thrust as well as for wing area and turbine temperature. The regression method followed the analytical solution with little error. The neural network exhibited 5-percent maximum error over all parameters. Performance of the hybrid method was intermediate in comparison to the individual approximators. Error in the response variable is smaller than that shown in the figure because of a distortion scale factor. The overall performance of the approximators was considered to be satisfactory because aircraft analysis with NASA Langley Research Center s FLOPS (Flight Optimization System) code is a synthesis of diverse disciplines: weight estimation, aerodynamic
Ogaard, B; Ten Bosch, J J
1994-09-01
This article describes a new nondestructive optical method for evaluation of lesion regression in vivo. White spot caries lesions were induced with orthodontic bands in two vital premolars of seven patients. The teeth were banded for 4 weeks with special orthodontic bands that allowed plaque accumulation on the buccal surface. The teeth were left in the dentition for 2 or 4 weeks after debanding. Regular oral hygiene with a nonfluoridated toothpaste was applied during the entire experimental period. The optical scattering coefficient of the banded area was measured before banding and in 1-week intervals thereafter. The scattering coefficient returned to the sound value in an exponential manner, the half-value-time for left teeth being 1.1 week, for right teeth 1.8 weeks, these values being significantly inequal (p = 0.035). At the start of the regression period, the scattering coefficient of left teeth lesions was 2.5 as high as of right teeth lesions, values being inequal with p = 0.09. It is concluded that regression of initial lesions in the presence of saliva is a relatively rapid process. The new optical method may be of clinical importance for quantitative evaluation of enamel lesion regression developed during fixed appliance therapy.
NASA Astrophysics Data System (ADS)
Asavaskulkiet, Krissada
2014-01-01
This paper proposes a novel face super-resolution reconstruction (hallucination) technique for YCbCr color space. The underlying idea is to learn with an error regression model and multi-linear principal component analysis (MPCA). From hallucination framework, many color face images are explained in YCbCr space. To reduce the time complexity of color face hallucination, we can be naturally described the color face imaged as tensors or multi-linear arrays. In addition, the error regression analysis is used to find the error estimation which can be obtained from the existing LR in tensor space. In learning process is from the mistakes in reconstruct face images of the training dataset by MPCA, then finding the relationship between input and error by regression analysis. In hallucinating process uses normal method by backprojection of MPCA, after that the result is corrected with the error estimation. In this contribution we show that our hallucination technique can be suitable for color face images both in RGB and YCbCr space. By using the MPCA subspace with error regression model, we can generate photorealistic color face images. Our approach is demonstrated by extensive experiments with high-quality hallucinated color faces. Comparison with existing algorithms shows the effectiveness of the proposed method.
Assessment of Weighted Quantile Sum Regression for Modeling Chemical Mixtures and Cancer Risk
Czarnota, Jenna; Gennings, Chris; Wheeler, David C
2015-01-01
In evaluation of cancer risk related to environmental chemical exposures, the effect of many chemicals on disease is ultimately of interest. However, because of potentially strong correlations among chemicals that occur together, traditional regression methods suffer from collinearity effects, including regression coefficient sign reversal and variance inflation. In addition, penalized regression methods designed to remediate collinearity may have limitations in selecting the truly bad actors among many correlated components. The recently proposed method of weighted quantile sum (WQS) regression attempts to overcome these problems by estimating a body burden index, which identifies important chemicals in a mixture of correlated environmental chemicals. Our focus was on assessing through simulation studies the accuracy of WQS regression in detecting subsets of chemicals associated with health outcomes (binary and continuous) in site-specific analyses and in non-site-specific analyses. We also evaluated the performance of the penalized regression methods of lasso, adaptive lasso, and elastic net in correctly classifying chemicals as bad actors or unrelated to the outcome. We based the simulation study on data from the National Cancer Institute Surveillance Epidemiology and End Results Program (NCI-SEER) case–control study of non-Hodgkin lymphoma (NHL) to achieve realistic exposure situations. Our results showed that WQS regression had good sensitivity and specificity across a variety of conditions considered in this study. The shrinkage methods had a tendency to incorrectly identify a large number of components, especially in the case of strong association with the outcome. PMID:26005323
Phillips, Kirk T.; Street, W. Nick
2005-01-01
The purpose of this study is to determine the best prediction of heart failure outcomes, resulting from two methods -- standard epidemiologic analysis with logistic regression and knowledge discovery with supervised learning/data mining. Heart failure was chosen for this study as it exhibits higher prevalence and cost of treatment than most other hospitalized diseases. The prevalence of heart failure has exceeded 4 million cases in the U.S.. Findings of this study should be useful for the design of quality improvement initiatives, as particular aspects of patient comorbidity and treatment are found to be associated with mortality. This is also a proof of concept study, considering the feasibility of emerging health informatics methods of data mining in conjunction with or in lieu of traditional logistic regression methods of prediction. Findings may also support the design of decision support systems and quality improvement programming for other diseases. PMID:16779367
Semiparametric regression during 2003–2007*
Ruppert, David; Wand, M.P.; Carroll, Raymond J.
2010-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800
NASA Astrophysics Data System (ADS)
Zhu, Dazhou; Ji, Baoping; Meng, Chaoying; Shi, Bolin; Tu, Zhenhua; Qing, Zhaoshen
Hybrid linear analysis (HLA), partial least-squares (PLS) regression, and the linear least square support vector machine (LSSVM) were used to determinate the soluble solids content (SSC) of apple by Fourier transform near-infrared (FT-NIR) spectroscopy. The performance of these three linear regression methods was compared. Results showed that HLA could be used for the analysis of complex solid samples such as apple. The predictive ability of SSC model constructed by HLA was comparable to that of PLS. HLA was sensitive to outliers, thus the outliers should be eliminated before HLA calibration. Linear LSSVM performed better than PLS and HLA. Direct orthogonal signal correction (DOSC) pretreatment was effective for PLS and linear LSSVM, but not suitable for HLA. The combination of DOSC and linear LSSVM had good generalization ability and was not sensitive to outliers, so it is a promising method for linear multivariate calibration.
NASA Astrophysics Data System (ADS)
Zheng, Jun; Shao, Xinyu; Gao, Liang; Jiang, Ping; Qiu, Haobo
2015-06-01
Engineering design, especially for complex engineering systems, is usually a time-consuming process involving computation-intensive computer-based simulation and analysis methods. A difference mapping method using least square support vector regression is developed in this work, as a special metamodelling methodology that includes variable-fidelity data, to replace the computationally expensive computer codes. A general difference mapping framework is proposed where a surrogate base is first created, then the approximation is gained by a mapping the difference between the base and the real high-fidelity response surface. The least square support vector regression is adopted to accomplish the mapping. Two different sampling strategies, nested and non-nested design of experiments, are conducted to explore their respective effects on modelling accuracy. Different sample sizes and three approximation performance measures of accuracy are considered.
Zhang, Li-qing; Wu, Xiao-hua; Tang, Xi; Zhu, Xian-liang; Su, Wen-ting
2002-06-01
Principal component regression (PCR) method is used to analyse five components: acetaminophen, p-aminophenol, caffeine, chlorphenamine maleate and guaifenesin. The basic principle and the analytical step of the approach are described in detail. The computer program of LHG is based on VB language. The experimental result shows that the PCR method has no systematical error as compared to classical method. The experimental result shows that the average recovery of each component is all in the range from 96.43% to 107.14%. Each component obtains satisfactory result without any pre-separation. The approach is simple, rapid and suitable for the computer-aid analysis. PMID:12938324
Unification of regression-based methods for the analysis of natural selection.
Morrissey, Michael B; Sakrejda, Krzysztof
2013-07-01
Regression analyses are central to characterization of the form and strength of natural selection in nature. Two common analyses that are currently used to characterize selection are (1) least squares-based approximation of the individual relative fitness surface for the purpose of obtaining quantitatively useful selection gradients, and (2) spline-based estimation of (absolute) fitness functions to obtain flexible inference of the shape of functions by which fitness and phenotype are related. These two sets of methodologies are often implemented in parallel to provide complementary inferences of the form of natural selection. We unify these two analyses, providing a method whereby selection gradients can be obtained for a given observed distribution of phenotype and characterization of a function relating phenotype to fitness. The method allows quantitatively useful selection gradients to be obtained from analyses of selection that adequately model nonnormal distributions of fitness, and provides unification of the two previously separate regression-based fitness analyses. We demonstrate the method by calculating directional and quadratic selection gradients associated with a smooth regression-based generalized additive model of the relationship between neonatal survival and the phenotypic traits of gestation length and birth mass in humans.
Li, Min; Zhou, Tong; Song, Yanan
2016-07-01
A grain size characterization method based on energy attenuation coefficient spectrum and support vector regression (SVR) is proposed. First, the spectra of the first and second back-wall echoes are cut into several frequency bands to calculate the energy attenuation coefficient spectrum. Second, the frequency band that is sensitive to grain size variation is determined. Finally, a statistical model between the energy attenuation coefficient in the sensitive frequency band and average grain size is established through SVR. Experimental verification is conducted on austenitic stainless steel. The average relative error of the predicted grain size is 5.65%, which is better than that of conventional methods.
NASA Astrophysics Data System (ADS)
Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi
2016-03-01
Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
NASA Astrophysics Data System (ADS)
Khazaei, Ardeshir; Sarmasti, Negin; Seyf, Jaber Yousefi
2016-03-01
Quantitative structure activity relationship were used to study a series of curcumin-related compounds with inhibitory effect on prostate cancer PC-3 cells, pancreas cancer Panc-1 cells, and colon cancer HT-29 cells. Sphere exclusion method was used to split data set in two categories of train and test set. Multiple linear regression, principal component regression and partial least squares were used as the regression methods. In other hand, to investigate the effect of feature selection methods, stepwise, Genetic algorithm, and simulated annealing were used. In two cases (PC-3 cells and Panc-1 cells), the best models were generated by a combination of multiple linear regression and stepwise (PC-3 cells: r2 = 0.86, q2 = 0.82, pred_r2 = 0.93, and r2m (test) = 0.43, Panc-1 cells: r2 = 0.85, q2 = 0.80, pred_r2 = 0.71, and r2m (test) = 0.68). For the HT-29 cells, principal component regression with stepwise (r2 = 0.69, q2 = 0.62, pred_r2 = 0.54, and r2m (test) = 0.41) is the best method. The QSAR study reveals descriptors which have crucial role in the inhibitory property of curcumin-like compounds. 6ChainCount, T_C_C_1, and T_O_O_7 are the most important descriptors that have the greatest effect. With a specific end goal to design and optimization of novel efficient curcumin-related compounds it is useful to introduce heteroatoms such as nitrogen, oxygen, and sulfur atoms in the chemical structure (reduce the contribution of T_C_C_1 descriptor) and increase the contribution of 6ChainCount and T_O_O_7 descriptors. Models can be useful in the better design of some novel curcumin-related compounds that can be used in the treatment of prostate, pancreas, and colon cancers.
Flexible regression models over river networks
O’Donnell, David; Rushworth, Alastair; Bowman, Adrian W; Marian Scott, E; Hallard, Mark
2014-01-01
Many statistical models are available for spatial data but the vast majority of these assume that spatial separation can be measured by Euclidean distance. Data which are collected over river networks constitute a notable and commonly occurring exception, where distance must be measured along complex paths and, in addition, account must be taken of the relative flows of water into and out of confluences. Suitable models for this type of data have been constructed based on covariance functions. The aim of the paper is to place the focus on underlying spatial trends by adopting a regression formulation and using methods which allow smooth but flexible patterns. Specifically, kernel methods and penalized splines are investigated, with the latter proving more suitable from both computational and modelling perspectives. In addition to their use in a purely spatial setting, penalized splines also offer a convenient route to the construction of spatiotemporal models, where data are available over time as well as over space. Models which include main effects and spatiotemporal interactions, as well as seasonal terms and interactions, are constructed for data on nitrate pollution in the River Tweed. The results give valuable insight into the changes in water quality in both space and time. PMID:25653460
Mortality Prediction in ICUs Using A Novel Time-Slicing Cox Regression Method
Wang, Yuan; Chen, Wenlin; Heard, Kevin; Kollef, Marin H.; Bailey, Thomas C.; Cui, Zhicheng; He, Yujie; Lu, Chenyang; Chen, Yixin
2015-01-01
Over the last few decades, machine learning and data mining have been increasingly used for clinical prediction in ICUs. However, there is still a huge gap in making full use of the time-series data generated from ICUs. Aiming at filling this gap, we propose a novel approach entitled Time Slicing Cox regression (TS-Cox), which extends the classical Cox regression into a classification method on multi-dimensional time-series. Unlike traditional classifiers such as logistic regression and support vector machines, our model not only incorporates the discriminative features derived from the time-series, but also naturally exploits the temporal orders of these features based on a Cox-like function. Empirical evaluation on MIMIC-II database demonstrates the efficacy of the TS-Cox model. Our TS-Cox model outperforms all other baseline models by a good margin in terms of AUC_PR, sensitivity and PPV, which indicates that TS-Cox may be a promising tool for mortality prediction in ICUs. PMID:26958269
Mortality Prediction in ICUs Using A Novel Time-Slicing Cox Regression Method.
Wang, Yuan; Chen, Wenlin; Heard, Kevin; Kollef, Marin H; Bailey, Thomas C; Cui, Zhicheng; He, Yujie; Lu, Chenyang; Chen, Yixin
2015-01-01
Over the last few decades, machine learning and data mining have been increasingly used for clinical prediction in ICUs. However, there is still a huge gap in making full use of the time-series data generated from ICUs. Aiming at filling this gap, we propose a novel approach entitled Time Slicing Cox regression (TS-Cox), which extends the classical Cox regression into a classification method on multi-dimensional time-series. Unlike traditional classifiers such as logistic regression and support vector machines, our model not only incorporates the discriminative features derived from the time-series, but also naturally exploits the temporal orders of these features based on a Cox-like function. Empirical evaluation on MIMIC-II database demonstrates the efficacy of the TS-Cox model. Our TS-Cox model outperforms all other baseline models by a good margin in terms of AUC_PR, sensitivity and PPV, which indicates that TS-Cox may be a promising tool for mortality prediction in ICUs.
A deformation analysis method of stepwise regression for bridge deflection prediction
NASA Astrophysics Data System (ADS)
Shen, Yueqian; Zeng, Ying; Zhu, Lei; Huang, Teng
2015-12-01
Large-scale bridges are among the most important infrastructures whose safe conditions concern people's daily activities and life safety. Monitoring of large-scale bridges is crucial since deformation might have occurred. How to obtain the deformation information and then judge the safe conditions are the key and difficult problems in bridge deformation monitoring field. Deflection is the important index for evaluation of bridge safety. This paper proposes a forecasting modeling of stepwise regression analysis. Based on the deflection monitoring data of Yangtze River Bridge, the main factors influenced deflection deformation is chiefly studied. Authors use the monitoring data to forecast the deformation value of a bridge deflection at different time from the perspective of non-bridge structure, and compared to the forecasting of gray relational analysis based on linear regression. The result show that the accuracy and reliability of stepwise regression analysis is high, which provides the scientific basis to the bridge operation management. And above all, the ideas of this research provide and effective method for bridge deformation analysis.
Chen, Liang-Hsuan; Hsueh, Chan-Ching
2007-06-01
Fuzzy regression models are useful to investigate the relationship between explanatory and response variables with fuzzy observations. Different from previous studies, this correspondence proposes a mathematical programming method to construct a fuzzy regression model based on a distance criterion. The objective of the mathematical programming is to minimize the sum of distances between the estimated and observed responses on the X axis, such that the fuzzy regression model constructed has the minimal total estimation error in distance. Only several alpha-cuts of fuzzy observations are needed as inputs to the mathematical programming model; therefore, the applications are not restricted to triangular fuzzy numbers. Three examples, adopted in the previous studies, and a larger example, modified from the crisp case, are used to illustrate the performance of the proposed approach. The results indicate that the proposed model has better performance than those in the previous studies based on either distance criterion or Kim and Bishu's criterion. In addition, the efficiency and effectiveness for solving the larger example by the proposed model are also satisfactory.
Statistical method for prediction of gait kinematics with Gaussian process regression.
Yun, Youngmok; Kim, Hyun-Chul; Shin, Sung Yul; Lee, Junwon; Deshpande, Ashish D; Kim, Changhwan
2014-01-01
We propose a novel methodology for predicting human gait pattern kinematics based on a statistical and stochastic approach using a method called Gaussian process regression (GPR). We selected 14 body parameters that significantly affect the gait pattern and 14 joint motions that represent gait kinematics. The body parameter and gait kinematics data were recorded from 113 subjects by anthropometric measurements and a motion capture system. We generated a regression model with GPR for gait pattern prediction and built a stochastic function mapping from body parameters to gait kinematics based on the database and GPR, and validated the model with a cross validation method. The function can not only produce trajectories for the joint motions associated with gait kinematics, but can also estimate the associated uncertainties. Our approach results in a novel, low-cost and subject-specific method for predicting gait kinematics with only the subject's body parameters as the necessary input, and also enables a comprehensive understanding of the correlation and uncertainty between body parameters and gait kinematics. PMID:24211221
NASA Astrophysics Data System (ADS)
Bates, Bryson C.; Chandler, Richard E.; Bowman, Adrian W.
2012-08-01
Over recent years, considerable attention has been given to the problem of detecting trends and change points (discontinuities) in climatic series. This has led to the use of a plethora of detection techniques, ranging from the very simple (e.g., linear regression and t-tests) to the relatively complex (e.g., Markov chain Monte Carlo methods). However, many of these techniques are quite restricted in their range of application and care is needed to avoid misinterpretation of their results. In this paper we highlight the availability of modern regression methods that allow for both smooth trends and abrupt changes, and a discontinuity test that enables discrimination between the two. Our framework can accommodate constant mean levels, linear or smooth trends, and can test for genuine change points in an objective and data-driven way. We demonstrate its capabilities using the winter (December-March) North Atlantic Oscillation, an annual mean relative humidity series and a seasonal (June to October) typhoon count series as case studies. We show that the framework is less restrictive than many alternatives in allowing the data to speak for themselves and can give different and more credible results from those of conventional methods. The research findings from such analyses can be used to appropriately inform the design of subsequent studies of temporal changes in underlying physical mechanisms, and the development of policy responses that are appropriate for smoothly varying rather than abrupt climate change (and vice versa).
NASA Astrophysics Data System (ADS)
Melo, Raquel; Vieira, Gonçalo; Caselli, Alberto; Ramos, Miguel
2010-05-01
Field surveying during the austral summer of 2007/08 and the analysis of a QuickBird satellite image, resulted on the production of a detailed geomorphological map of the Irizar and Crater Lake area in Deception Island (South Shetlands, Maritime Antarctic - 1:10 000) and allowed its analysis and spatial modelling of the geomorphological phenomena. The present study focus on the analysis of the spatial distribution and characteristics of hummocky terrains, lag surfaces and nivation hollows, complemented by GIS spatial modelling intending to identify relevant controlling geographical factors. Models of the susceptibility of occurrence of these phenomena were created using two statistical methods: logistical regression, as a multivariate method; and the informative value as a bivariate method. Success and prediction rate curves were used for model validation. The Area Under the Curve (AUC) was used to quantify the level of performance and prediction of the models and to allow the comparison between the two methods. Regarding the logistic regression method, the AUC showed a success rate of 71% for the lag surfaces, 81% for the hummocky terrains and 78% for the nivation hollows. The prediction rate was 72%, 68% and 71%, respectively. Concerning the informative value method, the success rate was 69% for the lag surfaces, 84% for the hummocky terrains and 78% for the nivation hollows, and with a correspondingly prediction of 71%, 66% and 69%. The results were of very good quality and demonstrate the potential of the models to predict the influence of independent variables in the occurrence of the geomorphological phenomena and also the reliability of the data. Key-words: present-day geomorphological dynamics, detailed geomorphological mapping, GIS, spatial modelling, Deception Island, Antarctic.
A fast nonlinear regression method for estimating permeability in CT perfusion imaging
Bennink, Edwin; Riordan, Alan J; Horsch, Alexander D; Dankbaar, Jan Willem; Velthuis, Birgitta K; de Jong, Hugo W
2013-01-01
Blood–brain barrier damage, which can be quantified by measuring vascular permeability, is a potential predictor for hemorrhagic transformation in acute ischemic stroke. Permeability is commonly estimated by applying Patlak analysis to computed tomography (CT) perfusion data, but this method lacks precision. Applying more elaborate kinetic models by means of nonlinear regression (NLR) may improve precision, but is more time consuming and therefore less appropriate in an acute stroke setting. We propose a simplified NLR method that may be faster and still precise enough for clinical use. The aim of this study is to evaluate the reliability of in total 12 variations of Patlak analysis and NLR methods, including the simplified NLR method. Confidence intervals for the permeability estimates were evaluated using simulated CT attenuation–time curves with realistic noise, and clinical data from 20 patients. Although fixating the blood volume improved Patlak analysis, the NLR methods yielded significantly more reliable estimates, but took up to 12 × longer to calculate. The simplified NLR method was ∼4 × faster than other NLR methods, while maintaining the same confidence intervals (CIs). In conclusion, the simplified NLR method is a new, reliable way to estimate permeability in stroke, fast enough for clinical application in an acute stroke setting. PMID:23881247
Kew, William; Mitchell, John B O
2015-09-01
The application of Machine Learning to cheminformatics is a large and active field of research, but there exist few papers which discuss whether ensembles of different Machine Learning methods can improve upon the performance of their component methodologies. Here we investigated a variety of methods, including kernel-based, tree, linear, neural networks, and both greedy and linear ensemble methods. These were all tested against a standardised methodology for regression with data relevant to the pharmaceutical development process. This investigation focused on QSPR problems within drug-like chemical space. We aimed to investigate which methods perform best, and how the 'wisdom of crowds' principle can be applied to ensemble predictors. It was found that no single method performs best for all problems, but that a dynamic, well-structured ensemble predictor would perform very well across the board, usually providing an improvement in performance over the best single method. Its use of weighting factors allows the greedy ensemble to acquire a bigger contribution from the better performing models, and this helps the greedy ensemble generally to outperform the simpler linear ensemble. Choice of data preprocessing methodology was found to be crucial to performance of each method too.
Adaptive wavelet simulation of global ocean dynamics using a new Brinkman volume penalization
NASA Astrophysics Data System (ADS)
Kevlahan, N. K.-R.; Dubos, T.; Aechtner, M.
2015-12-01
In order to easily enforce solid-wall boundary conditions in the presence of complex coastlines, we propose a new mass and energy conserving Brinkman penalization for the rotating shallow water equations. This penalization does not lead to higher wave speeds in the solid region. The error estimates for the penalization are derived analytically and verified numerically for linearized one-dimensional equations. The penalization is implemented in a conservative dynamically adaptive wavelet method for the rotating shallow water equations on the sphere with bathymetry and coastline data from NOAA's ETOPO1 database. This code could form the dynamical core for a future global ocean model. The potential of the dynamically adaptive ocean model is illustrated by using it to simulate the 2004 Indonesian tsunami and wind-driven gyres.
Race Making in a Penal Institution.
Walker, Michael L
2016-01-01
This article provides a ground-level investigation into the lives of penal inmates, linking the literature on race making and penal management to provide an understanding of racial formation processes in a modern penal institution. Drawing on 135 days of ethnographic data collected as an inmate in a Southern California county jail system, the author argues that inmates are subjected to two mutually constitutive racial projects--one institutional and the other microinteractional. Operating in symbiosis within a narrative of risk management, these racial projects increase (rather than decrease) incidents of intraracial violence and the potential for interracial violence. These findings have implications for understanding the process of racialization and evaluating the effectiveness of penal management strategies.
Race Making in a Penal Institution.
Walker, Michael L
2016-01-01
This article provides a ground-level investigation into the lives of penal inmates, linking the literature on race making and penal management to provide an understanding of racial formation processes in a modern penal institution. Drawing on 135 days of ethnographic data collected as an inmate in a Southern California county jail system, the author argues that inmates are subjected to two mutually constitutive racial projects--one institutional and the other microinteractional. Operating in symbiosis within a narrative of risk management, these racial projects increase (rather than decrease) incidents of intraracial violence and the potential for interracial violence. These findings have implications for understanding the process of racialization and evaluating the effectiveness of penal management strategies. PMID:27017706
Feng, Zeny Z; Yang, Xiaojian; Subedi, Sanjeena; McNicholas, Paul D
2012-01-01
Recent work concerning quantitative traits of interest has focused on selecting a small subset of single nucleotide polymorphisms (SNPs) from amongst the SNPs responsible for the phenotypic variation of the trait. When considered as covariates, the large number of variables (SNPs) and their association with those in close proximity pose challenges for variable selection. The features of sparsity and shrinkage of regression coefficients of the least absolute shrinkage and selection operator (LASSO) method appear attractive for SNP selection. Sparse partial least squares (SPLS) is also appealing as it combines the features of sparsity in subset selection and dimension reduction to handle correlations amongst SNPs. In this paper we investigate application of the LASSO and SPLS methods for selecting SNPs that predict quantitative traits. We evaluate the performance of both methods with different criteria and under different scenarios using simulation studies. Results indicate that these methods can be effective in selecting SNPs that predict quantitative traits but are limited by some conditions. Both methods perform similarly overall but each exhibit advantages over the other in given situations. Both methods are applied to Canadian Holstein cattle data to compare their performance.
NASA Astrophysics Data System (ADS)
Mandal, Nilrudra; Doloi, Biswanath; Mondal, Biswanath
2016-01-01
In the present study, an attempt has been made to apply the Taguchi parameter design method and regression analysis for optimizing the cutting conditions on surface finish while machining AISI 4340 steel with the help of the newly developed yttria based Zirconia Toughened Alumina (ZTA) inserts. These inserts are prepared through wet chemical co-precipitation route followed by powder metallurgy process. Experiments have been carried out based on an orthogonal array L9 with three parameters (cutting speed, depth of cut and feed rate) at three levels (low, medium and high). Based on the mean response and signal to noise ratio (SNR), the best optimal cutting condition has been arrived at A3B1C1 i.e. cutting speed is 420 m/min, depth of cut is 0.5 mm and feed rate is 0.12 m/min considering the condition smaller is the better approach. Analysis of Variance (ANOVA) is applied to find out the significance and percentage contribution of each parameter. The mathematical model of surface roughness has been developed using regression analysis as a function of the above mentioned independent variables. The predicted values from the developed model and experimental values are found to be very close to each other justifying the significance of the model. A confirmation run has been carried out with 95 % confidence level to verify the optimized result and the values obtained are within the prescribed limit.
NASA Technical Reports Server (NTRS)
Tomberlin, T. J.
1985-01-01
Research studies of residents' responses to noise consist of interviews with samples of individuals who are drawn from a number of different compact study areas. The statistical techniques developed provide a basis for those sample design decisions. These techniques are suitable for a wide range of sample survey applications. A sample may consist of a random sample of residents selected from a sample of compact study areas, or in a more complex design, of a sample of residents selected from a sample of larger areas (e.g., cities). The techniques may be applied to estimates of the effects on annoyance of noise level, numbers of noise events, the time-of-day of the events, ambient noise levels, or other factors. Methods are provided for determining, in advance, how accurately these effects can be estimated for different sample sizes and study designs. Using a simple cost function, they also provide for optimum allocation of the sample across the stages of the design for estimating these effects. These techniques are developed via a regression model in which the regression coefficients are assumed to be random, with components of variance associated with the various stages of a multi-stage sample design.
The crux of the method: assumptions in ordinary least squares and logistic regression.
Long, Rebecca G
2008-10-01
Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.
NASA Astrophysics Data System (ADS)
Zhao, Na; Yue, Tianxiang; Zhou, Xun; Zhao, Mingwei; Liu, Yu; Du, Zhengping; Zhang, Lili
2016-03-01
Downscaling precipitation is required in local scale climate impact studies. In this paper, a statistical downscaling scheme was presented with a combination of geographically weighted regression (GWR) model and a recently developed method, high accuracy surface modeling method (HASM). This proposed method was compared with another downscaling method using the Coupled Model Intercomparison Project Phase 5 (CMIP5) database and ground-based data from 732 stations across China for the period 1976-2005. The residual which was produced by GWR was modified by comparing different interpolators including HASM, Kriging, inverse distance weighted method (IDW), and Spline. The spatial downscaling from 1° to 1-km grids for period 1976-2005 and future scenarios was achieved by using the proposed downscaling method. The prediction accuracy was assessed at two separate validation sites throughout China and Jiangxi Province on both annual and seasonal scales, with the root mean square error (RMSE), mean relative error (MRE), and mean absolute error (MAE). The results indicate that the developed model in this study outperforms the method that builds transfer function using the gauge values. There is a large improvement in the results when using a residual correction with meteorological station observations. In comparison with other three classical interpolators, HASM shows better performance in modifying the residual produced by local regression method. The success of the developed technique lies in the effective use of the datasets and the modification process of the residual by using HASM. The results from the future climate scenarios show that precipitation exhibits overall increasing trend from T1 (2011-2040) to T2 (2041-2070) and T2 to T3 (2071-2100) in RCP2.6, RCP4.5, and RCP8.5 emission scenarios. The most significant increase occurs in RCP8.5 from T2 to T3, while the lowest increase is found in RCP2.6 from T2 to T3, increased by 47.11 and 2.12 mm, respectively.
Eliseyev, Andrey; Aksenova, Tetiana
2016-01-01
In the current paper the decoding algorithms for motor-related BCI systems for continuous upper limb trajectory prediction are considered. Two methods for the smooth prediction, namely Sobolev and Polynomial Penalized Multi-Way Partial Least Squares (PLS) regressions, are proposed. The methods are compared to the Multi-Way Partial Least Squares and Kalman Filter approaches. The comparison demonstrated that the proposed methods combined the prediction accuracy of the algorithms of the PLS family and trajectory smoothness of the Kalman Filter. In addition, the prediction delay is significantly lower for the proposed algorithms than for the Kalman Filter approach. The proposed methods could be applied in a wide range of applications beyond neuroscience. PMID:27196417
Eliseyev, Andrey; Aksenova, Tetiana
2016-01-01
In the current paper the decoding algorithms for motor-related BCI systems for continuous upper limb trajectory prediction are considered. Two methods for the smooth prediction, namely Sobolev and Polynomial Penalized Multi-Way Partial Least Squares (PLS) regressions, are proposed. The methods are compared to the Multi-Way Partial Least Squares and Kalman Filter approaches. The comparison demonstrated that the proposed methods combined the prediction accuracy of the algorithms of the PLS family and trajectory smoothness of the Kalman Filter. In addition, the prediction delay is significantly lower for the proposed algorithms than for the Kalman Filter approach. The proposed methods could be applied in a wide range of applications beyond neuroscience. PMID:27196417
Fienen, Michael N.; Selbig, William R.
2012-01-01
A new sample collection system was developed to improve the representation of sediment entrained in urban storm water by integrating water quality samples from the entire water column. The depth-integrated sampler arm (DISA) was able to mitigate sediment stratification bias in storm water, thereby improving the characterization of suspended-sediment concentration and particle size distribution at three independent study locations. Use of the DISA decreased variability, which improved statistical regression to predict particle size distribution using surrogate environmental parameters, such as precipitation depth and intensity. The performance of this statistical modeling technique was compared to results using traditional fixed-point sampling methods and was found to perform better. When environmental parameters can be used to predict particle size distributions, environmental managers have more options when characterizing concentrations, loads, and particle size distributions in urban runoff.
NASA Astrophysics Data System (ADS)
Huang, Fengzhen; Li, Jingzhen; Cao, Jun
2015-02-01
Temporally and Spatially Modulated Fourier Transform Imaging Spectrometer (TSMFTIS) is a new imaging spectrometer without moving mirrors and slits. As applied in remote sensing, TSMFTIS needs to rely on push-broom of the flying platform to obtain the interferogram of the target detected, and if the moving state of the flying platform changed during the imaging process, the target interferogram picked up from the remote sensing image sequence will deviate from the ideal interferogram, then the target spectrum recovered shall not reflect the real characteristic of the ground target object. Therefore, in order to achieve a high precision spectrum recovery of the target detected, the geometry position of the target point on the TSMFTIS image surface can be calculated in accordance with the sub-pixel image registration method, and the real point interferogram of the target can be obtained with image interpolation method. The core idea of the interpolation methods (nearest, bilinear and cubic etc) are to obtain the grey value of the point to be interpolated by weighting the grey value of the pixel around and with the kernel function constructed by the distance between the pixel around and the point to be interpolated. This paper adopts the gauss-based kernel regression mode, present a kernel function that consists of the grey information making use of the relative deviation and the distance information, then the kernel function is controlled by the deviation degree between the grey value of the pixel around and the means value so as to adjust weights self adaptively. The simulation adopts the partial spectrum data obtained by the pushbroom hyperspectral imager (PHI) as the spectrum of the target, obtains the successively push broomed motion error image in combination with the related parameter of the actual aviation platform; then obtains the interferogram of the target point with the above interpolation method; finally, recovers spectrogram with the nonuniform fast
Hwang, Kyu-Baek; Lee, In-Hee; Park, Jin-Ho; Hambuch, Tina; Choi, Yongjoon; Kim, MinHyeok; Lee, Kyungjoon; Song, Taemin; Neu, Matthew B.; Gupta, Neha; Kohane, Isaac S.; Green, Robert C.; Kong, Sek Won
2014-01-01
As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous SNVs; 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and ensemble genotyping would be essential to minimize false positive DNM candidates. PMID:24829188
Wang, Molin; Kuchiba, Aya; Ogino, Shuji
2015-08-01
In interdisciplinary biomedical, epidemiologic, and population research, it is increasingly necessary to consider pathogenesis and inherent heterogeneity of any given health condition and outcome. As the unique disease principle implies, no single biomarker can perfectly define disease subtypes. The complex nature of molecular pathology and biology necessitates biostatistical methodologies to simultaneously analyze multiple biomarkers and subtypes. To analyze and test for heterogeneity hypotheses across subtypes defined by multiple categorical and/or ordinal markers, we developed a meta-regression method that can utilize existing statistical software for mixed-model analysis. This method can be used to assess whether the exposure-subtype associations are different across subtypes defined by 1 marker while controlling for other markers and to evaluate whether the difference in exposure-subtype association across subtypes defined by 1 marker depends on any other markers. To illustrate this method in molecular pathological epidemiology research, we examined the associations between smoking status and colorectal cancer subtypes defined by 3 correlated tumor molecular characteristics (CpG island methylator phenotype, microsatellite instability, and the B-Raf protooncogene, serine/threonine kinase (BRAF), mutation) in the Nurses' Health Study (1980-2010) and the Health Professionals Follow-up Study (1986-2010). This method can be widely useful as molecular diagnostics and genomic technologies become routine in clinical medicine and public health.
A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.
2014-01-01
A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.
Toplak, Marko; Močnik, Rok; Polajnar, Matija; Bosnić, Zoran; Carlsson, Lars; Hasselgren, Catrin; Demšar, Janez; Boyer, Scott; Zupan, Blaž; Stålring, Jonna
2014-02-24
The vastness of chemical space and the relatively small coverage by experimental data recording molecular properties require us to identify subspaces, or domains, for which we can confidently apply QSAR models. The prediction of QSAR models in these domains is reliable, and potential subsequent investigations of such compounds would find that the predictions closely match the experimental values. Standard approaches in QSAR assume that predictions are more reliable for compounds that are "similar" to those in subspaces with denser experimental data. Here, we report on a study of an alternative set of techniques recently proposed in the machine learning community. These methods quantify prediction confidence through estimation of the prediction error at the point of interest. Our study includes 20 public QSAR data sets with continuous response and assesses the quality of 10 reliability scoring methods by observing their correlation with prediction error. We show that these new alternative approaches can outperform standard reliability scores that rely only on similarity to compounds in the training set. The results also indicate that the quality of reliability scoring methods is sensitive to data set characteristics and to the regression method used in QSAR. We demonstrate that at the cost of increased computational complexity these dependencies can be leveraged by integration of scores from various reliability estimation approaches. The reliability estimation techniques described in this paper have been implemented in an open source add-on package ( https://bitbucket.org/biolab/orange-reliability ) to the Orange data mining suite. PMID:24490838
Yap, C W; Li, H; Ji, Z L; Chen, Y Z
2007-11-01
Quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) models have been extensively used for predicting compounds of specific pharmacodynamic, pharmacokinetic, or toxicological property from structure-derived physicochemical and structural features. These models can be developed by using various regression methods including conventional approaches (multiple linear regression and partial least squares) and more recently explored genetic (genetic function approximation) and machine learning (k-nearest neighbour, neural networks, and support vector regression) approaches. This article describes the algorithms of these methods, evaluates their advantages and disadvantages, and discusses the application potential of the recently explored methods. Freely available online and commercial software for these regression methods and the areas of their applications are also presented. PMID:18045213
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, Dingfeng; Huang, Jian; Zhang, Ying
2013-10-01
We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
Methods for Adjusting U.S. Geological Survey Rural Regression Peak Discharges in an Urban Setting
Moglen, Glenn E.; Shivers, Dorianne E.
2006-01-01
A study was conducted of 78 U.S. Geological Survey gaged streams that have been subjected to varying degrees of urbanization over the last three decades. Flood-frequency analysis coupled with nonlinear regression techniques were used to generate a set of equations for converting peak discharge estimates determined from rural regression equations to a set of peak discharge estimates that represent known urbanization. Specifically, urban regression equations for the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year return periods were calibrated as a function of the corresponding rural peak discharge and the percentage of impervious area in a watershed. The results of this study indicate that two sets of equations, one set based on imperviousness and one set based on population density, performed well. Both sets of equations are dependent on rural peak discharges, a measure of development (average percentage of imperviousness or average population density), and a measure of homogeneity of development within a watershed. Average imperviousness was readily determined by using geographic information system methods and commonly available land-cover data. Similarly, average population density was easily determined from census data. Thus, a key advantage to the equations developed in this study is that they do not require field measurements of watershed characteristics as did the U.S. Geological Survey urban equations developed in an earlier investigation. During this study, the U.S. Geological Survey PeakFQ program was used as an integral tool in the calibration of all equations. The scarcity of historical land-use data, however, made exclusive use of flow records necessary for the 30-year period from 1970 to 2000. Such relatively short-duration streamflow time series required a nonstandard treatment of the historical data function of the PeakFQ program in comparison to published guidelines. Thus, the approach used during this investigation does not fully comply with the
Investigating the Accuracy of Three Estimation Methods for Regression Discontinuity Design
ERIC Educational Resources Information Center
Sun, Shuyan; Pan, Wei
2013-01-01
Regression discontinuity design is an alternative to randomized experiments to make causal inference when random assignment is not possible. This article first presents the formal identification and estimation of regression discontinuity treatment effects in the framework of Rubin's causal model, followed by a thorough literature review of…
ERIC Educational Resources Information Center
Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.
2013-01-01
In a traditional regression-discontinuity design (RDD), units are assigned to treatment on the basis of a cutoff score and a continuous assignment variable. The treatment effect is measured at a single cutoff location along the assignment variable. This article introduces the multivariate regression-discontinuity design (MRDD), where multiple…
Kernel regression image processing method for optical readout MEMS based uncooled IRFPA
NASA Astrophysics Data System (ADS)
Dong, Liquan; Liu, Xiaohua; Zhao, Yuejin; Hui, Mei; Zhou, Xiaoxiao
2009-11-01
Almost two years after the investors in Sarcon Microsystems pulled the plug, the micro-cantilever array based uncooled IR detector technology is again attracting more and more attention because of its low cost and high credibility. An uncooled thermal detector array with low NETD is designed and fabricated using MEMS bimaterial microcantilever structures that bend in response to thermal change. The IR images of objects obtained by these FPAs are readout by an optical method. For the IR images, one of the most problems of fixed pattern noise (FPN) is complicated by the fact that the response of each FPA detector changes due to a variety of factors. We adapt and expand kernel regression ideas for use in image denoising. The processed image quality is improved obviously. Great compute and analysis have been realized by using the discussed algorithm to the simulated data and in applications on real data. The experimental results demonstrate, better RMSE and highest Peak Signal-to- Noise Ratio (PSNR) compared with traditional methods can be obtained. At last we discuss the factors that determine the ultimate performance of the FPA. And we indicated that one of the unique advantages of the present approach is the scalability to larger imaging arrays.
A faster optimization method based on support vector regression for aerodynamic problems
NASA Astrophysics Data System (ADS)
Yang, Xixiang; Zhang, Weihua
2013-09-01
In this paper, a new strategy for optimal design of complex aerodynamic configuration with a reasonable low computational effort is proposed. In order to solve the formulated aerodynamic optimization problem with heavy computation complexity, two steps are taken: (1) a sequential approximation method based on support vector regression (SVR) and hybrid cross validation strategy, is proposed to predict aerodynamic coefficients, and thus approximates the objective function and constraint conditions of the originally formulated optimization problem with given limited sample points; (2) a sequential optimization algorithm is proposed to ensure the obtained optimal solution by solving the approximation optimization problem in step (1) is very close to the optimal solution of the originally formulated optimization problem. In the end, we adopt a complex aerodynamic design problem, that is optimal aerodynamic design of a flight vehicle with grid fins, to demonstrate our proposed optimization methods, and numerical results show that better results can be obtained with a significantly lower computational effort than using classical optimization techniques.
Standard regression-based methods for measuring recovery after sport-related concussion.
McCrea, Michael; Barr, William B; Guskiewicz, Kevin; Randolph, Christopher; Marshall, Stephen W; Cantu, Robert; Onate, James A; Kelly, James P
2005-01-01
Clinical decision making about an athlete's return to competition after concussion is hampered by a lack of systematic methods to measure recovery. We applied standard regression-based methods to statistically measure individual rates of impairment at several time points after concussion in college football players. Postconcussive symptoms, cognitive functioning, and balance were assessed in 94 players with concussion (based on American Academy of Neurology Criteria) and 56 noninjured controls during preseason baseline testing, and immediately, 3 hr, and 1, 2, 3, 5, and 7 days postinjury. Ninety-five percent of injured players exhibited acute concussion symptoms and impairment on cognitive or balance testing immediately after injury, which diminished to 4% who reported elevated symptoms on postinjury day 7. In addition, a small but clinically significant percentage of players who reported being symptom free by day 2 continued to be classified as impaired on the basis of objective balance and cognitive testing. These data suggest that neuropsychological testing may be of incremental utility to subjective symptom checklists in identifying the residual effects of sport-related concussion. The implementation of neuropsychological testing to detect subtle cognitive impairment is most useful once postconcussive symptoms have resolved. This management model is also supported by practical and other methodological considerations.
Eng, K.; Milly, P.C.D.; Tasker, Gary D.
2007-01-01
To facilitate estimation of streamflow characteristics at an ungauged site, hydrologists often define a region of influence containing gauged sites hydrologically similar to the estimation site. This region can be defined either in geographic space or in the space of the variables that are used to predict streamflow (predictor variables). These approaches are complementary, and a combination of the two may be superior to either. Here we propose a hybrid region-of-influence (HRoI) regression method that combines the two approaches. The new method was applied with streamflow records from 1,091 gauges in the southeastern United States to estimate the 50-year peak flow (Q50). The HRoI approach yielded lower root-mean-square estimation errors and produced fewer extreme errors than either the predictor-variable or geographic region-of-influence approaches. It is concluded, for Q50 in the study region, that similarity with respect to the basin characteristics considered (area, slope, and annual precipitation) is important, but incomplete, and that the consideration of geographic proximity of stations provides a useful surrogate for characteristics that are not included in the analysis. ?? 2007 ASCE.
Cox regression with missing covariate data using a modified partial likelihood method.
Martinussen, Torben; Holst, Klaus K; Scheike, Thomas H
2016-10-01
Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard function with the idea being to profile out this function before carrying out the estimation of the parameter of interest. In this step one uses a Breslow type estimator to estimate the cumulative baseline hazard function. We focus on the situation where the observed covariates are categorical which allows us to calculate estimators without having to assume anything about the distribution of the covariates. We show that the proposed estimator is consistent and asymptotically normal, and derive a consistent estimator of the variance-covariance matrix that does not involve any choice of a perturbation parameter. Moderate sample size performance of the estimators is investigated via simulation and by application to a real data example.
Cox regression with missing covariate data using a modified partial likelihood method.
Martinussen, Torben; Holst, Klaus K; Scheike, Thomas H
2016-10-01
Missing covariate values is a common problem in survival analysis. In this paper we propose a novel method for the Cox regression model that is close to maximum likelihood but avoids the use of the EM-algorithm. It exploits that the observed hazard function is multiplicative in the baseline hazard function with the idea being to profile out this function before carrying out the estimation of the parameter of interest. In this step one uses a Breslow type estimator to estimate the cumulative baseline hazard function. We focus on the situation where the observed covariates are categorical which allows us to calculate estimators without having to assume anything about the distribution of the covariates. We show that the proposed estimator is consistent and asymptotically normal, and derive a consistent estimator of the variance-covariance matrix that does not involve any choice of a perturbation parameter. Moderate sample size performance of the estimators is investigated via simulation and by application to a real data example. PMID:26493471
Stefanello, C; Vieira, S L; Xue, P; Ajuwon, K M; Adeola, O
2016-07-01
A study was conducted to determine the ileal digestible energy (IDE), ME, and MEn contents of bakery meal using the regression method and to evaluate whether the energy values are age-dependent in broiler chickens from zero to 21 d post hatching. Seven hundred and eighty male Ross 708 chicks were fed 3 experimental diets in which bakery meal was incorporated into a corn-soybean meal-based reference diet at zero, 100, or 200 g/kg by replacing the energy-yielding ingredients. A 3 × 3 factorial arrangement of 3 ages (1, 2, or 3 wk) and 3 dietary bakery meal levels were used. Birds were fed the same experimental diets in these 3 evaluated ages. Birds were grouped by weight into 10 replicates per treatment in a randomized complete block design. Apparent ileal digestibility and total tract retention of DM, N, and energy were calculated. Expression of mucin (MUC2), sodium-dependent phosphate transporter (NaPi-IIb), solute carrier family 7 (cationic amino acid transporter, Y(+) system, SLC7A2), glucose (GLUT2), and sodium-glucose linked transporter (SGLT1) genes were measured at each age in the jejunum by real-time PCR. Addition of bakery meal to the reference diet resulted in a linear decrease in retention of DM, N, and energy, and a quadratic reduction (P < 0.05) in N retention and ME. There was a linear increase in DM, N, and energy as birds' ages increased from 1 to 3 wk. Dietary bakery meal did not affect jejunal gene expression. Expression of genes encoding MUC2, NaPi-IIb, and SLC7A2 linearly increased (P < 0.05) with age. Regression-derived MEn of bakery meal linearly increased (P < 0.05) as the age of birds increased, with values of 2,710, 2,820, and 2,923 kcal/kg DM for 1, 2, and 3 wk, respectively. Based on these results, utilization of energy and nitrogen in the basal diet decreased when bakery meal was included and increased with age of broiler chickens.
Analysis of Genome-Wide Association Studies with Multiple Outcomes Using Penalization
Liu, Jin; Huang, Jian; Ma, Shuangge
2012-01-01
Genome-wide association studies have been extensively conducted, searching for markers for biologically meaningful outcomes and phenotypes. Penalization methods have been adopted in the analysis of the joint effects of a large number of SNPs (single nucleotide polymorphisms) and marker identification. This study is partly motivated by the analysis of heterogeneous stock mice dataset, in which multiple correlated phenotypes and a large number of SNPs are available. Existing penalization methods designed to analyze a single response variable cannot accommodate the correlation among multiple response variables. With multiple response variables sharing the same set of markers, joint modeling is first employed to accommodate the correlation. The group Lasso approach is adopted to select markers associated with all the outcome variables. An efficient computational algorithm is developed. Simulation study and analysis of the heterogeneous stock mice dataset show that the proposed method can outperform existing penalization methods. PMID:23272092
Revisiting the Distance Duality Relation using a non-parametric regression method
NASA Astrophysics Data System (ADS)
Rana, Akshay; Jain, Deepak; Mahajan, Shobhit; Mukherjee, Amitabha
2016-07-01
The interdependence of luminosity distance, DL and angular diameter distance, DA given by the distance duality relation (DDR) is very significant in observational cosmology. It is very closely tied with the temperature-redshift relation of Cosmic Microwave Background (CMB) radiation. Any deviation from η(z)≡ DL/DA (1+z)2 =1 indicates a possible emergence of new physics. Our aim in this work is to check the consistency of these relations using a non-parametric regression method namely, LOESS with SIMEX. This technique avoids dependency on the cosmological model and works with a minimal set of assumptions. Further, to analyze the efficiency of the methodology, we simulate a dataset of 020 points of η (z) data based on a phenomenological model η(z)= (1+z)epsilon. The error on the simulated data points is obtained by using the temperature of CMB radiation at various redshifts. For testing the distance duality relation, we use the JLA SNe Ia data for luminosity distances, while the angular diameter distances are obtained from radio galaxies datasets. Since the DDR is linked with CMB temperature-redshift relation, therefore we also use the CMB temperature data to reconstruct η (z). It is important to note that with CMB data, we are able to study the evolution of DDR upto a very high redshift z = 2.418. In this analysis, we find no evidence of deviation from η=1 within a 1σ region in the entire redshift range used in this analysis (0 < z <= 2.418).
A primer on regression methods for decoding cis-regulatory logic
Das, Debopriya; Pellegrini, Matteo; Gray, Joe W.
2009-03-03
The rapidly emerging field of systems biology is helping us to understand the molecular determinants of phenotype on a genomic scale [1]. Cis-regulatory elements are major sequence-based determinants of biological processes in cells and tissues [2]. For instance, during transcriptional regulation, transcription factors (TFs) bind to very specific regions on the promoter DNA [2,3] and recruit the basal transcriptional machinery, which ultimately initiates mRNA transcription (Figure 1A). Learning cis-Regulatory Elements from Omics Data A vast amount of work over the past decade has shown that omics data can be used to learn cis-regulatory logic on a genome-wide scale [4-6]--in particular, by integrating sequence data with mRNA expression profiles. The most popular approach has been to identify over-represented motifs in promoters of genes that are coexpressed [4,7,8]. Though widely used, such an approach can be limiting for a variety of reasons. First, the combinatorial nature of gene regulation is difficult to explicitly model in this framework. Moreover, in many applications of this approach, expression data from multiple conditions are necessary to obtain reliable predictions. This can potentially limit the use of this method to only large data sets [9]. Although these methods can be adapted to analyze mRNA expression data from a pair of biological conditions, such comparisons are often confounded by the fact that primary and secondary response genes are clustered together--whereas only the primary response genes are expected to contain the functional motifs [10]. A set of approaches based on regression has been developed to overcome the above limitations [11-32]. These approaches have their foundations in certain biophysical aspects of gene regulation [26,33-35]. That is, the models are motivated by the expected transcriptional response of genes due to the binding of TFs to their promoters. While such methods have gathered popularity in the computational domain
Huang, Dong; Cabral, Ricardo; De la Torre, Fernando
2016-02-01
Discriminative methods (e.g., kernel regression, SVM) have been extensively used to solve problems such as object recognition, image alignment and pose estimation from images. These methods typically map image features ( X) to continuous (e.g., pose) or discrete (e.g., object category) values. A major drawback of existing discriminative methods is that samples are directly projected onto a subspace and hence fail to account for outliers common in realistic training sets due to occlusion, specular reflections or noise. It is important to notice that existing discriminative approaches assume the input variables X to be noise free. Thus, discriminative methods experience significant performance degradation when gross outliers are present. Despite its obvious importance, the problem of robust discriminative learning has been relatively unexplored in computer vision. This paper develops the theory of robust regression (RR) and presents an effective convex approach that uses recent advances on rank minimization. The framework applies to a variety of problems in computer vision including robust linear discriminant analysis, regression with missing data, and multi-label classification. Several synthetic and real examples with applications to head pose estimation from images, image and video classification and facial attribute classification with missing data are used to illustrate the benefits of RR. PMID:26761740
Environmental Conditions in Kentucky's Penal Institutions
ERIC Educational Resources Information Center
Bell, Irving
1974-01-01
A state task force was organized to identify health or environmental deficiencies existing in Kentucky penal institutions. Based on information gained through direct observation and inmate questionnaires, the task force concluded that many hazardous and unsanitary conditions existed, and recommended that immediate action be given to these…
NCAA Penalizes Fewer Teams than Expected
ERIC Educational Resources Information Center
Sander, Libby
2008-01-01
This article reports that the National Collegiate Athletic Association (NCAA) has penalized fewer teams than it expected this year over athletes' poor academic performance. For years, officials with the NCAA have predicted that strikingly high numbers of college sports teams could be at risk of losing scholarships this year because of their…
Sample Size Determination for Regression Models Using Monte Carlo Methods in R
ERIC Educational Resources Information Center
Beaujean, A. Alexander
2014-01-01
A common question asked by researchers using regression models is, What sample size is needed for my study? While there are formulae to estimate sample sizes, their assumptions are often not met in the collected data. A more realistic approach to sample size determination requires more information such as the model of interest, strength of the…
ERIC Educational Resources Information Center
Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.
2012-01-01
In a traditional regression-discontinuity design (RDD), units are assigned to treatment and comparison conditions solely on the basis of a single cutoff score on a continuous assignment variable. The discontinuity in the functional form of the outcome at the cutoff represents the treatment effect, or the average treatment effect at the cutoff.…
Estimating R-squared Shrinkage in Multiple Regression: A Comparison of Different Analytical Methods.
ERIC Educational Resources Information Center
Yin, Ping; Fan, Xitao
2001-01-01
Studied the effectiveness of various analytical formulas for estimating "R" squared shrinkage in multiple regression analysis, focusing on estimators of the squared population multiple correlation coefficient and the squared population cross validity coefficient. Simulation results suggest that the most widely used Wherry (R. Wherry, 1931) formula…
Correcting Measurement Error in Latent Regression Covariates via the MC-SIMEX Method
ERIC Educational Resources Information Center
Rutkowski, Leslie; Zhou, Yan
2015-01-01
Given the importance of large-scale assessments to educational policy conversations, it is critical that subpopulation achievement is estimated reliably and with sufficient precision. Despite this importance, biased subpopulation estimates have been found to occur when variables in the conditioning model side of a latent regression model contain…
Using regression methods to estimate stream phosphorus loads at the Illinois River, Arkansas
Haggard, B.E.; Soerens, T.S.; Green, W.R.; Richards, R.P.
2003-01-01
The development of total maximum daily loads (TMDLs) requires evaluating existing constituent loads in streams. Accurate estimates of constituent loads are needed to calibrate watershed and reservoir models for TMDL development. The best approach to estimate constituent loads is high frequency sampling, particularly during storm events, and mass integration of constituents passing a point in a stream. Most often, resources are limited and discrete water quality samples are collected on fixed intervals and sometimes supplemented with directed sampling during storm events. When resources are limited, mass integration is not an accurate means to determine constituent loads and other load estimation techniques such as regression models are used. The objective of this work was to determine a minimum number of water-quality samples needed to provide constituent concentration data adequate to estimate constituent loads at a large stream. Twenty sets of water quality samples with and without supplemental storm samples were randomly selected at various fixed intervals from a database at the Illinois River, northwest Arkansas. The random sets were used to estimate total phosphorus (TP) loads using regression models. The regression-based annual TP loads were compared to the integrated annual TP load estimated using all the data. At a minimum, monthly sampling plus supplemental storm samples (six samples per year) was needed to produce a root mean square error of less than 15%. Water quality samples should be collected at least semi-monthly (every 15 days) in studies less than two years if seasonal time factors are to be used in the regression models. Annual TP loads estimated from independently collected discrete water quality samples further demonstrated the utility of using regression models to estimate annual TP loads in this stream system.
Technology Transfer Automated Retrieval System (TEKTRAN)
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...
Penalized Spline: a General Robust Trajectory Model for ZIYUAN-3 Satellite
NASA Astrophysics Data System (ADS)
Pan, H.; Zou, Z.
2016-06-01
Owing to the dynamic imaging system, the trajectory model plays a very important role in the geometric processing of high resolution satellite imagery. However, establishing a trajectory model is difficult when only discrete and noisy data are available. In this manuscript, we proposed a general robust trajectory model, the penalized spline model, which could fit trajectory data well and smooth noise. The penalized parameter λ controlling the smooth and fitting accuracy could be estimated by generalized cross-validation. Five other trajectory models, including third-order polynomials, Chebyshev polynomials, linear interpolation, Lagrange interpolation and cubic spline, are compared with the penalized spline model. Both the sophisticated ephemeris and on-board ephemeris are used to compare the orbit models. The penalized spline model could smooth part of noise, and accuracy would decrease as the orbit length increases. The band-to-band misregistration of ZiYuan-3 Dengfeng and Faizabad multispectral images is used to evaluate the proposed method. With the Dengfeng dataset, the third-order polynomials and Chebyshev approximation could not model the oscillation, and introduce misregistration of 0.57 pixels misregistration in across-track direction and 0.33 pixels in along-track direction. With the Faizabad dataset, the linear interpolation, Lagrange interpolation and cubic spline model suffer from noise, introducing larger misregistration than the approximation models. Experimental results suggest the penalized spline model could model the oscillation and smooth noise.
NASA Astrophysics Data System (ADS)
Yang, Jianhong; Yi, Cancan; Xu, Jinwu; Ma, Xianghong
2015-05-01
A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.
Lee, Soo Min; Lee, Jae-Won
2014-11-01
In this study, the optimal conditions for biomass torrefaction were determined by comparing the gain of energy content to the weight loss of biomass from the final products. Torrefaction experiments were performed at temperatures ranging from 220 to 280°C using 20-80min reaction times. Polynomial regression models ranging from the 1st to the 3rd order were used to determine a relationship between the severity factor (SF) and calorific value or weight loss. The intersection of two regression models for calorific value and weight loss was determined and assumed to be the optimized SF. The optimized SFs on each biomass ranged from 6.056 to 6.372. Optimized torrefaction conditions were determined at various reaction times of 15, 30, and 60min. The average optimized temperature was 248.55°C in the studied biomass when torrefaction was performed for 60min.
NASA Astrophysics Data System (ADS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels
Korany, Mohamed A; Maher, Hadir M; Galal, Shereen M; Ragab, Marwa A A
2013-05-01
This manuscript discusses the application and the comparison between three statistical regression methods for handling data: parametric, nonparametric, and weighted regression (WR). These data were obtained from different chemometric methods applied to the high-performance liquid chromatography response data using the internal standard method. This was performed on a model drug Acyclovir which was analyzed in human plasma with the use of ganciclovir as internal standard. In vivo study was also performed. Derivative treatment of chromatographic response ratio data was followed by convolution of the resulting derivative curves using 8-points sin x i polynomials (discrete Fourier functions). This work studies and also compares the application of WR method and Theil's method, a nonparametric regression (NPR) method with the least squares parametric regression (LSPR) method, which is considered the de facto standard method used for regression. When the assumption of homoscedasticity is not met for analytical data, a simple and effective way to counteract the great influence of the high concentrations on the fitted regression line is to use WR method. WR was found to be superior to the method of LSPR as the former assumes that the y-direction error in the calibration curve will increase as x increases. Theil's NPR method was also found to be superior to the method of LSPR as the former assumes that errors could occur in both x- and y-directions and that might not be normally distributed. Most of the results showed a significant improvement in the precision and accuracy on applying WR and NPR methods relative to LSPR.
A method to determine the necessity for global signal regression in resting-state fMRI studies.
Chen, Gang; Chen, Guangyu; Xie, Chunming; Ward, B Douglas; Li, Wenjun; Antuono, Piero; Li, Shi-Jiang
2012-12-01
In resting-state functional MRI studies, the global signal (operationally defined as the global average of resting-state functional MRI time courses) is often considered a nuisance effect and commonly removed in preprocessing. This global signal regression method can introduce artifacts, such as false anticorrelated resting-state networks in functional connectivity analyses. Therefore, the efficacy of this technique as a correction tool remains questionable. In this article, we establish that the accuracy of the estimated global signal is determined by the level of global noise (i.e., non-neural noise that has a global effect on the resting-state functional MRI signal). When the global noise level is low, the global signal resembles the resting-state functional MRI time courses of the largest cluster, but not those of the global noise. Using real data, we demonstrate that the global signal is strongly correlated with the default mode network components and has biological significance. These results call into question whether or not global signal regression should be applied. We introduce a method to quantify global noise levels. We show that a criteria for global signal regression can be found based on the method. By using the criteria, one can determine whether to include or exclude the global signal regression in minimizing errors in functional connectivity measures.
Goodenough, Anne E.; Hart, Adam G.; Stafford, Richard
2012-01-01
Despite recent papers on problems associated with full-model and stepwise regression, their use is still common throughout ecological and environmental disciplines. Alternative approaches, including generating multiple models and comparing them post-hoc using techniques such as Akaike's Information Criterion (AIC), are becoming more popular. However, these are problematic when there are numerous independent variables and interpretation is often difficult when competing models contain many different variables and combinations of variables. Here, we detail a new approach, REVS (Regression with Empirical Variable Selection), which uses all-subsets regression to quantify empirical support for every independent variable. A series of models is created; the first containing the variable with most empirical support, the second containing the first variable and the next most-supported, and so on. The comparatively small number of resultant models (n = the number of predictor variables) means that post-hoc comparison is comparatively quick and easy. When tested on a real dataset – habitat and offspring quality in the great tit (Parus major) – the optimal REVS model explained more variance (higher R2), was more parsimonious (lower AIC), and had greater significance (lower P values), than full, stepwise or all-subsets models; it also had higher predictive accuracy based on split-sample validation. Testing REVS on ten further datasets suggested that this is typical, with R2 values being higher than full or stepwise models (mean improvement = 31% and 7%, respectively). Results are ecologically intuitive as even when there are several competing models, they share a set of “core” variables and differ only in presence/absence of one or two additional variables. We conclude that REVS is useful for analysing complex datasets, including those in ecology and environmental disciplines. PMID:22479605
Regression to fuzziness method for estimation of remaining useful life in power plant components
NASA Astrophysics Data System (ADS)
Alamaniotis, Miltiadis; Grelle, Austin; Tsoukalas, Lefteri H.
2014-10-01
Mitigation of severe accidents in power plants requires the reliable operation of all systems and the on-time replacement of mechanical components. Therefore, the continuous surveillance of power systems is a crucial concern for the overall safety, cost control, and on-time maintenance of a power plant. In this paper a methodology called regression to fuzziness is presented that estimates the remaining useful life (RUL) of power plant components. The RUL is defined as the difference between the time that a measurement was taken and the estimated failure time of that component. The methodology aims to compensate for a potential lack of historical data by modeling an expert's operational experience and expertise applied to the system. It initially identifies critical degradation parameters and their associated value range. Once completed, the operator's experience is modeled through fuzzy sets which span the entire parameter range. This model is then synergistically used with linear regression and a component's failure point to estimate the RUL. The proposed methodology is tested on estimating the RUL of a turbine (the basic electrical generating component of a power plant) in three different cases. Results demonstrate the benefits of the methodology for components for which operational data is not readily available and emphasize the significance of the selection of fuzzy sets and the effect of knowledge representation on the predicted output. To verify the effectiveness of the methodology, it was benchmarked against the data-based simple linear regression model used for predictions which was shown to perform equal or worse than the presented methodology. Furthermore, methodology comparison highlighted the improvement in estimation offered by the adoption of appropriate of fuzzy sets for parameter representation.
Tanaka, Kenichi; Tateoka, Kunihiko; Asanuma, Osamu; Kamo, Ken-ichi; Sato, Kaori; Takeda, Hiromitsu; Takagi, Masaru; Hareyama, Masato; Takada, Jun
2014-01-01
The post-implantation dosimetry for brachytherapy using Monte Carlo calculation by EGS5 code combined with the source strength regression was investigated with respect to its validity. In this method, the source strength for the EGS5 calculation was adjusted with the regression, so that the calculation would reproduce the dose monitored with the glass rod dosimeters (GRDs) on a water phantom. The experiments were performed, simulating the case where one of two 125I sources of Oncoseed 6711 was lacking strength by 4–48%. As a result, the calculation without regression was in agreement with the GRD measurement within 26–62%. In this case, the shortage in strength of a source was neglected. By the regression, in order to reflect the strength shortage, the agreement was improved up to 17–24%. This agreement was also comparable with accuracy of the dose calculation for single source geometry reported previously. These results suggest the validity of the dosimetry method proposed in this study. PMID:24449715
Quirós, Elia; Felicísimo, Angel M; Cuartero, Aurora
2009-01-01
This work proposes a new method to classify multi-spectral satellite images based on multivariate adaptive regression splines (MARS) and compares this classification system with the more common parallelepiped and maximum likelihood (ML) methods. We apply the classification methods to the land cover classification of a test zone located in southwestern Spain. The basis of the MARS method and its associated procedures are explained in detail, and the area under the ROC curve (AUC) is compared for the three methods. The results show that the MARS method provides better results than the parallelepiped method in all cases, and it provides better results than the maximum likelihood method in 13 cases out of 17. These results demonstrate that the MARS method can be used in isolation or in combination with other methods to improve the accuracy of soil cover classification. The improvement is statistically significant according to the Wilcoxon signed rank test. PMID:22291550
Kolasa-Wiecek, Alicja
2015-04-01
The energy sector in Poland is the source of 81% of greenhouse gas (GHG) emissions. Poland, among other European Union countries, occupies a leading position with regard to coal consumption. Polish energy sector actively participates in efforts to reduce GHG emissions to the atmosphere, through a gradual decrease of the share of coal in the fuel mix and development of renewable energy sources. All evidence which completes the knowledge about issues related to GHG emissions is a valuable source of information. The article presents the results of modeling of GHG emissions which are generated by the energy sector in Poland. For a better understanding of the quantitative relationship between total consumption of primary energy and greenhouse gas emission, multiple stepwise regression model was applied. The modeling results of CO2 emissions demonstrate a high relationship (0.97) with the hard coal consumption variable. Adjustment coefficient of the model to actual data is high and equal to 95%. The backward step regression model, in the case of CH4 emission, indicated the presence of hard coal (0.66), peat and fuel wood (0.34), solid waste fuels, as well as other sources (-0.64) as the most important variables. The adjusted coefficient is suitable and equals R2=0.90. For N2O emission modeling the obtained coefficient of determination is low and equal to 43%. A significant variable influencing the amount of N2O emission is the peat and wood fuel consumption.
Analyzing Association Mapping in Pedigree-Based GWAS Using a Penalized Multitrait Mixed Model.
Liu, Jin; Yang, Can; Shi, Xingjie; Li, Cong; Huang, Jian; Zhao, Hongyu; Ma, Shuangge
2016-07-01
Genome-wide association studies (GWAS) have led to the identification of many genetic variants associated with complex diseases in the past 10 years. Penalization methods, with significant numerical and statistical advantages, have been extensively adopted in analyzing GWAS. This study has been partly motivated by the analysis of Genetic Analysis Workshop (GAW) 18 data, which have two notable characteristics. First, the subjects are from a small number of pedigrees and hence related. Second, for each subject, multiple correlated traits have been measured. Most of the existing penalization methods assume independence between subjects and traits and can be suboptimal. There are a few methods in the literature based on mixed modeling that can accommodate correlations. However, they cannot fully accommodate the two types of correlations while conducting effective marker selection. In this study, we develop a penalized multitrait mixed modeling approach. It accommodates the two different types of correlations and includes several existing methods as special cases. Effective penalization is adopted for marker selection. Simulation demonstrates its satisfactory performance. The GAW 18 data are analyzed using the proposed method. PMID:27247027
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William
2014-01-01
Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies. PMID:24992657
Doran, Kara S.; Howd, Peter A.; Sallenger,, Asbury H., Jr.
2015-01-01
Recent studies, and most of their predecessors, use tide gage data to quantify SL acceleration, ASL(t). In the current study, three techniques were used to calculate acceleration from tide gage data, and of those examined, it was determined that the two techniques based on sliding a regression window through the time series are more robust compared to the technique that fits a single quadratic form to the entire time series, particularly if there is temporal variation in the magnitude of the acceleration. The single-fit quadratic regression method has been the most commonly used technique in determining acceleration in tide gage data. The inability of the single-fit method to account for time-varying acceleration may explain some of the inconsistent findings between investigators. Properly quantifying ASL(t) from field measurements is of particular importance in evaluating numerical models of past, present, and future SLR resulting from anticipated climate change.
Doran, Kara S.; Howd, Peter A.; Sallenger,, Asbury H.
2016-01-04
Recent studies, and most of their predecessors, use tide gage data to quantify SL acceleration, ASL(t). In the current study, three techniques were used to calculate acceleration from tide gage data, and of those examined, it was determined that the two techniques based on sliding a regression window through the time series are more robust compared to the technique that fits a single quadratic form to the entire time series, particularly if there is temporal variation in the magnitude of the acceleration. The single-fit quadratic regression method has been the most commonly used technique in determining acceleration in tide gage data. The inability of the single-fit method to account for time-varying acceleration may explain some of the inconsistent findings between investigators. Properly quantifying ASL(t) from field measurements is of particular importance in evaluating numerical models of past, present, and future SLR resulting from anticipated climate change.
Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William
2014-01-01
Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies.
Penalized maximum-likelihood image reconstruction for lesion detection
NASA Astrophysics Data System (ADS)
Qi, Jinyi; Huesman, Ronald H.
2006-08-01
Detecting cancerous lesions is one major application in emission tomography. In this paper, we study penalized maximum-likelihood image reconstruction for this important clinical task. Compared to analytical reconstruction methods, statistical approaches can improve the image quality by accurately modelling the photon detection process and measurement noise in imaging systems. To explore the full potential of penalized maximum-likelihood image reconstruction for lesion detection, we derived simplified theoretical expressions that allow fast evaluation of the detectability of a random lesion. The theoretical results are used to design the regularization parameters to improve lesion detectability. We conducted computer-based Monte Carlo simulations to compare the proposed penalty function, conventional penalty function, and a penalty function for isotropic point spread function. The lesion detectability is measured by a channelized Hotelling observer. The results show that the proposed penalty function outperforms the other penalty functions for lesion detection. The relative improvement is dependent on the size of the lesion. However, we found that the penalty function optimized for a 5 mm lesion still outperforms the other two penalty functions for detecting a 14 mm lesion. Therefore, it is feasible to use the penalty function designed for small lesions in image reconstruction, because detection of large lesions is relatively easy.
NASA Technical Reports Server (NTRS)
Hopkins, Dale A.
1998-01-01
A key challenge in designing the new High Speed Civil Transport (HSCT) aircraft is determining a good match between the airframe and engine. Multidisciplinary design optimization can be used to solve the problem by adjusting parameters of both the engine and the airframe. Earlier, an example problem was presented of an HSCT aircraft with four mixed-flow turbofan engines and a baseline mission to carry 305 passengers 5000 nautical miles at a cruise speed of Mach 2.4. The problem was solved by coupling NASA Lewis Research Center's design optimization testbed (COMETBOARDS) with NASA Langley Research Center's Flight Optimization System (FLOPS). The computing time expended in solving the problem was substantial, and the instability of the FLOPS analyzer at certain design points caused difficulties. In an attempt to alleviate both of these limitations, we explored the use of two approximation concepts in the design optimization process. The two concepts, which are based on neural network and linear regression approximation, provide the reanalysis capability and design sensitivity analysis information required for the optimization process. The HSCT aircraft optimization problem was solved by using three alternate approaches; that is, the original FLOPS analyzer and two approximate (derived) analyzers. The approximate analyzers were calibrated and used in three different ranges of the design variables; narrow (interpolated), standard, and wide (extrapolated).
Cabi, Cemalettin; Sayman Muslubas, Isil Bahar; Aydin Oral, Ayse Yesim; Dastan, Metin
2014-01-01
AIM To compare the efficacies of patching and penalization therapies for the treatment of amblyopia patients. METHODS The records of 64 eyes of 50 patients 7 to 16y of age who had presented to our clinics with a diagnosis of amblyopia, were evaluated retrospectively. Forty eyes of 26 patients who had received patching therapy and 24 eyes of 24 patients who had received penalization therapy included in this study. The latencies and amplitudes of visual evoked potential (VEP) records and best corrected visual acuities (BCVA) of these two groups were compared before and six months after the treatment. RESULTS In both patching and the penalization groups, the visual acuities increased significantly following the treatments (P<0.05). The latency measurements of the P100 wave obtained at 1.0°, 15 arc min. Patterns of both groups significantly decreased following the 6-months-treatment. However, the amplitude measurements increased (P<0.05). CONCLUSION The patching and the penalization methods, which are the main methods used in the treatment of amblyopia, were also effective over the age of 7y, which has been accepted as the critical age for the treatment of amblyopia. PMID:24967195
NASA Astrophysics Data System (ADS)
Saeidi, Omid; Torabi, Seyed Rahman; Ataei, Mohammad
2014-03-01
Rock mass classification systems are one of the most common ways of determining rock mass excavatability and related equipment assessment. However, the strength and weak points of such rating-based classifications have always been questionable. Such classification systems assign quantifiable values to predefined classified geotechnical parameters of rock mass. This causes particular ambiguities, leading to the misuse of such classifications in practical applications. Recently, intelligence system approaches such as artificial neural networks (ANNs) and neuro-fuzzy methods, along with multiple regression models, have been used successfully to overcome such uncertainties. The purpose of the present study is the construction of several models by using an adaptive neuro-fuzzy inference system (ANFIS) method with two data clustering approaches, including fuzzy c-means (FCM) clustering and subtractive clustering, an ANN and non-linear multiple regression to estimate the basic rock mass diggability index. A set of data from several case studies was used to obtain the real rock mass diggability index and compared to the predicted values by the constructed models. In conclusion, it was observed that ANFIS based on the FCM model shows higher accuracy and correlation with actual data compared to that of the ANN and multiple regression. As a result, one can use the assimilation of ANNs with fuzzy clustering-based models to construct such rigorous predictor tools.
Du, Hongying; Hu, Zhide; Bazzoli, Andrea; Zhang, Yang
2011-01-01
The epidermal growth factor receptor (EGFR) protein tyrosine kinase (PTK) is an important protein target for anti-tumor drug discovery. To identify potential EGFR inhibitors, we conducted a quantitative structure–activity relationship (QSAR) study on the inhibitory activity of a series of quinazoline derivatives against EGFR tyrosine kinase. Two 2D-QSAR models were developed based on the best multi-linear regression (BMLR) and grid-search assisted projection pursuit regression (GS-PPR) methods. The results demonstrate that the inhibitory activity of quinazoline derivatives is strongly correlated with their polarizability, activation energy, mass distribution, connectivity, and branching information. Although the present investigation focused on EGFR, the approach provides a general avenue in the structure-based drug development of different protein receptor inhibitors. PMID:21811593
NASA Astrophysics Data System (ADS)
Bitter, Christopher; Mulligan, Gordon F.; Dall'Erba, Sandy
2007-04-01
Hedonic house price models typically impose a constant price structure on housing characteristics throughout an entire market area. However, there is increasing evidence that the marginal prices of many important attributes vary over space, especially within large markets. In this paper, we compare two approaches to examine spatial heterogeneity in housing attribute prices within the Tucson, Arizona housing market: the spatial expansion method and geographically weighted regression (GWR). Our results provide strong evidence that the marginal price of key housing characteristics varies over space. GWR outperforms the spatial expansion method in terms of explanatory power and predictive accuracy.
Chiu, Chuan-Hung; Wen, Tzai-Hung; Chien, Lung-Chang; Yu, Hwa-Lung
2014-01-01
Understanding the spatial characteristics of dengue fever (DF) incidences is crucial for governmental agencies to implement effective disease control strategies. We investigated the associations between environmental and socioeconomic factors and DF geographic distribution, are proposed a probabilistic risk assessment approach that uses threshold-based quantile regression to identify the significant risk factors for DF transmission and estimate the spatial distribution of DF risk regarding full probability distributions. To interpret risk, return period was also included to characterize the frequency pattern of DF geographic occurrences. The study area included old Kaohsiung City and Fongshan District, two areas in Taiwan that have been affected by severe DF infections in recent decades. Results indicated that water-related facilities, including canals and ditches, and various types of residential area, as well as the interactions between them, were significant factors that elevated DF risk. By contrast, the increase of per capita income and its associated interactions with residential areas mitigated the DF risk in the study area. Nonlinear associations between these factors and DF risk were present in various quantiles, implying that water-related factors characterized the underlying spatial patterns of DF, and high-density residential areas indicated the potential for high DF incidence (e.g., clustered infections). The spatial distributions of DF risks were assessed in terms of three distinct map presentations: expected incidence rates, incidence rates in various return periods, and return periods at distinct incidence rates. These probability-based spatial risk maps exhibited distinct DF risks associated with environmental factors, expressed as various DF magnitudes and occurrence probabilities across Kaohsiung, and can serve as a reference for local governmental agencies.
Using LASSO Regression to Predict Rheumatoid Arthritis Treatment Efficacy
Odgers, David J.; Tellis, Natalie; Hall, Heather; Dumontier, Michel
2016-01-01
Rheumatoid arthritis (RA) accounts for one-fifth of the deaths due to arthritis, the leading cause of disability in the United States. Finding effective treatments for managing arthritis symptoms are a major challenge, since the mechanisms of autoimmune disorders are not fully understood and disease presentation differs for each patient. The American College of Rheumatology clinical guidelines for treatment consider the severity of the disease when deciding treatment, but do not include any prediction of drug efficacy. Using Electronic Health Records and Biomedical Linked Open Data (LOD), we demonstrate a method to classify patient outcomes using LASSO penalized regression. We show how Linked Data improves prediction and provides insight into how drug treatment regimes have different treatment outcome. Applying classifiers like this to decision support in clinical applications could decrease time to successful disease management, lessening a physical and financial burden on patients individually and the healthcare system as a whole. PMID:27570666
Using LASSO Regression to Predict Rheumatoid Arthritis Treatment Efficacy.
Odgers, David J; Tellis, Natalie; Hall, Heather; Dumontier, Michel
2016-01-01
Rheumatoid arthritis (RA) accounts for one-fifth of the deaths due to arthritis, the leading cause of disability in the United States. Finding effective treatments for managing arthritis symptoms are a major challenge, since the mechanisms of autoimmune disorders are not fully understood and disease presentation differs for each patient. The American College of Rheumatology clinical guidelines for treatment consider the severity of the disease when deciding treatment, but do not include any prediction of drug efficacy. Using Electronic Health Records and Biomedical Linked Open Data (LOD), we demonstrate a method to classify patient outcomes using LASSO penalized regression. We show how Linked Data improves prediction and provides insight into how drug treatment regimes have different treatment outcome. Applying classifiers like this to decision support in clinical applications could decrease time to successful disease management, lessening a physical and financial burden on patients individually and the healthcare system as a whole. PMID:27570666
Heinze, Georg; Ploner, Meinhard; Beyea, Jan
2013-12-20
In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter β at level 1 - α to be identified as those β* and β** that satisfy CDF c (β*) = α ∕ 2 and CDF c (β**) = 1 - α ∕ 2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. PMID:23873477
ERIC Educational Resources Information Center
Gilstrap, Donald L.
2013-01-01
In addition to qualitative methods presented in chaos and complexity theories in educational research, this article addresses quantitative methods that may show potential for future research studies. Although much in the social and behavioral sciences literature has focused on computer simulations, this article explores current chaos and…
Deng, Zhaohong; Choi, Kup-Sze; Jiang, Yizhang; Wang, Shitong
2014-12-01
Inductive transfer learning has attracted increasing attention for the training of effective model in the target domain by leveraging the information in the source domain. However, most transfer learning methods are developed for a specific model, such as the commonly used support vector machine, which makes the methods applicable only to the adopted models. In this regard, the generalized hidden-mapping ridge regression (GHRR) method is introduced in order to train various types of classical intelligence models, including neural networks, fuzzy logical systems and kernel methods. Furthermore, the knowledge-leverage based transfer learning mechanism is integrated with GHRR to realize the inductive transfer learning method called transfer GHRR (TGHRR). Since the information from the induced knowledge is much clearer and more concise than that from the data in the source domain, it is more convenient to control and balance the similarity and difference of data distributions between the source and target domains. The proposed GHRR and TGHRR algorithms have been evaluated experimentally by performing regression and classification on synthetic and real world datasets. The results demonstrate that the performance of TGHRR is competitive with or even superior to existing state-of-the-art inductive transfer learning algorithms.
Xiao, Yongling; Abrahamowicz, Michal
2010-03-30
We propose two bootstrap-based methods to correct the standard errors (SEs) from Cox's model for within-cluster correlation of right-censored event times. The cluster-bootstrap method resamples, with replacement, only the clusters, whereas the two-step bootstrap method resamples (i) the clusters, and (ii) individuals within each selected cluster, with replacement. In simulations, we evaluate both methods and compare them with the existing robust variance estimator and the shared gamma frailty model, which are available in statistical software packages. We simulate clustered event time data, with latent cluster-level random effects, which are ignored in the conventional Cox's model. For cluster-level covariates, both proposed bootstrap methods yield accurate SEs, and type I error rates, and acceptable coverage rates, regardless of the true random effects distribution, and avoid serious variance under-estimation by conventional Cox-based standard errors. However, the two-step bootstrap method over-estimates the variance for individual-level covariates. We also apply the proposed bootstrap methods to obtain confidence bands around flexible estimates of time-dependent effects in a real-life analysis of cluster event times.
NASA Astrophysics Data System (ADS)
Borodachev, S. M.
2016-06-01
The simple derivation of recursive least squares (RLS) method equations is given as special case of Kalman filter estimation of a constant system state under changing observation conditions. A numerical example illustrates application of RLS to multicollinearity problem.
Huang, Lei
2015-09-30
To solve the problem in which the conventional ARMA modeling methods for gyro random noise require a large number of samples and converge slowly, an ARMA modeling method using a robust Kalman filtering is developed. The ARMA model parameters are employed as state arguments. Unknown time-varying estimators of observation noise are used to achieve the estimated mean and variance of the observation noise. Using the robust Kalman filtering, the ARMA model parameters are estimated accurately. The developed ARMA modeling method has the advantages of a rapid convergence and high accuracy. Thus, the required sample size is reduced. It can be applied to modeling applications for gyro random noise in which a fast and accurate ARMA modeling method is required.
ERIC Educational Resources Information Center
Coskuntuncel, Orkun
2013-01-01
The purpose of this study is two-fold; the first aim being to show the effect of outliers on the widely used least squares regression estimator in social sciences. The second aim is to compare the classical method of least squares with the robust M-estimator using the "determination of coefficient" (R[superscript 2]). For this purpose,…
Stride, P
2011-09-01
Robert Garrett emigrated from Scotland to Van Diemen's Land (now Tasmania) in 1822. Within a few months of arrival he was posted to the barbaric penal colony in Macquarie Harbour, known as Sarah Island. His descent into alcoholism, medical misadventure and premature death were related to his largely unsupported professional environment and were, in many respects, typical of those subjected to this experience.
Mohammed, Mohammed A; Manktelow, Bradley N; Hofer, Timothy P
2016-04-01
There is interest in deriving case-mix adjusted standardised mortality ratios so that comparisons between healthcare providers, such as hospitals, can be undertaken in the controversial belief that variability in standardised mortality ratios reflects quality of care. Typically standardised mortality ratios are derived using a fixed effects logistic regression model, without a hospital term in the model. This fails to account for the hierarchical structure of the data - patients nested within hospitals - and so a hierarchical logistic regression model is more appropriate. However, four methods have been advocated for deriving standardised mortality ratios from a hierarchical logistic regression model, but their agreement is not known and neither do we know which is to be preferred. We found significant differences between the four types of standardised mortality ratios because they reflect a range of underlying conceptual issues. The most subtle issue is the distinction between asking how an average patient fares in different hospitals versus how patients at a given hospital fare at an average hospital. Since the answers to these questions are not the same and since the choice between these two approaches is not obvious, the extent to which profiling hospitals on mortality can be undertaken safely and reliably, without resolving these methodological issues, remains questionable.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
The Dutch penal law and homosexual conduct.
Salden, M
The history of changes in Dutch penal law regulating homosexual conduct since the 18th century are traced and their effects on homosexual behavior described. Changes in policies and practices regarding enforcement are reviewed. The article discusses the Dutch criminal code of 1886, the criminalization of homosexual contacts involving minors in 1911, the criminalization of male homosexuality from 1941 to 1945, and the progressive relaxation of the law since World War II, resulting in the decriminalization in 1971 of homosexual contacts involving minors and the draft in 1981 for a bill that would prohibit discrimination against homosexuals.
Boundary integral equation method calculations of surface regression effects in flame spreading
NASA Technical Reports Server (NTRS)
Altenkirch, R. A.; Rezayat, M.; Eichhorn, R.; Rizzo, F. J.
1982-01-01
A solid-phase conduction problem that is a modified version of one that has been treated previously in the literature and is applicable to flame spreading over a pyrolyzing fuel is solved using a boundary integral equation (BIE) method. Results are compared to surface temperature measurements that can be found in the literature. In addition, the heat conducted through the solid forward of the flame, the heat transfer responsible for sustaining the flame, is also computed in terms of the Peclet number based on a heated layer depth using the BIE method and approximate methods based on asymptotic expansions. Agreement between computed and experimental results is quite good as is agreement between the BIE and the approximate results.
Technology Transfer Automated Retrieval System (TEKTRAN)
The beard testing method for measuring cotton fiber length is based on the fibrogram theory. However, in the instrumental implementations, the engineering complexity alters the original fiber length distribution observed by the instrument. This causes challenges in obtaining the entire original le...
Using a Linear Regression Method to Detect Outliers in IRT Common Item Equating
ERIC Educational Resources Information Center
He, Yong; Cui, Zhongmin; Fang, Yu; Chen, Hanwei
2013-01-01
Common test items play an important role in equating alternate test forms under the common item nonequivalent groups design. When the item response theory (IRT) method is applied in equating, inconsistent item parameter estimates among common items can lead to large bias in equated scores. It is prudent to evaluate inconsistency in parameter…
Dai, Huanping; Micheyl, Christophe
2012-11-01
Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.
[Analysis of selected changes in project the penal code].
Berent, Jarosław; Jurczyk, Agnieszka P; Szram, Stefan
2002-01-01
In this paper the authors have analysed selected proposals of changes in the project of amendments in the penal code. Special attention has been placed on problem of the legality of the "comma" in art. 156 of the penal code. In this matter also a review of court jurisdiction has been made.
Stojić, Andreja; Maletić, Dimitrije; Stanišić Stojić, Svetlana; Mijić, Zoran; Šoštarić, Andrej
2015-07-15
In this study, advanced multivariate methods were applied for VOC source apportionment and subsequent short-term forecast of industrial- and vehicle exhaust-related contributions in Belgrade urban area (Serbia). The VOC concentrations were measured using PTR-MS, together with inorganic gaseous pollutants (NOx, NO, NO2, SO2, and CO), PM10, and meteorological parameters. US EPA Positive Matrix Factorization and Unmix receptor models were applied to the obtained dataset both resolving six source profiles. For the purpose of forecasting industrial- and vehicle exhaust-related source contributions, different multivariate methods were employed in two separate cases, relying on meteorological data, and on meteorological data and concentrations of inorganic gaseous pollutants, respectively. The results indicate that Boosted Decision Trees and Multi-Layer Perceptrons were the best performing methods. According to the results, forecasting accuracy was high (lowest relative error of only 6%), in particular when the forecast was based on both meteorological parameters and concentrations of inorganic gaseous pollutants. PMID:25828408
NASA Astrophysics Data System (ADS)
Salonen, J. Sakari; Luoto, Miska; Alenius, Teija; Heikkilä, Maija; Seppä, Heikki; Telford, Richard J.; Birks, H. John B.
2014-03-01
We test and analyse a new calibration method, boosted regression trees (BRTs) in palaeoclimatic reconstructions based on fossil pollen assemblages. We apply BRTs to multiple Holocene and Lateglacial pollen sequences from northern Europe, and compare their performance with two commonly-used calibration methods: weighted averaging regression (WA) and the modern-analogue technique (MAT). Using these calibration methods and fossil pollen data, we present synthetic reconstructions of Holocene summer temperature, winter temperature, and water balance changes in northern Europe. Highly consistent trends are found for summer temperature, with a distinct Holocene thermal maximum at ca 8000-4000 cal. a BP, with a mean Tjja anomaly of ca +0.7 °C at 6 ka compared to 0.5 ka. We were unable to reconstruct reliably winter temperature or water balance, due to the confounding effects of summer temperature and the great between-reconstruction variability. We find BRTs to be a promising tool for quantitative reconstructions from palaeoenvironmental proxy data. BRTs show good performance in cross-validations compared with WA and MAT, can model a variety of taxon response types, find relevant predictors and incorporate interactions between predictors, and show some robustness with non-analogue fossil assemblages.
NASA Astrophysics Data System (ADS)
Dogulu, N.; López López, P.; Solomatine, D. P.; Weerts, A. H.; Shrestha, D. L.
2015-07-01
In operational hydrology, estimation of the predictive uncertainty of hydrological models used for flood modelling is essential for risk-based decision making for flood warning and emergency management. In the literature, there exists a variety of methods analysing and predicting uncertainty. However, studies devoted to comparing the performance of the methods in predicting uncertainty are limited. This paper focuses on the methods predicting model residual uncertainty that differ in methodological complexity: quantile regression (QR) and UNcertainty Estimation based on local Errors and Clustering (UNEEC). The comparison of the methods is aimed at investigating how well a simpler method using fewer input data performs over a more complex method with more predictors. We test these two methods on several catchments from the UK that vary in hydrological characteristics and the models used. Special attention is given to the methods' performance under different hydrological conditions. Furthermore, normality of model residuals in data clusters (identified by UNEEC) is analysed. It is found that basin lag time and forecast lead time have a large impact on the quantification of uncertainty and the presence of normality in model residuals' distribution. In general, it can be said that both methods give similar results. At the same time, it is also shown that the UNEEC method provides better performance than QR for small catchments with the changing hydrological dynamics, i.e. rapid response catchments. It is recommended that more case studies of catchments of distinct hydrologic behaviour, with diverse climatic conditions, and having various hydrological features, be considered.
Tiedeman, C.R.; Kernodle, J.M.; McAda, D.P.
1998-01-01
This report documents the application of nonlinear-regression methods to a numerical model of ground-water flow in the Albuquerque Basin, New Mexico. In the Albuquerque Basin, ground water is the primary source for most water uses. Ground-water withdrawal has steadily increased since the 1940's, resulting in large declines in water levels in the Albuquerque area. A ground-water flow model was developed in 1994 and revised and updated in 1995 for the purpose of managing basin ground- water resources. In the work presented here, nonlinear-regression methods were applied to a modified version of the previous flow model. Goals of this work were to use regression methods to calibrate the model with each of six different configurations of the basin subsurface and to assess and compare optimal parameter estimates, model fit, and model error among the resulting calibrations. The Albuquerque Basin is one in a series of north trending structural basins within the Rio Grande Rift, a region of Cenozoic crustal extension. Mountains, uplifts, and fault zones bound the basin, and rock units within the basin include pre-Santa Fe Group deposits, Tertiary Santa Fe Group basin fill, and post-Santa Fe Group volcanics and sediments. The Santa Fe Group is greater than 14,000 feet (ft) thick in the central part of the basin. During deposition of the Santa Fe Group, crustal extension resulted in development of north trending normal faults with vertical displacements of as much as 30,000 ft. Ground-water flow in the Albuquerque Basin occurs primarily in the Santa Fe Group and post-Santa Fe Group deposits. Water flows between the ground-water system and surface-water bodies in the inner valley of the basin, where the Rio Grande, a network of interconnected canals and drains, and Cochiti Reservoir are located. Recharge to the ground-water flow system occurs as infiltration of precipitation along mountain fronts and infiltration of stream water along tributaries to the Rio Grande; subsurface
[Legal probation of juvenile offenders after release from penal reformative training].
Urbaniok, Frank; Rossegger, Astrid; Fegert, Jörg; Rubertus, Michael; Endrass, Jérôme
2007-01-01
Over recent years, there has been an increase in adolescent delinquency in Germany and Switzerland. In this context, the episodic character of the majority of adolescent delinquency is usually pointed out; however, numerous studies show high re-offending rates for released adolescents. The goal of this study is to examine the legal probation of juvenile delinquents after release from penal reformative training. In this study, the legal probation of adolescents committed to the AEA Uitikon, in the Canton of Zurich, between 1974 and 1986 was scrutinized by examining extracts from their criminal record as of 2003. The period of catamnesis was thus between 17 and 29 years. Overall, 71% of offenders reoffended, 29% with a violent or sexual offence. Bivariate logistic regression showed that the kind of offence committed had no influence on the probability of recidivism. If commitment to the AEA was due to a single offence (as opposed to serial offences), the risk of recidivism was reduced by 71% (OR=0.29). The results of the study show that young delinquents sentenced and committed to penal reformative training have a high recidivism risk. Furthermore, the results point out the importance of the evaluation of the offense-preventive efficacy of penal measures.
Penal measures for drug offences: perspectives from some Asian countries.
Jayasuriya, D C
1984-01-01
The importance of penal measures in the control of drugs has been recognized by various Asian countries during the last three centuries. The countries of the Asian region referred to in this article have legislation providing for different penal measures against drug offences. Severe punitive sanctions, including the death penalty, have been prescribed for serious drug offences by Iran (Islamic Republic of), Malaysia, the Philippines, Singapore, Sri Lanka and Thailand. Several countries in the region have made legal provisions for the compulsory treatment and rehabilitation of drug dependent persons. There is, however, a paucity of research studies on the efficiency of penal measures and approaches in drug control. Given the long tradition of punitive measures and the wide variety of penal approaches adopted to cope with drug-related problems, various Asian countries can provide interesting cases for criminological research on the effectiveness of penal measures in combating drug problems.
NASA Astrophysics Data System (ADS)
Dogulu, N.; López López, P.; Solomatine, D. P.; Weerts, A. H.; Shrestha, D. L.
2014-09-01
In operational hydrology, estimation of predictive uncertainty of hydrological models used for flood modelling is essential for risk based decision making for flood warning and emergency management. In the literature, there exists a variety of methods analyzing and predicting uncertainty. However, case studies comparing performance of these methods, most particularly predictive uncertainty methods, are limited. This paper focuses on two predictive uncertainty methods that differ in their methodological complexity: quantile regression (QR) and UNcertainty Estimation based on local Errors and Clustering (UNEEC), aiming at identifying possible advantages and disadvantages of these methods (both estimating residual uncertainty) based on their comparative performance. We test these two methods on several catchments (from UK) that vary in its hydrological characteristics and models. Special attention is given to the errors for high flow/water level conditions. Furthermore, normality of model residuals is discussed in view of clustering approach employed within the framework of UNEEC method. It is found that basin lag time and forecast lead time have great impact on quantification of uncertainty (in the form of two quantiles) and achievement of normality in model residuals' distribution. In general, uncertainty analysis results from different case studies indicate that both methods give similar results. However, it is also shown that UNEEC method provides better performance than QR for small catchments with changing hydrological dynamics, i.e. rapid response catchments. We recommend that more case studies of catchments from regions of distinct hydrologic behaviour, with diverse climatic conditions, and having various hydrological features be tested.
The cross politics of Ecuador's penal state.
Garces, Chris
2010-01-01
This essay examines inmate "crucifixion protests" in Ecuador's largest prison during 2003-04. It shows how the preventively incarcerated-of whom there are thousands-managed to effectively denounce their extralegal confinement by embodying the violence of the Christian crucifixion story. This form of protest, I argue, simultaneously clarified and obscured the multiple layers of sovereign power that pressed down on urban crime suspects, who found themselves persecuted and forsaken both outside and within the space of the prison. Police enacting zero-tolerance policies in urban neighborhoods are thus a key part of the penal state, as are the politically threatened family members of the indicted, the sensationalized local media, distrustful neighbors, prison guards, and incarcerated mafia. The essay shows how the politico-theological performance of self-crucifixion responded to these internested forms of sovereign violence, and were briefly effective. The inmates' cross intervention hence provides a window into the way sovereignty works in the Ecuadorean penal state, drawing out how incarceration trends and new urban security measures interlink, and produce an array of victims. PMID:20662147
The cross politics of Ecuador's penal state.
Garces, Chris
2010-01-01
This essay examines inmate "crucifixion protests" in Ecuador's largest prison during 2003-04. It shows how the preventively incarcerated-of whom there are thousands-managed to effectively denounce their extralegal confinement by embodying the violence of the Christian crucifixion story. This form of protest, I argue, simultaneously clarified and obscured the multiple layers of sovereign power that pressed down on urban crime suspects, who found themselves persecuted and forsaken both outside and within the space of the prison. Police enacting zero-tolerance policies in urban neighborhoods are thus a key part of the penal state, as are the politically threatened family members of the indicted, the sensationalized local media, distrustful neighbors, prison guards, and incarcerated mafia. The essay shows how the politico-theological performance of self-crucifixion responded to these internested forms of sovereign violence, and were briefly effective. The inmates' cross intervention hence provides a window into the way sovereignty works in the Ecuadorean penal state, drawing out how incarceration trends and new urban security measures interlink, and produce an array of victims.
Three penalized EM-type algorithms for PET image reconstruction.
Teng, Yueyang; Zhang, Tie
2012-06-01
Based on Bayes theory, Green introduced the maximum a posteriori (MAP) algorithm to obtain a smoothing reconstruction for positron emission tomography. This algorithm is flexible and convenient for most of the penalties, but it is hard to guarantee convergence. For a common goal, Fessler penalized a weighted least squares (WLS) estimator by a quadratic penalty and then solved it with the successive over-relaxation (SOR) algorithm, however, the algorithm was time-consuming and difficultly parallelized. Anderson proposed another WLS estimator for faster convergence, on which there were few regularization methods studied. For three regularized estimators above, we develop three new expectation maximization (EM) type algorithms to solve them. Unlike MAP and SOR, the proposed algorithms yield update rules by minimizing the auxiliary functions constructed on the previous iterations, which ensure the cost functions monotonically decreasing. Experimental results demonstrated the robustness and effectiveness of the proposed algorithms.
NASA Astrophysics Data System (ADS)
Baraldi, Piero; Di Maio, Francesco; Turati, Pietro; Zio, Enrico
2015-08-01
In this work, we propose a modification of the traditional Auto Associative Kernel Regression (AAKR) method which enhances the signal reconstruction robustness, i.e., the capability of reconstructing abnormal signals to the values expected in normal conditions. The modification is based on the definition of a new procedure for the computation of the similarity between the present measurements and the historical patterns used to perform the signal reconstructions. The underlying conjecture for this is that malfunctions causing variations of a small number of signals are more frequent than those causing variations of a large number of signals. The proposed method has been applied to real normal condition data collected in an industrial plant for energy production. Its performance has been verified considering synthetic and real malfunctioning. The obtained results show an improvement in the early detection of abnormal conditions and the correct identification of the signals responsible of triggering the detection.
27 CFR 19.245 - Bonds and penal sums of bonds.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Bonds and penal sums of... Bonds and penal sums of bonds. The bonds, and the penal sums thereof, required by this subpart, are as follows: Penal Sum Type of bond Basis Minimum Maximum (a) Operations bond: (1) One plant bond—...
Zhang, L; Liu, X J
2016-01-01
With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations. PMID:27323111
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
Zeng, Yaohui; Breheny, Patrick
2016-01-01
Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. We compare this approach to the ordinary lasso and to GSEA using both simulated and real data. We find that incorporation of prior pathway information can substantially improve the accuracy of gene expression classifiers, and we shed light on several ways in which hypothesis-testing approaches such as GSEA differ from regression approaches with respect to the analysis of pathway data. PMID:27679461
Shrinkage Estimation of Varying Covariate Effects Based On Quantile Regression
Peng, Limin; Xu, Jinfeng; Kutner, Nancy
2013-01-01
Varying covariate effects often manifest meaningful heterogeneity in covariate-response associations. In this paper, we adopt a quantile regression model that assumes linearity at a continuous range of quantile levels as a tool to explore such data dynamics. The consideration of potential non-constancy of covariate effects necessitates a new perspective for variable selection, which, under the assumed quantile regression model, is to retain variables that have effects on all quantiles of interest as well as those that influence only part of quantiles considered. Current work on l1-penalized quantile regression either does not concern varying covariate effects or may not produce consistent variable selection in the presence of covariates with partial effects, a practical scenario of interest. In this work, we propose a shrinkage approach by adopting a novel uniform adaptive LASSO penalty. The new approach enjoys easy implementation without requiring smoothing. Moreover, it can consistently identify the true model (uniformly across quantiles) and achieve the oracle estimation efficiency. We further extend the proposed shrinkage method to the case where responses are subject to random right censoring. Numerical studies confirm the theoretical results and support the utility of our proposals. PMID:25332515
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection.
Zeng, Yaohui; Breheny, Patrick
2016-01-01
Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. We compare this approach to the ordinary lasso and to GSEA using both simulated and real data. We find that incorporation of prior pathway information can substantially improve the accuracy of gene expression classifiers, and we shed light on several ways in which hypothesis-testing approaches such as GSEA differ from regression approaches with respect to the analysis of pathway data. PMID:27679461
Overlapping Group Logistic Regression with Applications to Genetic Pathway Selection
Zeng, Yaohui; Breheny, Patrick
2016-01-01
Discovering important genes that account for the phenotype of interest has long been a challenge in genome-wide expression analysis. Analyses such as gene set enrichment analysis (GSEA) that incorporate pathway information have become widespread in hypothesis testing, but pathway-based approaches have been largely absent from regression methods due to the challenges of dealing with overlapping pathways and the resulting lack of available software. The R package grpreg is widely used to fit group lasso and other group-penalized regression models; in this study, we develop an extension, grpregOverlap, to allow for overlapping group structure using a latent variable approach. We compare this approach to the ordinary lasso and to GSEA using both simulated and real data. We find that incorporation of prior pathway information can substantially improve the accuracy of gene expression classifiers, and we shed light on several ways in which hypothesis-testing approaches such as GSEA differ from regression approaches with respect to the analysis of pathway data.
A General Semiparametric Hazards Regression Model: Efficient Estimation and Structure Selection
Tong, Xingwei; Zhu, Liang; Leng, Chenlei; Leisenring, Wendy; Robison, Leslie L.
2014-01-01
We consider a general semiparametric hazards regression model that encompasses Cox’s proportional hazards model and the accelerated failure time model for survival analysis. To overcome the nonexistence of the maximum likelihood, we derive a kernel-smoothed profile likelihood function, and prove that the resulting estimates of the regression parameters are consistent and achieve semiparametric efficiency. In addition, we develop penalized structure selection techniques to determine which covariates constitute the accelerate failure time model and which covariates constitute the proportional hazards model. The proposed method is able to estimate the model structure consistently and model parameters efficiently. Furthermore, variance estimation is straightforward. The proposed estimation performs well in simulation studies and is applied to the analysis of a real data set. Copyright PMID:23824784
Risser, Dennis W.; Thompson, Ronald E.; Stuckey, Marla H.
2008-01-01
A method was developed for making estimates of long-term, mean annual ground-water recharge from streamflow data at 80 streamflow-gaging stations in Pennsylvania. The method relates mean annual base-flow yield derived from the streamflow data (as a proxy for recharge) to the climatic, geologic, hydrologic, and physiographic characteristics of the basins (basin characteristics) by use of a regression equation. Base-flow yield is the base flow of a stream divided by the drainage area of the basin, expressed in inches of water basinwide. Mean annual base-flow yield was computed for the period of available streamflow record at continuous streamflow-gaging stations by use of the computer program PART, which separates base flow from direct runoff on the streamflow hydrograph. Base flow provides a reasonable estimate of recharge for basins where streamflow is mostly unaffected by upstream regulation, diversion, or mining. Twenty-eight basin characteristics were included in the exploratory regression analysis as possible predictors of base-flow yield. Basin characteristics found to be statistically significant predictors of mean annual base-flow yield during 1971-2000 at the 95-percent confidence level were (1) mean annual precipitation, (2) average maximum daily temperature, (3) percentage of sand in the soil, (4) percentage of carbonate bedrock in the basin, and (5) stream channel slope. The equation for predicting recharge was developed using ordinary least-squares regression. The standard error of prediction for the equation on log-transformed data was 9.7 percent, and the coefficient of determination was 0.80. The equation can be used to predict long-term, mean annual recharge rates for ungaged basins, providing that the explanatory basin characteristics can be determined and that the underlying assumption is accepted that base-flow yield derived from PART is a reasonable estimate of ground-water recharge rates. For example, application of the equation for 370
Yan, Qi; Weeks, Daniel E; Celedón, Juan C; Tiwari, Hemant K; Li, Bingshan; Wang, Xiaojing; Lin, Wan-Yu; Lou, Xiang-Yang; Gao, Guimin; Chen, Wei; Liu, Nianjun
2015-12-01
The recent development of sequencing technology allows identification of association between the whole spectrum of genetic variants and complex diseases. Over the past few years, a number of association tests for rare variants have been developed. Jointly testing for association between genetic variants and multiple correlated phenotypes may increase the power to detect causal genes in family-based studies, but familial correlation needs to be appropriately handled to avoid an inflated type I error rate. Here we propose a novel approach for multivariate family data using kernel machine regression (denoted as MF-KM) that is based on a linear mixed-model framework and can be applied to a large range of studies with different types of traits. In our simulation studies, the usual kernel machine test has inflated type I error rates when applied directly to familial data, while our proposed MF-KM method preserves the expected type I error rates. Moreover, the MF-KM method has increased power compared to methods that either analyze each phenotype separately while considering family structure or use only unrelated founders from the families. Finally, we illustrate our proposed methodology by analyzing whole-genome genotyping data from a lung function study.
Wang, Huifang; Xiao, Bo; Wang, Mingyu; Shao, Ming'an
2013-01-01
Soil water retention parameters are critical to quantify flow and solute transport in vadose zone, while the presence of rock fragments remarkably increases their variability. Therefore a novel method for determining water retention parameters of soil-gravel mixtures is required. The procedure to generate such a model is based firstly on the determination of the quantitative relationship between the content of rock fragments and the effective saturation of soil-gravel mixtures, and then on the integration of this relationship with former analytical equations of water retention curves (WRCs). In order to find such relationships, laboratory experiments were conducted to determine WRCs of soil-gravel mixtures obtained with a clay loam soil mixed with shale clasts or pebbles in three size groups with various gravel contents. Data showed that the effective saturation of the soil-gravel mixtures with the same kind of gravels within one size group had a linear relation with gravel contents, and had a power relation with the bulk density of samples at any pressure head. Revised formulas for water retention properties of the soil-gravel mixtures are proposed to establish the water retention curved surface models of the power-linear functions and power functions. The analysis of the parameters obtained by regression and validation of the empirical models showed that they were acceptable by using either the measured data of separate gravel size group or those of all the three gravel size groups having a large size range. Furthermore, the regression parameters of the curved surfaces for the soil-gravel mixtures with a large range of gravel content could be determined from the water retention data of the soil-gravel mixtures with two representative gravel contents or bulk densities. Such revised water retention models are potentially applicable in regional or large scale field investigations of significantly heterogeneous media, where various gravel sizes and different gravel
ERIC Educational Resources Information Center
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
A Guide to Assistance In Penal and Correctional Institutions
ERIC Educational Resources Information Center
Walker, Bailus; Gordon, Theodore
1973-01-01
Lists the more significant federal assistance programs relating to penal and correctional reform which may serve as a guide for environmental health specialists who are beginning to assume much broader responsibilities in institutional environmental quality. (JR)
Xu, A; Zhang, Y; Ran, T; Liu, H; Lu, S; Xu, J; Xiong, X; Jiang, Y; Lu, T; Chen, Y
2015-01-01
Bruton's tyrosine kinase (BTK) plays a crucial role in B-cell activation and development, and has emerged as a new molecular target for the treatment of autoimmune diseases and B-cell malignancies. In this study, two- and three-dimensional quantitative structure-activity relationship (2D and 3D-QSAR) analyses were performed on a series of pyridine and pyrimidine-based BTK inhibitors by means of genetic algorithm optimized multivariate adaptive regression spline (GA-MARS) and comparative molecular similarity index analysis (CoMSIA) methods. Here, we propose a modified MARS algorithm to develop 2D-QSAR models. The top ranked models showed satisfactory statistical results (2D-QSAR: Q(2) = 0.884, r(2) = 0.929, r(2)pred = 0.878; 3D-QSAR: q(2) = 0.616, r(2) = 0.987, r(2)pred = 0.905). Key descriptors selected by 2D-QSAR were in good agreement with the conclusions of 3D-QSAR, and the 3D-CoMSIA contour maps facilitated interpretation of the structure-activity relationship. A new molecular database was generated by molecular fragment replacement (MFR) and further evaluated with GA-MARS and CoMSIA prediction. Twenty-five pyridine and pyrimidine derivatives as novel potential BTK inhibitors were finally selected for further study. These results also demonstrated that our method can be a very efficient tool for the discovery of novel potent BTK inhibitors.
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
NASA Astrophysics Data System (ADS)
Thomas, G. E.; Bardeen, C.; Benze, S.
2014-12-01
Simulations of Polar Mesospheric Cloud (PMC) brightness and ice water content (IWC) are used to develop a simple robust method for IWC retrieval from UV satellite observations. We compare model simulations of IWC with retrievals from the UV Cloud Imaging and Particle Size (CIPS) experiment on board the satellite mission Aeronomy for Ice in the Mesosphere (AIM). This instrument remotely senses scattered brightness related to the vertically-integrated ice content. Simulations from the Whole Atmosphere Community Climate Model (WACCM), a chemistry climate model, is combined with a sectional microphysics model based on the Community Aerosol and Radiation Model for Atmospheres (CARMA). The model calculates high-resolution three-dimensional size distributions of ice particles. The internal variability is due to geographic and temporal variation of temperature and dynamics, water vapor, and meteoric dust. We examine all simulations from a single model day (we chose northern summer solstice) which contains several thousand model clouds. Accurate vertical integrations of the albedo and IWC are obtained. The ice size distributions are thus based on physical principles, rather than artificial analytic distributions that are often used in retrieval algorithms from observations. Treating the model clouds as noise-free data, we apply the CIPS algorithm to retrieve cloud particle size and IWC. The inherent "errors" in the retrievals are thus estimated. The linear dependence of IWC on albedo makes possible a method to derive IWC, called the Albedo-Ice regression method, or AIR. This method potentially unifies the variety of data from various UV experiments, with the advantages of (1) removing scattering-angle bias from cloud brightness measurements,(2) providing a physically-useful parameter (IWC),(3) deriving IWC even for faint clouds of small average particle sizes, and (4) estimating the statistical uncertainty as a random error, which bypasses the need to derive particle size.
Cao, M H; Adeola, O
2016-02-01
The energy values of poultry byproduct meal (PBM) and animal-vegetable oil blend (A-V blend) were determined in 2 experiments with 288 broiler chickens from d 19 to 25 post hatching. The birds were fed a starter diet from d 0 to 19 post hatching. In each experiment, 144 birds were grouped by weight into 8 replicates of cages with 6 birds per cage. There were 3 diets in each experiment consisting of one reference diet (RD) and 2 test diets (TD). The TD contained 2 levels of PBM (Exp. 1) or A-V blend (Exp. 2) that replaced the energy sources in the RD at 50 or 100 g/kg (Exp. 1) or 40 or 80 g/kg (Exp. 2) in such a way that the same ratio were maintained for energy ingredients across experimental diets. The ileal digestible energy (IDE), ME, and MEn of PBM and A-V blend were determined by the regression method. Dry matter of PBM and A-V blend were 984 and 999 g/kg; the gross energies were 5,284 and 9,604 kcal/kg of DM, respectively. Addition of PBM to the RD in Exp. 1 linearly decreased (P < 0.05) DM, ileal and total tract of DM, energy and nitrogen digestibilities and utilization. In Exp. 2, addition of A-V blend to the RD linearly increased (P < 0.001) ileal digestibilities and total tract utilization of DM, energy and nitrogen as well as IDE, ME, and MEn. Regressions of PBM-associated IDE, ME, or MEn intake in kcal against PBM intake were: IDE = 3,537x + 4.953, r(2) = 0.97; ME = 3,805x + 1.279, r(2) = 0.97; MEn = 3,278x + 0.164, r(2) = 0.90; and A-V blend as follows: IDE = 10,616x + 7.350, r(2) = 0.96; ME = 10,121x + 0.447, r(2) = 0.99; MEn = 10,124x + 2.425, r(2) = 0.99. These data indicate the respective IDE, ME, MEn values (kcal/kg of DM) of PBM evaluated to be 3,537, 3,805, and 3,278, and A-V blend evaluated to be 10,616, 10,121, and 10,124. PMID:26628339
Cao, M H; Adeola, O
2016-02-01
The energy values of poultry byproduct meal (PBM) and animal-vegetable oil blend (A-V blend) were determined in 2 experiments with 288 broiler chickens from d 19 to 25 post hatching. The birds were fed a starter diet from d 0 to 19 post hatching. In each experiment, 144 birds were grouped by weight into 8 replicates of cages with 6 birds per cage. There were 3 diets in each experiment consisting of one reference diet (RD) and 2 test diets (TD). The TD contained 2 levels of PBM (Exp. 1) or A-V blend (Exp. 2) that replaced the energy sources in the RD at 50 or 100 g/kg (Exp. 1) or 40 or 80 g/kg (Exp. 2) in such a way that the same ratio were maintained for energy ingredients across experimental diets. The ileal digestible energy (IDE), ME, and MEn of PBM and A-V blend were determined by the regression method. Dry matter of PBM and A-V blend were 984 and 999 g/kg; the gross energies were 5,284 and 9,604 kcal/kg of DM, respectively. Addition of PBM to the RD in Exp. 1 linearly decreased (P < 0.05) DM, ileal and total tract of DM, energy and nitrogen digestibilities and utilization. In Exp. 2, addition of A-V blend to the RD linearly increased (P < 0.001) ileal digestibilities and total tract utilization of DM, energy and nitrogen as well as IDE, ME, and MEn. Regressions of PBM-associated IDE, ME, or MEn intake in kcal against PBM intake were: IDE = 3,537x + 4.953, r(2) = 0.97; ME = 3,805x + 1.279, r(2) = 0.97; MEn = 3,278x + 0.164, r(2) = 0.90; and A-V blend as follows: IDE = 10,616x + 7.350, r(2) = 0.96; ME = 10,121x + 0.447, r(2) = 0.99; MEn = 10,124x + 2.425, r(2) = 0.99. These data indicate the respective IDE, ME, MEn values (kcal/kg of DM) of PBM evaluated to be 3,537, 3,805, and 3,278, and A-V blend evaluated to be 10,616, 10,121, and 10,124.
Li, J.; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
NASA Astrophysics Data System (ADS)
Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.
2014-05-01
Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution
Multivariate Regression with Calibration*
Liu, Han; Wang, Lie; Zhao, Tuo
2014-01-01
We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite-sample performance. Computationally, we develop an efficient smoothed proximal gradient algorithm which has a worst-case iteration complexity O(1/ε), where ε is a pre-specified numerical accuracy. Theoretically, we prove that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness of CMR by thorough numerical simulations and show that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR on a brain activity prediction problem and find that CMR is as competitive as the handcrafted model created by human experts. PMID:25620861
Survival prediction and gene identification with penalized global AUC maximization.
Liu, Zhenqiu; Gartenhaus, Ronald B; Chen, Xue-Wen; Howell, Charles D; Tan, Ming
2009-12-01
Identifying genes (biomarkers) and predicting the clinical outcomes with censored survival times are important for cancer prognosis and pathogenesis. In this article, we propose a novel method with L(1) penalized global AUC summary maximization (L(1)GAUCS). The L(1)GAUCS method is developed for simultaneous gene (feature) selection and survival prediction. L(1) penalty shrinks coefficients and produces some coefficients that are exactly zero, and therefore selects a small subset of genes (features). It is a well-known fact that many genes are highly correlated in gene expression data and the highly correlated genes may function together. We, therefore, define a correlation measure to identify those genes such that their expression level may be low but they are highly correlated with the downstream highly expressed genes selected with L(1)GAUCS. Partial pathways associated with the correlated genes are identified with DAVID (http://david.abcc.ncifcrf.gov/). Experimental results with chemotherapy and gene expression data demonstrate that the proposed procedures can be used for identifying important genes and pathways that are related to time to death due to cancer and for building a parsimonious model for predicting the survival of future patients. Software is available upon request from the first author.
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from
pPXF: Penalized Pixel-Fitting stellar kinematics extraction
NASA Astrophysics Data System (ADS)
Cappellari, Michele
2012-10-01
pPXF is an IDL (and free GDL or FL) program which extracts the stellar kinematics or stellar population from absorption-line spectra of galaxies using the Penalized Pixel-Fitting method (pPXF) developed by Cappellari & Emsellem (2004, PASP, 116, 138). Additional features implemented in the pPXF routine include: Optimal template: Fitted together with the kinematics to minimize template-mismatch errors. Also useful to extract gas kinematics or derive emission-corrected line-strengths indexes. One can use synthetic templates to study the stellar population of galaxies via "Full Spectral Fitting" instead of using traditional line-strengths.Regularization of templates weights: To reduce the noise in the recovery of the stellar population parameters and attach a physical meaning to the output weights assigned to the templates in term of the star formation history (SFH) or metallicity distribution of an individual galaxy.Iterative sigma clipping: To clean the spectra from residual bad pixels or cosmic rays.Additive/multiplicative polynomials: To correct low frequency continuum variations. Also useful for calibration purposes.
Metamorphic geodesic regression.
Hong, Yi; Joshi, Sarang; Sanchez, Mar; Styner, Martin; Niethammer, Marc
2012-01-01
We propose a metamorphic geodesic regression approach approximating spatial transformations for image time-series while simultaneously accounting for intensity changes. Such changes occur for example in magnetic resonance imaging (MRI) studies of the developing brain due to myelination. To simplify computations we propose an approximate metamorphic geodesic regression formulation that only requires pairwise computations of image metamorphoses. The approximated solution is an appropriately weighted average of initial momenta. To obtain initial momenta reliably, we develop a shooting method for image metamorphosis.
Bolarinwa, O A; Adeola, O
2016-02-01
Direct or indirect methods can be used to determine the DE and ME of feed ingredients for pigs. In situations when only the indirect approach is suitable, the regression method presents a robust indirect approach. Three experiments were conducted to compare the direct and regression methods for determining the DE and ME values of barley, sorghum, and wheat for pigs. In each experiment, 24 barrows with an average initial BW of 31, 32, and 33 kg were assigned to 4 diets in a randomized complete block design. The 4 diets consisted of 969 g barley, sorghum, or wheat/kg plus minerals and vitamins for the direct method; a corn-soybean meal reference diet (RD); the RD + 300 g barley, sorghum, or wheat/kg; and the RD + 600 g barley, sorghum, or wheat/kg. The 3 corn-soybean meal diets were used for the regression method. Each diet was fed to 6 barrows in individual metabolism crates for a 5-d acclimation followed by a 5-d period of total but separate collection of feces and urine in each experiment. Graded substitution of barley or wheat, but not sorghum, into the RD linearly reduced ( < 0.05) dietary DE and ME. The direct method-derived DE and ME for barley were 3,669 and 3,593 kcal/kg DM, respectively. The regressions of barley contribution to DE and ME in kilocalories against the quantity of barley DMI in kilograms generated 3,746 kcal DE/kg DM and 3,647 kcal ME/kg DM. The DE and ME for sorghum by the direct method were 4,097 and 4,042 kcal/kg DM, respectively; the corresponding regression-derived estimates were 4,145 and 4,066 kcal/kg DM. Using the direct method, energy values for wheat were 3,953 kcal DE/kg DM and 3,889 kcal ME/kg DM. The regressions of wheat contribution to DE and ME in kilocalories against the quantity of wheat DMI in kilograms generated 3,960 kcal DE/kg DM and 3,874 kcal ME/kg DM. The DE and ME of barley using the direct method were not different (0.3 < < 0.4) from those obtained using the regression method (3,669 vs. 3,746 and 3,593 vs. 3,647 kcal
ERIC Educational Resources Information Center
Guler, Nese; Penfield, Randall D.
2009-01-01
In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…
ERIC Educational Resources Information Center
Kromrey, Jeffrey D.; Hines, Constance V.
1996-01-01
The accuracy of three analytical formulas for shrinkage estimation and four empirical techniques were investigated in a Monte Carlo study of the coefficient of cross-validity in multiple regression. Substantial statistical bias was evident for all techniques except the formula of M. W. Brown (1975) and multicross-validation. (SLD)
ERIC Educational Resources Information Center
Hick, Thomas L.; Irvine, David J.
To eliminate maturation as a factor in the pretest-posttest design, pretest scores can be converted to anticipate posttest scores using grade equivalent scores from standardized tests. This conversion, known as historical regression, assumes that without specific intervention, growth will continue at the rate (grade equivalents per year of…
Maximum penalized likelihood estimation in semiparametric mark-recapture-recovery models.
Michelot, Théo; Langrock, Roland; Kneib, Thomas; King, Ruth
2016-01-01
We discuss the semiparametric modeling of mark-recapture-recovery data where the temporal and/or individual variation of model parameters is explained via covariates. Typically, in such analyses a fixed (or mixed) effects parametric model is specified for the relationship between the model parameters and the covariates of interest. In this paper, we discuss the modeling of the relationship via the use of penalized splines, to allow for considerably more flexible functional forms. Corresponding models can be fitted via numerical maximum penalized likelihood estimation, employing cross-validation to choose the smoothing parameters in a data-driven way. Our contribution builds on and extends the existing literature, providing a unified inferential framework for semiparametric mark-recapture-recovery models for open populations, where the interest typically lies in the estimation of survival probabilities. The approach is applied to two real datasets, corresponding to gray herons (Ardea cinerea), where we model the survival probability as a function of environmental condition (a time-varying global covariate), and Soay sheep (Ovis aries), where we model the survival probability as a function of individual weight (a time-varying individual-specific covariate). The proposed semiparametric approach is compared to a standard parametric (logistic) regression and new interesting underlying dynamics are observed in both cases.
NASA Astrophysics Data System (ADS)
Demir, Begüm; Bruzzone, Lorenzo
2012-11-01
This paper presents a novel active learning (AL) technique in the context of ɛ-insensitive support vector regression (SVR) to estimate biophysical parameters from remotely sensed images. The proposed AL method aims at selecting the most informative and representative unlabeled samples which have maximum uncertainty, diversity and density assessed according to the SVR estimation rule. This is achieved on the basis of two consecutive steps that rely on the kernel kmeans clustering. In the first step the most uncertain unlabeled samples are selected by removing the most certain ones from a pool of unlabeled samples. In SVR problems, the most uncertain samples are located outside or on the boundary of the ɛ-tube of SVR, as their target values have the lowest confidence to be correctly estimated. In order to select these samples, the kernel k-means clustering is applied to all unlabeled samples together with the training samples that are not SVs, i.e., those that are inside the ɛ-tube, (non-SVs). Then, clusters with non-SVs inside are rejected, whereas the unlabeled samples contained in the remained clusters are selected as the most uncertain samples. In the second step the samples located in the high density regions in the kernel space and as much diverse as possible to each other are chosen among the uncertain samples. The density and diversity of the unlabeled samples are evaluated on the basis of their clusters' information. To this end, initially the density of each cluster is measured by the ratio of the number of samples in the cluster to the distance of its two furthest samples. Then, the highest density clusters are chosen and the medoid samples closest to the centers of the selected clusters are chosen as the most informative ones. The diversity of samples is accomplished by selecting only one sample from each selected cluster. Experiments applied to the estimation of single-tree parameters, i.e., tree stem volume and tree stem diameter, show the
NASA Astrophysics Data System (ADS)
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-04-01
first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.
NASA Astrophysics Data System (ADS)
Espinoza-Ojeda, O. M.; Santoyo, E.
2016-08-01
A new practical method based on logarithmic transformation regressions was developed for the determination of static formation temperatures (SFTs) in geothermal, petroleum and permafrost bottomhole temperature (BHT) data sets. The new method involves the application of multiple linear and polynomial (from quadratic to eight-order) regression models to BHT and log-transformation (Tln) shut-in times. Selection of the best regression models was carried out by using four statistical criteria: (i) the coefficient of determination as a fitting quality parameter; (ii) the sum of the normalized squared residuals; (iii) the absolute extrapolation, as a dimensionless statistical parameter that enables the accuracy of each regression model to be evaluated through the extrapolation of the last temperature measured of the data set; and (iv) the deviation percentage between the measured and predicted BHT data. The best regression model was used for reproducing the thermal recovery process of the boreholes, and for the determination of the SFT. The original thermal recovery data (BHT and shut-in time) were used to demonstrate the new method's prediction efficiency. The prediction capability of the new method was additionally evaluated by using synthetic data sets where the true formation temperature (TFT) was known with accuracy. With these purposes, a comprehensive statistical analysis was carried out through the application of the well-known F-test and Student's t-test and the error percentage or statistical differences computed between the SFT estimates and the reported TFT data. After applying the new log-transformation regression method to a wide variety of geothermal, petroleum, and permafrost boreholes, it was found that the polynomial models were generally the best regression models that describe their thermal recovery processes. These fitting results suggested the use of this new method for the reliable estimation of SFT. Finally, the practical use of the new method was
NASA Astrophysics Data System (ADS)
Demuzere, Matthias; van Lipzig, Nicole P. M.
2010-03-01
In order to make projections for future air-quality levels, a robust methodology is needed that succeeds in reconstructing present-day air-quality levels. At present, climate projections for meteorological variables are available from Atmospheric-Ocean Coupled Global Climate Models (AOGCMs) but the temporal and spatial resolution is insufficient for air-quality assessment. Therefore, a variety of methods are tested in this paper in their ability to hindcast maximum 8 hourly levels of O 3 and daily mean PM 10 from observed meteorological data. The methods are based on a multiple linear regression technique combined with the automated Lamb weather classification. Moreover, we studied whether the above-mentioned multiple regression analysis still holds when driven by operational ECMWF (European Center for Medium-Range Weather Forecast) meteorological data. The main results show that a weather type classification prior to the regression analysis is superior to a simple linear regression approach. In contrast to PM 10 downscaling, seasonal characteristics should be taken into account during the downscaling of O 3 time series. Apart from a lower explained variance due to intrinsic limitations of the regression approach itself, a lower variability of the meteorological predictors (resolution effect) and model deficiencies, this synoptic-regression-based tool is generally able to reproduce the relevant statistical properties of the observed O 3 distributions important in terms of European air quality Directives and air quality mitigation strategies. For PM 10, the situation is different as the approach using only meteorology data was found to be insufficient to explain the observed PM 10 variability using the meteorological variables considered in this study.
Clegg, Samuel M; Barefield, James E; Wiens, Roger C; Dyar, Melinda D; Schafer, Martha W; Tucker, Jonathan M
2008-01-01
The ChemCam instrument on the Mars Science Laboratory (MSL) will include a laser-induced breakdown spectrometer (LIBS) to quantify major and minor elemental compositions. The traditional analytical chemistry approach to calibration curves for these data regresses a single diagnostic peak area against concentration for each element. This approach contrasts with a new multivariate method in which elemental concentrations are predicted by step-wise multiple regression analysis based on areas of a specific set of diagnostic peaks for each element. The method is tested on LIBS data from igneous and metamorphosed rocks. Between 4 and 13 partial regression coefficients are needed to describe each elemental abundance accurately (i.e., with a regression line of R{sup 2} > 0.9995 for the relationship between predicted and measured elemental concentration) for all major and minor elements studied. Validation plots suggest that the method is limited at present by the small data set, and will work best for prediction of concentration when a wide variety of compositions and rock types has been analyzed.
Lange, Kenneth; Papp, Jeanette C.; Sinsheimer, Janet S.; Sobel, Eric M.
2014-01-01
Statistical genetics is undergoing the same transition to big data that all branches of applied statistics are experiencing. With the advent of inexpensive DNA sequencing, the transition is only accelerating. This brief review highlights some modern techniques with recent successes in statistical genetics. These include: (a) lasso penalized regression and association mapping, (b) ethnic admixture estimation, (c) matrix completion for genotype and sequence data, (d) the fused lasso and copy number variation, (e) haplotyping, (f) estimation of relatedness, (g) variance components models, and (h) rare variant testing. For more than a century, genetics has been both a driver and beneficiary of statistical theory and practice. This symbiotic relationship will persist for the foreseeable future. PMID:24955378
[Arterial hypertension in females engaged into penal system work].
Tagirova, M M; El'garov, A A; Shogenova, A B; Murtazov, A M
2010-01-01
The authors proved significant prevalence of arterial hypertension and atherosclerosis risk factors in women engaged into penal system work--so these values form cardiovascular risk caused by environmental parameters. Teveten and Nebilet were proved effective in the examinees with arterial hypertension.
Education--Penal Institutions: U. S. and Europe.
ERIC Educational Resources Information Center
Kerle, Ken
Penal systems of European countries vary in educational programs and humanizing efforts. A high percentage of Soviet prisoners, many incarcerated for ideological/religious beliefs, are confined to labor colonies. All inmates are obligated to learn a trade, one of the qualifications for release being evidence of some trade skill. Swedish…
Indian NGO challenges penal code prohibition of "unnatural offences".
Csete, Joanne
2002-07-01
On 7 December 2001, the Naz Foundation (India) Trust (NFIT), a non-governmental organization based in New Delhi, filed a petition in the Delhi High Court to repeal the "unnatural offences" section of the Indian Penal Code that criminalizes men who have sex with men.
27 CFR 25.93 - Penal sum of bond.
Code of Federal Regulations, 2011 CFR
2011-04-01
... OF THE TREASURY LIQUORS BEER Bonds and Consents of Surety § 25.93 Penal sum of bond. (a)(1) Brewers... calculated at the rates prescribed by law which the brewer will become liable to pay during a calendar year during the period of the bond on beer: (i) Removed for transfer to the brewery from other breweries...
27 CFR 25.93 - Penal sum of bond.
Code of Federal Regulations, 2010 CFR
2010-04-01
... OF THE TREASURY LIQUORS BEER Bonds and Consents of Surety § 25.93 Penal sum of bond. (a)(1) Brewers... calculated at the rates prescribed by law which the brewer will become liable to pay during a calendar year during the period of the bond on beer: (i) Removed for transfer to the brewery from other breweries...
27 CFR 25.93 - Penal sum of bond.
Code of Federal Regulations, 2014 CFR
2014-04-01
... tax at the rates prescribed by law, on the maximum quantity of beer used in the production of... OF THE TREASURY ALCOHOL BEER Bonds and Consents of Surety § 25.93 Penal sum of bond. (a)(1) Brewers... calculated at the rates prescribed by law which the brewer will become liable to pay during a calendar...
27 CFR 25.93 - Penal sum of bond.
Code of Federal Regulations, 2012 CFR
2012-04-01
... OF THE TREASURY LIQUORS BEER Bonds and Consents of Surety § 25.93 Penal sum of bond. (a)(1) Brewers... calculated at the rates prescribed by law which the brewer will become liable to pay during a calendar year during the period of the bond on beer: (i) Removed for transfer to the brewery from other breweries...
27 CFR 25.93 - Penal sum of bond.
Code of Federal Regulations, 2013 CFR
2013-04-01
... tax at the rates prescribed by law, on the maximum quantity of beer used in the production of... OF THE TREASURY ALCOHOL BEER Bonds and Consents of Surety § 25.93 Penal sum of bond. (a)(1) Brewers... calculated at the rates prescribed by law which the brewer will become liable to pay during a calendar...
Chan, Weng Howe; Mohamad, Mohd Saberi; Deris, Safaai; Zaki, Nazar; Kasim, Shahreen; Omatu, Sigeru; Corchado, Juan Manuel; Al Ashwal, Hany
2016-10-01
Incorporation of pathway knowledge into microarray analysis has brought better biological interpretation of the analysis outcome. However, most pathway data are manually curated without specific biological context. Non-informative genes could be included when the pathway data is used for analysis of context specific data like cancer microarray data. Therefore, efficient identification of informative genes is inevitable. Embedded methods like penalized classifiers have been used for microarray analysis due to their embedded gene selection. This paper proposes an improved penalized support vector machine with absolute t-test weighting scheme to identify informative genes and pathways. Experiments are done on four microarray data sets. The results are compared with previous methods using 10-fold cross validation in terms of accuracy, sensitivity, specificity and F-score. Our method shows consistent improvement over the previous methods and biological validation has been done to elucidate the relation of the selected genes and pathway with the phenotype under study. PMID:27522238
Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin
2016-01-01
To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026
Raevsky, O A; Polianczyk, D E; Mukhametov, A; Grigorev, V Y
2016-08-01
Assessment of "CNS drugs/CNS candidates" classification abilities of the multi-parametric optimization (CNS MPO) approach was performed by logistic regression. It was found that the five out of the six separately used physical-chemical properties (topological polar surface area, number of hydrogen-bonded donor atoms, basicity, lipophilicity of compound in neutral form and at pH = 7.4) provided accuracy of recognition below 60%. Only the descriptor of molecular weight (MW) could correctly classify two-thirds of the studied compounds. Aggregation of all six properties in the MPOscore did not improve the classification, which was worse than the classification using only MW. The results of our study demonstrate the imperfection of the CNS MPO approach; in its current form it is not very useful for computer design of new, effective CNS drugs. PMID:27477321
Wild bootstrap for quantile regression.
Feng, Xingdong; He, Xuming; Hu, Jianhua
2011-12-01
The existing theory of the wild bootstrap has focused on linear estimators. In this note, we broaden its validity by providing a class of weight distributions that is asymptotically valid for quantile regression estimators. As most weight distributions in the literature lead to biased variance estimates for nonlinear estimators of linear regression, we propose a modification of the wild bootstrap that admits a broader class of weight distributions for quantile regression. A simulation study on median regression is carried out to compare various bootstrap methods. With a simple finite-sample correction, the wild bootstrap is shown to account for general forms of heteroscedasticity in a regression model with fixed design points.
NASA Astrophysics Data System (ADS)
Ren, Xue; Lee, Soo-Jin
2016-03-01
Patch-based regularization methods, which have proven useful not only for image denoising, but also for tomographic reconstruction, penalize image roughness based on the intensity differences between two nearby patches. However, when two patches are not considered to be similar in the general sense of similarity but still have similar features in a scaled domain after normalizing the two patches, the difference between the two patches in the scaled domain is smaller than the intensity difference measured in the standard method. Standard patch-based methods tend to ignore such similarities due to the large intensity differences between the two patches. In this work, for patch-based penalized likelihood tomographic reconstruction, we propose a new approach to the similarity measure using the normalized patch differences as well as the intensity-based patch differences. A normalized patch difference is obtained by normalizing and scaling the intensity-based patch difference. To selectively take advantage of the standard patch (SP) and normalized patch (NP), we use switching schemes that can select either SP or NP based on the gradient of a reconstructed image. In this case the SP is selected for restoring large-scaled piecewise-smooth regions, while the NP is selected for preserving the contrast of fine details. The numerical experiments using software phantom demonstrate that our proposed methods not only improve overall reconstruction accuracy in terms of the percentage error, but also reveal better recovery of fine details in terms of the contrast recovery coefficient.
Haghighi, Mona; Johnson, Suzanne Bennett; Qian, Xiaoning; Lynch, Kristian F; Vehik, Kendra; Huang, Shuai
2016-01-01
Regression models are extensively used in many epidemiological studies to understand the linkage between specific outcomes of interest and their risk factors. However, regression models in general examine the average effects of the risk factors and ignore subgroups with different risk profiles. As a result, interventions are often geared towards the average member of the population, without consideration of the special health needs of different subgroups within the population. This paper demonstrates the value of using rule-based analysis methods that can identify subgroups with heterogeneous risk profiles in a population without imposing assumptions on the subgroups or method. The rules define the risk pattern of subsets of individuals by not only considering the interactions between the risk factors but also their ranges. We compared the rule-based analysis results with the results from a logistic regression model in The Environmental Determinants of Diabetes in the Young (TEDDY) study. Both methods detected a similar suite of risk factors, but the rule-based analysis was superior at detecting multiple interactions between the risk factors that characterize the subgroups. A further investigation of the particular characteristics of each subgroup may detect the special health needs of the subgroup and lead to tailored interventions.
Haghighi, Mona; Johnson, Suzanne Bennett; Qian, Xiaoning; Lynch, Kristian F.; Vehik, Kendra; Huang, Shuai; Rewers, Marian; Barriga, Katherine; Baxter, Judith; Eisenbarth, George; Frank, Nicole; Gesualdo, Patricia; Hoffman, Michelle; Norris, Jill; Ide, Lisa; Robinson, Jessie; Waugh, Kathleen; She, Jin-Xiong; Schatz, Desmond; Hopkins, Diane; Steed, Leigh; Choate, Angela; Silvis, Katherine; Shankar, Meena; Huang, Yi-Hua; Yang, Ping; Wang, Hong-Jie; Leggett, Jessica; English, Kim; McIndoe, Richard; Dequesada, Angela; Haller, Michael; Anderson, Stephen W.; Ziegler, Anette G.; Boerschmann, Heike; Bonifacio, Ezio; Bunk, Melanie; Försch, Johannes; Henneberger, Lydia; Hummel, Michael; Hummel, Sandra; Joslowski, Gesa; Kersting, Mathilde; Knopff, Annette; Kocher, Nadja; Koletzko, Sibylle; Krause, Stephanie; Lauber, Claudia; Mollenhauer, Ulrike; Peplow, Claudia; Pflüger, Maren; Pöhlmann, Daniela; Ramminger, Claudia; Rash-Sur, Sargol; Roth, Roswith; Schenkel, Julia; Thümer, Leonore; Voit, Katja; Winkler, Christiane; Zwilling, Marina; Simell, Olli G.; Nanto-Salonen, Kirsti; Ilonen, Jorma; Knip, Mikael; Veijola, Riitta; Simell, Tuula; Hyöty, Heikki; Virtanen, Suvi M.; Kronberg-Kippilä, Carina; Torma, Maija; Simell, Barbara; Ruohonen, Eeva; Romo, Minna; Mantymaki, Elina; Schroderus, Heidi; Nyblom, Mia; Stenius, Aino; Lernmark, Åke; Agardh, Daniel; Almgren, Peter; Andersson, Eva; Andrén-Aronsson, Carin; Ask, Maria; Karlsson, Ulla-Marie; Cilio, Corrado; Bremer, Jenny; Ericson-Hallström, Emilie; Gard, Thomas; Gerardsson, Joanna; Gustavsson, Ulrika; Hansson, Gertie; Hansen, Monica; Hyberg, Susanne; Håkansson, Rasmus; Ivarsson, Sten; Johansen, Fredrik; Larsson, Helena; Lernmark, Barbro; Markan, Maria; Massadakis, Theodosia; Melin, Jessica; Månsson-Martinez, Maria; Nilsson, Anita; Nilsson, Emma; Rahmati, Kobra; Rang, Sara; Järvirova, Monica Sedig; Sibthorpe, Sara; Sjöberg, Birgitta; Törn, Carina; Wallin, Anne; Wimar, Åsa; Hagopian, William A.; Yan, Xiang; Killian, Michael; Crouch, Claire Cowen; Hay, Kristen M.; Ayres, Stephen; Adams, Carissa; Bratrude, Brandi; Fowler, Greer; Franco, Czarina; Hammar, Carla; Heaney, Diana; Marcus, Patrick; Meyer, Arlene; Mulenga, Denise; Scott, Elizabeth; Skidmore, Jennifer; Small, Erin; Stabbert, Joshua; Stepitova, Viktoria; Becker, Dorothy; Franciscus, Margaret; Dalmagro-Elias Smith, MaryEllen; Daftary, Ashi; Krischer, Jeffrey P.; Abbondondolo, Michael; Ballard, Lori; Brown, Rasheedah; Cuthbertson, David; Eberhard, Christopher; Gowda, Veena; Lee, Hye-Seung; Liu, Shu; Malloy, Jamie; McCarthy, Cristina; McLeod, Wendy; Smith, Laura; Smith, Stephen; Smith, Susan; Uusitalo, Ulla; Yang, Jimin; Akolkar, Beena; Briese, Thomas; Erlich, Henry; Oberste, Steve
2016-01-01
Regression models are extensively used in many epidemiological studies to understand the linkage between specific outcomes of interest and their risk factors. However, regression models in general examine the average effects of the risk factors and ignore subgroups with different risk profiles. As a result, interventions are often geared towards the average member of the population, without consideration of the special health needs of different subgroups within the population. This paper demonstrates the value of using rule-based analysis methods that can identify subgroups with heterogeneous risk profiles in a population without imposing assumptions on the subgroups or method. The rules define the risk pattern of subsets of individuals by not only considering the interactions between the risk factors but also their ranges. We compared the rule-based analysis results with the results from a logistic regression model in The Environmental Determinants of Diabetes in the Young (TEDDY) study. Both methods detected a similar suite of risk factors, but the rule-based analysis was superior at detecting multiple interactions between the risk factors that characterize the subgroups. A further investigation of the particular characteristics of each subgroup may detect the special health needs of the subgroup and lead to tailored interventions. PMID:27561809
Haghighi, Mona; Johnson, Suzanne Bennett; Qian, Xiaoning; Lynch, Kristian F; Vehik, Kendra; Huang, Shuai
2016-01-01
Regression models are extensively used in many epidemiological studies to understand the linkage between specific outcomes of interest and their risk factors. However, regression models in general examine the average effects of the risk factors and ignore subgroups with different risk profiles. As a result, interventions are often geared towards the average member of the population, without consideration of the special health needs of different subgroups within the population. This paper demonstrates the value of using rule-based analysis methods that can identify subgroups with heterogeneous risk profiles in a population without imposing assumptions on the subgroups or method. The rules define the risk pattern of subsets of individuals by not only considering the interactions between the risk factors but also their ranges. We compared the rule-based analysis results with the results from a logistic regression model in The Environmental Determinants of Diabetes in the Young (TEDDY) study. Both methods detected a similar suite of risk factors, but the rule-based analysis was superior at detecting multiple interactions between the risk factors that characterize the subgroups. A further investigation of the particular characteristics of each subgroup may detect the special health needs of the subgroup and lead to tailored interventions. PMID:27561809
Farhadian, Maryam; Aliabadi, Mohsen; Darvishi, Ebrahim
2015-01-01
Background: Prediction models are used in a variety of medical domains, and they are frequently built from experience which constitutes data acquired from actual cases. This study aimed to analyze the potential of artificial neural networks and logistic regression techniques for estimation of hearing impairment among industrial workers. Materials and Methods: A total of 210 workers employed in a steel factory (in West of Iran) were selected, and their occupational exposure histories were analyzed. The hearing loss thresholds of the studied workers were determined using a calibrated audiometer. The personal noise exposures were also measured using a noise dosimeter in the workstations. Data obtained from five variables, which can influence the hearing loss, were used as input features, and the hearing loss thresholds were considered as target feature of the prediction methods. Multilayer feedforward neural networks and logistic regression were developed using MATLAB R2011a software. Results: Based on the World Health Organization classification for the grades of hearing loss, 74.2% of the studied workers have normal hearing thresholds, 23.4% have slight hearing loss, and 2.4% have moderate hearing loss. The accuracy and kappa coefficient of the best developed neural networks for prediction of the grades of hearing loss were 88.6 and 66.30, respectively. The accuracy and kappa coefficient of the logistic regression were also 84.28 and 51.30, respectively. Conclusion: Neural networks could provide more accurate predictions of the hearing loss than logistic regression. The prediction method can provide reliable and comprehensible information for occupational health and medicine experts. PMID:26500410
NASA Astrophysics Data System (ADS)
Setiawan, Suhartono, Ahmad, Imam Safawi; Rahmawati, Noorgam Ika
2015-12-01
Bank Indonesia (BI) as the central bank of Republic Indonesiahas a single overarching objective to establish and maintain rupiah stability. This objective could be achieved by monitoring traffic of inflow and outflow money currency. Inflow and outflow are related to stock and distribution of money currency around Indonesia territory. It will effect of economic activities. Economic activities of Indonesia,as one of Moslem country, absolutely related to Islamic Calendar (lunar calendar), that different with Gregorian calendar. This research aims to forecast the inflow and outflow money currency of Representative Office (RO) of BI Semarang Central Java region. The results of the analysis shows that the characteristics of inflow and outflow money currency influenced by the effects of the calendar variations, that is the day of Eid al-Fitr (moslem holyday) as well as seasonal patterns. In addition, the period of a certain week during Eid al-Fitr also affect the increase of inflow and outflow money currency. The best model based on the value of the smallestRoot Mean Square Error (RMSE) for inflow data is ARIMA model. While the best model for predicting the outflow data in RO of BI Semarang is ARIMAX model or Time Series Regression, because both of them have the same model. The results forecast in a period of 2015 shows an increase of inflow money currency happened in August, while the increase in outflow money currency happened in July.
NASA Astrophysics Data System (ADS)
Huang, Cong; Liu, Dan-Dan; Wang, Jing-Song
2009-06-01
The 10.7 cm solar radio flux (F10.7), the value of the solar radio emission flux density at a wavelength of 10.7 cm, is a useful index of solar activity as a proxy for solar extreme ultraviolet radiation. It is meaningful and important to predict F10.7 values accurately for both long-term (months-years) and short-term (days) forecasting, which are often used as inputs in space weather models. This study applies a novel neural network technique, support vector regression (SVR), to forecasting daily values of F10.7. The aim of this study is to examine the feasibility of SVR in short-term F10.7 forecasting. The approach, based on SVR, reduces the dimension of feature space in the training process by using a kernel-based learning algorithm. Thus, the complexity of the calculation becomes lower and a small amount of training data will be sufficient. The time series of F10.7 from 2002 to 2006 are employed as the data sets. The performance of the approach is estimated by calculating the norm mean square error and mean absolute percentage error. It is shown that our approach can perform well by using fewer training data points than the traditional neural network.
Boy-Roura, M; Cameron, K C; Di, H J
2016-02-01
This study presents a meta-analysis of 12 experiments that quantify nitrate-N leaching losses from grazed pasture systems in alluvial sedimentary soils in Canterbury (New Zealand). Mean measured nitrate-N leached (kg N/ha × 100 mm drainage) losses were 2.7 when no urine was applied, 8.4 at the urine rate of 300 kg N/ha, 9.8 at 500 kg N/ha, 24.5 at 700 kg N/ha and 51.4 at 1000 kg N/ha. Lismore soils presented significantly higher nitrate-N losses compared to Templeton soils. Moreover, a multiple linear regression (MLR) model was developed to determine the key factors that influence nitrate-N leaching and to predict nitrate-N leaching losses. The MLR analyses was calibrated and validated using 82 average values of nitrate-N leached and 48 explanatory variables representative of nitrogen inputs and outputs, transport, attenuation of nitrogen and farm management practices. The MLR model (R (2) = 0.81) showed that nitrate-N leaching losses were greater at higher urine application rates and when there was more drainage from rainfall and irrigation. On the other hand, nitrate leaching decreased when nitrification inhibitors (e.g. dicyandiamide (DCD)) were applied. Predicted nitrate-N leaching losses at the paddock scale were calculated using the MLR equation, and they varied largely depending on the urine application rate and urine patch coverage.
ERIC Educational Resources Information Center
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
High dimensional linear regression models under long memory dependence and measurement error
NASA Astrophysics Data System (ADS)
Kaul, Abhishek
This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Jović, Ozren; Smrečki, Neven; Popović, Zora
2016-04-01
A novel quantitative prediction and variable selection method called interval ridge regression (iRR) is studied in this work. The method is performed on six data sets of FTIR, two data sets of UV-vis and one data set of DSC. The obtained results show that models built with ridge regression on optimal variables selected with iRR significantly outperfom models built with ridge regression on all variables in both calibration (6 out of 9 cases) and validation (2 out of 9 cases). In this study, iRR is also compared with interval partial least squares regression (iPLS). iRR outperfomed iPLS in validation (insignificantly in 6 out of 9 cases and significantly in one out of 9 cases for p<0.05). Also, iRR can be a fast alternative to iPLS, especially in case of unknown degree of complexity of analyzed system, i.e. if upper limit of number of latent variables is not easily estimated for iPLS. Adulteration of hempseed (H) oil, a well known health beneficial nutrient, is studied in this work by mixing it with cheap and widely used oils such as soybean (So) oil, rapeseed (R) oil and sunflower (Su) oil. Binary mixture sets of hempseed oil with these three oils (HSo, HR and HSu) and a ternary mixture set of H oil, R oil and Su oil (HRSu) were considered. The obtained accuracy indicates that using iRR on FTIR and UV-vis data, each particular oil can be very successfully quantified (in all 8 cases RMSEP<1.2%). This means that FTIR-ATR coupled with iRR can very rapidly and effectively determine the level of adulteration in the adulterated hempseed oil (R(2)>0.99).
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects. PMID:21350755
Penal managerialism from within: implications for theory and research.
Cheliotis, Leonidas K
2006-01-01
Unlike the bulk of penological scholarship dealing with managerialist reforms, this article calls for greater theoretical and research attention to the often pernicious impact of managerialism on criminal justice professionals. Much in an ideal-typical fashion, light is shed on: the reasons why contemporary penal bureaucracies endeavor systematically to strip criminal justice work of its inherently affective nature; the structural forces that ensure control over officials; the processes by which those forces come into effect; and the human consequences of submission to totalitarian bureaucratic milieus. It is suggested that the heavy preoccupation of present-day penality with the predictability and calculability of outcomes entails the atomization of professionals and the dehumanization of their work. This is achieved through a kaleidoscope of direct and indirect mechanisms that naturalize and/or legitimate acquiescence.
Zhu, Ying; Tan, Tuck Lee
2016-04-15
An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects. PMID:26827180
Zhu, Ying; Tan, Tuck Lee
2016-04-15
An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects.
NASA Astrophysics Data System (ADS)
Zhu, Ying; Tan, Tuck Lee
2016-04-01
An effective and simple analytical method using Fourier transform infrared (FTIR) spectroscopy to distinguish wild-grown high-quality Ganoderma lucidum (G. lucidum) from cultivated one is of essential importance for its quality assurance and medicinal value estimation. Commonly used chemical and analytical methods using full spectrum are not so effective for the detection and interpretation due to the complex system of the herbal medicine. In this study, two penalized discriminant analysis models, penalized linear discriminant analysis (PLDA) and elastic net (Elnet),using FTIR spectroscopy have been explored for the purpose of discrimination and interpretation. The classification performances of the two penalized models have been compared with two widely used multivariate methods, principal component discriminant analysis (PCDA) and partial least squares discriminant analysis (PLSDA). The Elnet model involving a combination of L1 and L2 norm penalties enabled an automatic selection of a small number of informative spectral absorption bands and gave an excellent classification accuracy of 99% for discrimination between spectra of wild-grown and cultivated G. lucidum. Its classification performance was superior to that of the PLDA model in a pure L1 setting and outperformed the PCDA and PLSDA models using full wavelength. The well-performed selection of informative spectral features leads to substantial reduction in model complexity and improvement of classification accuracy, and it is particularly helpful for the quantitative interpretations of the major chemical constituents of G. lucidum regarding its anti-cancer effects.
Liu, Song; Su, Bo-min; Li, Qing-hui; Gan, Fu-xi
2015-01-01
The authors tried to find a method for quantitative analysis using pXRF without solid bulk stone/jade reference samples. 24 nephrite samples were selected, 17 samples were calibration samples and the other 7 are test samples. All the nephrite samples were analyzed by Proton induced X-ray emission spectroscopy (PIXE) quantitatively. Based on the PIXE results of calibration samples, calibration curves were created for the interested components/elements and used to analyze the test samples quantitatively; then, the qualitative spectrum of all nephrite samples were obtained by pXRF. According to the PIXE results and qualitative spectrum of calibration samples, partial least square method (PLS) was used for quantitative analysis of test samples. Finally, the results of test samples obtained by calibration method, PLS method and PIXE were compared to each other. The accuracy of calibration curve method and PLS method was estimated. The result indicates that the PLS method is the alternate method for quantitative analysis of stone/jade samples.
A Penalized Likelihood Approach for Investigating Gene-Drug Interactions in Pharmacogenetic Studies
Neely, Megan L.; Bondell, Howard D.; Tzeng, Jung-Ying
2015-01-01
Summary Pharmacogenetics investigates the relationship between heritable genetic variation and the variation in how individuals respond to drug therapies. Often, gene-drug interactions play a primary role in this response, and identifying these effects can aid in the development of individualized treatment regimes. Haplotypes can hold key information in understanding the association between genetic variation and drug response. However, the standard approach for haplotype-based association analysis does not directly address the research questions dictated by individualized medicine. A complementary post-hoc analysis is required, and this post-hoc analysis is usually under powered after adjusting for multiple comparisons and may lead to seemingly contradictory conclusions. In this work, we propose a penalized likelihood approach that is able to overcome the drawbacks of the standard approach and yield the desired personalized output. We demonstrate the utility of our method by applying it to the Scottish Randomized Trial in Ovarian Cancer. We also conducted simulation studies and showed that the proposed penalized method has comparable or more power than the standard approach and maintains low Type I error rates for both binary and quantitative drug responses. The largest performance gains are seen when the haplotype frequency is low, the difference in effect sizes are small, or the true relationship among the drugs is more complex. PMID:25604216
Ibrahim, Joseph G.
2014-01-01
Multiple Imputation, Maximum Likelihood and Fully Bayesian methods are the three most commonly used model-based approaches in missing data problems. Although it is easy to show that when the responses are missing at random (MAR), the complete case analysis is unbiased and efficient, the aforementioned methods are still commonly used in practice for this setting. To examine the performance of and relationships between these three methods in this setting, we derive and investigate small sample and asymptotic expressions of the estimates and standard errors, and fully examine how these estimates are related for the three approaches in the linear regression model when the responses are MAR. We show that when the responses are MAR in the linear model, the estimates of the regression coefficients using these three methods are asymptotically equivalent to the complete case estimates under general conditions. One simulation and a real data set from a liver cancer clinical trial are given to compare the properties of these methods when the responses are MAR. PMID:25309677
[Guideline 'Medicinal care for drug addicts in penal institutions'].
Westra, Michel; de Haan, Hein A; Arends, Marleen T; van Everdingen, Jannes J E; Klazinga, Niek S
2009-01-01
In the Netherlands, the policy on care for prisoners who are addicted to opiates is still heterogeneous. The recent guidelines entitled 'Medicinal care for drug addicts in penal institutions' should contribute towards unambiguous and more evidence-based treatment for this group. In addition, it should improve and bring the care pathways within judicial institutions and mainstream healthcare more into line with one another. Each rational course of medicinal treatment will initially be continued in the penal institution. In penal institutions the help on offer is mainly focused on abstinence from illegal drugs while at the same time limiting the damage caused to the health of the individual user. Methadone is regarded at the first choice for maintenance therapy. For patient safety, this is best given in liquid form in sealed cups of 5 mg/ml once daily in the morning. Recently a combination preparation containing buprenorphine and naloxone - a complete opiate antagonist - has become available. On discontinuation of opiate maintenance treatment intensive follow-up care is necessary. During this period there is considerable risk of a potentially lethal overdose. Detoxification should be coupled with psychosocial or medicinal intervention aimed at preventing relapse. Naltrexone is currently the only available opiate antagonist for preventing relapse. In those addicted to opiates, who also take benzodiazepines without any indication, it is strongly recommended that these be reduced and discontinued. This can be achieved by converting the regular dosage into the equivalent in diazepam and then reducing this dosage by a maximum of 25% a week.
Pereira, L F P; Adeola, O
2016-09-01
The energy and phosphorus values of sunflower meal (SFM) and rice bran (RB) were determined in 2 experiments with Ross 708 broiler chickens from 15 to 22 d of age. In Exp.1, the diets consisted of a corn-soybean meal reference diet (RD) and 4 test diets (TD). The TD consisted of SFM and RB that partly replaced the energy sources in the RD at 100 or 200 g/kg and 75 or 150 g/kg, respectively, such that the equal ratios were maintained for all energy containing ingredients across all experimental diets. In Exp.2, a cornstarch-soybean meal diet was the RD and TD consisting of SFM and RB that partly replaced cornstarch in the RD at 100 or 200 g/kg and 60 or 120 g/kg, respectively. Addition of SFM and RB to the RD in Exp.1 linearly decreased (P < 0.01) the digestibility coefficients of DM, energy, ileal digestible energy (IDE), metabolizability coefficients of DM, nitrogen (N), energy, N correct energy, metabolize energy (ME), and nitrogen-corrected ME. Except for RB, the increased levels of the test ingredients in RD did affect the metabolizability coefficients of N. The IDE values (kcal/kg DM) were 1,953 for SFM and 2,498 for RB; ME values (kcal/kg DM) were 1,893 for SFM and 2,683 for RB; and MEn values (kcal/kg DM) were 1,614 for SFM and 2,476 for RB. In Exp.2, there was a linear relationship between phosphorus (P) intake and ileal P output for diets with increased levels of SFM and RB. In addition, there was a linear relationship between P intake and P digestibility and retention for diets with increased levels of SFM. There were a quadratic effect (P < 0.01) and a tendency of quadratic effect (P = 0.07) for P digestible and total tract P retained, respectively, in the RB diets. The P digestibility and total tract P retention from regression analyses for SFM were 46% and 38%, respectively.
Pereira, L F P; Adeola, O
2016-09-01
The energy and phosphorus values of sunflower meal (SFM) and rice bran (RB) were determined in 2 experiments with Ross 708 broiler chickens from 15 to 22 d of age. In Exp.1, the diets consisted of a corn-soybean meal reference diet (RD) and 4 test diets (TD). The TD consisted of SFM and RB that partly replaced the energy sources in the RD at 100 or 200 g/kg and 75 or 150 g/kg, respectively, such that the equal ratios were maintained for all energy containing ingredients across all experimental diets. In Exp.2, a cornstarch-soybean meal diet was the RD and TD consisting of SFM and RB that partly replaced cornstarch in the RD at 100 or 200 g/kg and 60 or 120 g/kg, respectively. Addition of SFM and RB to the RD in Exp.1 linearly decreased (P < 0.01) the digestibility coefficients of DM, energy, ileal digestible energy (IDE), metabolizability coefficients of DM, nitrogen (N), energy, N correct energy, metabolize energy (ME), and nitrogen-corrected ME. Except for RB, the increased levels of the test ingredients in RD did affect the metabolizability coefficients of N. The IDE values (kcal/kg DM) were 1,953 for SFM and 2,498 for RB; ME values (kcal/kg DM) were 1,893 for SFM and 2,683 for RB; and MEn values (kcal/kg DM) were 1,614 for SFM and 2,476 for RB. In Exp.2, there was a linear relationship between phosphorus (P) intake and ileal P output for diets with increased levels of SFM and RB. In addition, there was a linear relationship between P intake and P digestibility and retention for diets with increased levels of SFM. There were a quadratic effect (P < 0.01) and a tendency of quadratic effect (P = 0.07) for P digestible and total tract P retained, respectively, in the RB diets. The P digestibility and total tract P retention from regression analyses for SFM were 46% and 38%, respectively. PMID:26976902
Korany, Mohamed A; Maher, Hadir M; Galal, Shereen M; Fahmy, Ossama T; Ragab, Marwa A A
2010-11-15
This manuscript discusses the application of chemometrics to the handling of HPLC response data using the internal standard method (ISM). This was performed on a model mixture containing terbutaline sulphate, guaiphenesin, bromhexine HCl, sodium benzoate and propylparaben as an internal standard. Derivative treatment of chromatographic response data of analyte and internal standard was followed by convolution of the resulting derivative curves using 8-points sin x(i) polynomials (discrete Fourier functions). The response of each analyte signal, its corresponding derivative and convoluted derivative data were divided by that of the internal standard to obtain the corresponding ratio data. This was found beneficial in eliminating different types of interferences. It was successfully applied to handle some of the most common chromatographic problems and non-ideal conditions, namely: overlapping chromatographic peaks and very low analyte concentrations. For example, a significant change in the correlation coefficient of sodium benzoate, in case of overlapping peaks, went from 0.9975 to 0.9998 on applying normal conventional peak area and first derivative under Fourier functions methods, respectively. Also a significant improvement in the precision and accuracy for the determination of synthetic mixtures and dosage forms in non-ideal cases was achieved. For example, in the case of overlapping peaks guaiphenesin mean recovery% and RSD% went from 91.57, 9.83 to 100.04, 0.78 on applying normal conventional peak area and first derivative under Fourier functions methods, respectively. This work also compares the application of Theil's method, a non-parametric regression method, in handling the response ratio data, with the least squares parametric regression method, which is considered the de facto standard method used for regression. Theil's method was found to be superior to the method of least squares as it assumes that errors could occur in both x- and y-directions and
Evaluating differential effects using regression interactions and regression mixture models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Basis Selection for Wavelet Regression
NASA Technical Reports Server (NTRS)
Wheeler, Kevin R.; Lau, Sonie (Technical Monitor)
1998-01-01
A wavelet basis selection procedure is presented for wavelet regression. Both the basis and the threshold are selected using cross-validation. The method includes the capability of incorporating prior knowledge on the smoothness (or shape of the basis functions) into the basis selection procedure. The results of the method are demonstrated on sampled functions widely used in the wavelet regression literature. The results of the method are contrasted with other published methods.
Sparse linear regression with elastic net regularization for brain-computer interfaces.
Kelly, John W; Degenhart, Alan D; Siewiorek, Daniel P; Smailagic, Asim; Wang, Wei
2012-01-01
This paper demonstrates the feasibility of decoding neuronal population signals using a sparse linear regression model with an elastic net penalty. In offline analysis of real electrocorticographic (ECoG) neural data the elastic net achieved a timepoint decoding accuracy of 95% for classifying hand grasps vs. rest, and 82% for moving a cursor in 1-D space towards a target. These results were superior to those obtained using ℓ(2)-penalized and unpenalized linear regression, and marginally better than ℓ(1)-penalized regression. Elastic net and the ℓ(1)-penalty also produced sparse feature sets, but the elastic net did not eliminate correlated features, which could result in a more stable decoder for brain-computer interfaces.
Penalized likelihood PET image reconstruction using patch-based edge-preserving regularization.
Wang, Guobao; Qi, Jinyi
2012-12-01
Iterative image reconstruction for positron emission tomography (PET) can improve image quality by using spatial regularization that penalizes image intensity difference between neighboring pixels. The most commonly used quadratic penalty often oversmoothes edges and fine features in reconstructed images. Nonquadratic penalties can preserve edges but often introduce piece-wise constant blocky artifacts and the results are also sensitive to the hyper-parameter that controls the shape of the penalty function. This paper presents a patch-based regularization for iterative image reconstruction that uses neighborhood patches instead of individual pixels in computing the nonquadratic penalty. The new regularization is more robust than the conventional pixel-based regularization in differentiating sharp edges from random fluctuations due to noise. An optimization transfer algorithm is developed for the penalized maximum likelihood estimation. Each iteration of the algorithm can be implemented in three simple steps: an EM-like image update, an image smoothing and a pixel-by-pixel image fusion. Computer simulations show that the proposed patch-based regularization can achieve higher contrast recovery for small objects without increasing background variation compared with the quadratic regularization. The reconstruction is also more robust to the hyper-parameter than conventional pixel-based nonquadratic regularizations. The proposed regularization method has been applied to real 3-D PET data.
A penalization technique to model plasma facing components in a tokamak with temperature variations
Paredes, A.; Bufferand, H.; Ciraolo, G.; Schwander, F.; Serre, E.; Ghendrih, P.; Tamain, P.
2014-10-01
To properly address turbulent transport in the edge plasma region of a tokamak, it is mandatory to describe the particle and heat outflow on wall components, using an accurate representation of the wall geometry. This is challenging for many plasma transport codes, which use a structured mesh with one coordinate aligned with magnetic surfaces. We propose here a penalization technique that allows modeling of particle and heat transport using such structured mesh, while also accounting for geometrically complex plasma-facing components. Solid obstacles are considered as particle and momentum sinks whereas ionic and electronic temperature gradients are imposed on both sides of the obstacles along the magnetic field direction using delta functions (Dirac). Solutions exhibit plasma velocities (M=1) and temperatures fluxes at the plasma–wall boundaries that match with boundary conditions usually implemented in fluid codes. Grid convergence and error estimates are found to be in agreement with theoretical results obtained for neutral fluid conservation equations. The capability of the penalization technique is illustrated by introducing the non-collisional plasma region expected by the kinetic theory in the immediate vicinity of the interface, that is impossible when considering fluid boundary conditions. Axisymmetric numerical simulations show the efficiency of the method to investigate the large-scale transport at the plasma edge including the separatrix and in realistic complex geometries while keeping a simple structured grid.
A penalization technique to model plasma facing components in a tokamak with temperature variations
NASA Astrophysics Data System (ADS)
Paredes, A.; Bufferand, H.; Ciraolo, G.; Schwander, F.; Serre, E.; Ghendrih, P.; Tamain, P.
2014-10-01
To properly address turbulent transport in the edge plasma region of a tokamak, it is mandatory to describe the particle and heat outflow on wall components, using an accurate representation of the wall geometry. This is challenging for many plasma transport codes, which use a structured mesh with one coordinate aligned with magnetic surfaces. We propose here a penalization technique that allows modeling of particle and heat transport using such structured mesh, while also accounting for geometrically complex plasma-facing components. Solid obstacles are considered as particle and momentum sinks whereas ionic and electronic temperature gradients are imposed on both sides of the obstacles along the magnetic field direction using delta functions (Dirac). Solutions exhibit plasma velocities (M=1) and temperatures fluxes at the plasma-wall boundaries that match with boundary conditions usually implemented in fluid codes. Grid convergence and error estimates are found to be in agreement with theoretical results obtained for neutral fluid conservation equations. The capability of the penalization technique is illustrated by introducing the non-collisional plasma region expected by the kinetic theory in the immediate vicinity of the interface, that is impossible when considering fluid boundary conditions. Axisymmetric numerical simulations show the efficiency of the method to investigate the large-scale transport at the plasma edge including the separatrix and in realistic complex geometries while keeping a simple structured grid.
Donnelly, Aoife; Misstear, Bruce; Broderick, Brian
2011-02-15
Background concentrations of nitrogen dioxide (NO(2)) are not constant but vary temporally and spatially. The current paper presents a powerful tool for the quantification of the effects of wind direction and wind speed on background NO(2) concentrations, particularly in cases where monitoring data are limited. In contrast to previous studies which applied similar methods to sites directly affected by local pollution sources, the current study focuses on background sites with the aim of improving methods for predicting background concentrations adopted in air quality modelling studies. The relationship between measured NO(2) concentration in air at three such sites in Ireland and locally measured wind direction has been quantified using nonparametric regression methods. The major aim was to analyse a method for quantifying the effects of local wind direction on background levels of NO(2) in Ireland. The method was expanded to include wind speed as an added predictor variable. A Gaussian kernel function is used in the analysis and circular statistics employed for the wind direction variable. Wind direction and wind speed were both found to have a statistically significant effect on background levels of NO(2) at all three sites. Frequently environmental impact assessments are based on short term baseline monitoring producing a limited dataset. The presented non-parametric regression methods, in contrast to the frequently used methods such as binning of the data, allow concentrations for missing data pairs to be estimated and distinction between spurious and true peaks in concentrations to be made. The methods were found to provide a realistic estimation of long term concentration variation with wind direction and speed, even for cases where the data set is limited. Accurate identification of the actual variation at each location and causative factors could be made, thus supporting the improved definition of background concentrations for use in air quality modelling
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Quantile regression for climate data
NASA Astrophysics Data System (ADS)
Marasinghe, Dilhani Shalika
Quantile regression is a developing statistical tool which is used to explain the relationship between response and predictor variables. This thesis describes two examples of climatology using quantile regression.Our main goal is to estimate derivatives of a conditional mean and/or conditional quantile function. We introduce a method to handle autocorrelation in the framework of quantile regression and used it with the temperature data. Also we explain some properties of the tornado data which is non-normally distributed. Even though quantile regression provides a more comprehensive view, when talking about residuals with the normality and the constant variance assumption, we would prefer least square regression for our temperature analysis. When dealing with the non-normality and non constant variance assumption, quantile regression is a better candidate for the estimation of the derivative.
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record
NASA Astrophysics Data System (ADS)
Han, Hao; Zhang, Hao; Wei, Xinzhou; Moore, William; Liang, Zhengrong
2016-03-01
In this paper, we proposed a low-dose computed tomography (LdCT) image reconstruction method with the help of prior knowledge learning from previous high-quality or normal-dose CT (NdCT) scans. The well-established statistical penalized weighted least squares (PWLS) algorithm was adopted for image reconstruction, where the penalty term was formulated by a texture-based Gaussian Markov random field (gMRF) model. The NdCT scan was firstly segmented into different tissue types by a feature vector quantization (FVQ) approach. Then for each tissue type, a set of tissue-specific coefficients for the gMRF penalty was statistically learnt from the NdCT image via multiple-linear regression analysis. We also proposed a scheme to adaptively select the order of gMRF model for coefficients prediction. The tissue-specific gMRF patterns learnt from the NdCT image were finally used to form an adaptive MRF penalty for the PWLS reconstruction of LdCT image. The proposed texture-adaptive PWLS image reconstruction algorithm was shown to be more effective to preserve image textures than the conventional PWLS image reconstruction algorithm, and we further demonstrated the gain of high-order MRF modeling for texture-preserved LdCT PWLS image reconstruction.
[Qualification of persons taking part in psychiatric opinion-giving in a penal trial].
Zgryzek, K
1998-01-01
Introduction of new Penal code by the Parliament brings about the necessity of conducting a detailed analysis of particular legal solutions in the code. The authors present an analysis of selected issues included in the Penal Code, referring to proof from the opinion of psychiatric experts, particularly those regarding professional qualifications of persons appointed by the court in a penal trial to assess mental health state of definite persons (a witness, a victim, the perpetrator). It was accepted that the only persons authorized the conduct psychiatric examination in a penal trial are those with at least first degree specialization in psychiatry.
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan
2011-07-01
SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan
2011-12-01
SummaryIn this study, groundwater spring potential maps produced by three different methods, frequency ratio, weights of evidence, and logistic regression, were evaluated using validation data sets and compared to each other. Groundwater spring occurrence potential maps in the Sultan Mountains (Konya, Turkey) were constructed using the relationship between groundwater spring locations and their causative factors. Groundwater spring locations were identified in the study area from a topographic map. Different thematic maps of the study area, such as geology, topography, geomorphology, hydrology, and land use/cover, have been used to identify groundwater potential zones. Seventeen spring-related parameter layers of the entire study area were used to generate groundwater spring potential maps. These are geology (lithology), fault density, distance to fault, relative permeability of lithologies, elevation, slope aspect, slope steepness, curvature, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport capacity index, drainage density, distance to drainage, land use/cover, and precipitation. The predictive capability of each model was determined by the area under the relative operating characteristic curve. The areas under the curve for frequency ratio, weights of evidence and logistic regression methods were calculated as 0.903, 0.880, and 0.840, respectively. These results indicate that frequency ratio and weights of evidence models are relatively good estimators, whereas logistic regression is a relatively poor estimator of groundwater spring potential mapping in the study area. The frequency ratio model is simple; the process of input, calculation and output can be readily understood. The produced groundwater spring potential maps can serve planners and engineers in groundwater development plans and land-use planning.
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Rezk, Mamdouh R.; Omran, Yasmin Rostom
2015-04-01
Smart and novel spectrophotometric and chemometric methods have been developed and validated for the simultaneous determination of a binary mixture of chloramphenicol (CPL) and dexamethasone sodium phosphate (DSP) in presence of interfering substances without prior separation. The first method depends upon derivative subtraction coupled with constant multiplication. The second one is ratio difference method at optimum wavelengths which were selected after applying derivative transformation method via multiplying by a decoding spectrum in order to cancel the contribution of non labeled interfering substances. The third method relies on partial least squares with regression model updating. They are so simple that they do not require any preliminary separation steps. Accuracy, precision and linearity ranges of these methods were determined. Moreover, specificity was assessed by analyzing synthetic mixtures of both drugs. The proposed methods were successfully applied for analysis of both drugs in their pharmaceutical formulation. The obtained results have been statistically compared to that of an official spectrophotometric method to give a conclusion that there is no significant difference between the proposed methods and the official ones with respect to accuracy and precision.
Hegazy, Maha A; Lotfy, Hayam M; Rezk, Mamdouh R; Omran, Yasmin Rostom
2015-04-01
Smart and novel spectrophotometric and chemometric methods have been developed and validated for the simultaneous determination of a binary mixture of chloramphenicol (CPL) and dexamethasone sodium phosphate (DSP) in presence of interfering substances without prior separation. The first method depends upon derivative subtraction coupled with constant multiplication. The second one is ratio difference method at optimum wavelengths which were selected after applying derivative transformation method via multiplying by a decoding spectrum in order to cancel the contribution of non labeled interfering substances. The third method relies on partial least squares with regression model updating. They are so simple that they do not require any preliminary separation steps. Accuracy, precision and linearity ranges of these methods were determined. Moreover, specificity was assessed by analyzing synthetic mixtures of both drugs. The proposed methods were successfully applied for analysis of both drugs in their pharmaceutical formulation. The obtained results have been statistically compared to that of an official spectrophotometric method to give a conclusion that there is no significant difference between the proposed methods and the official ones with respect to accuracy and precision.
Eriksson, Lennart; Jaworska, Joanna; Worth, Andrew P; Cronin, Mark T D; McDowell, Robert M; Gramatica, Paola
2003-01-01
This article provides an overview of methods for reliability assessment of quantitative structure-activity relationship (QSAR) models in the context of regulatory acceptance of human health and environmental QSARs. Useful diagnostic tools and data analytical approaches are highlighted and exemplified. Particular emphasis is given to the question of how to define the applicability borders of a QSAR and how to estimate parameter and prediction uncertainty. The article ends with a discussion regarding QSAR acceptability criteria. This discussion contains a list of recommended acceptability criteria, and we give reference values for important QSAR performance statistics. Finally, we emphasize that rigorous and independent validation of QSARs is an essential step toward their regulatory acceptance and implementation. PMID:12896860
Eriksson, Lennart; Jaworska, Joanna; Worth, Andrew P; Cronin, Mark T D; McDowell, Robert M; Gramatica, Paola
2003-08-01
This article provides an overview of methods for reliability assessment of quantitative structure-activity relationship (QSAR) models in the context of regulatory acceptance of human health and environmental QSARs. Useful diagnostic tools and data analytical approaches are highlighted and exemplified. Particular emphasis is given to the question of how to define the applicability borders of a QSAR and how to estimate parameter and prediction uncertainty. The article ends with a discussion regarding QSAR acceptability criteria. This discussion contains a list of recommended acceptability criteria, and we give reference values for important QSAR performance statistics. Finally, we emphasize that rigorous and independent validation of QSARs is an essential step toward their regulatory acceptance and implementation.
NASA Astrophysics Data System (ADS)
Adamowski, Jan; Fung Chan, Hiu; Prasher, Shiv O.; Ozga-Zielinski, Bogdan; Sliusarieva, Anna
2012-01-01
Daily water demand forecasts are an important component of cost-effective and sustainable management and optimization of urban water supply systems. In this study, a method based on coupling discrete wavelet transforms (WA) and artificial neural networks (ANNs) for urban water demand forecasting applications is proposed and tested. Multiple linear regression (MLR), multiple nonlinear regression (MNLR), autoregressive integrated moving average (ARIMA), ANN and WA-ANN models for urban water demand forecasting at lead times of one day for the summer months (May to August) were developed, and their relative performance was compared using the coefficient of determination, root mean square error, relative root mean square error, and efficiency index. The key variables used to develop and validate the models were daily total precipitation, daily maximum temperature, and daily water demand data from 2001 to 2009 in the city of Montreal, Canada. The WA-ANN models were found to provide more accurate urban water demand forecasts than the MLR, MNLR, ARIMA, and ANN models. The results of this study indicate that coupled wavelet-neural network models are a potentially promising new method of urban water demand forecasting that merit further study.
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan; Altural, Tolga
2013-03-01
This study evaluated and compared landslide susceptibility maps produced with three different methods, frequency ratio, weights of evidence, and logistic regression, by using validation datasets. The field surveys performed as part of this investigation mapped the locations of 90 landslides that had been identified in the Sultan Mountains of south-western Turkey. The landslide influence parameters used for this study are geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transportation capacity index, distance to drainage, distance to fault, drainage density, fault density, and spring density maps. The relationships between landslide distributions and these parameters were analysed using the three methods, and the results of these methods were then used to calculate the landslide susceptibility of the entire study area. The accuracy of the final landslide susceptibility maps was evaluated based on the landslides observed during the fieldwork, and the accuracy of the models was evaluated by calculating each model's relative operating characteristic curve. The predictive capability of each model was determined from the area under the relative operating characteristic curve and the areas under the curves obtained using the frequency ratio, logistic regression, and weights of evidence methods are 0.976, 0.952, and 0.937, respectively. These results indicate that the frequency ratio and weights of evidence models are relatively good estimators of landslide susceptibility in the study area. Specifically, the results of the correlation analysis show a high correlation between the frequency ratio and weights of evidence results, and the frequency ratio and logistic regression methods exhibit correlation coefficients of 0.771 and 0.727, respectively. The frequency ratio model is simple, and its input, calculation and output processes are
Speech, language and hearing disorders in a adult penal institution.
Bountress, N; Richards, J
1979-08-01
It has been speculated that the prevalence of communicative disorders among prison inmates is considerably higher than that found in the general population, but research regarding inmate speech and hearing disorders is limited. This study investigated the nature and extent of communicative disorders in an inmate population of a medium-security penal institution in southeastern Virginia. The results of the screening indicated a slightly lower prevalence of stuttering, higher prevalences of articulation, voice, and hearing disorders, and more deficient receptive vocabulary skills than found in the general population. Some dialectal variations among black inmates are noted and the possible influence of linguistic-cultural interference on the results is discussed.
Naguib, Ibrahim A; Abdelaleem, Eglal A; Zaazaa, Hala E; Hussein, Essraa A
2016-07-01
Two multivariate chemometric models, namely, partial least-squares regression (PLSR) and linear support vector regression (SVR), are presented for the analysis of amoxicillin trihydrate and dicloxacillin sodium in the presence of their common impurity (6-aminopenicillanic acid) in raw materials and in pharmaceutical dosage form via handling UV spectral data and making a modest comparison between the two models, highlighting the advantages and limitations of each. For optimum analysis, a three-factor, four-level experimental design was established, resulting in a training set of 16 mixtures containing different ratios of interfering species. To validate the prediction ability of the suggested models, an independent test set consisting of eight mixtures was used. The presented results show the ability of the two proposed models to determine the two drugs simultaneously in the presence of small levels of the common impurity with high accuracy and selectivity. The analysis results of the dosage form were statistically compared to a reported HPLC method, with no significant difference regarding accuracy and precision, indicating the ability of the suggested multivariate calibration models to be reliable and suitable for routine analysis of the drug product. Compared to the PLSR model, the SVR model gives more accurate results with a lower prediction error, as well as high generalization ability; however, the PLSR model is easy to handle and fast to optimize. PMID:27305461
Vasiliu, Daniel; Clamons, Samuel; McDonough, Molly; Rabe, Brian; Saha, Margaret
2015-01-01
Global gene expression analysis using microarrays and, more recently, RNA-seq, has allowed investigators to understand biological processes at a system level. However, the identification of differentially expressed genes in experiments with small sample size, high dimensionality, and high variance remains challenging, limiting the usability of these tens of thousands of publicly available, and possibly many more unpublished, gene expression datasets. We propose a novel variable selection algorithm for ultra-low-n microarray studies using generalized linear model-based variable selection with a penalized binomial regression algorithm called penalized Euclidean distance (PED). Our method uses PED to build a classifier on the experimental data to rank genes by importance. In place of cross-validation, which is required by most similar methods but not reliable for experiments with small sample size, we use a simulation-based approach to additively build a list of differentially expressed genes from the rank-ordered list. Our simulation-based approach maintains a low false discovery rate while maximizing the number of differentially expressed genes identified, a feature critical for downstream pathway analysis. We apply our method to microarray data from an experiment perturbing the Notch signaling pathway in Xenopus laevis embryos. This dataset was chosen because it showed very little differential expression according to limma, a powerful and widely-used method for microarray analysis. Our method was able to detect a significant number of differentially expressed genes in this dataset and suggest future directions for investigation. Our method is easily adaptable for analysis of data from RNA-seq and other global expression experiments with low sample size and high dimensionality.
49 CFR 26.47 - Can recipients be penalized for failing to meet overall goals?
Code of Federal Regulations, 2010 CFR
2010-10-01
... overall goals? 26.47 Section 26.47 Transportation Office of the Secretary of Transportation PARTICIPATION... Goals, Good Faith Efforts, and Counting § 26.47 Can recipients be penalized for failing to meet overall goals? (a) You cannot be penalized, or treated by the Department as being in noncompliance with...
27 CFR 19.957 - Instructions to compute bond penal sum.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 27 Alcohol, Tobacco Products and Firearms 1 2010-04-01 2010-04-01 false Instructions to compute bond penal sum. 19.957 Section 19.957 Alcohol, Tobacco Products and Firearms ALCOHOL AND TOBACCO TAX... Fuel Use Bonds § 19.957 Instructions to compute bond penal sum. (a) Medium plants. To find the...
Wild bootstrap for quantile regression.
Feng, Xingdong; He, Xuming; Hu, Jianhua
2011-12-01
The existing theory of the wild bootstrap has focused on linear estimators. In this note, we broaden its validity by providing a class of weight distributions that is asymptotically valid for quantile regression estimators. As most weight distributions in the literature lead to biased variance estimates for nonlinear estimators of linear regression, we propose a modification of the wild bootstrap that admits a broader class of weight distributions for quantile regression. A simulation study on median regression is carried out to compare various bootstrap methods. With a simple finite-sample correction, the wild bootstrap is shown to account for general forms of heteroscedasticity in a regression model with fixed design points. PMID:23049133
Kramer, S.
1996-12-31
In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.
Middle Micoene sandstone reservoirs of the Penal/Barrackpore field
Dyer, B.L. )
1991-03-01
The Penal/Barrackpore field was discovered in 1938 and is located in the southern subbasin of onshore Trinidad. The accumulation is one of a series of northeast-southwest trending en echelon middle Miocene anticlinal structures that was later accentuated by late Pliocene transpressional folding. Relative movement of the South American and Caribbean plates climaxed in the middle Miocene compressive tectonic event and produced an imbricate pattern of southward-facing basement-involved thrusts. Further compressive interaction between the plates in the late Pliocene produced a transpressive tectonic episode forming northwest-southeast oriented transcurrent faults, tear faults, basement thrust faults, lystric normal faults, and detached simple folds with infrequent diapiric cores. The middle Miocene Herrera and Karamat turbiditic sandstones are the primary reservoir rock in the subsurface anticline of the Penal/Barrackpore field. These turbidites were sourced from the north and deposited within the marls and clays of the Cipero Formation. Miocene and Pliocene deltaics and turbidites succeed the Cipero Formation vertically, lapping into preexisting Miocene highs. The late Pliocene transpression also coincides with the onset of oil migration along faults, diapirs, and unconformities from the Cretaceous Naparima Hill source. The Lengua Formation and the upper Forest clays are considered effective seals. Hydrocarbon trapping is structurally and stratigraphically controlled, with structure being the dominant trapping mechanism. Ultimate recoverable reserves for the field are estimated at 127.9 MMBo and 628.8 bcf. The field is presently owned and operated by the Trinidad and Tobago Oil Company Limited (TRINTOC).
Hwang, Jae Joon; Kim, Kee-Deog; Park, Hyok; Park, Chang Seo; Jeong, Ho-Gul
2014-01-01
Superimposition has been used as a method to evaluate the changes of orthodontic or orthopedic treatment in the dental field. With the introduction of cone beam CT (CBCT), evaluating 3 dimensional changes after treatment became possible by superimposition. 4 point plane orientation is one of the simplest ways to achieve superimposition of 3 dimensional images. To find factors influencing superimposition error of cephalometric landmarks by 4 point plane orientation method and to evaluate the reproducibility of cephalometric landmarks for analyzing superimposition error, 20 patients were analyzed who had normal skeletal and occlusal relationship and took CBCT for diagnosis of temporomandibular disorder. The nasion, sella turcica, basion and midpoint between the left and the right most posterior point of the lesser wing of sphenoidal bone were used to define a three-dimensional (3D) anatomical reference co-ordinate system. Another 15 reference cephalometric points were also determined three times in the same image. Reorientation error of each landmark could be explained substantially (23%) by linear regression model, which consists of 3 factors describing position of each landmark towards reference axes and locating error. 4 point plane orientation system may produce an amount of reorientation error that may vary according to the perpendicular distance between the landmark and the x-axis; the reorientation error also increases as the locating error and shift of reference axes viewed from each landmark increases. Therefore, in order to reduce the reorientation error, accuracy of all landmarks including the reference points is important. Construction of the regression model using reference points of greater precision is required for the clinical application of this model. PMID:25372707
Phase retrieval from noisy data based on minimization of penalized I-divergence.
Choi, Kerkil; Lanterman, Aaron D
2007-01-01
We study noise artifacts in phase retrieval based on minimization of an information-theoretic discrepancy measure called Csiszár's I-divergence. We specifically focus on adding Poisson noise to either the autocorrelation of the true image (as in astronomical imaging through turbulence) or the squared Fourier magnitudes of the true image (as in x-ray crystallography). Noise effects are quantified via various error metrics as signal-to-noise ratios vary. We propose penalized minimum I-divergence methods to suppress the observed noise artifacts. To avoid computational difficulties arising from the introduction of a penalty, we adapt Green's one-step-late approach for use in our minimum I-divergence framework.
Phase retrieval from noisy data based on minimization of penalized I-divergence.
Choi, Kerkil; Lanterman, Aaron D
2007-01-01
We study noise artifacts in phase retrieval based on minimization of an information-theoretic discrepancy measure called Csiszár's I-divergence. We specifically focus on adding Poisson noise to either the autocorrelation of the true image (as in astronomical imaging through turbulence) or the squared Fourier magnitudes of the true image (as in x-ray crystallography). Noise effects are quantified via various error metrics as signal-to-noise ratios vary. We propose penalized minimum I-divergence methods to suppress the observed noise artifacts. To avoid computational difficulties arising from the introduction of a penalty, we adapt Green's one-step-late approach for use in our minimum I-divergence framework. PMID:17164841
Building Regression Models: The Importance of Graphics.
ERIC Educational Resources Information Center
Dunn, Richard
1989-01-01
Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)
Garcia-Magariños, Manuel; Antoniadis, Anestis; Cao, Ricardo; Gonzãlez-Manteiga, Wenceslao
2010-01-01
Statistical methods generating sparse models are of great value in the gene expression field, where the number of covariates (genes) under study moves about the thousands while the sample sizes seldom reach a hundred of individuals. For phenotype classification, we propose different lasso logistic regression approaches with specific penalizations for each gene. These methods are based on a generalized soft-threshold (GSoft) estimator. We also show that a recent algorithm for convex optimization, namely, the cyclic coordinate descent (CCD) algorithm, provides with a way to solve the optimization problem significantly faster than with other competing methods. Viewing GSoft as an iterative thresholding procedure allows us to get the asymptotic properties of the resulting estimates in a straightforward manner. Results are obtained for simulated and real data. The leukemia and colon datasets are commonly used to evaluate new statistical approaches, so they come in useful to establish comparisons with similar methods. Furthermore, biological meaning is extracted from the leukemia results, and compared with previous studies. In summary, the approaches presented here give rise to sparse, interpretable models that are competitive with similar methods developed in the field.
Kong, Changsu; Adeola, Olayiwola
2016-01-01
The present study was conducted to determine ileal digestible energy (IDE), metabolizable energy (ME), and nitrogen-corrected ME (MEn) contents of expeller- (EECM) and solvent-extracted canola meal (SECM) for broiler chickens using the regression method. Dietary treatments consisted of a corn-soybean meal reference diet and four assay diets prepared by supplementing the reference diet with each of canola meals (EECM or SECM) at 100 or 200 g/kg, respectively, to partly replace the energy yielding sources in the reference diet. Birds received a standard starter diet from day 0 to 14 and the assay diets from day 14 to 21. On day 14, a total of 240 birds were grouped into eight blocks by body weight and randomly allocated to five dietary treatments in each block with six birds per cage in a randomized complete block design. Excreta samples were collected from day 18 to 20 and ileal digesta were collected on day 21. The IDE, ME, and MEn (kcal/kg DM) of EECM or SECM were derived from the regression of EECM- or SECM-associated IDE, ME and MEn intake (Y, kcal) against the intake of EECM or SECM (X, kg DM), respectively. Regression equations of IDE, ME and MEn for the EECM-substituted diet were Y = -21.2 + 3035X (r(2) = 0.946), Y = -1.0 + 2807X (r(2) = 0.884) and Y = -2.0 + 2679X (r(2) = 0.902), respectively. The respective equations for the SECM diet were Y = 20.7 + 2881X (r(2) = 0.962), Y = 27.2 + 2077X (r(2) = 0.875) and Y = 24.7 + 2013X (r(2) = 0.901). The slope for IDE did not differ between the EECM and SECM whereas the slopes for ME and MEn were greater (P < 0.05) for the EECM than for the SECM. These results indicate that the EECM might be a superior energy source for broiler chickens compared with the SECM when both canola meals are used to reduce the cost of feeding. PMID:27350926
Yuan, Haibo; Liu, Xiaowei; Xiang, Maosheng; Huang, Yang; Zhang, Huihua; Chen, Bingqiu E-mail: x.liu@pku.edu.cn
2015-02-01
In this paper we propose a spectroscopy-based stellar color regression (SCR) method to perform accurate color calibration for modern imaging surveys, taking advantage of millions of stellar spectra now available. The method is straightforward, insensitive to systematic errors in the spectroscopically determined stellar atmospheric parameters, applicable to regions that are effectively covered by spectroscopic surveys, and capable of delivering an accuracy of a few millimagnitudes for color calibration. As an illustration, we have applied the method to the Sloan Digital Sky Survey (SDSS) Stripe 82 data. With a total number of 23,759 spectroscopically targeted stars, we have mapped out the small but strongly correlated color zero-point errors present in the photometric catalog of Stripe 82, and we improve the color calibration by a factor of two to three. Our study also reveals some small but significant magnitude dependence errors in the z band for some charge-coupled devices (CCDs). Such errors are likely to be present in all the SDSS photometric data. Our results are compared with those from a completely independent test based on the intrinsic colors of red galaxies presented by Ivezić et al. The comparison, as well as other tests, shows that the SCR method has achieved a color calibration internally consistent at a level of about 5 mmag in u – g, 3 mmag in g – r, and 2 mmag in r – i and i – z. Given the power of the SCR method, we discuss briefly the potential benefits by applying the method to existing, ongoing, and upcoming imaging surveys.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Tharrington, Arnold N.
2015-09-09
The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.
Organic Act No. 3/1989 updating the Penal Code, 21 June 1989. [Selected provisions].
1989-01-01
Spain's Organic Act No. 3/1989 updating the Penal Code, June 21, 1989, prohibits public officials from soliciting sex from a person who, independently or through her spouse, spouse-like partner, ascendant, descendant, brother, or person related in the same degree, has business involving that official. A prison official who solicits sex from a person in his custody or any relative of that person, shall be punished with minor imprisonment. He who habitually and for any reason perpetrates physical violence against his spouse, child, or persons similarly situated to him, shall be punished with grand imprisonment. Free and express consent given by competent adults exempts parties from penal liability for organ transplants, sterilizations, or transsexual surgery. Sterilization of an incompetent person shall not be punished when the incompetent person suffers from serious mental deficiency and a judge has authorized the sterilization under specified conditions. A person commits the crime of rape, punishable with minor imprisonment, if he had carnal access to another person by vaginal, anal, or oral means using force or intimidation, or when the person lacks full faculties or is in an impaired condition, or when the person is under 12 of age, despite the absence of the above circumstances. Sexual aggression accomplished under the above circumstances shall be punished with minor imprisonment and a fine. If the aggression includes the use of objects or brutal, degrading, or harassing means, methods or instruments, the punishment shall be grand imprisonment. Whoever fails to fulfill support obligations shall be punished with grand imprisonment and a fine when there is malicious abandonment of the marital home or disorderly conduct; the Court may terminate parental authority. Those allowing the use of minors under age 16 in the practice of beggary shall be punished with grand imprisonment. Failure to perform parental duties or comply with court orders shall be punished. PMID
Organic Act No. 3/1989 updating the Penal Code, 21 June 1989. [Selected provisions].
1989-01-01
Spain's Organic Act No. 3/1989 updating the Penal Code, June 21, 1989, prohibits public officials from soliciting sex from a person who, independently or through her spouse, spouse-like partner, ascendant, descendant, brother, or person related in the same degree, has business involving that official. A prison official who solicits sex from a person in his custody or any relative of that person, shall be punished with minor imprisonment. He who habitually and for any reason perpetrates physical violence against his spouse, child, or persons similarly situated to him, shall be punished with grand imprisonment. Free and express consent given by competent adults exempts parties from penal liability for organ transplants, sterilizations, or transsexual surgery. Sterilization of an incompetent person shall not be punished when the incompetent person suffers from serious mental deficiency and a judge has authorized the sterilization under specified conditions. A person commits the crime of rape, punishable with minor imprisonment, if he had carnal access to another person by vaginal, anal, or oral means using force or intimidation, or when the person lacks full faculties or is in an impaired condition, or when the person is under 12 of age, despite the absence of the above circumstances. Sexual aggression accomplished under the above circumstances shall be punished with minor imprisonment and a fine. If the aggression includes the use of objects or brutal, degrading, or harassing means, methods or instruments, the punishment shall be grand imprisonment. Whoever fails to fulfill support obligations shall be punished with grand imprisonment and a fine when there is malicious abandonment of the marital home or disorderly conduct; the Court may terminate parental authority. Those allowing the use of minors under age 16 in the practice of beggary shall be punished with grand imprisonment. Failure to perform parental duties or comply with court orders shall be punished.
Sampling and handling artifacts can bias filter-based measurements of particulate organic carbon (OC). Several measurement-based methods for OC artifact reduction and/or estimation are currently used in research-grade field studies. OC frequently is not artifact-corrected in larg...
Larson, Nicholas B; McDonnell, Shannon; Albright, Lisa Cannon; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham; MacInnis, Robert; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catolona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J
2016-09-01
Rare variants (RVs) have been shown to be significant contributors to complex disease risk. By definition, these variants have very low minor allele frequencies and traditional single-marker methods for statistical analysis are underpowered for typical sequencing study sample sizes. Multimarker burden-type approaches attempt to identify aggregation of RVs across case-control status by analyzing relatively small partitions of the genome, such as genes. However, it is generally the case that the aggregative measure would be a mixture of causal and neutral variants, and these omnibus tests do not directly provide any indication of which RVs may be driving a given association. Recently, Bayesian variable selection approaches have been proposed to identify RV associations from a large set of RVs under consideration. Although these approaches have been shown to be powerful at detecting associations at the RV level, there are often computational limitations on the total quantity of RVs under consideration and compromises are necessary for large-scale application. Here, we propose a computationally efficient alternative formulation of this method using a probit regression approach specifically capable of simultaneously analyzing hundreds to thousands of RVs. We evaluate our approach to detect causal variation on simulated data and examine sensitivity and specificity in instances of high RV dimensionality as well as apply it to pathway-level RV analysis results from a prostate cancer (PC) risk case-control sequencing study. Finally, we discuss potential extensions and future directions of this work.
Larson, Nicholas B; McDonnell, Shannon; Albright, Lisa Cannon; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham; MacInnis, Robert; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catolona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J
2016-09-01
Rare variants (RVs) have been shown to be significant contributors to complex disease risk. By definition, these variants have very low minor allele frequencies and traditional single-marker methods for statistical analysis are underpowered for typical sequencing study sample sizes. Multimarker burden-type approaches attempt to identify aggregation of RVs across case-control status by analyzing relatively small partitions of the genome, such as genes. However, it is generally the case that the aggregative measure would be a mixture of causal and neutral variants, and these omnibus tests do not directly provide any indication of which RVs may be driving a given association. Recently, Bayesian variable selection approaches have been proposed to identify RV associations from a large set of RVs under consideration. Although these approaches have been shown to be powerful at detecting associations at the RV level, there are often computational limitations on the total quantity of RVs under consideration and compromises are necessary for large-scale application. Here, we propose a computationally efficient alternative formulation of this method using a probit regression approach specifically capable of simultaneously analyzing hundreds to thousands of RVs. We evaluate our approach to detect causal variation on simulated data and examine sensitivity and specificity in instances of high RV dimensionality as well as apply it to pathway-level RV analysis results from a prostate cancer (PC) risk case-control sequencing study. Finally, we discuss potential extensions and future directions of this work. PMID:27312771
Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé
2016-01-01
Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis.
Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé
2016-01-01
Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418
Sazykina, I G
2008-01-01
The effect of the special contingents of penal system on the tuberculosis epidemiological indicators in a particular region. The findings testify that under the conditions existed in the Orenburgskaya oblast the negative effect of the contingents of the sentence execution service of the Russian Federation on the tuberculosis epidemiological indicators is rather strong and besides is significantly higher in comparison with the average values typical for the Russian Federation. This kind of epidemiological analysis would be useful for functioning of the organizational and methodical services of other regional tuberculosis dispensers for the purpose of developing well-grounded additional managerial decisions.
Statutory disclosure in article 280 of the Turkish Penal Code.
Büken, Erhan; Sahinoğlu, Serap; Büken, Nüket Ornek
2006-11-01
A new Turkish Penal Code came into effect on 1 June 2005. Article 280 concerns health care workers' failure to report a crime. This article removes the responsibility from health care workers to maintain confidentiality, but also removes patients' right to confidentiality. It provides for up to one year of imprisonment for a health care worker who, while on duty, finds an indication that a crime might have been committed by a patient and who does not inform the responsible authorities about it. This forces the health care worker to divulge the patient's confidential information. A patient who thinks he or she may be accused of a crime may therefore not seek medical help, which is the universal right of every person. The article is therefore contrary to medical ethics, oaths taken by physicians and nurses, and the understanding of patient confidentiality.
Penal harm medicine: state tort remedies for delaying and denying health care to prisoners.
Vaughn, M S
1999-01-01
In prison and jail subcultures, custodial personnel are committed to the penal harm movement, which seeks to inflict pain on prisoners. Conversely, correctional medical personnel are sworn to the Hippocratic Oath and are committed to alleviating prisoners' suffering. The Hippocratic Oath is violated when correctional medical workers adopt penal harm mandates and inflict pain on prisoners. By analyzing lawsuits filed by prisoners under state tort law, this article shows how the penal harm movement co-opts some correctional medical employees into abandoning their treatment and healing mission, thus causing denial or delay of medical treatment to prisoners.
[Some consequences of the application of the new Swiss penal code on legal psychiatry].
Gasser, Jacques; Gravier, Bruno
2007-09-19
The new text of the Swiss penal code, which entered into effect at the beginning of 2007, has many incidences on the practice of the psychiatrists realizing expertises in the penal field or engaged in the application of legal measures imposing a treatment. The most notable consequences of this text are, on the one hand, a new definition of the concept of penal irresponsibility which is not necessarily any more related to a psychiatric diagnosis and, on the other hand, a new definition of legal constraints that justice can take to prevent new punishable acts and which appreciably modifies the place of the psychiatrists in the questions binding psychiatric care and social control.
[Between law and psychiatry: homosexuality in the project of the Swiss penal code (1918)].
Delessert, Thierry
2005-01-01
In 1942 the Swiss penal code depenalises homosexual acts between agreeing adults under some conditions. The genesis of the penal article shows that it was constructed before the First World War and bears marks of the forensic theories of the turn of the century. Both by direct contacts and the authority of its eminent figures, Swiss psychiatry exerts an unquestionable influence on the depenalisation. The conceptualisation of homosexuality is also strongly influenced by the German psychiatric theories and discussed in reference to Germanic law. By the penal article, the Swiss lawyers and psychiatrists link the homosexual question with the determination of the irresponsibility of criminal mental patients and degeneracy.
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
An, Yongkai; Lu, Wenxi; Cheng, Weiguo
2015-07-30
This paper introduces a surrogate model to identify an optimal exploitation scheme, while the western Jilin province was selected as the study area. A numerical simulation model of groundwater flow was established first, and four exploitation wells were set in the Tongyu county and Qian Gorlos county respectively so as to supply water to Daan county. Second, the Latin Hypercube Sampling (LHS) method was used to collect data in the feasible region for input variables. A surrogate model of the numerical simulation model of groundwater flow was developed using the regression kriging method. An optimization model was established to search an optimal groundwater exploitation scheme using the minimum average drawdown of groundwater table and the minimum cost of groundwater exploitation as multi-objective functions. Finally, the surrogate model was invoked by the optimization model in the process of solving the optimization problem. Results show that the relative error and root mean square error of the groundwater table drawdown between the simulation model and the surrogate model for 10 validation samples are both lower than 5%, which is a high approximation accuracy. The contrast between the surrogate-based simulation optimization model and the conventional simulation optimization model for solving the same optimization problem, shows the former only needs 5.5 hours, and the latter needs 25 days. The above results indicate that the surrogate model developed in this study could not only considerably reduce the computational burden of the simulation optimization process, but also maintain high computational accuracy. This can thus provide an effective method for identifying an optimal groundwater exploitation scheme quickly and accurately.
NASA Astrophysics Data System (ADS)
Ochoa Gutierrez, L. H.; Vargas Jimenez, C. A.; Niño Vasquez, L. F.
2011-12-01
The "Sabana de Bogota" (Bogota Savannah) is the most important social and economical center of Colombia. Almost the third of population is concentrated in this region and generates about the 40% of Colombia's Internal Brute Product (IBP). According to this, the zone presents an elevated vulnerability in case that a high destructive seismic event occurs. Historical evidences show that high magnitude events took place in the past with a huge damage caused to the city and indicate that is probable that such events can occur in the next years. This is the reason why we are working in an early warning generation system, using the first few seconds of a seismic signal registered by three components and wide band seismometers. Such system can be implemented using Computational Intelligence tools, designed and calibrated to the particular Geological, Structural and environmental conditions present in the region. The methods developed are expected to work on real time, thus suitable software and electronic tools need to be developed. We used Support Vector Machines Regression (SVMR) methods trained and tested with historic seismic events registered by "EL ROSAL" Station, located near Bogotá, calculating descriptors or attributes as the input of the model, from the first 6 seconds of signal. With this algorithm, we obtained less than 10% of mean absolute error and correlation coefficients greater than 85% in hypocentral distance and Magnitude estimation. With this results we consider that we can improve the method trying to have better accuracy with less signal time and that this can be a very useful model to be implemented directly in the seismological stations to generate a fast characterization of the event, broadcasting not only raw signal but pre-processed information that can be very useful for accurate Early Warning Generation.
An, Yongkai; Lu, Wenxi; Cheng, Weiguo
2015-08-01
This paper introduces a surrogate model to identify an optimal exploitation scheme, while the western Jilin province was selected as the study area. A numerical simulation model of groundwater flow was established first, and four exploitation wells were set in the Tongyu county and Qian Gorlos county respectively so as to supply water to Daan county. Second, the Latin Hypercube Sampling (LHS) method was used to collect data in the feasible region for input variables. A surrogate model of the numerical simulation model of groundwater flow was developed using the regression kriging method. An optimization model was established to search an optimal groundwater exploitation scheme using the minimum average drawdown of groundwater table and the minimum cost of groundwater exploitation as multi-objective functions. Finally, the surrogate model was invoked by the optimization model in the process of solving the optimization problem. Results show that the relative error and root mean square error of the groundwater table drawdown between the simulation model and the surrogate model for 10 validation samples are both lower than 5%, which is a high approximation accuracy. The contrast between the surrogate-based simulation optimization model and the conventional simulation optimization model for solving the same optimization problem, shows the former only needs 5.5 hours, and the latter needs 25 days. The above results indicate that the surrogate model developed in this study could not only considerably reduce the computational burden of the simulation optimization process, but also maintain high computational accuracy. This can thus provide an effective method for identifying an optimal groundwater exploitation scheme quickly and accurately. PMID:26264008
NASA Astrophysics Data System (ADS)
He, Yaqian; Bo, Yanchen; Chai, Leilei; Liu, Xiaolong; Li, Aihua
2016-08-01
Leaf Area Index (LAI) is an important parameter of vegetation structure. A number of moderate resolution LAI products have been produced in urgent need of large scale vegetation monitoring. High resolution LAI reference maps are necessary to validate these LAI products. This study used a geostatistical regression (GR) method to estimate LAI reference maps by linking in situ LAI and Landsat TM/ETM+ and SPOT-HRV data over two cropland and two grassland sites. To explore the discrepancies of employing different vegetation indices (VIs) on estimating LAI reference maps, this study established the GR models for different VIs, including difference vegetation index (DVI), normalized difference vegetation index (NDVI), and ratio vegetation index (RVI). To further assess the performance of the GR model, the results from the GR and Reduced Major Axis (RMA) models were compared. The results show that the performance of the GR model varies between the cropland and grassland sites. At the cropland sites, the GR model based on DVI provides the best estimation, while at the grassland sites, the GR model based on DVI performs poorly. Compared to the RMA model, the GR model improves the accuracy of reference LAI maps in terms of root mean square errors (RMSE) and bias.
NASA Astrophysics Data System (ADS)
Li, Yusheng
2011-02-01
Iterative reconstruction algorithms have been widely used in PET and SPECT emission tomography. Accurate modeling of photon noise propagation is crucial for quantitative tomography applications. Iteration-based noise propagation methods have been developed for only a few algorithms that have explicit multiplicative update equations. And there are discrepancies between the iteration-based methods and Fessler's fixed-point method because of improper approximations. In this paper, we present a unified theoretical prediction of noise propagation for any penalized expectation maximization (EM) algorithm where the EM approach incorporates a penalty term. The proposed method does not require an explicit update equation. The update equation is assumed to be implicitly defined by a differential equation of a surrogate function. We derive the expressions using the implicit function theorem, Taylor series and the chain rule from vector calculus. We also derive the fixed-point expressions when iterative algorithms converge and show the consistency between the proposed method and the fixed-point method. These expressions are solely defined in terms of the partial derivatives of the surrogate function and the Fisher information matrices. We also apply the theoretical noise predictions for iterative reconstruction algorithms in emission tomography. Finally, we validate the theoretical predictions for MAP-EM and OSEM algorithms using Monte Carlo simulations with Jaszczak-like and XCAT phantoms, respectively.
Peres, Maria Fernanda Tourinho; Nery Filho, Antônio
2002-01-01
Psychiatric information and practice are closely related with the field of criminal law, questioning classical penal law premises, such as responsibility and freewill. We have analyzed the articles related to mental health in Brazilian penal laws, since Código Criminal do Império do Brazil (Brazilian Empire criminal laws) from 1830. Our objective is to describe the structuring of a legal status for the mentally ill in Brazil, as well as the model of penal intervention in the lives of those considered as 'dangerous' and 'irresponsible'. In order to do so, we have analyzed not only specific articles on penal law, but also texts by specialized analysts. In addition, we have discussed the concepts that keep mentally-ill criminals in a rather ambiguous situation, i.e. legal irresponsibility, potential aggressiveness and safety policies.
Recent changes in Criminal Procedure Code and Indian Penal Code relevant to medical profession.
Agarwal, Swapnil S; Kumar, Lavlesh; Mestri, S C
2010-02-01
Some sections in Criminal Procedure Code and Indian Penal Code have a direct binding on medical practitioner. With changing times, few of them have been revised and these changes are presented in this article.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso
Kong, Shengchun; Nan, Bin
2013-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses. PMID:24516328
Improved Regression Calibration
ERIC Educational Resources Information Center
Skrondal, Anders; Kuha, Jouni
2012-01-01
The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…
Prediction in Multiple Regression.
ERIC Educational Resources Information Center
Osborne, Jason W.
2000-01-01
Presents the concept of prediction via multiple regression (MR) and discusses the assumptions underlying multiple regression analyses. Also discusses shrinkage, cross-validation, and double cross-validation of prediction equations and describes how to calculate confidence intervals around individual predictions. (SLD)
Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
2013-01-01
Background Previous studies on informal patient payments have mostly focused on the magnitude and determinants of these payments while the attitudes of health care actors towards these payments are less well known. This study aims to reveal the attitudes of Hungarian health care consumers towards informal payments to provide a better understanding of this phenomenon. Methods For the analysis, we use data from a survey carried out in 2010 in Hungary involving a representative sample of 1037 respondents. We use cluster analysis to identify the main attitude groups related to informal payments based on the respondents’ perception of and behavior related to informal payments. Multinomial logistic regression is applied to examine the differences between these groups in terms of socio-demographic characteristics, as well as past utilization and informal payments paid for health care services. Results We identified three main different attitudes towards informal payments: accepting informal payments, doubting about informal payments and opposing informal payments. Those who accept informal payments (mostly young or elderly people, living in the capital) consider these payments as an expression of gratitude and perceive them as inevitable due to the low funding of the health care system. Those who doubt about informal payments (mostly respondents outside the capital, with higher education and higher household income) are not certain whether these payments are inevitable, perceive them as similar to corruption rather than gratitude, and would rather use private services to avoid these payments. We find that the opposition to informal payments (mostly among men from small households and low income households) can be explained by their lower ability and willingness to pay. Conclusions A large share of Hungarian health care consumers has a rather positive attitude towards informal payments, perceiving them as “inevitable due to the low funding of the health care system
NASA Astrophysics Data System (ADS)
Llacer, Jorge; Solberg, Timothy D.; Promberger, Claus
2001-10-01
This paper presents a description of tests carried out to compare the behaviour of five algorithms in inverse radiation therapy planning: (1) The Dynamically Penalized Likelihood (DPL), an algorithm based on statistical estimation theory; (2) an accelerated version of the same algorithm; (3) a new fast adaptive simulated annealing (ASA) algorithm; (4) a conjugate gradient method; and (5) a Newton gradient method. A three-dimensional mathematical phantom and two clinical cases have been studied in detail. The phantom consisted of a U-shaped tumour with a partially enclosed 'spinal cord'. The clinical examples were a cavernous sinus meningioma and a prostate case. The algorithms have been tested in carefully selected and controlled conditions so as to ensure fairness in the assessment of results. It has been found that all five methods can yield relatively similar optimizations, except when a very demanding optimization is carried out. For the easier cases, the differences are principally in robustness, ease of use and optimization speed. In the more demanding case, there are significant differences in the resulting dose distributions. The accelerated DPL emerges as possibly the algorithm of choice for clinical practice. An appendix describes the differences in behaviour between the new ASA method and the one based on a patent by the Nomos Corporation.
Efficient Drug-Pathway Association Analysis via Integrative Penalized Matrix Decomposition.
Li, Cong; Yang, Can; Hather, Greg; Liu, Ray; Zhao, Hongyu
2016-01-01
Traditional drug discovery practice usually follows the "one drug - one target" approach, seeking to identify drug molecules that act on individual targets, which ignores the systemic nature of human diseases. Pathway-based drug discovery recently emerged as an appealing approach to overcome this limitation. An important first step of such pathway-based drug discovery is to identify associations between drug molecules and biological pathways. This task has been made feasible by the accumulating data from high-throughput transcription and drug sensitivity profiling. In this paper, we developed "iPaD", an integrative Penalized Matrix Decomposition method to identify drug-pathway associations through jointly modeling of such high-throughput transcription and drug sensitivity data. A scalable bi-convex optimization algorithm was implemented and gave iPaD tremendous advantage in computational efficiency over current state-of-the-art method, which allows it to handle the ever-growing large-scale data sets that current method cannot afford to. On two widely used real data sets, iPaD also significantly outperformed the current method in terms of the number of validated drug-pathway associations that were identified. The Matlab code of our algorithm publicly available at http://licong-jason.github.io/iPaD/. PMID:27295636
Regression problems for magnitudes
NASA Astrophysics Data System (ADS)
Castellaro, S.; Mulargia, F.; Kagan, Y. Y.
2006-06-01
Least-squares linear regression is so popular that it is sometimes applied without checking whether its basic requirements are satisfied. In particular, in studying earthquake phenomena, the conditions (a) that the uncertainty on the independent variable is at least one order of magnitude smaller than the one on the dependent variable, (b) that both data and uncertainties are normally distributed and (c) that residuals are constant are at times disregarded. This may easily lead to wrong results. As an alternative to least squares, when the ratio between errors on the independent and the dependent variable can be estimated, orthogonal regression can be applied. We test the performance of orthogonal regression in its general form against Gaussian and non-Gaussian data and error distributions and compare it with standard least-square regression. General orthogonal regression is found to be superior or equal to the standard least squares in all the cases investigated and its use is recommended. We also compare the performance of orthogonal regression versus standard regression when, as often happens in the literature, the ratio between errors on the independent and the dependent variables cannot be estimated and is arbitrarily set to 1. We apply these results to magnitude scale conversion, which is a common problem in seismology, with important implications in seismic hazard evaluation, and analyse it through specific tests. Our analysis concludes that the commonly used standard regression may induce systematic errors in magnitude conversion as high as 0.3-0.4, and, even more importantly, this can introduce apparent catalogue incompleteness, as well as a heavy bias in estimates of the slope of the frequency-magnitude distributions. All this can be avoided by using the general orthogonal regression in magnitude conversions.
Geodesic least squares regression on information manifolds
Verdoolaege, Geert
2014-12-05
We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.
Tang Shaojie; Tang Xiangyang
2012-09-15
Purposes: The suppression of noise in x-ray computed tomography (CT) imaging is of clinical relevance for diagnostic image quality and the potential for radiation dose saving. Toward this purpose, statistical noise reduction methods in either the image or projection domain have been proposed, which employ a multiscale decomposition to enhance the performance of noise suppression while maintaining image sharpness. Recognizing the advantages of noise suppression in the projection domain, the authors propose a projection domain multiscale penalized weighted least squares (PWLS) method, in which the angular sampling rate is explicitly taken into consideration to account for the possible variation of interview sampling rate in advanced clinical or preclinical applications. Methods: The projection domain multiscale PWLS method is derived by converting an isotropic diffusion partial differential equation in the image domain into the projection domain, wherein a multiscale decomposition is carried out. With adoption of the Markov random field or soft thresholding objective function, the projection domain multiscale PWLS method deals with noise at each scale. To compensate for the degradation in image sharpness caused by the projection domain multiscale PWLS method, an edge enhancement is carried out following the noise reduction. The performance of the proposed method is experimentally evaluated and verified using the projection data simulated by computer and acquired by a CT scanner. Results: The preliminary results show that the proposed projection domain multiscale PWLS method outperforms the projection domain single-scale PWLS method and the image domain multiscale anisotropic diffusion method in noise reduction. In addition, the proposed method can preserve image sharpness very well while the occurrence of 'salt-and-pepper' noise and mosaic artifacts can be avoided. Conclusions: Since the interview sampling rate is taken into account in the projection domain
Conditional bias-penalized optimal estimation for QPE
NASA Astrophysics Data System (ADS)
Seo, D.
2011-12-01
Most precipitation estimation techniques employ some form of optimal estimation, which usually targets unbiasedness and minimum error variance. Because these properties generally hold only in the unconditional sense, the resulting estimates are subject to conditional biases that may be unacceptably large. A prime example is precipitation analysis using rain gauge data for which, e.g., kriging may significantly underestimate heavy precipitation and, albeit less consequentially, overestimate very light precipitation. In this presentation, we introduce an extremely simple extension to the widely used optimal estimation techniques of simple and ordinary kriging, referred to herein as conditional bias-penalized kriging (CBPK), which minimizes explicitly conditional bias in addition to unconditional error variance. To understand the properties and performance characteristics of CBPK, we carried out numerical experiments in which normal and lognormal random fields of varying spatial correlation scale and rain gauge network density are synthetically generated, and the estimates are cross-validated; the results are summarized in this presentation. Also presented are generalization of CBPK in the framework of classical optimal linear estimation theory, and how it may be used in multisensor QPE.
Act No. 62, Penal Code, 29 December 1987.
1988-01-01
This document contains various provisions of the 1987 Cuban Penal Code. Chapter 6 of Title 8 (crimes against life and bodily integrity) outlaws abortion and sets prison terms for its performance under various circumstances. Chapter 7 sets a penalty of five to 12 years imprisonment for performing a sterilization procedure. Chapter 8 outlines the penalties for abandonment of minors and incompetent or helpless people. Under Title 9 (crimes against individual rights), Chapter 8 renders it illegal to discriminate on the grounds of sex, race, color, or national origin. Chapter 1 of Title 11 deals with crimes against the normal development of sexual relations, setting penalties for rape, pederasty with violence, and lascivious abuse. Chapter 2 covers crimes against the normal development of the family such as incest, sexual relations with a minor, bigamy, illegal marriage, and substitution of one child for another. Chapter 3 places penalties for crimes against the normal development of childhood and youth, such as the corruption of minors, the neglect of minors, and the failure to support minors. PMID:12289530
Act No. 62, Penal Code, 29 December 1987.
1988-01-01
This document contains various provisions of the 1987 Cuban Penal Code. Chapter 6 of Title 8 (crimes against life and bodily integrity) outlaws abortion and sets prison terms for its performance under various circumstances. Chapter 7 sets a penalty of five to 12 years imprisonment for performing a sterilization procedure. Chapter 8 outlines the penalties for abandonment of minors and incompetent or helpless people. Under Title 9 (crimes against individual rights), Chapter 8 renders it illegal to discriminate on the grounds of sex, race, color, or national origin. Chapter 1 of Title 11 deals with crimes against the normal development of sexual relations, setting penalties for rape, pederasty with violence, and lascivious abuse. Chapter 2 covers crimes against the normal development of the family such as incest, sexual relations with a minor, bigamy, illegal marriage, and substitution of one child for another. Chapter 3 places penalties for crimes against the normal development of childhood and youth, such as the corruption of minors, the neglect of minors, and the failure to support minors.
Laplace regression with censored data.
Bottai, Matteo; Zhang, Jiajia
2010-08-01
We consider a regression model where the error term is assumed to follow a type of asymmetric Laplace distribution. We explore its use in the estimation of conditional quantiles of a continuous outcome variable given a set of covariates in the presence of random censoring. Censoring may depend on covariates. Estimation of the regression coefficients is carried out by maximizing a non-differentiable likelihood function. In the scenarios considered in a simulation study, the Laplace estimator showed correct coverage and shorter computation time than the alternative methods considered, some of which occasionally failed to converge. We illustrate the use of Laplace regression with an application to survival time in patients with small cell lung cancer.
Revzin, Ella; Majumdar, Dibyen; Bassett, Gilbert W
2014-12-20
Tumor growth curves provide a simple way to understand how tumors change over time. The traditional approach to fitting such curves to empirical data has been to estimate conditional mean regression functions, which describe the average effect of covariates on growth. However, this method ignores the possibility that tumor growth dynamics are different for different quantiles of the possible distribution of growth patterns. Furthermore, typical individual preclinical cancer drug study designs have very small sample sizes and can have lower power to detect a statistically significant difference in tumor volume between treatment groups. In our work, we begin to address these issues by combining several independent small sample studies of an experimental cancer treatment with differing study designs to construct quantile tumor growth curves. For modeling, we use a Penalized Fixed Effects Quantile Regression with added study effects to control for study differences. We demonstrate this approach using data from a series of small sample studies that investigated the effect of a naturally derived biological peptide, P28, on tumor volumes in mice grafted with human melanoma cells. We find a statistically significant quantile treatment effect on tumor volume trajectories and baseline values. In particular, the experimental treatment and a corresponding conventional chemotherapy had different effects on tumor growth by quantile. The conventional treatment, Dacarbazine (DTIC), tended to inhibit growth for smaller quantiles, while the experimental treatment P28 produced slower rates of growth in the upper quantiles, especially in the 95th quantile. PMID:25231497
NASA Astrophysics Data System (ADS)
Amin, Mohd Zaki M.; Islam, Tanvir; Ishak, Asnor M.
2014-10-01
The authors have applied an automated regression-based statistical method, namely, the automated statistical downscaling (ASD) model, to downscale and project the precipitation climatology in an equatorial climate region (Peninsular Malaysia). Five precipitation indices are, principally, downscaled and projected: mean monthly values of precipitation (Mean), standard deviation (STD), 90th percentile of rain day amount, percentage of wet days (Wet-day), and maximum number of consecutive dry days (CDD). The predictors, National Centers for Environmental Prediction (NCEP) products, are taken from the daily series reanalysis data, while the global climate model (GCM) outputs are from the Hadley Centre Coupled Model, version 3 (HadCM3) in A2/B2 emission scenarios and Third-Generation Coupled Global Climate Model (CGCM3) in A2 emission scenario. Meanwhile, the predictand data are taken from the arithmetically averaged rain gauge information and used as a baseline data for the evaluation. The results reveal, from the calibration and validation periods spanning a period of 40 years (1961-2000), the ASD model is capable to downscale the precipitation with reasonable accuracy. Overall, during the validation period, the model simulations with the NCEP predictors produce mean monthly precipitation of 6.18-6.20 mm/day (root mean squared error 0.78 and 0.82 mm/day), interpolated, respectively, on HadCM3 and CGCM3 grids, in contrast to 6.00 mm/day as observation. Nevertheless, the model suffers to perform reasonably well at the time of extreme precipitation and summer time, more specifically to generate the CDD and STD indices. The future projections of precipitation (2011-2099) exhibit that there would be an increase in the precipitation amount and frequency in most of the months. Taking the 1961-2000 timeline as the base period, overall, the annual mean precipitation would indicate a surplus projection by nearly 14~18 % under both GCM output cases (HadCM3 A2/B2 scenarios and
Dealing with Outliers: Robust, Resistant Regression
ERIC Educational Resources Information Center
Glasser, Leslie
2007-01-01
Least-squares linear regression is the best of statistics and it is the worst of statistics. The reasons for this paradoxical claim, arising from possible inapplicability of the method and the excessive influence of "outliers", are discussed and substitute regression methods based on median selection, which is both robust and resistant, are…
Penalized maximum-likelihood sinogram restoration for dual focal spot computed tomography.
Forthmann, P; Köhler, T; Begemann, P G C; Defrise, M
2007-08-01
Due to various system non-idealities, the raw data generated by a computed tomography (CT) machine are not readily usable for reconstruction. Although the deterministic nature of corruption effects such as crosstalk and afterglow permits correction by deconvolution, there is a drawback because deconvolution usually amplifies noise. Methods that perform raw data correction combined with noise suppression are commonly termed sinogram restoration methods. The need for sinogram restoration arises, for example, when photon counts are low and non-statistical reconstruction algorithms such as filtered backprojection are used. Many modern CT machines offer a dual focal spot (DFS) mode, which serves the goal of increased radial sampling by alternating the focal spot between two positions on the anode plate during the scan. Although the focal spot mode does not play a role with respect to how the data are affected by the above-mentioned corruption effects, it needs to be taken into account if regularized sinogram restoration is to be applied to the data. This work points out the subtle difference in processing that sinogram restoration for DFS requires, how it is correctly employed within the penalized maximum-likelihood sinogram restoration algorithm and what impact it has on image quality.
Technology Transfer Automated Retrieval System (TEKTRAN)
In precision agriculture regression has been used widely to quality the relationship between soil attributes and other environmental variables. However, spatial correlation existing in soil samples usually makes the regression model suboptimal. In this study, a regression-kriging method was attemp...
Quantile Regression With Measurement Error
Wei, Ying; Carroll, Raymond J.
2010-01-01
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. PMID:20305802
Hybrid fuzzy regression with trapezoidal fuzzy data
NASA Astrophysics Data System (ADS)
Razzaghnia, T.; Danesh, S.; Maleki, A.
2011-12-01
In this regard, this research deals with a method for hybrid fuzzy least-squares regression. The extension of symmetric triangular fuzzy coefficients to asymmetric trapezoidal fuzzy coefficients is considered as an effective measure for removing unnecessary fuzziness of the linear fuzzy model. First, trapezoidal fuzzy variable is applied to derive a bivariate regression model. In the following, normal equations are formulated to solve the four parts of hybrid regression coefficients. Also the model is extended to multiple regression analysis. Eventually, method is compared with Y-H.O. chang's model.
Gang, Grace J.; Stayman, J. Webster; Zbijewski, Wojciech; Siewerdsen, Jeffrey H.
2014-08-15
Purpose: Nonstationarity is an important aspect of imaging performance in CT and cone-beam CT (CBCT), especially for systems employing iterative reconstruction. This work presents a theoretical framework for both filtered-backprojection (FBP) and penalized-likelihood (PL) reconstruction that includes explicit descriptions of nonstationary noise, spatial resolution, and task-based detectability index. Potential utility of the model was demonstrated in the optimal selection of regularization parameters in PL reconstruction. Methods: Analytical models for local modulation transfer function (MTF) and noise-power spectrum (NPS) were investigated for both FBP and PL reconstruction, including explicit dependence on the object and spatial location. For FBP, a cascaded systems analysis framework was adapted to account for nonstationarity by separately calculating fluence and system gains for each ray passing through any given voxel. For PL, the point-spread function and covariance were derived using the implicit function theorem and first-order Taylor expansion according toFessler [“Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): Applications to tomography,” IEEE Trans. Image Process. 5(3), 493–506 (1996)]. Detectability index was calculated for a variety of simple tasks. The model for PL was used in selecting the regularization strength parameter to optimize task-based performance, with both a constant and a spatially varying regularization map. Results: Theoretical models of FBP and PL were validated in 2D simulated fan-beam data and found to yield accurate predictions of local MTF and NPS as a function of the object and the spatial location. The NPS for both FBP and PL exhibit similar anisotropic nature depending on the pathlength (and therefore, the object and spatial location within the object) traversed by each ray, with the PL NPS experiencing greater smoothing along directions with higher noise. The MTF of FBP
[Penal deterrents and benefits of clemency: marginal notes on research].
Tartaglione, G
1978-01-01
The article refers to a volume recently issued by the Criminology Department of the National Centre for Prevention and Social Protection, which illustrates the results of some research carried out in order to verify the effects, if any, of the application of the benefits of amnesty and free pardon (not infrequent in the history of Italy) and of the application of the benefits of mercy, on recidivism. Although the research is focused on the formal figures of sentences of punishment, an attempt has been made to reach indicative conclusions based on the behaviour of those to whom these benefits were granted, in order to discover whether their application has reinforced or weakened the crimino-resistance of the subjects. The indications found in the examination lead to the conclusion that those who are predisposed to a certain type of delinquency, greater or lesser, (for example, towards crimes against the patrimony, especially if recidivous) continued to commit crimes at the same rhythm, or even in some cases at a greater rhythm, while those who may have fallen only rarely into crime (particularly women) tended to relapse less into crime. This is the case also with pardon, although in this case the benefit is individualized and conceded generally as a consequence of favourable prognostic evaluations. It is interesting to note that each time a general measure of amnesty or pardon is issued, the percentages of criminality are increased. The article brings these results to the attention of the reader, as material for reflection on the efficacy of reinforcement of the measures of pardon, with respect to the penal system.
[Extramural research funds and penal law--status of legislation].
Ulsenheimer, Klaus
2005-04-01
After decades of smooth functioning, the cooperation of physicians and hospitals with the industry (much desired from the side of the government in the interest of clinical research) has fallen in legal discredit due to increasingly frequent criminal inquires and proceedings for unduly privileges, corruption, and embezzlement. The discredit is so severe that the industry funding for clinical research is diverted abroad to an increasing extent. The legal elements of embezzlement assume the intentional violation of the entrusted funds against the interest of the customer. Undue privileges occur when an official requests an advantage in exchange for a service (or is promised one or takes one) in his or somebody else's interest. The elements of corruption are then given when the receiver of the undue privilege provides an illegal service or takes a discretionary decision under the influence of the gratuity. The tension between the prohibition of undue privileges (as regulated by the penal law) and the granting of extramural funds (as regulated by the administrative law in academic institutions) can be reduced through a high degree of transparency and the start of control possibilities--public announcement and authorization by the officials--as well as through exact documentation and observance of the principles of separation of interests and moderation. With the anti-corruption law of 1997, it is possible to charge of corruption also physicians employed in private institutions. In contrast, physicians in private practice are not considered in the above criminal facts. They can only be charged of misdemeanor, or called to respond to the professional board, on the basis of the law that regulates advertising for medicinal products (Heilmittelwerbegesetz).
Penalized differential pathway analysis of integrative oncogenomics studies.
van Wieringen, Wessel N; van de Wiel, Mark A
2014-04-01
Through integration of genomic data from multiple sources, we may obtain a more accurate and complete picture of the molecular mechanisms underlying tumorigenesis. We discuss the integration of DNA copy number and mRNA gene expression data from an observational integrative genomics study involving cancer patients. The two molecular levels involved are linked through the central dogma of molecular biology. DNA copy number aberrations abound in the cancer cell. Here we investigate how these aberrations affect gene expression levels within a pathway using observational integrative genomics data of cancer patients. In particular, we aim to identify differential edges between regulatory networks of two groups involving these molecular levels. Motivated by the rate equations, the regulatory mechanism between DNA copy number aberrations and gene expression levels within a pathway is modeled by a simultaneous-equations model, for the one- and two-group case. The latter facilitates the identification of differential interactions between the two groups. Model parameters are estimated by penalized least squares using the lasso (L1) penalty to obtain a sparse pathway topology. Simulations show that the inclusion of DNA copy number data benefits the discovery of gene-gene interactions. In addition, the simulations reveal that cis-effects tend to be over-estimated in a univariate (single gene) analysis. In the application to real data from integrative oncogenomic studies we show that inclusion of prior information on the regulatory network architecture benefits the reproducibility of all edges. Furthermore, analyses of the TP53 and TGFb signaling pathways between ER+ and ER- samples from an integrative genomics breast cancer study identify reproducible differential regulatory patterns that corroborate with existing literature.
Tarpey, Thaddeus; Petkova, Eva
2010-07-01
Finite mixture models have come to play a very prominent role in modelling data. The finite mixture model is predicated on the assumption that distinct latent groups exist in the population. The finite mixture model therefore is based on a categorical latent variable that distinguishes the different groups. Often in practice distinct sub-populations do not actually exist. For example, disease severity (e.g. depression) may vary continuously and therefore, a distinction of diseased and not-diseased may not be based on the existence of distinct sub-populations. Thus, what is needed is a generalization of the finite mixture's discrete latent predictor to a continuous latent predictor. We cast the finite mixture model as a regression model with a latent Bernoulli predictor. A latent regression model is proposed by replacing the discrete Bernoulli predictor by a continuous latent predictor with a beta distribution. Motivation for the latent regression model arises from applications where distinct latent classes do not exist, but instead individuals vary according to a continuous latent variable. The shapes of the beta density are very flexible and can approximate the discrete Bernoulli distribution. Examples and a simulation are provided to illustrate the latent regression model. In particular, the latent regression model is used to model placebo effect among drug treated subjects in a depression study. PMID:20625443
[Understanding logistic regression].
El Sanharawi, M; Naudet, F
2013-10-01
Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.
A scalable projective scaling algorithm for l(p) loss with convex penalizations.
Zhou, Hongbo; Cheng, Qiang
2015-02-01
This paper presents an accurate, efficient, and scalable algorithm for minimizing a special family of convex functions, which have a lp loss function as an additive component. For this problem, well-known learning algorithms often have well-established results on accuracy and efficiency, but there exists rarely any report on explicit linear scalability with respect to the problem size. The proposed approach starts with developing a second-order learning procedure with iterative descent for general convex penalization functions, and then builds efficient algorithms for a restricted family of functions, which satisfy the Karmarkar's projective scaling condition. Under this condition, a light weight, scalable message passing algorithm (MPA) is further developed by constructing a series of simpler equivalent problems. The proposed MPA is intrinsically scalable because it only involves matrix-vector multiplication and avoids matrix inversion operations. The MPA is proven to be globally convergent for convex formulations; for nonconvex situations, it converges to a stationary point. The accuracy, efficiency, scalability, and applicability of the proposed method are verified through extensive experiments on sparse signal recovery, face image classification, and over-complete dictionary learning problems. PMID:25608289
NASA Astrophysics Data System (ADS)
Yang, Li; Zhou, Jian; Ferrero, Andrea; Badawi, Ramsey D.; Qi, Jinyi
2014-01-01
Detecting cancerous lesions is a major clinical application in emission tomography. In previous work, we have studied penalized maximum-likelihood (PML) image reconstruction for the detection task and proposed a method to design a shift-invariant quadratic penalty function to maximize detectability of a lesion at a known location in a two dimensional image. Here we extend the regularization design to maximize detectability of lesions at unknown locations in fully 3D PET. We used a multiview channelized Hotelling observer (mvCHO) to assess the lesion detectability in 3D images to mimic the condition where a human observer examines three orthogonal views of a 3D image for lesion detection. We derived simplified theoretical expressions that allow fast prediction of the detectability of a 3D lesion. The theoretical results were used to design the regularization in PML reconstruction to improve lesion detectability. We conducted computer-based Monte Carlo simulations to compare the optimized penalty with the conventional penalty for detecting lesions of various sizes. Only true coincidence events were simulated. Lesion detectability was also assessed by two human observers, whose performances agree well with that of the mvCHO. Both the numerical observer and human observer results showed a statistically significant improvement in lesion detection by using the proposed penalty function compared to using the conventional penalty function.
An example of neutronic penalizations in reactivity transient analysis using 3D coupled chain HEMERA
Dubois, F.; Normand, B.; Sargeni, A.
2012-07-01
HEMERA (Highly Evolutionary Methods for Extensive Reactor Analyses), is a fully coupled 3D computational chain developed jointly by IRSN and CEA. It is composed of CRONOS2 (core neutronics, cross sections library from APOLLO2), FLICA4 (core thermal-hydraulics) and the system code CATHARE. Multi-level and multi-dimensional models are developed to account for neutronics, core thermal-hydraulics, fuel thermal analysis and system thermal-hydraulics, dedicated to best-estimate, conservative simulations and sensitivity analysis. In IRSN, the HEMERA chain is widely used to study several types of reactivity accidents and for sensitivity studies. Just as an example of the HEMERA possibilities, we present here two types of neutronic penalizations and their impact on a power transient due to a REA (Rod Ejection Accident): in the first one, we studied a bum-up distribution modification and in the second one, a delayed-neutron fraction modification. Both modifications are applied to the whole core or localized in a few assemblies. Results show that it is possible to use global or local changes but 1) in case of bum-up modification, the total core power can increase when assembly peak power decrease so, care has to be taken if the goal is to maximize a local power peak and 2) for delayed-neutron fraction, a local modification can have the same effect as the one on the whole core, provided that it is large enough. (authors)
Ellimoottil, Chandy; Ryan, Andrew M; Hou, Hechuan; Dupree, James; Hallstrom, Brian; Miller, David C
2016-09-01
In an effort to reduce episode payment variation for joint replacement at US hospitals, the Centers for Medicare and Medicaid Services (CMS) recently implemented the Comprehensive Care for Joint Replacement bundled payment program. Some stakeholders are concerned that the program may unintentionally penalize hospitals because it lacks a mechanism (such as risk adjustment) to sufficiently account for patients' medical complexity. Using Medicare claims for patients in Michigan who underwent lower extremity joint replacement in the period 2011-13, we applied payment methods analogous to those CMS intends to use in determining annual bonuses or penalties (reconciliation payments) to hospitals. We calculated the net difference in reconciliation payments with and without risk adjustment. We found that reconciliation payments were reduced by $827 per episode for each standard-deviation increase in a hospital's patient complexity. Moreover, we found that risk adjustment could increase reconciliation payments to some hospitals by as much as $114,184 annually. Our findings suggest that CMS should include risk adjustment in the Comprehensive Care for Joint Replacement program and in future bundled payment programs. PMID:27605647
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
From moral theory to penal attitudes and back: a theoretically integrated modeling approach.
de Keijser, Jan W; van der Leeden, Rien; Jackson, Janet L
2002-01-01
From a moral standpoint, we would expect the practice of punishment to reflect a solid and commonly shared legitimizing framework. Several moral legal theories explicitly aim to provide such frameworks. Based on the theories of Retributivism, Utilitarianism, and Restorative Justice, this article first sets out to develop a theoretically integrated model of penal attitudes and then explores the extent to which Dutch judges' attitudes to punishment fit the model. Results indicate that penal attitudes can be measured in a meaningful way that is consistent with an integrated approach to moral theory. The general structure of penal attitudes among Dutch judges suggests a streamlined and pragmatic approach to legal punishment that is identifiably founded on the separate concepts central to moral theories of punishment. While Restorative Justice is frequently presented as an alternative paradigm, results show it to be smoothly incorporated within the streamlined approach.
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…
Modern Regression Discontinuity Analysis
ERIC Educational Resources Information Center
Bloom, Howard S.
2012-01-01
This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
40 CFR 33.410 - Can a recipient be penalized for failing to meet its fair share objectives?
Code of Federal Regulations, 2010 CFR
2010-07-01
... failing to meet its fair share objectives? 33.410 Section 33.410 Protection of Environment ENVIRONMENTAL... UNITED STATES ENVIRONMENTAL PROTECTION AGENCY PROGRAMS Fair Share Objectives § 33.410 Can a recipient be penalized for failing to meet its fair share objectives? A recipient cannot be penalized, or treated by...
40 CFR 33.410 - Can a recipient be penalized for failing to meet its fair share objectives?
Code of Federal Regulations, 2011 CFR
2011-07-01
... failing to meet its fair share objectives? 33.410 Section 33.410 Protection of Environment ENVIRONMENTAL... UNITED STATES ENVIRONMENTAL PROTECTION AGENCY PROGRAMS Fair Share Objectives § 33.410 Can a recipient be penalized for failing to meet its fair share objectives? A recipient cannot be penalized, or treated by...
Logistic regression: a brief primer.
Stoltzfus, Jill C
2011-10-01
Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model
Regression Analysis: Legal Applications in Institutional Research
ERIC Educational Resources Information Center
Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.
2008-01-01
This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…
[Art. 192 Polish Penal Code--lawyers comments and medical practice].
Swiatek, Barbara
2005-01-01
Art. 192 was enacted in the Polish Penal Code in 1997. Performance of a "medical intervention" without the patient's consent has been penalized. The binding norm is not generally acclaimed in medical circles, but lawyers seem very interested in them. In essence the regulation is clear in medical circles but teaching of lawyers has its differences. Inconsistent interpretation are concerned with objectivity of legal protection, extent of disposition of the legal norm and other determinant factors. The literature does not permit for an unambiguous and effective interpretation of this regulation. Urgent amendment of this law is necessary.
The Regression Trunk Approach to Discover Treatment Covariate Interaction
ERIC Educational Resources Information Center
Dusseldorp, Elise; Meulman, Jacqueline J.
2004-01-01
The regression trunk approach (RTA) is an integration of regression trees and multiple linear regression analysis. In this paper RTA is used to discover treatment covariate interactions, in the regression of one continuous variable on a treatment variable with "multiple" covariates. The performance of RTA is compared to the classical method of…
Precision and Recall for Regression
NASA Astrophysics Data System (ADS)
Torgo, Luis; Ribeiro, Rita
Cost sensitive prediction is a key task in many real world applications. Most existing research in this area deals with classification problems. This paper addresses a related regression problem: the prediction of rare extreme values of a continuous variable. These values are often regarded as outliers and removed from posterior analysis. However, for many applications (e.g. in finance, meteorology, biology, etc.) these are the key values that we want to accurately predict. Any learning method obtains models by optimizing some preference criteria. In this paper we propose new evaluation criteria that are more adequate for these applications. We describe a generalization for regression of the concepts of precision and recall often used in classification. Using these new evaluation metrics we are able to focus the evaluation of predictive models on the cases that really matter for these applications. Our experiments indicate the advantages of the use of these new measures when comparing predictive models in the context of our target applications.
NASA Astrophysics Data System (ADS)
He, Anhua; Singh, Ramesh P.; Sun, Zhaohua; Ye, Qing; Zhao, Gang
2016-07-01
The earth tide, atmospheric pressure, precipitation and earthquake fluctuations, especially earthquake greatly impacts water well levels, thus anomalous co-seismic changes in ground water levels have been observed. In this paper, we have used four different models, simple linear regression (SLR), multiple linear regression (MLR), principal component analysis (PCA) and partial least squares (PLS) to compute the atmospheric pressure and earth tidal effects on water level. Furthermore, we have used the Akaike information criterion (AIC) to study the performance of various models. Based on the lowest AIC and sum of squares for error values, the best estimate of the effects of atmospheric pressure and earth tide on water level is found using the MLR model. However, MLR model does not provide multicollinearity between inputs, as a result the atmospheric pressure and earth tidal response coefficients fail to reflect the mechanisms associated with the groundwater level fluctuations. On the premise of solving serious multicollinearity of inputs, PLS model shows the minimum AIC value. The atmospheric pressure and earth tidal response coefficients show close response with the observation using PLS model. The atmospheric pressure and the earth tidal response coefficients are found to be sensitive to the stress-strain state using the observed data for the period 1 April-8 June 2008 of Chuan 03# well. The transient enhancement of porosity of rock mass around Chuan 03# well associated with the Wenchuan earthquake (Mw = 7.9 of 12 May 2008) that has taken its original pre-seismic level after 13 days indicates that the co-seismic sharp rise of water well could be induced by static stress change, rather than development of new fractures.
Multitask Coupled Logistic Regression and its Fast Implementation for Large Multitask Datasets.
Gu, Xin; Chung, Fu-Lai; Ishibuchi, Hisao; Wang, Shitong
2015-09-01
When facing multitask-learning problems, it is desirable that the learning method could find the correct input-output features and share the commonality among multiple domains and also scale-up for large multitask datasets. We introduce the multitask coupled logistic regression (LR) framework called LR-based multitask classification learning algorithm (MTC-LR), which is a new method for generating each classifier for each task, capable of sharing the commonality among multitask domains. The basic idea of MTC-LR is to use all individual LR based classifiers, each one appropriate for each task domain, but in contrast to other support vector machine (SVM)-based proposals, learning all the parameter vectors of all individual classifiers by using the conjugate gradient method, in a global way and without the use of kernel trick, and being easily extended into its scaled version. We theoretically show that the addition of a new term in the cost function of the set of LRs (that penalizes the diversity among multiple tasks) produces a coupling of multiple tasks that allows MTC-LR to improve the learning performance in a LR way. This finding can make us easily integrate it with a state-of-the-art fast LR algorithm called dual coordinate descent method (CDdual) to develop its fast version MTC-LR-CDdual for large multitask datasets. The proposed algorithm MTC-LR-CDdual is also theoretically analyzed. Our experimental results on artificial and real-datasets indicate the effectiveness of the proposed algorithm MTC-LR-CDdual in classification accuracy, speed, and robustness. PMID:25423663
Ridge Regression Signal Processing
NASA Technical Reports Server (NTRS)
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
C-arm perfusion imaging with a fast penalized maximum-likelihood approach
NASA Astrophysics Data System (ADS)
Frysch, Robert; Pfeiffer, Tim; Bannasch, Sebastian; Serowy, Steffen; Gugel, Sebastian; Skalej, Martin; Rose, Georg
2014-03-01
Perfusion imaging is an essential method for stroke diagnostics. One of the most important factors for a successful therapy is to get the diagnosis as fast as possible. Therefore our approach aims at perfusion imaging (PI) with a cone beam C-arm system providing perfusion information directly in the interventional suite. For PI the imaging system has to provide excellent soft tissue contrast resolution in order to allow the detection of small attenuation enhancement due to contrast agent in the capillary vessels. The limited dynamic range of flat panel detectors as well as the sparse sampling of the slow rotating C-arm in combination with standard reconstruction methods results in limited soft tissue contrast. We choose a penalized maximum-likelihood reconstruction method to get suitable results. To minimize the computational load, the 4D reconstruction task is reduced to several static 3D reconstructions. We also include an ordered subset technique with transitioning to a small number of subsets, which adds sharpness to the image with less iterations while also suppressing the noise. Instead of the standard multiplicative EM correction, we apply a Newton-based optimization to further accelerate the reconstruction algorithm. The latter optimization reduces the computation time by up to 70%. Further acceleration is provided by a multi-GPU implementation of the forward and backward projection, which fulfills the demands of cone beam geometry. In this preliminary study we evaluate this procedure on clinical data. Perfusion maps are computed and compared with reference images from magnetic resonance scans. We found a high correlation between both images.
Regression Calibration with Heteroscedastic Error Variance
Spiegelman, Donna; Logan, Roger; Grove, Douglas
2011-01-01
The problem of covariate measurement error with heteroscedastic measurement error variance is considered. Standard regression calibration assumes that the measurement error has a homoscedastic measurement error variance. An estimator is proposed to correct regression coefficients for covariate measurement error with heteroscedastic variance. Point and interval estimates are derived. Validation data containing the gold standard must be available. This estimator is a closed-form correction of the uncorrected primary regression coefficients, which may be of logistic or Cox proportional hazards model form, and is closely related to the version of regression calibration developed by Rosner et al. (1990). The primary regression model can include multiple covariates measured without error. The use of these estimators is illustrated in two data sets, one taken from occupational epidemiology (the ACE study) and one taken from nutritional epidemiology (the Nurses’ Health Study). In both cases, although there was evidence of moderate heteroscedasticity, there was little difference in estimation or inference using this new procedure compared to standard regression calibration. It is shown theoretically that unless the relative risk is large or measurement error severe, standard regression calibration approximations will typically be adequate, even with moderate heteroscedasticity in the measurement error model variance. In a detailed simulation study, standard regression calibration performed either as well as or better than the new estimator. When the disease is rare and the errors normally distributed, or when measurement error is moderate, standard regression calibration remains the method of choice. PMID:22848187
Vargas, Mateo; van Eeuwijk, Fred A; Crossa, Jose; Ribaut, Jean-Marcel
2006-04-01
The study of QTL x environment interaction (QEI) is important for understanding genotype x environment interaction (GEI) in many quantitative traits. For modeling GEI and QEI, factorial regression (FR) models form a powerful class of models. In FR models, covariables (contrasts) defined on the levels of the genotypic and/or environmental factor(s) are used to describe main effects and interactions. In FR models for QTL expression, considerable numbers of genotypic covariables can occur as for each putative QTL an additional covariable needs to be introduced. For large numbers of genotypic and/or environmental covariables, least square estimation breaks down and partial least squares (PLS) estimation procedures become an attractive alternative. In this paper we develop methodology for analyzing QEI by FR for estimating effects and locations of QTLs and QEI and interpreting QEI in terms of environmental variables. A randomization test for the main effects of QTLs and QEI is presented. A population of F2 derived F3 families was evaluated in eight environments differing in drought stress and soil nitrogen content and the traits yield and anthesis silking interval (ASI) were measured. For grain yield, chromosomes 1 and 10 showed significant QEI, whereas in chromosomes 3 and 8 only main effect QTLs were observed. For ASI, QTL main effects were observed on chromosomes 1, 2, 6, 8, and 10, whereas QEI was observed only on chromosome 8. The assessment of the QEI at chromosome 1 for grain yield showed that the QTL main effect explained 35.8% of the QTL + QEI variability, while QEI explained 64.2%. Minimum temperature during flowering time explained 77.6% of the QEI. The QEI analysis at chromosome 10 showed that the QTL main effect explained 59.8% of the QTL + QEI variability, while QEI explained 40.2%. Maximum temperature during flowering time explained 23.8% of the QEI. Results of this study show the possibilities of using FR for mapping QTL and for dissecting QEI in terms
Liu, J. B.; Adeola, O.
2016-01-01
Forty-eight barrows with an average initial body weight of 25.5±0.3 kg were assigned to 6 dietary treatments arranged in a 3×2 factorial of 3 graded levels of P at 1.42, 2.07, or 2.72 g/kg, and 2 levels of casein at 0 or 50 g/kg to compare the estimates of true total tract digestibility (TTTD) of P in soybean meal (SBM) for pigs fed diets with or without casein supplementation. The SBM is the only source of P in diets without casein, and in the diet with added casein, 1.0 to 2.4 g/kg of total dietary P was supplied by SBM as dietary level of SBM increased. The experiment consisted of a 5-d adjustment period and a 5-d total collection period with ferric oxide as a maker to indicate the initiation and termination of fecal collection. There were interactive effects of casein supplementation and total dietary P level on the apparent total tract digestibility (ATTD) and retention of P (p<0.05). Dietary P intake, fecal P output, digested P and retained P were increased linearly with graded increasing levels of SBM in diets regardless of casein addition (p<0.01). Compared with diets without casein, there was a reduction in fecal P in the casein-supplemented diets, which led to increases in digested P, retained P, ATTD, and retention of P (p<0.01). Digested N, ATTD of N, retained N, and N retention were affected by the interaction of casein supplementation and dietary P level (p<0.05). Fecal N output, urinary N output, digested N, and retained N increased linearly with graded increasing levels of SBM for each type of diet (p<0.01). The estimates of TTTD of P in SBM, derived from the regression of daily digested P against daily P intake, for pigs fed diets without casein and with casein were calculated to be 37.3% and 38.6%, respectively. Regressing daily digested N against daily N intake, the TTTD of N in SBM were determined at 94.3% and 94.4% for diets without casein and with added casein, respectively. There was no difference in determined values of TTTD of P or N in
Using Time-Series Regression to Predict Academic Library Circulations.
ERIC Educational Resources Information Center
Brooks, Terrence A.
1984-01-01
Four methods were used to forecast monthly circulation totals in 15 midwestern academic libraries: dummy time-series regression, lagged time-series regression, simple average (straight-line forecasting), monthly average (naive forecasting). In tests of forecasting accuracy, dummy regression method and monthly mean method exhibited smallest average…
Testing Different Model Building Procedures Using Multiple Regression.
ERIC Educational Resources Information Center
Thayer, Jerome D.
The stepwise regression method of selecting predictors for computer assisted multiple regression analysis was compared with forward, backward, and best subsets regression, using 16 data sets. The results indicated the stepwise method was preferred because of its practical nature, when the models chosen by different selection methods were similar…
Regression Verification Using Impact Summaries
NASA Technical Reports Server (NTRS)
Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana
2013-01-01
versions [19]. These techniques compare two programs with a large degree of syntactic similarity to prove that portions of one program version are equivalent to the other. Regression verification can be used for guaranteeing backward compatibility, and for showing behavioral equivalence in programs with syntactic differences, e.g., when a program is refactored to improve its performance, maintainability, or readability. Existing regression verification techniques leverage similarities between program versions by using abstraction and decomposition techniques to improve scalability of the analysis [10, 12, 19]. The abstractions and decomposition in the these techniques, e.g., summaries of unchanged code [12] or semantically equivalent methods [19], compute an over-approximation of the program behaviors. The equivalence checking results of these techniques are sound but not complete-they may characterize programs as not functionally equivalent when, in fact, they are equivalent. In this work we describe a novel approach that leverages the impact of the differences between two programs for scaling regression verification. We partition program behaviors of each version into (a) behaviors impacted by the changes and (b) behaviors not impacted (unimpacted) by the changes. Only the impacted program behaviors are used during equivalence checking. We then prove that checking equivalence of the impacted program behaviors is equivalent to checking equivalence of all program behaviors for a given depth bound. In this work we use symbolic execution to generate the program behaviors and leverage control- and data-dependence information to facilitate the partitioning of program behaviors. The impacted program behaviors are termed as impact summaries. The dependence analyses that facilitate the generation of the impact summaries, we believe, could be used in conjunction with other abstraction and decomposition based approaches, [10, 12], as a complementary reduction technique. An
The Role of the Environmental Health Specialist in the Penal and Correctional System
ERIC Educational Resources Information Center
Walker, Bailus, Jr.; Gordon, Theodore J.
1976-01-01
Implementing a health and hygiene program in penal systems necessitates coordinating the entire staff. Health specialists could participate in facility planning and management, policy formation, and evaluation of medical care, housekeeping, and food services. They could also serve as liaisons between correctional staff and governmental or…
36 CFR 1200.16 - Will I be penalized for misusing the official seals and logos?
Code of Federal Regulations, 2012 CFR
2012-07-01
... misusing the official seals and logos? 1200.16 Section 1200.16 Parks, Forests, and Public Property NATIONAL ARCHIVES AND RECORDS ADMINISTRATION GENERAL RULES OFFICIAL SEALS Penalties for Misuse of NARA Seals and Logos § 1200.16 Will I be penalized for misusing the official seals and logos? (a) Seals. (1) If...
36 CFR 1200.16 - Will I be penalized for misusing the official seals and logos?
Code of Federal Regulations, 2011 CFR
2011-07-01
... misusing the official seals and logos? 1200.16 Section 1200.16 Parks, Forests, and Public Property NATIONAL ARCHIVES AND RECORDS ADMINISTRATION GENERAL RULES OFFICIAL SEALS Penalties for Misuse of NARA Seals and Logos § 1200.16 Will I be penalized for misusing the official seals and logos? (a) Seals. (1) If...
36 CFR 1200.16 - Will I be penalized for misusing the official seals and logos?
Code of Federal Regulations, 2010 CFR
2010-07-01
... misusing the official seals and logos? 1200.16 Section 1200.16 Parks, Forests, and Public Property NATIONAL ARCHIVES AND RECORDS ADMINISTRATION GENERAL RULES OFFICIAL SEALS Penalties for Misuse of NARA Seals and Logos § 1200.16 Will I be penalized for misusing the official seals and logos? (a) Seals. (1) If...
49 CFR 26.47 - Can recipients be penalized for failing to meet overall goals?
Code of Federal Regulations, 2012 CFR
2012-10-01
... 49 Transportation 1 2012-10-01 2012-10-01 false Can recipients be penalized for failing to meet overall goals? 26.47 Section 26.47 Transportation Office of the Secretary of Transportation PARTICIPATION BY DISADVANTAGED BUSINESS ENTERPRISES IN DEPARTMENT OF TRANSPORTATION FINANCIAL ASSISTANCE...
49 CFR 26.47 - Can recipients be penalized for failing to meet overall goals?
Code of Federal Regulations, 2013 CFR
2013-10-01
... 49 Transportation 1 2013-10-01 2013-10-01 false Can recipients be penalized for failing to meet overall goals? 26.47 Section 26.47 Transportation Office of the Secretary of Transportation PARTICIPATION BY DISADVANTAGED BUSINESS ENTERPRISES IN DEPARTMENT OF TRANSPORTATION FINANCIAL ASSISTANCE...
49 CFR 26.47 - Can recipients be penalized for failing to meet overall goals?
Code of Federal Regulations, 2011 CFR
2011-10-01
... 49 Transportation 1 2011-10-01 2011-10-01 false Can recipients be penalized for failing to meet overall goals? 26.47 Section 26.47 Transportation Office of the Secretary of Transportation PARTICIPATION BY DISADVANTAGED BUSINESS ENTERPRISES IN DEPARTMENT OF TRANSPORTATION FINANCIAL ASSISTANCE...
49 CFR 26.47 - Can recipients be penalized for failing to meet overall goals?
Code of Federal Regulations, 2014 CFR
2014-10-01
... 49 Transportation 1 2014-10-01 2014-10-01 false Can recipients be penalized for failing to meet overall goals? 26.47 Section 26.47 Transportation Office of the Secretary of Transportation PARTICIPATION BY DISADVANTAGED BUSINESS ENTERPRISES IN DEPARTMENT OF TRANSPORTATION FINANCIAL ASSISTANCE...
Regression modeling of ground-water flow
Cooley, R.L.; Naff, R.L.
1985-01-01
Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)
Post-processing through linear regression
NASA Astrophysics Data System (ADS)
van Schaeybroeck, B.; Vannitsem, S.
2011-03-01
Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.
Chen, Y
2015-06-15
Purpose: To improve the quality of kV X-ray cone beam CT (CBCT) for use in radiotherapy delivery assessment and re-planning by using penalized likelihood (PL) iterative reconstruction and auto-segmentation accuracy of the resulting CBCTs as an image quality metric. Methods: Present filtered backprojection (FBP) CBCT reconstructions can be improved upon by PL reconstruction with image formation models and appropriate regularization constraints. We use two constraints: 1) image smoothing via an edge preserving filter, and 2) a constraint minimizing the differences between the reconstruction and a registered prior image. Reconstructions of prostate therapy CBCTs were computed with constraint 1 alone and with both constraints. The prior images were planning CTs(pCT) deformable-registered to the FBP reconstructions. Anatomy segmentations were done using atlas-based auto-segmentation (Elekta ADMIRE). Results: We observed small but consistent improvements in the Dice similarity coefficients of PL reconstructions over the FBP results, and additional small improvements with the added prior image constraint. For a CBCT with anatomy very similar in appearance to the pCT, we observed these changes in the Dice metric: +2.9% (prostate), +8.6% (rectum), −1.9% (bladder). For a second CBCT with a very different rectum configuration, we observed +0.8% (prostate), +8.9% (rectum), −1.2% (bladder). For a third case with significant lateral truncation of the field of view, we observed: +0.8% (prostate), +8.9% (rectum), −1.2% (bladder). Adding the prior image constraint raised Dice measures by about 1%. Conclusion: Efficient and practical adaptive radiotherapy requires accurate deformable registration and accurate anatomy delineation. We show here small and consistent patterns of improved contour accuracy using PL iterative reconstruction compared with FBP reconstruction. However, the modest extent of these results and the pattern of differences across CBCT cases suggest that
Dose reduction in digital breast tomosynthesis using a penalized maximum likelihood reconstruction
NASA Astrophysics Data System (ADS)
Das, Mini; Gifford, Howard; O'Connor, Michael; Glick, Stephen J.
2009-02-01
Digital breast tomosynthesis (DBT) is a 3D imaging modality with limited angle projection data. The ability of tomosynthesis systems to accurately detect smaller microcalcifications is debatable. This is because of the higher noise in the projection data (lower average dose per projection), which is then propagated through the reconstructed image . Reconstruction methods that minimize the propagation of quantum noise have potential to improve microcalcification detectability using DBT. In this paper we show that penalized maximum likelihood (PML) reconstruction in DBT yields images with an improved resolution/noise tradeoff as compared to conventional filtered backprojection (FBP). Signal to noise ratio (SNR) using PML was observed to be higher than that obtained using the standard FBP algorithm. Our results indicate that for microcalcifications, using the PML algorithm, reconstructions obtained with a mean glandular dose (MGD) of 1.5 mGy yielded better SNR than that those obtained with FBP using a 4mGy total dose. Thus perhaps total dose could be reduced to one-third or lower with same microcalcification detectability, if PML reconstruction is used instead of FBP. Visibility of low contrast masses with various contrast levels were studied using a contrast-detail phantom in a breast shape structure with an average breast density. Images generated using various dose levels indicate that visibility of low contrast masses generated using PML reconstructions are significantly better than those generated using FBP. SNR measurements in the low-contrast study did not appear to correlate with the visual subjective analysis of the reconstruction indicating that SNR is not a good figure of merit to be used.
Southard, Rodney E.
2013-01-01
The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical
CSWS-related autistic regression versus autistic regression without CSWS.
Tuchman, Roberto
2009-08-01
Continuous spike-waves during slow-wave sleep (CSWS) and Landau-Kleffner syndrome (LKS) are two clinical epileptic syndromes that are associated with the electroencephalography (EEG) pattern of electrical status epilepticus during slow wave sleep (ESES). Autistic regression occurs in approximately 30% of children with autism and is associated with an epileptiform EEG in approximately 20%. The behavioral phenotypes of CSWS, LKS, and autistic regression overlap. However, the differences in age of regression, degree and type of regression, and frequency of epilepsy and EEG abnormalities suggest that these are distinct phenotypes. CSWS with autistic regression is rare, as is autistic regression associated with ESES. The pathophysiology and as such the treatment implications for children with CSWS and autistic regression are distinct from those with autistic regression without CSWS.
Multiatlas Segmentation as Nonparametric Regression
Awate, Suyash P.; Whitaker, Ross T.
2015-01-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528
Southard, Rodney E.
2013-01-01
The weather and precipitation patterns in Missouri vary considerably from year to year. In 2008, the statewide average rainfall was 57.34 inches and in 2012, the statewide average rainfall was 30.64 inches. This variability in precipitation and resulting streamflow in Missouri underlies the necessity for water managers and users to have reliable streamflow statistics and a means to compute select statistics at ungaged locations for a better understanding of water availability. Knowledge of surface-water availability is dependent on the streamflow data that have been collected and analyzed by the U.S. Geological Survey for more than 100 years at approximately 350 streamgages throughout Missouri. The U.S. Geological Survey, in cooperation with the Missouri Department of Natural Resources, computed streamflow statistics at streamgages through the 2010 water year, defined periods of drought and defined methods to estimate streamflow statistics at ungaged locations, and developed regional regression equations to compute selected streamflow statistics at ungaged locations. Streamflow statistics and flow durations were computed for 532 streamgages in Missouri and in neighboring States of Missouri. For streamgages with more than 10 years of record, Kendall’s tau was computed to evaluate for trends in streamflow data. If trends were detected, the variable length method was used to define the period of no trend. Water years were removed from the dataset from the beginning of the record for a streamgage until no trend was detected. Low-flow frequency statistics were then computed for the entire period of record and for the period of no trend if 10 or more years of record were available for each analysis. Three methods are presented for computing selected streamflow statistics at ungaged locations. The first method uses power curve equations developed for 28 selected streams in Missouri and neighboring States that have multiple streamgages on the same streams. Statistical
Nodule Regression in Adults With Nodular Gastritis
Kim, Ji Wan; Lee, Sun-Young; Kim, Jeong Hwan; Sung, In-Kyung; Park, Hyung Seok; Shim, Chan-Sup; Han, Hye Seung
2015-01-01
Background Nodular gastritis (NG) is associated with the presence of Helicobacter pylori infection, but there are controversies on nodule regression in adults. The aim of this study was to analyze the factors that are related to the nodule regression in adults diagnosed as NG. Methods Adult population who were diagnosed as NG with H. pylori infection during esophagogastroduodenoscopy (EGD) at our center were included. Changes in the size and location of the nodules, status of H. pylori infection, upper gastrointestinal (UGI) symptom, EGD and pathology findings were analyzed between the initial and follow-up tests. Results Of the 117 NG patients, 66.7% (12/18) of the eradicated NG patients showed nodule regression after H. pylori eradication, whereas 9.9% (9/99) of the non-eradicated NG patients showed spontaneous nodule regression without H. pylori eradication (P < 0.001). Nodule regression was more frequent in NG patients with antral nodule location (P = 0.010), small-sized nodules (P = 0.029), H. pylori eradication (P < 0.001), UGI symptom (P = 0.007), and a long-term follow-up period (P = 0.030). On the logistic regression analysis, nodule regression was inversely correlated with the persistent H. pylori infection on the follow-up test (odds ratio (OR): 0.020, 95% confidence interval (CI): 0.003 - 0.137, P < 0.001) and short-term follow-up period < 30.5 months (OR: 0.140, 95% CI: 0.028 - 0.700, P = 0.017). Conclusions In adults with NG, H. pylori eradication is the most significant factor associated with nodule regression. Long-term follow-up period is also correlated with nodule regression, but is less significant than H. pylori eradication. Our findings suggest that H. pylori eradication should be considered to promote nodule regression in NG patients with H. pylori infection.
Evaluating Geographically Weighted Regression Models for Environmental Chemical Risk Analysis
Czarnota, Jenna; Wheeler, David C; Gennings, Chris
2015-01-01
In the evaluation of cancer risk related to environmental chemical exposures, the effect of many correlated chemicals on disease is often of interest. The relationship between correlated environmental chemicals and health effects is not always constant across a study area, as exposure levels may change spatially due to various environmental factors. Geographically weighted regression (GWR) has been proposed to model spatially varying effects. However, concerns about collinearity effects, including regression coefficient sign reversal (ie, reversal paradox), may limit the applicability of GWR for environmental chemical risk analysis. A penalized version of GWR, the geographically weighted lasso, has been proposed to remediate the collinearity effects in GWR models. Our focus in this study was on assessing through a simulation study the ability of GWR and GWL to correctly identify spatially varying chemical effects for a mixture of correlated chemicals within a study area. Our results showed that GWR suffered from the reversal paradox, while GWL overpenalized the effects for the chemical most strongly related to the outcome. PMID:25983546
Bailey-Wilson, Joan E.; Brennan, Jennifer S.; Bull, Shelley B; Culverhouse, Robert; Kim, Yoonhee; Jiang, Yuan; Jung, Jeesun; Li, Qing; Lamina, Claudia; Liu, Ying; Mägi, Reedik; Niu, Yue S.; Simpson, Claire L.; Wang, Libo; Yilmaz, Yildiz E.; Zhang, Heping; Zhang, Zhaogong
2012-01-01
Group 14 of Genetic Analysis Workshop 17 examined several issues related to analysis of complex traits using DNA sequence data. These issues included novel methods for analyzing rare genetic variants in an aggregated manner (often termed collapsing rare variants), evaluation of various study designs to increase power to detect effects of rare variants, and the use of machine learning approaches to model highly complex heterogeneous traits. Various published and novel methods for analyzing traits with extreme locus and allelic heterogeneity were applied to the simulated quantitative and disease phenotypes. Overall, we conclude that power is (as expected) dependent on locus-specific heritability or contribution to disease risk, large samples will be required to detect rare causal variants with small effect sizes, extreme phenotype sampling designs may increase power for smaller laboratory costs, methods that allow joint analysis of multiple variants per gene or pathway are more powerful in general than analyses of individual rare variants, population-specific analyses can be optimal when different subpopulations harbor private causal mutations, and machine learning methods may be useful for selecting subsets of predictors for follow-up in the presence of extreme locus heterogeneity and large numbers of potential predictors. PMID:22128066
A New Sample Size Formula for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.; Barcikowski, Robert S.
The focus of this research was to determine the efficacy of a new method of selecting sample sizes for multiple linear regression. A Monte Carlo simulation was used to study both empirical predictive power rates and empirical statistical power rates of the new method and seven other methods: those of C. N. Park and A. L. Dudycha (1974); J. Cohen…
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
"They're Here to Follow Orders, Not to Think:" Some Notes on Academic Freedom in Penal Institutions.
ERIC Educational Resources Information Center
Arcard, Thomas E.; Watts-LaFontaine, Phyllis
1983-01-01
Addresses basic issues that arise when an instructor who teaches college-level courses to inmates in a penal institution reflects upon that teaching experience. Presents questions that must be examined by those teaching prison inmates. (JOW)
Afanador, N L; Tran, T N; Buydens, L M C
2013-03-20
Bio-pharmaceutical manufacturing is a multifaceted and complex process wherein the manufacture of a single batch hundreds of processing variables and raw materials are monitored. In these processes, identifying the candidate variables responsible for any changes in process performance can prove to be extremely challenging. Within this context, partial least squares (PLS) has proven to be an important tool in helping determine the root cause for changes in biological performance, such as cellular growth or viral propagation. In spite of the positive impact PLS has had in helping understand bio-pharmaceutical process data, the high variability in measured response (Y) and predictor variables (X), and weak relationship between X and Y, has at times made root cause determination for process changes difficult. Our goal is to demonstrate how the use of bootstrapping, in conjunction with permutation tests, can provide avenues for improving the selection of variables responsible for manufacturing process changes via the variable importance in the projection (PLS-VIP) statistic. Although applied uniquely to the PLS-VIP in this article, the generality of the aforementioned methods can be used to improve other variable selection methods, in addition to increasing confidence around other estimates obtained from a PLS model. PMID:23473249
Variable Selection in ROC Regression
2013-01-01
Regression models are introduced into the receiver operating characteristic (ROC) analysis to accommodate effects of covariates, such as genes. If many covariates are available, the variable selection issue arises. The traditional induced methodology separately models outcomes of diseased and nondiseased groups; thus, separate application of variable selections to two models will bring barriers in interpretation, due to differences in selected models. Furthermore, in the ROC regression, the accuracy of area under the curve (AUC) should be the focus instead of aiming at the consistency of model selection or the good prediction performance. In this paper, we obtain one single objective function with the group SCAD to select grouped variables, which adapts to popular criteria of model selection, and propose a two-stage framework to apply the focused information criterion (FIC). Some asymptotic properties of the proposed methods are derived. Simulation studies show that the grouped variable selection is superior to separate model selections. Furthermore, the FIC improves the accuracy of the estimated AUC compared with other criteria. PMID:24312135
[Is it possible to avoid a penal offence in carrying out ambulatory anesthesia?].
Forceville, X; Oxeda, C; Leloup, E; Bouju, P; Amiot, J F; Dupouey, B; Arnaud, F
1991-01-01
French jurisprudence about outpatient anaesthesia is resolutely unfavorable. It is principally based on the June 22nd 1972 decision of the cessation court, the highest court of justice in France. Preoperative non hospitalisation has been considered as a fault by negligence/carelessness of the practitioners. It resulted in their penal condemnation for involuntary injuries and compensation for the harm. This decision is linked with the evolution of the fault and the responsibility share between surgeon and anaesthetist. The post-operative phase seems to involve "theory of missing luck" (causality linkage or the detriment in itself), excluding a penal condemnation but not a partial compensation. Though some new legal considerations could be put forward, a written contract between physicians and patients is necessary in outpatient surgery, whereas the medical files and the organisation of the unit can prove the quality of medical care.
Law No. 91, Amendment to the Penal Code, 5 September 1987.
1989-01-01
This Law replaces Article 398 of the Iraq Penal Code with the following language: "If a sound contract of marriage has been made between a perpetrator of one of the crimes mentioned in this chapter and the victim, it shall be a legal extenuating excuse for the purpose of implementing the provisions of Articles (130 and 131) of the Penal Code. If the marriage contract has been terminated by a divorce issued by the husband without a legitimate reason, or by a divorce passed by the court for such reasons related [to] a mistake or a misconduct of the husband, three years before the expiry of the sentence of the action, then, the punishment shall be reconsidered with a view to intensifying it due to a request from the public prosecution, the victim herself, or any interested person." Among the crimes mentioned in the chapter referred to in Article 398 is rape.
[Development of a questionnaire to assess user satisfaction of a penal mediation program (CSM-P)].
Manzano Blanquez, Juan; Soria Verde, Miguel Angel; Armadans Tremolosa, Inmaculada
2008-08-01
The aim of the present study is to elaborate an instrument (CSM-P), valid for victims and aggressors, to assess satisfaction of individuals participating in a penal mediation program (VOM). The instrument was administered to a sample of 213 subjects, randomly chosen from the pool of participants in a VOM program of Catalonian Justice Department. Data analysis of the questionnaire shows an internal consistency of .88 (Cronbach's alpha). The dimensionality of the questionnaire is structured in a single factor that accounts for 61.45% of the variance. The instrument has proven its utility for assessing the satisfaction of the participants in a penal mediation program. Validation of the instrument in similar populations should be performed and it should be adapted to other contexts where assessing user satisfaction in a mediation program is necessary.
[Changes in legal status in the field of civil and penal doctor's liability].
Dukiet-Nagórska, T
1999-01-01
The author discusses the effects of organization changes in the range of health care, which took place in Poland, on the penal and civil liability of doctors. Her opinion refers also to the amendment of the penal code. The author emphasizes wider than before possibility of the doctor's liability for harm caused to a person or for the infringing of patient's rights. She also draws our attention to the description of offences pertaining to doctors, contained in the new panel code. The author proposes to work out the standards of procedure with the patient (based on the model of treatment standards)--according to the specific nature of particular sections of medicine or clinics. She expresses her conviction about the need to initiate changes in the legal status through the joint actions of doctors and lawyers.
An empirical evaluation of spatial regression models
NASA Astrophysics Data System (ADS)
Gao, Xiaolu; Asami, Yasushi; Chung, Chang-Jo F.
2006-10-01
Conventional statistical methods are often ineffective to evaluate spatial regression models. One reason is that spatial regression models usually have more parameters or smaller sample sizes than a simple model, so their degree of freedom is reduced. Thus, it is often unlikely to evaluate them based on traditional tests. Another reason, which is theoretically associated with statistical methods, is that statistical criteria are crucially dependent on such assumptions as normality, independence, and homogeneity. This may create problems because the assumptions are open for testing. In view of these problems, this paper proposes an alternative empirical evaluation method. To illustrate the idea, a few hedonic regression models for a house and land price data set are evaluated, including a simple, ordinary linear regression model and three spatial models. Their performance as to how well the price of the house and land can be predicted is examined. With a cross-validation technique, the prices at each sample point are predicted with a model estimated with the samples excluding the one being concerned. Then, empirical criteria are established whereby the predicted prices are compared with the real, observed prices. The proposed method provides an objective guidance for the selection of a suitable model specification for a data set. Moreover, the method is seen as an alternative way to test the significance of the spatial relationships being concerned in spatial regression models.
Su, Yongbo; She, Yue; Huang, Qiang; Shi, Chuanxin; Li, Zhongchao; Huang, Chengfei; Piao, Xiangshu; Li, Defa
2015-01-01
This experiment was conducted to determine the effects of inclusion level of soybean oil (SO) and palm oil (PO) on their digestible and metabolism energy (DE and ME) contents when fed to growing pigs by difference and regression method. Sixty-six crossbred growing barrows (Duroc×Landrace×Yorkshire and weighing 38.1±2.4 kg) were randomly allotted to a 2×5 factorial arrangement involving 2 lipid sources (SO and PO), and 5 levels of lipid (2%, 4%, 6%, 8%, and 10%) as well as a basal diet composed of corn and soybean meal. The barrows were housed in individual metabolism crates to facilitate separate collection of feces and urine, and were fed the assigned test diets at 4% of initial body weight per day. A 5-d total collection of feces and urine followed a 7-d diet adaptation period. The results showed that the DE and ME contents of SO and PO determined by the difference method were not affected by inclusion level. The DE and ME determined by the regression method for SO were greater compared with the corresponding respective values for PO (DE: 37.07, ME: 36.79 MJ/kg for SO; DE: 34.11, ME: 33.84 MJ/kg for PO, respectively). These values were close to the DE and ME values determined by the difference method at the 10% inclusion level (DE: 37.31, ME: 36.83 MJ/kg for SO; DE: 34.62, ME: 33.47 MJ/kg for PO, respectively). A similar response for the apparent total tract digestibility of acid-hydrolyzed ether extract (AEE) in lipids was observed. The true total tract digestibility of AEE in SO was significantly (p<0.05) greater than that for PO (97.5% and 91.1%, respectively). In conclusion, the DE and ME contents of lipid was not affected by its inclusion level. The difference method can substitute the regression method to determine the DE and ME contents in lipids when the inclusion level is 10%. PMID:26580443
Su, Yongbo; She, Yue; Huang, Qiang; Shi, Chuanxin; Li, Zhongchao; Huang, Chengfei; Piao, Xiangshu; Li, Defa
2015-12-01
This experiment was conducted to determine the effects of inclusion level of soybean oil (SO) and palm oil (PO) on their digestible and metabolism energy (DE and ME) contents when fed to growing pigs by difference and regression method. Sixty-six crossbred growing barrows (Duroc×Landrace×Yorkshire and weighing 38.1±2.4 kg) were randomly allotted to a 2×5 factorial arrangement involving 2 lipid sources (SO and PO), and 5 levels of lipid (2%, 4%, 6%, 8%, and 10%) as well as a basal diet composed of corn and soybean meal. The barrows were housed in individual metabolism crates to facilitate separate collection of feces and urine, and were fed the assigned test diets at 4% of initial body weight per day. A 5-d total collection of feces and urine followed a 7-d diet adaptation period. The results showed that the DE and ME contents of SO and PO determined by the difference method were not affected by inclusion level. The DE and ME determined by the regression method for SO were greater compared with the corresponding respective values for PO (DE: 37.07, ME: 36.79 MJ/kg for SO; DE: 34.11, ME: 33.84 MJ/kg for PO, respectively). These values were close to the DE and ME values determined by the difference method at the 10% inclusion level (DE: 37.31, ME: 36.83 MJ/kg for SO; DE: 34.62, ME: 33.47 MJ/kg for PO, respectively). A similar response for the apparent total tract digestibility of acid-hydrolyzed ether extract (AEE) in lipids was observed. The true total tract digestibility of AEE in SO was significantly (p<0.05) greater than that for PO (97.5% and 91.1%, respectively). In conclusion, the DE and ME contents of lipid was not affected by its inclusion level. The difference method can substitute the regression method to determine the DE and ME contents in lipids when the inclusion level is 10%.
[Civil prodedure or penal procedure in the event of medical fault].
du Jardin, J
2004-01-01
The author specifies the rules and the vocabulary of the civil and penal procedures. He points out the characteristics of the medical act and the failures that a medical practitioner can be blamed for. He defines the notions of the duty of best efforts and the duty to achieve a specific result. The role of the expert is touched upon. The article is supplemented by significant case-law decisions and a list of recent textbooks.
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination.
Yang, Li; Wang, Guobao; Qi, Jinyi
2016-04-01
Detecting cancerous lesions is a major clinical application of emission tomography. In a previous work, we studied penalized maximum-likelihood (PML) image reconstruction for lesion detection in static PET. Here we extend our theoretical analysis of static PET reconstruction to dynamic PET. We study both the conventional indirect reconstruction and direct reconstruction for Patlak parametric image estimation. In indirect reconstruction, Patlak parametric images are generated by first reconstructing a sequence of dynamic PET images, and then performing Patlak analysis on the time activity curves (TACs) pixel-by-pixel. In direct reconstruction, Patlak parametric images are estimated directly from raw sinogram data by incorporating the Patlak model into the image reconstruction procedure. PML reconstruction is used in both the indirect and direct reconstruction methods. We use a channelized Hotelling observer (CHO) to assess lesion detectability in Patlak parametric images. Simplified expressions for evaluating the lesion detectability have been derived and applied to the selection of the regularization parameter value to maximize detection performance. The proposed method is validated using computer-based Monte Carlo simulations. Good agreements between the theoretical predictions and the Monte Carlo results are observed. Both theoretical predictions and Monte Carlo simulation results show the benefit of the indirect and direct methods under optimized regularization parameters in dynamic PET reconstruction for lesion detection, when compared with the conventional static PET reconstruction.
Tong, Xuming; Chen, Jinghang; Miao, Hongyu; Li, Tingting; Zhang, Le
2015-01-01
Agent-based models (ABM) and differential equations (DE) are two commonly used methods for immune system simulation. However, it is difficult for ABM to estimate key parameters of the model by incorporating experimental data, whereas the differential equation model is incapable of describing the complicated immune system in detail. To overcome these problems, we developed an integrated ABM regression model (IABMR). It can combine the advantages of ABM and DE by employing ABM to mimic the multi-scale immune system with various phenotypes and types of cells as well as using the input and output of ABM to build up the Loess regression for key parameter estimation. Next, we employed the greedy algorithm to estimate the key parameters of the ABM with respect to the same experimental data set and used ABM to describe a 3D immune system similar to previous studies that employed the DE model. These results indicate that IABMR not only has the potential to simulate the immune system at various scales, phenotypes and cell types, but can also accurately infer the key parameters like DE model. Therefore, this study innovatively developed a complex system development mechanism that could simulate the complicated immune system in detail like ABM and validate the reliability and efficiency of model like DE by fitting the experimental data.
Evans, Joshua D.; Politte, David G.; Whiting, Bruce R.; O'Sullivan, Joseph A.; Williamson, Jeffrey F.
2011-03-15
Purpose: In comparison with conventional filtered backprojection (FBP) algorithms for x-ray computed tomography (CT) image reconstruction, statistical algorithms directly incorporate the random nature of the data and do not assume CT data are linear, noiseless functions of the attenuation line integral. Thus, it has been hypothesized that statistical image reconstruction may support a more favorable tradeoff than FBP between image noise and spatial resolution in dose-limited applications. The purpose of this study is to evaluate the noise-resolution tradeoff for the alternating minimization (AM) algorithm regularized using a nonquadratic penalty function. Methods: Idealized monoenergetic CT projection data with Poisson noise were simulated for two phantoms with inserts of varying contrast (7%-238%) and distance from the field-of-view (FOV) center (2-6.5 cm). Images were reconstructed for the simulated projection data by the FBP algorithm and two penalty function parameter values of the penalized AM algorithm. Each algorithm was run with a range of smoothing strengths to allow quantification of the noise-resolution tradeoff curve. Image noise is quantified as the standard deviation in the water background around each contrast insert. Modulation transfer functions (MTFs) were calculated from six-parameter model fits to oversampled edge-spread functions defined by the circular contrast-insert edges as a metric of local resolution. The integral of the MTF up to 0.5 lp/mm was adopted as a single-parameter measure of local spatial resolution. Results: The penalized AM algorithm noise-resolution tradeoff curve was always more favorable than that of the FBP algorithm. While resolution and noise are found to vary as a function of distance from the FOV center differently for the two algorithms, the ratio of noises when matching the resolution metric is relatively uniform over the image. The ratio of AM-to-FBP image variances, a predictor of dose-reduction potential, was
Reconstruction of difference in sequential CT studies using penalized likelihood estimation
Pourmorteza, A; Dang, H; Siewerdsen, J H; Stayman, J W
2016-01-01
Characterization of anatomical change and other differences is important in sequential computed tomography (CT) imaging, where a high-fidelity patient-specific prior image is typically present, but is not used, in the reconstruction of subsequent anatomical states. Here, we introduce a penalized likelihood (PL) method called reconstruction of difference (RoD) to directly reconstruct a difference image volume using both the current projection data and the (unregistered) prior image integrated into the forward model for the measurement data. The algorithm utilizes an alternating minimization to find both the registration and reconstruction estimates. This formulation allows direct control over the image properties of the difference image, permitting regularization strategies that inhibit noise and structural differences due to inconsistencies between the prior image and the current data.Additionally, if the change is known to be local, RoD allows local acquisition and reconstruction, as opposed to traditional model-based approaches that require a full support field of view (or other modifications). We compared the performance of RoD to a standard PL algorithm, in simulation studies and using test-bench cone-beam CT data. The performances of local and global RoD approaches were similar, with local RoD providing a significant computational speedup. In comparison across a range of data with differing fidelity, the local RoD approach consistently showed lower error (with respect to a truth image) than PL in both noisy data and sparsely sampled projection scenarios. In a study of the prior image registration performance of RoD, a clinically reasonable capture ranges were demonstrated. Lastly, the registration algorithm had a broad capture range and the error for reconstruction of CT data was 35% and 20% less than filtered back-projection for RoD and PL, respectively. The RoD has potential for delivering high-quality difference images in a range of sequential clinical
Reconstruction of difference in sequential CT studies using penalized likelihood estimation
NASA Astrophysics Data System (ADS)
Pourmorteza, A.; Dang, H.; Siewerdsen, J. H.; Stayman, J. W.
2016-03-01
Characterization of anatomical change and other differences is important in sequential computed tomography (CT) imaging, where a high-fidelity patient-specific prior image is typically present, but is not used, in the reconstruction of subsequent anatomical states. Here, we introduce a penalized likelihood (PL) method called reconstruction of difference (RoD) to directly reconstruct a difference image volume using both the current projection data and the (unregistered) prior image integrated into the forward model for the measurement data. The algorithm utilizes an alternating minimization to find both the registration and reconstruction estimates. This formulation allows direct control over the image properties of the difference image, permitting regularization strategies that inhibit noise and structural differences due to inconsistencies between the prior image and the current data. Additionally, if the change is known to be local, RoD allows local acquisition and reconstruction, as opposed to traditional model-based approaches that require a full support field of view (or other modifications). We compared the performance of RoD to a standard PL algorithm, in simulation studies and using test-bench cone-beam CT data. The performances of local and global RoD approaches were similar, with local RoD providing a significant computational speedup. In comparison across a range of data with differing fidelity, the local RoD approach consistently showed lower error (with respect to a truth image) than PL in both noisy data and sparsely sampled projection scenarios. In a study of the prior image registration performance of RoD, a clinically reasonable capture ranges were demonstrated. Lastly, the registration algorithm had a broad capture range and the error for reconstruction of CT data was 35% and 20% less than filtered back-projection for RoD and PL, respectively. The RoD has potential for delivering high-quality difference images in a range of sequential clinical
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA. PMID:11410035
Ternès, Nils; Rotolo, Federico; Michiels, Stefan
2016-07-10
Correct selection of prognostic biomarkers among multiple candidates is becoming increasingly challenging as the dimensionality of biological data becomes higher. Therefore, minimizing the false discovery rate (FDR) is of primary importance, while a low false negative rate (FNR) is a complementary measure. The lasso is a popular selection method in Cox regression, but its results depend heavily on the penalty parameter λ. Usually, λ is chosen using maximum cross-validated log-likelihood (max-cvl). However, this method has often a very high FDR. We review methods for a more conservative choice of λ. We propose an empirical extension of the cvl by adding a penalization term, which trades off between the goodness-of-fit and the parsimony of the model, leading to the selection of fewer biomarkers and, as we show, to the reduction of the FDR without large increase in FNR. We conducted a simulation study considering null and moderately sparse alternative scenarios and compared our approach with the standard lasso and 10 other competitors: Akaike information criterion (AIC), corrected AIC, Bayesian information criterion (BIC), extended BIC, Hannan and Quinn information criterion (HQIC), risk information criterion (RIC), one-standard-error rule, adaptive lasso, stability selection, and percentile lasso. Our extension achieved the best compromise across all the scenarios between a reduction of the FDR and a limited raise of the FNR, followed by the AIC, the RIC, and the adaptive lasso, which performed well in some settings. We illustrate the methods using gene expression data of 523 breast cancer patients. In conclusion, we propose to apply our extension to the lasso whenever a stringent FDR with a limited FNR is targeted. Copyright © 2016 John Wiley & Sons, Ltd.
Mental chronometry with simple linear regression.
Chen, J Y
1997-10-01
Typically, mental chronometry is performed by means of introducing an independent variable postulated to affect selectively some stage of a presumed multistage process. However, the effect could be a global one that spreads proportionally over all stages of the process. Currently, there is no method to test this possibility although simple linear regression might serve the purpose. In the present study, the regression approach was tested with tasks (memory scanning and mental rotation) that involved a selective effect and with a task (word superiority effect) that involved a global effect, by the dominant theories. The results indicate (1) the manipulation of the size of a memory set or of angular disparity affects the intercept of the regression function that relates the times for memory scanning with different set sizes or for mental rotation with different angular disparities and (2) the manipulation of context affects the slope of the regression function that relates the times for detecting a target character under word and nonword conditions. These ratify the regression approach as a useful method for doing mental chronometry. PMID:9347535
Can luteal regression be reversed?
Telleria, Carlos M
2006-01-01
The corpus luteum is an endocrine gland whose limited lifespan is hormonally programmed. This debate article summarizes findings of our research group that challenge the principle that the end of function of the corpus luteum or luteal regression, once triggered, cannot be reversed. Overturning luteal regression by pharmacological manipulations may be of critical significance in designing strategies to improve fertility efficacy. PMID:17074090
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Consumer Education. An Introductory Unit for Inmates in Penal Institutions.
ERIC Educational Resources Information Center
Schmoele, Henry H.; And Others
This introductory consumer education curriculum outline contains materials designed to help soon-to-be-released prisoners to develop an awareness of consumer concerns and to better manage their family lives. Each of the four units provided includes lesson objectives, suggested contents, suggested teaching methods, handouts, and tests. The unit on…
Modeling confounding by half-sibling regression.
Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas
2016-07-01
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154
Modeling confounding by half-sibling regression.
Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas
2016-07-01
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application.
Modeling confounding by half-sibling regression
Schölkopf, Bernhard; Hogg, David W.; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas
2016-01-01
We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as “half-sibling regression,” is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154
[Regression grading in gastrointestinal tumors].
Tischoff, I; Tannapfel, A
2012-02-01
Preoperative neoadjuvant chemoradiation therapy is a well-established and essential part of the interdisciplinary treatment of gastrointestinal tumors. Neoadjuvant treatment leads to regressive changes in tumors. To evaluate the histological tumor response different scoring systems describing regressive changes are used and known as tumor regression grading. Tumor regression grading is usually based on the presence of residual vital tumor cells in proportion to the total tumor size. Currently, no nationally or internationally accepted grading systems exist. In general, common guidelines should be used in the pathohistological diagnostics of tumors after neoadjuvant therapy. In particularly, the standard tumor grading will be replaced by tumor regression grading. Furthermore, tumors after neoadjuvant treatment are marked with the prefix "y" in the TNM classification. PMID:22293790
Prediction of dynamical systems by symbolic regression.
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K; Noack, Bernd R
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast. PMID:27575130
Prediction of dynamical systems by symbolic regression
NASA Astrophysics Data System (ADS)
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K.; Noack, Bernd R.
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.
Prediction of dynamical systems by symbolic regression.
Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K; Noack, Bernd R
2016-07-01
We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.
A two-phase procedure for QTL mapping with regression models.
Chen, Zehua; Cui, Wenquan
2010-07-01
It is typical in QTL mapping experiments that the number of markers under investigation is large. This poses a challenge to commonly used regression models since the number of feature variables is usually much larger than the sample size, especially, when epistasis effects are to be considered. The greedy nature of the conventional stepwise procedures is well known and is even more conspicuous in such cases. In this article, we propose a two-phase procedure based on penalized likelihood techniques and extended Bayes information criterion (EBIC) for QTL mapping. The procedure consists of a screening phase and a selection phase. In the screening phase, the main and interaction features are alternatively screened by a penalized likelihood mechanism. In the selection phase, a low-dimensional approach using EBIC is applied to the features retained in the screening phase to identify QTL. The two-phase procedure has the asymptotic property that its positive detection rate (PDR) and false discovery rate (FDR) converge to 1 and 0, respectively, as sample size goes to infinity. The two-phase procedure is compared with both traditional and recently developed approaches by simulation studies. A real data analysis is presented to demonstrate the application of the two-phase procedure.
Uncertainty quantification in DIC with Kriging regression
NASA Astrophysics Data System (ADS)
Wang, Dezhi; DiazDelaO, F. A.; Wang, Weizhuo; Lin, Xiaoshan; Patterson, Eann A.; Mottershead, John E.
2016-03-01
A Kriging regression model is developed as a post-processing technique for the treatment of measurement uncertainty in classical subset-based Digital Image Correlation (DIC). Regression is achieved by regularising the sample-point correlation matrix using a local, subset-based, assessment of the measurement error with assumed statistical normality and based on the Sum of Squared Differences (SSD) criterion. This leads to a Kriging-regression model in the form of a Gaussian process representing uncertainty on the Kriging estimate of the measured displacement field. The method is demonstrated using numerical and experimental examples. Kriging estimates of displacement fields are shown to be in excellent agreement with 'true' values for the numerical cases and in the experimental example uncertainty quantification is carried out using the Gaussian random process that forms part of the Kriging model. The root mean square error (RMSE) on the estimated displacements is produced and standard deviations on local strain estimates are determined.
Response-adaptive regression for longitudinal data.
Wu, Shuang; Müller, Hans-Georg
2011-09-01
We propose a response-adaptive model for functional linear regression, which is adapted to sparsely sampled longitudinal responses. Our method aims at predicting response trajectories and models the regression relationship by directly conditioning the sparse and irregular observations of the response on the predictor, which can be of scalar, vector, or functional type. This obliterates the need to model the response trajectories, a task that is challenging for sparse longitudinal data and was previously required for functional regression implementations for longitudinal data. The proposed approach turns out to be superior compared to previous functional regression approaches in terms of prediction error. It encompasses a variety of regression settings that are relevant for the functional modeling of longitudinal data in the life sciences. The improved prediction of response trajectories with the proposed response-adaptive approach is illustrated for a longitudinal study of Kiwi weight growth and by an analysis of the dynamic relationship between viral load and CD4 cell counts observed in AIDS clinical trials. PMID:21133880
Committee on the forfeiture of assets in criminal offences of the Howard League of Penal Reform.
Nicol, A
1983-01-01
The profits of illicit drug trafficking have attracted world-wide attention. In 1980, a decision of the House of Lords in the United Kingdom ruled that the English criminal courts had no power to forfeit such profits even though earned by convicted defendants. This lacuna in the law aroused considerable criticism and the Howard League of Penal Reform established an independent committee to propose reforms. Its report will be published shortly. This article discusses some of the issues which the Committee faced, particularly the delicate balance between the interest of the State in ensuring that crime does not pay and a commitment to due process before financial punishments are imposed.
[Social training of alcoholic traffic offenders completing a penal term--a general practice report].
Walter, J
1989-05-01
This is to inform about a practical introduction of a training for alcoholic traffic-offenders in the liberal Sachsenheim-branch of the Pforzheim penal institution. There prisoners have the opportunity to participate in re-training by traffic psychologists of the "TUV" (Technical Surveillance Organization) in order to improve their aptitude to drive motor vehicles. The experience with the first 10 courses are said to be good, however, it is not possible to give an opinion on the efficiency of those measures.
Godoy, Roberto L M
2009-01-01
The present essay is intended to oppose to the bipartite thesis of the capacity of penal culpability ("to be able to understand the criminality of the act or to be able to direct the actions"), a unitary thesis in which it seems biopsychologically impossible to direct the behaviour towards an object that hasn't been previously understood, nor a complete divorce of action from understanding (as it results from a maximum integration of the intellective, volitive and affective spheres of a dynamic psyche).
[The concept of insanity in Danish penal and mental health codes].
Brandt-Christensen, Mette; Bertelsen, Aksel
2010-04-26
The medical terms insanity and psychosis are used synonymously to describe a condition with substantial changes in the "total" personality and loss of realism. In the 10th revision of the diagnostic classification system ICD-10 (1994) the intention was to replace the term "psychosis" with "psychotic" to indicate the presence of hallucinations and delusions. However, in Danish legislation - most importantly in the penal code and the Mental Health Act - the term "insanity" is still in use. This difference has lead to diagnostic uncertainty, especially in clinical and forensic psychiatric practice.
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Independent motion detection with a rival penalized adaptive particle filter
NASA Astrophysics Data System (ADS)
Becker, Stefan; Hübner, Wolfgang; Arens, Michael
2014-10-01
Aggregation of pixel based motion detection into regions of interest, which include views of single moving objects in a scene is an essential pre-processing step in many vision systems. Motion events of this type provide significant information about the object type or build the basis for action recognition. Further, motion is an essential saliency measure, which is able to effectively support high level image analysis. When applied to static cameras, background subtraction methods achieve good results. On the other hand, motion aggregation on freely moving cameras is still a widely unsolved problem. The image flow, measured on a freely moving camera is the result from two major motion types. First the ego-motion of the camera and second object motion, that is independent from the camera motion. When capturing a scene with a camera these two motion types are adverse blended together. In this paper, we propose an approach to detect multiple moving objects from a mobile monocular camera system in an outdoor environment. The overall processing pipeline consists of a fast ego-motion compensation algorithm in the preprocessing stage. Real-time performance is achieved by using a sparse optical flow algorithm as an initial processing stage and a densely applied probabilistic filter in the post-processing stage. Thereby, we follow the idea proposed by Jung and Sukhatme. Normalized intensity differences originating from a sequence of ego-motion compensated difference images represent the probability of moving objects. Noise and registration artefacts are filtered out, using a Bayesian formulation. The resulting a posteriori distribution is located on image regions, showing strong amplitudes in the difference image which are in accordance with the motion prediction. In order to effectively estimate the a posteriori distribution, a particle filter is used. In addition to the fast ego-motion compensation, the main contribution of this paper is the design of the probabilistic
Multiple Regression and Its Discontents
ERIC Educational Resources Information Center
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
ERIC Educational Resources Information Center
California State Office of the Attorney General, Sacramento.
This handbook was prepared to ensure that, as required by section 626.1 of the California Penal Code in 1984, "students, parents, and all school officials and employees have access to a concise, easily understandable summary of California penal and civil law pertaining to crimes committed against persons or property on school grounds." The…
ERIC Educational Resources Information Center
Torrence, John Thomas
Excluding military installations, training programs in state and federal penal institutions were surveyed, through a mailed checklist, to test the hypotheses that (1) training programs in penal institutions were not related to the unfilled job openings by major occupations in the United States, and (2) that training programs reported would have a…
Wrong Signs in Regression Coefficients
NASA Technical Reports Server (NTRS)
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
Regression Discontinuity Designs in Epidemiology
Moscoe, Ellen; Mutevedzi, Portia; Newell, Marie-Louise; Bärnighausen, Till
2014-01-01
When patients receive an intervention based on whether they score below or above some threshold value on a continuously measured random variable, the intervention will be randomly assigned for patients close to the threshold. The regression discontinuity design exploits this fact to estimate causal treatment effects. In spite of its recent proliferation in economics, the regression discontinuity design has not been widely adopted in epidemiology. We describe regression discontinuity, its implementation, and the assumptions required for causal inference. We show that regression discontinuity is generalizable to the survival and nonlinear models that are mainstays of epidemiologic analysis. We then present an application of regression discontinuity to the much-debated epidemiologic question of when to start HIV patients on antiretroviral therapy. Using data from a large South African cohort (2007–2011), we estimate the causal effect of early versus deferred treatment eligibility on mortality. Patients whose first CD4 count was just below the 200 cells/μL CD4 count threshold had a 35% lower hazard of death (hazard ratio = 0.65 [95% confidence interval = 0.45–0.94]) than patients presenting with CD4 counts just above the threshold. We close by discussing the strengths and limitations of regression discontinuity designs for epidemiology. PMID:25061922
[First aid responsibility of the physician in Austria according to paragraph 95 of the penal code].
Bauer, G
1986-02-28
With paragraph 95 of its new Penal Code, Austria has joined the ranks of those countries which, by statutory provision, lay down a general duty, which applies to everyone to render aid in a case of misadventure. This provision is to enhance the principle of rendering aid and assistance to a fellow human being in need, and to obligate everyone to exercise at least a minimum of solidarity with fellow-men and to undertake to save another person in danger of life or limb. The conception in legal policy and juristic application is shown by examples drawn from the literature. A case is reported in detail to demonstrate that the provision is also applicable to the doctor's activity in the hospital, starting out from the relatively wide scope of the concept of "misadventure". 95 of the Austrian Penal Code defines a typical crime by omission, the applicability therefore not being bound to the proof of an injurious outcome. The paper points out the particular difficulty in the differential diagnosis of alcoholisation - cranial trauma.
Sex offender punishment and the persistence of penal harm in the U.S.
Leon, Chrysanthi S
2011-01-01
The U.S. has dramatically revised its approach to punishment in the last several decades. In particular, people convicted of sex crimes have experienced a remarkable expansion in social control through a wide-range of post-conviction interventions. While this expansion may be largely explained by general punishment trends, there appear to be unique factors that have prevented other penal reforms from similarly modulating sex offender punishment. In part, this continuation of a "penal harm" approach to sex offenders relates to the past under-valuing of sexual victimization. In the "bad old days," the law and its agents sent mixed messages about sexual violence and sexual offending. Some sexual offending was mere nuisance, some was treatable, and a fraction "deserved" punishment equivalent to other serious criminal offending. In contrast, today's sex offender punishment schemes rarely distinguish formally among gradations of harm or dangerousness. After examining incarceration trends, this article explores the historical context of the current broad brush approach and reviews the unintended consequences. Altogether, this article reinforces the need to return to differentiation among sex offenders, but differentiation based on science and on the experience-based, guided discretion of experts in law enforcement, corrections, and treatment.
A tutorial on Bayesian Normal linear regression
NASA Astrophysics Data System (ADS)
Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens
2015-12-01
Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.
Yang, Xue; Lauzon, Carolyn B.; Crainiceanu, Ciprian; Caffo, Brian; Resnick, Susan M.; Landman, Bennett A.
2012-01-01
Massively univariate regression and inference in the form of statistical parametric mapping have transformed the way in which multi-dimensional imaging data are studied. In functional and structural neuroimaging, the de facto standard “design matrix”-based general linear regression model and its multi-level cousins have enabled investigation of the biological basis of the human brain. With modern study designs, it is possible to acquire multi-modal three-dimensional assessments of the same individuals — e.g., structural, functional and quantitative magnetic resonance imaging, alongside functional and ligand binding maps with positron emission tomography. Largely, current statistical methods in the imaging community assume that the regressors are non-random. For more realistic multi-parametric assessment (e.g., voxel-wise modeling), distributional consideration of all observations is appropriate. Herein, we discuss two unified regression and inference approaches, model II regression and regression calibration, for use in massively univariate inference with imaging data. These methods use the design matrix paradigm and account for both random and non-random imaging regressors. We characterize these methods in simulation and illustrate their use on an empirical dataset. Both methods have been made readily available as a toolbox plug-in for the SPM software. PMID:22609453
Transfer Learning Based on Logistic Regression
NASA Astrophysics Data System (ADS)
Paul, A.; Rottensteiner, F.; Heipke, C.
2015-08-01
In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.
Interpretation of Standardized Regression Coefficients in Multiple Regression.
ERIC Educational Resources Information Center
Thayer, Jerome D.
The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for variables…
Regressive Evolution in Astyanax Cavefish
Jeffery, William R.
2013-01-01
A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment. PMID:19640230
Survival Data and Regression Models
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
[Is regression of atherosclerosis possible?].
Thomas, D; Richard, J L; Emmerich, J; Bruckert, E; Delahaye, F
1992-10-01
Experimental studies have shown the regression of atherosclerosis in animals given a cholesterol-rich diet and then given a normal diet or hypolipidemic therapy. Despite favourable results of clinical trials of primary prevention modifying the lipid profile, the concept of atherosclerosis regression in man remains very controversial. The methodological approach is difficult: this is based on angiographic data and requires strict standardisation of angiographic views and reliable quantitative techniques of analysis which are available with image processing. Several methodologically acceptable clinical coronary studies have shown not only stabilisation but also regression of atherosclerotic lesions with reductions of about 25% in total cholesterol levels and of about 40% in LDL cholesterol levels. These reductions were obtained either by drugs as in CLAS (Cholesterol Lowering Atherosclerosis Study), FATS (Familial Atherosclerosis Treatment Study) and SCOR (Specialized Center of Research Intervention Trial), by profound modifications in dietary habits as in the Lifestyle Heart Trial, or by surgery (ileo-caecal bypass) as in POSCH (Program On the Surgical Control of the Hyperlipidemias). On the other hand, trials with non-lipid lowering drugs such as the calcium antagonists (INTACT, MHIS) have not shown significant regression of existing atherosclerotic lesions but only a decrease on the number of new lesions. The clinical benefits of these regression studies are difficult to demonstrate given the limited period of observation, relatively small population numbers and the fact that in some cases the subjects were asymptomatic. The decrease in the number of cardiovascular events therefore seems relatively modest and concerns essentially subjects who were symptomatic initially. The clinical repercussion of studies of prevention involving a single lipid factor is probably partially due to the reduction in progression and anatomical regression of the atherosclerotic plaque
Regression Segmentation for M³ Spinal Images.
Wang, Zhijie; Zhen, Xiantong; Tay, KengYeow; Osman, Said; Romano, Walter; Li, Shuo
2015-08-01
Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities (M(3)). Unfortunately, existing methods for segmenting spinal images are still limited to one specific structure, in one specific plane or from one specific modality (S(3)). In this paper, we propose a novel approach, Regression Segmentation, that is for the first time able to segment M(3) spinal images in one single unified framework. This approach formulates the segmentation task innovatively as a boundary regression problem: modeling a highly nonlinear mapping function from substantially diverse M(3) images directly to desired object boundaries. Leveraging the advancement of sparse kernel machines, regression segmentation is fulfilled by a multi-dimensional support vector regressor (MSVR) which operates in an implicit, high dimensional feature space where M(3) diversity and specificity can be systematically categorized, extracted, and handled. The proposed regression segmentation approach was thoroughly tested on images from 113 clinical subjects including both disc and vertebral structures, in both sagittal and axial planes, and from both MRI and CT modalities. The overall result reaches a high dice similarity index (DSI) 0.912 and a low boundary distance (BD) 0.928 mm. With our unified and expendable framework, an efficient clinical tool for M(3) spinal image segmentation can be easily achieved, and will substantially benefit the diagnosis and treatment of spinal diseases.
Genetic Programming Transforms in Linear Regression Situations
NASA Astrophysics Data System (ADS)
Castillo, Flor; Kordon, Arthur; Villa, Carlos
The chapter summarizes the use of Genetic Programming (GP) inMultiple Linear Regression (MLR) to address multicollinearity and Lack of Fit (LOF). The basis of the proposed method is applying appropriate input transforms (model respecification) that deal with these issues while preserving the information content of the original variables. The transforms are selected from symbolic regression models with optimal trade-off between accuracy of prediction and expressional complexity, generated by multiobjective Pareto-front GP. The chapter includes a comparative study of the GP-generated transforms with Ridge Regression, a variant of ordinary Multiple Linear Regression, which has been a useful and commonly employed approach for reducing multicollinearity. The advantages of GP-generated model respecification are clearly defined and demonstrated. Some recommendations for transforms selection are given as well. The application benefits of the proposed approach are illustrated with a real industrial application in one of the broadest empirical modeling areas in manufacturing - robust inferential sensors. The chapter contributes to increasing the awareness of the potential of GP in statistical model building by MLR.
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed
Competing risks regression for clustered data.
Zhou, Bingqing; Fine, Jason; Latouche, Aurelien; Labopin, Myriam
2012-07-01
A population average regression model is proposed to assess the marginal effects of covariates on the cumulative incidence function when there is dependence across individuals within a cluster in the competing risks setting. This method extends the Fine-Gray proportional hazards model for the subdistribution to situations, where individuals within a cluster may be correlated due to unobserved shared factors. Estimators of the regression parameters in the marginal model are developed under an independence working assumption where the correlation across individuals within a cluster is completely unspecified. The estimators are consistent and asymptotically normal, and variance estimation may be achieved without specifying the form of the dependence across individuals. A simulation study evidences that the inferential procedures perform well with realistic sample sizes. The practical utility of the methods is illustrated with data from the European Bone Marrow Transplant Registry.
NASA Astrophysics Data System (ADS)
Dyar, M. D.; Carmosino, M. L.; Breves, E. A.; Ozanne, M. V.; Clegg, S. M.; Wiens, R. C.
2012-04-01
A remote laser-induced breakdown spectrometer (LIBS) designed to simulate the ChemCam instrument on the Mars Science Laboratory Rover Curiosity was used to probe 100 geologic samples at a 9-m standoff distance. ChemCam consists of an integrated remote LIBS instrument that will probe samples up to 7 m from the mast of the rover and a remote micro-imager (RMI) that will record context images. The elemental compositions of 100 igneous and highly-metamorphosed rocks are determined with LIBS using three variations of multivariate analysis, with a goal of improving the analytical accuracy. Two forms of partial least squares (PLS) regression are employed with finely-tuned parameters: PLS-1 regresses a single response variable (elemental concentration) against the observation variables (spectra, or intensity at each of 6144 spectrometer channels), while PLS-2 simultaneously regresses multiple response variables (concentrations of the ten major elements in rocks) against the observation predictor variables, taking advantage of natural correlations between elements. Those results are contrasted with those from the multivariate regression technique of the least absolute shrinkage and selection operator (lasso), which is a penalized shrunken regression method that selects the specific channels for each element that explain the most variance in the concentration of that element. To make this comparison, we use results of cross-validation and of held-out testing, and employ unscaled and uncentered spectral intensity data because all of the input variables are already in the same units. Results demonstrate that the lasso, PLS-1, and PLS-2 all yield comparable results in terms of accuracy for this dataset. However, the interpretability of these methods differs greatly in terms of fundamental understanding of LIBS emissions. PLS techniques generate principal components, linear combinations of intensities at any number of spectrometer channels, which explain as much variance in the
Kernel Partial Least Squares for Nonlinear Regression and Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Clancy, Daniel (Technical Monitor)
2002-01-01
This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.
Combining regression trees and radial basis function networks.
Orr, M; Hallam, J; Takezawa, K; Murra, A; Ninomiya, S; Oide, M; Leonard, T
2000-12-01
We describe a method for non-parametric regression which combines regression trees with radial basis function networks. The method is similar to that of Kubat, who was first to suggest such a combination, but has some significant improvements. We demonstrate the features of the new method, compare its performance with other methods on DELVE data sets and apply it to a real world problem involving the classification of soybean plants from digital images.
ERIC Educational Resources Information Center
Semmens, Bob, Ed.; Cook, Sandy, Ed.
This document contains 19 papers presented at an international forum on education in penal systems. The following papers are included: "Burning" (Craig W.J. Minogue); "The Acquisition of Cognitive Skills as a Means of Recidivism Reduction: A Former Prisoner's Perspective" (Trevor Darryl Doherty); "CEA (Correctional Education Association)…
ERIC Educational Resources Information Center
Klofas, John; Duffee, David E.
1981-01-01
Reexamines the assumptions of the change grid regarding the channeling of masses of clients into change strategies programs. Penal organizations specifically select and place clients so that programs remain stable, rather than sequence programs to meet the needs of clients. (Author)
45 CFR 261.13 - May an individual be penalized for not following an individual responsibility plan?
Code of Federal Regulations, 2011 CFR
2011-10-01
... an individual responsibility plan? 261.13 Section 261.13 Public Welfare Regulations Relating to... Addressing Individual Responsibility? § 261.13 May an individual be penalized for not following an individual responsibility plan? Yes. If an individual fails without good cause to comply with an individual...
45 CFR 261.13 - May an individual be penalized for not following an individual responsibility plan?
Code of Federal Regulations, 2010 CFR
2010-10-01
... an individual responsibility plan? 261.13 Section 261.13 Public Welfare Regulations Relating to... Addressing Individual Responsibility? § 261.13 May an individual be penalized for not following an individual responsibility plan? Yes. If an individual fails without good cause to comply with an individual...
Model selection for logistic regression models
NASA Astrophysics Data System (ADS)
Duller, Christine
2012-09-01
Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.
Differential correction schemes in nonlinear regression
NASA Technical Reports Server (NTRS)
Decell, H. P., Jr.; Speed, F. M.
1972-01-01
Classical iterative methods in nonlinear regression are reviewed and improved upon. This is accomplished by discussion of the geometrical and theoretical motivation for introducing modifications using generalized matrix inversion. Examples having inherent pitfalls are presented and compared in terms of results obtained using classical and modified techniques. The modification is shown to be useful alone or in conjunction with other modifications appearing in the literature.
Realization of Ridge Regression in MATLAB
NASA Astrophysics Data System (ADS)
Dimitrov, S.; Kovacheva, S.; Prodanova, K.
2008-10-01
The least square estimator (LSE) of the coefficients in the classical linear regression models is unbiased. In the case of multicollinearity of the vectors of design matrix, LSE has very big variance, i.e., the estimator is unstable. A more stable estimator (but biased) can be constructed using ridge-estimator (RE). In this paper the basic methods of obtaining of Ridge-estimators and numerical procedures of its realization in MATLAB are considered. An application to Pharmacokinetics problem is considered.
Face Alignment via Regressing Local Binary Features.
Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian
2016-03-01
This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.
General Regression and Representation Model for Classification
Qian, Jianjun; Yang, Jian; Xu, Yong
2014-01-01
Recently, the regularized coding-based classification methods (e.g. SRC and CRC) show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR) for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients) and the specific information (weight matrix of image pixels) to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel) weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR) and robust general regression and representation classifier (R-GRR). The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms. PMID:25531882
Incidence of AIDS cases in Spanish penal facilities through the capture-recapture method, 2000.
Acin, E; Gómez, P; Hernando, P; Corella, I
2003-09-01
Three available sources of information used in the surveillance of AIDS in Spanish prisons were used to carry out a capture-recapture study. Results showed the register of AIDS cases (RCS) considerably underestimates the incidence of this disease in prisons as it covers only 50% of cases. This study highlights the need to use additional sources to the RCS to evaluate the real incidence of AIDS in prisons in Spain.
Correlation Weights in Multiple Regression
ERIC Educational Resources Information Center
Waller, Niels G.; Jones, Jeff A.
2010-01-01
A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…
Weighting Regressions by Propensity Scores
ERIC Educational Resources Information Center
Freedman, David A.; Berk, Richard A.
2008-01-01
Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If…
Multiple Regression: A Leisurely Primer.
ERIC Educational Resources Information Center
Daniel, Larry G.; Onwuegbuzie, Anthony J.
Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…
Cactus: An Introduction to Regression
ERIC Educational Resources Information Center
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Ridge Regression for Interactive Models.
ERIC Educational Resources Information Center
Tate, Richard L.
1988-01-01
An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are favorable to…
Quantile Regression with Censored Data
ERIC Educational Resources Information Center
Lin, Guixian
2009-01-01
The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…
NASA Astrophysics Data System (ADS)
Wu, Chunhung
2016-04-01
Few researches have discussed about the applicability of applying the statistical landslide susceptibility (LS) model for extreme rainfall-induced landslide events. The researches focuses on the comparison and applicability of LS models based on four methods, including landslide ratio-based logistic regression (LRBLR), frequency ratio (FR), weight of evidence (WOE), and instability index (II) methods, in an extreme rainfall-induced landslide cases. The landslide inventory in the Chishan river watershed, Southwestern Taiwan, after 2009 Typhoon Morakot is the main materials in this research. The Chishan river watershed is a tributary watershed of Kaoping river watershed, which is a landslide- and erosion-prone watershed with the annual average suspended load of 3.6×107 MT/yr (ranks 11th in the world). Typhoon Morakot struck Southern Taiwan from Aug. 6-10 in 2009 and dumped nearly 2,000 mm of rainfall in the Chishan river watershed. The 24-hour, 48-hour, and 72-hours accumulated rainfall in the Chishan river watershed exceeded the 200-year return period accumulated rainfall. 2,389 landslide polygons in the Chishan river watershed were extracted from SPOT 5 images after 2009 Typhoon Morakot. The total landslide area is around 33.5 km2, equals to the landslide ratio of 4.1%. The main landslide types based on Varnes' (1978) classification are rotational and translational slides. The two characteristics of extreme rainfall-induced landslide event are dense landslide distribution and large occupation of downslope landslide areas owing to headward erosion and bank erosion in the flooding processes. The area of downslope landslide in the Chishan river watershed after 2009 Typhoon Morakot is 3.2 times higher than that of upslope landslide areas. The prediction accuracy of LS models based on LRBLR, FR, WOE, and II methods have been proven over 70%. The model performance and applicability of four models in a landslide-prone watershed with dense distribution of rainfall
Contrasting OLS and Quantile Regression Approaches to Student "Growth" Percentiles
ERIC Educational Resources Information Center
Castellano, Katherine Elizabeth; Ho, Andrew Dean
2013-01-01
Regression methods can locate student test scores in a conditional distribution, given past scores. This article contrasts and clarifies two approaches to describing these locations in terms of readily interpretable percentile ranks or "conditional status percentile ranks." The first is Betebenner's quantile regression approach that results in…
An Importance Sampling EM Algorithm for Latent Regression Models
ERIC Educational Resources Information Center
von Davier, Matthias; Sinharay, Sandip
2007-01-01
Reporting methods used in large-scale assessments such as the National Assessment of Educational Progress (NAEP) rely on latent regression models. To fit the latent regression model using the maximum likelihood estimation technique, multivariate integrals must be evaluated. In the computer program MGROUP used by the Educational Testing Service for…
Semisupervised Clustering by Iterative Partition and Regression with Neuroscience Applications
Qian, Guoqi; Wu, Yuehua; Ferrari, Davide; Qiao, Puxue; Hollande, Frédéric
2016-01-01
Regression clustering is a mixture of unsupervised and supervised statistical learning and data mining method which is found in a wide range of applications including artificial intelligence and neuroscience. It performs unsupervised learning when it clusters the data according to their respective unobserved regression hyperplanes. The method also performs supervised learning when it fits regression hyperplanes to the corresponding data clusters. Applying regression clustering in practice requires means of determining the underlying number of clusters in the data, finding the cluster label of each data point, and estimating the regression coefficients of the model. In this paper, we review the estimation and selection issues in regression clustering with regard to the least squares and robust statistical methods. We also provide a model selection based technique to determine the number of regression clusters underlying the data. We further develop a computing procedure for regression clustering estimation and selection. Finally, simulation studies are presented for assessing the procedure, together with analyzing a real data set on RGB cell marking in neuroscience to illustrate and interpret the method. PMID:27212939
Tutorial on Using Regression Models with Count Outcomes Using R
ERIC Educational Resources Information Center
Beaujean, A. Alexander; Morgan, Grant B.
2016-01-01
Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares) either with or without transforming the count variables. In either case, using typical regression for count data can…