Science.gov

Sample records for additive regression trees

  1. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).

    PubMed

    Sparapani, Rodney A; Logan, Brent R; McCulloch, Robert E; Laud, Purushottam W

    2016-07-20

    Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26854022

  2. A multiple additive regression tree analysis of three exposure measures during Hurricane Katrina.

    PubMed

    Curtis, Andrew; Li, Bin; Marx, Brian D; Mills, Jacqueline W; Pine, John

    2011-01-01

    This paper analyses structural and personal exposure to Hurricane Katrina. Structural exposure is measured by flood height and building damage; personal exposure is measured by the locations of 911 calls made during the response. Using these variables, this paper characterises the geography of exposure and also demonstrates the utility of a robust analytical approach in understanding health-related challenges to disadvantaged populations during recovery. Analysis is conducted using a contemporary statistical approach, a multiple additive regression tree (MART), which displays considerable improvement over traditional regression analysis. By using MART, the percentage of improvement in R-squares over standard multiple linear regression ranges from about 62 to more than 100 per cent. The most revealing finding is the modelled verification that African Americans experienced disproportionate exposure in both structural and personal contexts. Given the impact of exposure to health outcomes, this finding has implications for understanding the long-term health challenges facing this population.

  3. Structural regression trees

    SciTech Connect

    Kramer, S.

    1996-12-31

    In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.

  4. Combining regression trees and radial basis function networks.

    PubMed

    Orr, M; Hallam, J; Takezawa, K; Murra, A; Ninomiya, S; Oide, M; Leonard, T

    2000-12-01

    We describe a method for non-parametric regression which combines regression trees with radial basis function networks. The method is similar to that of Kubat, who was first to suggest such a combination, but has some significant improvements. We demonstrate the features of the new method, compare its performance with other methods on DELVE data sets and apply it to a real world problem involving the classification of soybean plants from digital images.

  5. Growth in Mathematics Achievement: Analysis with Classification and Regression Trees

    ERIC Educational Resources Information Center

    Ma, Xin

    2005-01-01

    A recently developed statistical technique, often referred to as classification and regression trees (CART), holds great potential for researchers to discover how student-level (and school-level) characteristics interactively affect growth in mathematics achievement. CART is a host of advanced statistical methods that statistically cluster…

  6. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    PubMed

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed.

  7. Predicting Fusarium head blight epidemics with boosted regression trees.

    PubMed

    Shah, D A; De Wolf, E D; Paul, P A; Madden, L V

    2014-07-01

    Predicting major Fusarium head blight (FHB) epidemics allows for the judicious use of fungicides in suppressing disease development. Our objectives were to investigate the utility of boosted regression trees (BRTs) for predictive modeling of FHB epidemics in the United States, and to compare the predictive performances of the BRT models with those of logistic regression models we had developed previously. The data included 527 FHB observations from 15 states over 26 years. BRTs were fit to a training data set of 369 FHB observations, in which FHB epidemics were classified as either major (severity ≥ 10%) or non-major (severity < 10%), linked to a predictor matrix consisting of 350 weather-based variables and categorical variables for wheat type (spring or winter), presence or absence of corn residue, and cultivar resistance. Predictive performance was estimated on a test (holdout) data set consisting of the remaining 158 observations. BRTs had a misclassification rate of 0.23 on the test data, which was 31% lower than the average misclassification rate over 15 logistic regression models we had presented earlier. The strongest predictors were generally one of mean daily relative humidity, mean daily temperature, and the number of hours in which the temperature was between 9 and 30°C and relative humidity ≥ 90% simultaneously. Moreover, the predicted risk of major epidemics increased substantially when mean daily relative humidity rose above 70%, which is a lower threshold than previously modeled for most plant pathosystems. BRTs led to novel insights into the weather-epidemic relationship.

  8. Capacitance Regression Modelling Analysis on Latex from Selected Rubber Tree Clones

    NASA Astrophysics Data System (ADS)

    Rosli, A. D.; Hashim, H.; Khairuzzaman, N. A.; Mohd Sampian, A. F.; Baharudin, R.; Abdullah, N. E.; Sulaiman, M. S.; Kamaru'zzaman, M.

    2015-11-01

    This paper investigates the capacitance regression modelling performance of latex for various rubber tree clones, namely clone 2002, 2008, 2014 and 3001. Conventionally, the rubber tree clones identification are based on observation towards tree features such as shape of leaf, trunk, branching habit and pattern of seeds texture. The former method requires expert persons and very time-consuming. Currently, there is no sensing device based on electrical properties that can be employed to measure different clones from latex samples. Hence, with a hypothesis that the dielectric constant of each clone varies, this paper discusses the development of a capacitance sensor via Capacitance Comparison Bridge (known as capacitance sensor) to measure an output voltage of different latex samples. The proposed sensor is initially tested with 30ml of latex sample prior to gradually addition of dilution water. The output voltage and capacitance obtained from the test are recorded and analyzed using Simple Linear Regression (SLR) model. This work outcome infers that latex clone of 2002 has produced the highest and reliable linear regression line with determination coefficient of 91.24%. In addition, the study also found that the capacitive elements in latex samples deteriorate if it is diluted with higher volume of water.

  9. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules…

  10. Estimation of adjusted rate differences using additive negative binomial regression.

    PubMed

    Donoghoe, Mark W; Marschner, Ian C

    2016-08-15

    Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity-link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model-fitting methods are often unable to cope with the constrained parameter space arising from the non-negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation-conditional maximisation-either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi-parametric regression functions. We illustrate the method using a placebo-controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd. PMID:27073156

  11. A partially linear tree-based regression model for assessing complex joint gene-gene and gene-environment effects.

    PubMed

    Chen, Jinbo; Yu, Kai; Hsing, Ann; Therneau, Terry M

    2007-04-01

    The success of genetic dissection of complex diseases may greatly benefit from judicious exploration of joint gene effects, which, in turn, critically depends on the power of statistical tools. Standard regression models are convenient for assessing main effects and low-order gene-gene interactions but not for exploring complex higher-order interactions. Tree-based methodology is an attractive alternative for disentangling possible interactions, but it has difficulty in modeling additive main effects. This work proposes a new class of semiparametric regression models, termed partially linear tree-based regression (PLTR) models, which exhibit the advantages of both generalized linear regression and tree models. A PLTR model quantifies joint effects of genes and other risk factors by a combination of linear main effects and a non-parametric tree -structure. We propose an iterative algorithm to fit the PLTR model, and a unified resampling approach for identifying and testing the significance of the optimal "pruned" tree nested within the tree resultant from the fitting algorithm. Simulation studies showed that the resampling procedure maintained the correct type I error rate. We applied the PLTR model to assess the association between biliary stone risk and 53 single nucleotide polymorphisms (SNPs) in the inflammation pathway in a population-based case-control study. The analysis yielded an interesting parsimonious summary of the joint effect of all SNPs. The proposed model is also useful for exploring gene-environment interactions and has broad implications for applying the tree methodology to genetic epidemiology research.

  12. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  13. Building optimal regression tree by ant colony system-genetic algorithm: application to modeling of melting points.

    PubMed

    Hemmateenejad, Bahram; Shamsipur, Mojtaba; Zare-Shahabadi, Vali; Akhond, Morteza

    2011-10-17

    The classification and regression trees (CART) possess the advantage of being able to handle large data sets and yield readily interpretable models. A conventional method of building a regression tree is recursive partitioning, which results in a good but not optimal tree. Ant colony system (ACS), which is a meta-heuristic algorithm and derived from the observation of real ants, can be used to overcome this problem. The purpose of this study was to explore the use of CART and its combination with ACS for modeling of melting points of a large variety of chemical compounds. Genetic algorithm (GA) operators (e.g., cross averring and mutation operators) were combined with ACS algorithm to select the best solution model. In addition, at each terminal node of the resulted tree, variable selection was done by ACS-GA algorithm to build an appropriate partial least squares (PLS) model. To test the ability of the resulted tree, a set of approximately 4173 structures and their melting points were used (3000 compounds as training set and 1173 as validation set). Further, an external test set containing of 277 drugs was used to validate the prediction ability of the tree. Comparison of the results obtained from both trees showed that the tree constructed by ACS-GA algorithm performs better than that produced by recursive partitioning procedure.

  14. Building optimal regression tree by ant colony system-genetic algorithm: application to modeling of melting points.

    PubMed

    Hemmateenejad, Bahram; Shamsipur, Mojtaba; Zare-Shahabadi, Vali; Akhond, Morteza

    2011-10-17

    The classification and regression trees (CART) possess the advantage of being able to handle large data sets and yield readily interpretable models. A conventional method of building a regression tree is recursive partitioning, which results in a good but not optimal tree. Ant colony system (ACS), which is a meta-heuristic algorithm and derived from the observation of real ants, can be used to overcome this problem. The purpose of this study was to explore the use of CART and its combination with ACS for modeling of melting points of a large variety of chemical compounds. Genetic algorithm (GA) operators (e.g., cross averring and mutation operators) were combined with ACS algorithm to select the best solution model. In addition, at each terminal node of the resulted tree, variable selection was done by ACS-GA algorithm to build an appropriate partial least squares (PLS) model. To test the ability of the resulted tree, a set of approximately 4173 structures and their melting points were used (3000 compounds as training set and 1173 as validation set). Further, an external test set containing of 277 drugs was used to validate the prediction ability of the tree. Comparison of the results obtained from both trees showed that the tree constructed by ACS-GA algorithm performs better than that produced by recursive partitioning procedure. PMID:21907021

  15. Response of the regression tree model to high resolution remote sensing data for predicting percent tree cover in a Mediterranean ecosystem.

    PubMed

    Donmez, Cenk; Berberoglu, Suha; Erdogan, Mehmet Akif; Tanriover, Anil Akin; Cilek, Ahmet

    2015-02-01

    Percent tree cover is the percentage of the ground surface area covered by a vertical projection of the outermost perimeter of the plants. It is an important indicator to reveal the condition of forest systems and has a significant importance for ecosystem models as a main input. The aim of this study is to estimate the percent tree cover of various forest stands in a Mediterranean environment based on an empirical relationship between tree coverage and remotely sensed data in Goksu Watershed located at the Eastern Mediterranean coast of Turkey. A regression tree algorithm was used to simulate spatial fractions of Pinus nigra, Cedrus libani, Pinus brutia, Juniperus excelsa and Quercus cerris using multi-temporal LANDSAT TM/ETM data as predictor variables and land cover information. Two scenes of high resolution GeoEye-1 images were employed for training and testing the model. The predictor variables were incorporated in addition to biophysical variables estimated from the LANDSAT TM/ETM data. Additionally, normalised difference vegetation index (NDVI) was incorporated to LANDSAT TM/ETM band settings as a biophysical variable. Stepwise linear regression (SLR) was applied for selecting the relevant bands to employ in regression tree process. SLR-selected variables produced accurate results in the model with a high correlation coefficient of 0.80. The output values ranged from 0 to 100 %. The different tree species were mapped in 30 m resolution in respect to elevation. Percent tree cover map as a final output was derived using LANDSAT TM/ETM image over Goksu Watershed and the biophysical variables. The results were tested using high spatial resolution GeoEye-1 images. Thus, the combination of the RT algorithm and higher resolution data for percent tree cover mapping were tested and examined in a complex Mediterranean environment.

  16. A Hypothesis Verification Method Using Regression Tree for Semiconductor Yield Analysis

    NASA Astrophysics Data System (ADS)

    Tsuda, Hidetaka; Shirai, Hidehiro; Terabe, Masahiro; Hashimoto, Kazuo; Shinohara, Ayumi

    Several researchers have reported the regression tree analysis for semiconductor yield. However, the scope of these analyses is restricted by the difficulty involved in applying the regression tree analysis to a small number of samples with many attributes. It is often observed that splitting attributes in the route node do not indicate the hypothesized causes of failure. We propose a method for verifying the hypothesized causes of failure, which reduces the number of verification hypotheses. Our method involves selecting sets of analysis data with the same cause of failure, extracting the hypothesis by applying the regression tree analysis separately to each set of analysis data, and merging and sorting attributes according to the t value. The results of an experiment conducted in a real environment show that the proposed method helps in widening the scope of applicability of the regression tree analysis for semiconductor yield.

  17. Logistic Regression-Based Trichotomous Classification Tree and Its Application in Medical Diagnosis.

    PubMed

    Zhu, Yanke; Fang, Jiqian

    2016-11-01

    The classification tree is a valuable methodology for predictive modeling and data mining. However, the current existing classification trees ignore the fact that there might be a subset of individuals who cannot be well classified based on the information of the given set of predictor variables and who might be classified with a higher error rate; most of the current existing classification trees do not use the combination of variables in each step. An algorithm of a logistic regression-based trichotomous classification tree (LRTCT) is proposed that employs the trichotomous tree structure and the linear combination of predictor variables in the recursive partitioning process. Compared with the widely used classification and regression tree through the applications on a series of simulated data and 2 real data sets, the LRTCT performed better in several aspects and does not require excessive complicated calculations.

  18. Application of Boosting Regression Trees to Preliminary Cost Estimation in Building Construction Projects.

    PubMed

    Shin, Yoonseok

    2015-01-01

    Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project. PMID:26339227

  19. Application of Boosting Regression Trees to Preliminary Cost Estimation in Building Construction Projects.

    PubMed

    Shin, Yoonseok

    2015-01-01

    Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.

  20. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  1. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  2. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  3. Methods for Estimating Population Density in Data-Limited Areas: Evaluating Regression and Tree-Based Models in Peru

    PubMed Central

    Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

    2014-01-01

    Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies. PMID:24992657

  4. Methods for estimating population density in data-limited areas: evaluating regression and tree-based models in Peru.

    PubMed

    Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

    2014-01-01

    Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies.

  5. Which Phylogenetic Networks are Merely Trees with Additional Arcs?

    PubMed Central

    Francis, Andrew R.; Steel, Mike

    2015-01-01

    A binary phylogenetic network may or may not be obtainable from a tree by the addition of directed edges (arcs) between tree arcs. Here, we establish a precise and easily tested criterion (based on “2-SAT”) that efficiently determines whether or not any given network can be realized in this way. Moreover, the proof provides a polynomial-time algorithm for finding one or more trees (when they exist) on which the network can be based. A number of interesting consequences are presented as corollaries; these lead to some further relevant questions and observations, which we outline in the conclusion. PMID:26070685

  6. Which Phylogenetic Networks are Merely Trees with Additional Arcs?

    PubMed

    Francis, Andrew R; Steel, Mike

    2015-09-01

    A binary phylogenetic network may or may not be obtainable from a tree by the addition of directed edges (arcs) between tree arcs. Here, we establish a precise and easily tested criterion (based on "2-SAT") that efficiently determines whether or not any given network can be realized in this way. Moreover, the proof provides a polynomial-time algorithm for finding one or more trees (when they exist) on which the network can be based. A number of interesting consequences are presented as corollaries; these lead to some further relevant questions and observations, which we outline in the conclusion.

  7. A stepwise regression tree for nonlinear approximation: applications to estimating subpixel land cover

    USGS Publications Warehouse

    Huang, C.; Townshend, J.R.G.

    2003-01-01

    A stepwise regression tree (SRT) algorithm was developed for approximating complex nonlinear relationships. Based on the regression tree of Breiman et al . (BRT) and a stepwise linear regression (SLR) method, this algorithm represents an improvement over SLR in that it can approximate nonlinear relationships and over BRT in that it gives more realistic predictions. The applicability of this method to estimating subpixel forest was demonstrated using three test data sets, on all of which it gave more accurate predictions than SLR and BRT. SRT also generated more compact trees and performed better than or at least as well as BRT at all 10 equal forest proportion interval ranging from 0 to 100%. This method is appealing to estimating subpixel land cover over large areas.

  8. Nitrogen Addition Enhances Drought Sensitivity of Young Deciduous Tree Species.

    PubMed

    Dziedek, Christoph; Härdtle, Werner; von Oheimb, Goddert; Fichtner, Andreas

    2016-01-01

    Understanding how trees respond to global change drivers is central to predict changes in forest structure and functions. Although there is evidence on the mode of nitrogen (N) and drought (D) effects on tree growth, our understanding of the interplay of these factors is still limited. Simultaneously, as mixtures are expected to be less sensitive to global change as compared to monocultures, we aimed to investigate the combined effects of N addition and D on the productivity of three tree species (Fagus sylvatica, Quercus petraea, Pseudotsuga menziesii) in relation to functional diverse species mixtures using data from a 4-year field experiment in Northwest Germany. Here we show that species mixing can mitigate the negative effects of combined N fertilization and D events, but the community response is mainly driven by the combination of certain traits rather than the tree species richness of a community. For beech, we found that negative effects of D on growth rates were amplified by N fertilization (i.e., combined treatment effects were non-additive), while for oak and fir, the simultaneous effects of N and D were additive. Beech and oak were identified as most sensitive to combined N+D effects with a strong size-dependency observed for beech, suggesting that the negative impact of N+D becomes stronger with time as beech grows larger. As a consequence, the net biodiversity effect declined at the community level, which can be mainly assigned to a distinct loss of complementarity in beech-oak mixtures. This pattern, however, was not evident in the other species-mixtures, indicating that neighborhood composition (i.e., trait combination), but not tree species richness mediated the relationship between tree diversity and treatment effects on tree growth. Our findings point to the importance of the qualitative role ('trait portfolio') that biodiversity play in determining resistance of diverse tree communities to environmental changes. As such, they provide further

  9. Reconstructing missing daily precipitation data using regression trees and artificial neural networks

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Incomplete meteorological data has been a problem in environmental modeling studies. The objective of this work was to develop a technique to reconstruct missing daily precipitation data in the central part of Chesapeake Bay Watershed using regression trees (RT) and artificial neural networks (ANN)....

  10. Reconstructing missing daily precipitation data using regression trees and artificial neural networks

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Missing meteorological data have to be estimated for agricultural and environmental modeling. The objective of this work was to develop a technique to reconstruct the missing daily precipitation data in the central part of the Chesapeake Bay Watershed using regression trees (RT) and artificial neura...

  11. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  12. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  13. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  14. Nitrogen Addition Enhances Drought Sensitivity of Young Deciduous Tree Species

    PubMed Central

    Dziedek, Christoph; Härdtle, Werner; von Oheimb, Goddert; Fichtner, Andreas

    2016-01-01

    Understanding how trees respond to global change drivers is central to predict changes in forest structure and functions. Although there is evidence on the mode of nitrogen (N) and drought (D) effects on tree growth, our understanding of the interplay of these factors is still limited. Simultaneously, as mixtures are expected to be less sensitive to global change as compared to monocultures, we aimed to investigate the combined effects of N addition and D on the productivity of three tree species (Fagus sylvatica, Quercus petraea, Pseudotsuga menziesii) in relation to functional diverse species mixtures using data from a 4-year field experiment in Northwest Germany. Here we show that species mixing can mitigate the negative effects of combined N fertilization and D events, but the community response is mainly driven by the combination of certain traits rather than the tree species richness of a community. For beech, we found that negative effects of D on growth rates were amplified by N fertilization (i.e., combined treatment effects were non-additive), while for oak and fir, the simultaneous effects of N and D were additive. Beech and oak were identified as most sensitive to combined N+D effects with a strong size-dependency observed for beech, suggesting that the negative impact of N+D becomes stronger with time as beech grows larger. As a consequence, the net biodiversity effect declined at the community level, which can be mainly assigned to a distinct loss of complementarity in beech-oak mixtures. This pattern, however, was not evident in the other species-mixtures, indicating that neighborhood composition (i.e., trait combination), but not tree species richness mediated the relationship between tree diversity and treatment effects on tree growth. Our findings point to the importance of the qualitative role (‘trait portfolio’) that biodiversity play in determining resistance of diverse tree communities to environmental changes. As such, they provide

  15. Nitrogen Addition Enhances Drought Sensitivity of Young Deciduous Tree Species.

    PubMed

    Dziedek, Christoph; Härdtle, Werner; von Oheimb, Goddert; Fichtner, Andreas

    2016-01-01

    Understanding how trees respond to global change drivers is central to predict changes in forest structure and functions. Although there is evidence on the mode of nitrogen (N) and drought (D) effects on tree growth, our understanding of the interplay of these factors is still limited. Simultaneously, as mixtures are expected to be less sensitive to global change as compared to monocultures, we aimed to investigate the combined effects of N addition and D on the productivity of three tree species (Fagus sylvatica, Quercus petraea, Pseudotsuga menziesii) in relation to functional diverse species mixtures using data from a 4-year field experiment in Northwest Germany. Here we show that species mixing can mitigate the negative effects of combined N fertilization and D events, but the community response is mainly driven by the combination of certain traits rather than the tree species richness of a community. For beech, we found that negative effects of D on growth rates were amplified by N fertilization (i.e., combined treatment effects were non-additive), while for oak and fir, the simultaneous effects of N and D were additive. Beech and oak were identified as most sensitive to combined N+D effects with a strong size-dependency observed for beech, suggesting that the negative impact of N+D becomes stronger with time as beech grows larger. As a consequence, the net biodiversity effect declined at the community level, which can be mainly assigned to a distinct loss of complementarity in beech-oak mixtures. This pattern, however, was not evident in the other species-mixtures, indicating that neighborhood composition (i.e., trait combination), but not tree species richness mediated the relationship between tree diversity and treatment effects on tree growth. Our findings point to the importance of the qualitative role ('trait portfolio') that biodiversity play in determining resistance of diverse tree communities to environmental changes. As such, they provide further

  16. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy. PMID:26687087

  17. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  18. Hyperspectral Analysis of Soil Nitrogen, Carbon, Carbonate, and Organic Matter Using Regression Trees

    PubMed Central

    Gmur, Stephan; Vogt, Daniel; Zabowski, Darlene; Moskal, L. Monika

    2012-01-01

    The characterization of soil attributes using hyperspectral sensors has revealed patterns in soil spectra that are known to respond to mineral composition, organic matter, soil moisture and particle size distribution. Soil samples from different soil horizons of replicated soil series from sites located within Washington and Oregon were analyzed with the FieldSpec Spectroradiometer to measure their spectral signatures across the electromagnetic range of 400 to 1,000 nm. Similarity rankings of individual soil samples reveal differences between replicate series as well as samples within the same replicate series. Using classification and regression tree statistical methods, regression trees were fitted to each spectral response using concentrations of nitrogen, carbon, carbonate and organic matter as the response variables. Statistics resulting from fitted trees were: nitrogen R2 0.91 (p < 0.01) at 403, 470, 687, and 846 nm spectral band widths, carbonate R2 0.95 (p < 0.01) at 531 and 898 nm band widths, total carbon R2 0.93 (p < 0.01) at 400, 409, 441 and 907 nm band widths, and organic matter R2 0.98 (p < 0.01) at 300, 400, 441, 832 and 907 nm band widths. Use of the 400 to 1,000 nm electromagnetic range utilizing regression trees provided a powerful, rapid and inexpensive method for assessing nitrogen, carbon, carbonate and organic matter for upper soil horizons in a nondestructive method. PMID:23112620

  19. Regression trees for modeling geochemical data-An application to Late Jurassic carbonates (Ammonitico Rosso)

    NASA Astrophysics Data System (ADS)

    Coimbra, Rute; Rodriguez-Galiano, Victor; Olóriz, Federico; Chica-Olmo, Mario

    2014-12-01

    Research based on ancient carbonate geochemical records is often assisted by multivariate statistical analysis, among others, used for data mining. This contribution reports a complementary approach that can be applied to paleoenvironmental research. The choice to use a machine learning method, here regression trees (RT), relied in the ability to learn complex patterns, integrating multiple types of data with different statistical distributions to obtain a knowledge model of geochemical behavior along a paleo-platform. The Late Jurassic epioceanic deposits under scope are represented by six stratigraphic sections located in SE Spain and on the Majorca Island. The used database comprises a total of 1960 data points corresponding to eight variables (stable C and O isotopes, the elements Ca, Mg, Sr, Fe, Mn and skeletal content). This study uses RT models in which the predictive variables are the geochemical proxies, whilst skeletal content is used as a target variable. The resulting model is data driven, explaining variations in the target variable and providing additional information on the relative importance of each variable to each prediction, as well as its corresponding threshold values. The obtained RT revealed a structured distribution of samples, organized either by stratigraphic section or sets of nearby sections. Averaged estimated skeletal abundance confirmed the initial observations of higher skeletal content for the most distal sections with estimated values from 18% to 27%. In contrast, lower skeletal abundance from 5% to 15% is proposed for the remaining sections. The geochemical variable that best discriminates this major trend is δ18O, at a threshold value of -0.2‰, interpreted as evidence for separation of water-mass properties across the studied areas. Other four variables were considered relevant by the obtained decision tree: C isotopes, Ca, Sr and Mn, providing new insights for further differentiation between sets of samples.

  20. Prioritizing Highway Safety Manual's crash prediction variables using boosted regression trees.

    PubMed

    Saha, Dibakar; Alluri, Priyanka; Gan, Albert

    2015-06-01

    The Highway Safety Manual (HSM) recommends using the empirical Bayes (EB) method with locally derived calibration factors to predict an agency's safety performance. However, the data needs for deriving these local calibration factors are significant, requiring very detailed roadway characteristics information. Many of the data variables identified in the HSM are currently unavailable in the states' databases. Moreover, the process of collecting and maintaining all the HSM data variables is cost-prohibitive. Prioritization of the variables based on their impact on crash predictions would, therefore, help to identify influential variables for which data could be collected and maintained for continued updates. This study aims to determine the impact of each independent variable identified in the HSM on crash predictions. A relatively recent data mining approach called boosted regression trees (BRT) is used to investigate the association between the variables and crash predictions. The BRT method can effectively handle different types of predictor variables, identify very complex and non-linear association among variables, and compute variable importance. Five years of crash data from 2008 to 2012 on two urban and suburban facility types, two-lane undivided arterials and four-lane divided arterials, were analyzed for estimating the influence of variables on crash predictions. Variables were found to exhibit non-linear and sometimes complex relationship to predicted crash counts. In addition, only a few variables were found to explain most of the variation in the crash data.

  1. Feature Extraction of One-step Ahead Daily Maximum Load with Regression Tree

    NASA Astrophysics Data System (ADS)

    Mori, Hiroyuki; Sakatani, Yoshinori; Fujino, Tatsurou; Numa, Kazuyuki

    In this paper, a new efficient feature extraction method is proposed to handle the one-step ahead daily maximum load forecasting. In recent years, power systems become more complicated under the deregulated and competitive environment. As a result, it is not easy to understand the cause and effect of short-term load forecasting with a bunch of data. This paper analyzes load data from a standpoint of data mining. By it we mean a technique that finds out rules or knowledge through large database. As a data mining method for load forecasting, this paper focuses on the regression tree that handles continuous variables and expresses a knowledge rule as if-then rules. Investigating the variable importance of the regression tree gives information on the transition of the load forecasting models. This paper proposes a feature extraction method for examining the variable importance. The proposed method allows to classify the transition of the variable importance through actual data.

  2. [Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

    PubMed

    Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

    2016-03-01

    Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period. PMID:27400527

  3. [Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

    PubMed

    Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

    2016-03-01

    Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.

  4. Stressor-response modeling using the 2D water quality model and regression trees to predict chlorophyll-a in a reservoir system

    NASA Astrophysics Data System (ADS)

    Park, Yongeun; Pachepsky, Yakov A.; Cho, Kyung Hwa; Jeon, Dong Jin; Kim, Joon Ha

    2015-10-01

    To control algal blooms, the stressor-response relationships between water quality metrics, environmental variables, and algal growth need to be better understood and modeled. Machine-learning methods have been suggested as means to express the stressor-response relationships that are found when applying mechanistic water quality models. The objective of this work was to evaluate the efficiency of regression trees in the development of a stressor-response model for chlorophyll-a (Chl-a) concentrations, using the results from site-specific mechanistic water quality modeling. The 2-dimensional hydrodynamic and water quality model (CE-QUAL-W2) model was applied to simulate water quality using four-year observational data and additional scenarios of air temperature increases for the Yeongsan Reservoir in South Korea. Regression tree modeling was applied to the results of these simulations. Given the well-expressed seasonality in the simulated Chl-a dynamics, separate regression trees were developed for months from May to September. The regression trees provided a reasonably accurate representation of the stressor-response dependence generated by the CE-QUAL-W2 model. Different stressors were then selected as split variables for different months, and, in most cases, splits by the same stressor variable yielded the same correlation sign between the variable and the Chl-a concentration. Compared to physical variables, nutrient content appeared to better predict Chl-a responses. The highest Chl-a temperature sensitivities were found for May and June. Regression tree splits based on ammonium concentration resulted in a consistent trend of greater sensitivity in the groups of samples with higher ammonium concentrations. Regression tree models provided a transparent visual representation of the stressor-response relationships for Chl-a and its sensitivity. Overall, the representation of relationships using classification and regression tools can be considered a useful

  5. Industrial and occupational ergonomics in the petrochemical process industry: a regression trees approach.

    PubMed

    Bevilacqua, M; Ciarapica, F E; Giacchetta, G

    2008-07-01

    This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.

  6. Delaware River Streamflow Reconstruction using Tree Rings: Exploration of Hierarchical Bayesian Regression

    NASA Astrophysics Data System (ADS)

    Devineni, N.; Lall, U.; Cook, E.; Pederson, N.

    2011-12-01

    We present the application of a linear model in a Hierarchical Bayesian Regression (HBR) framework for reconstructing the summer seasonal averaged streamflow at five stations in the Delaware River Basin using eight newly developed regional tree ring chronologies. This technique directly provides estimates of the posterior probability distribution of each reconstructed streamflow value, considering model parameter uncertainty. The methodology also allows us to shrink the model parameters towards a common mean to incorporate the predictive ability of each tree chronology on multiple stations. We present the results from HBR analysis along with the results from traditional Point by Point Regression (PPR) analysis to demonstrate the benefits of developing the reconstructions under a Bayesian modeling framework. Further, we also present the comparative results of the model validation using various performance evaluation metrics such as reduction in error (RE) and coefficient of efficiency (CE). The reconstructed streamflow at various stations can be utilized to examine the frequency and recurrence attributes of extreme droughts in the region and their potential connections to known low frequency climate modes.

  7. Hatching timing, oxygen availability, and external gill regression in the tree frog, Agalychnis callidryas.

    PubMed

    Warkentin, Karen M

    2002-01-01

    The physiological role of the embryonic external gills in anurans is equivocal. In some species, diffusion alone is clearly sufficient to supply oxygen throughout the embryonic period. In others, morphological elaboration and environmental regulation of the external gills suggest functional importance. Since oxygen stress is a common trigger of hatching, I examined the relationships among hatching timing, oxygen stress, and external gill loss. I worked with the red-eyed tree frog, Agalychnis callidryas, a species with arboreal eggs and aquatic tadpoles in which gill regression is associated with hatching, and hatching timing affects posthatching survival with aquatic predators. Both exposure to a hypoxic gas mixture and submergence in water, a natural context in which hypoxic stress can occur, induced early hatching. Exposure to hyperoxic gas mixtures induced regression of external gills, and subsequent exposure to air induced early hatching. Prostaglandin-induced external gill regression also induced hatching, and this effect was partially ameliorated by exposure to hyperoxic gas. Together, these results suggest that external gills enhance the oxygen uptake of embryos and are necessary to extend embryonic development past the onset of hatching competence. PMID:12024291

  8. Regression tree modeling of forest NPP using site conditions and climate variables across eastern USA

    NASA Astrophysics Data System (ADS)

    Kwon, Y.

    2013-12-01

    As evidence of global warming continue to increase, being able to predict forest response to climate changes, such as expected rise of temperature and precipitation, will be vital for maintaining the sustainability and productivity of forests. To map forest species redistribution by climate change scenario has been successful, however, most species redistribution maps lack mechanistic understanding to explain why trees grow under the novel conditions of chaining climate. Distributional map is only capable of predicting under the equilibrium assumption that the communities would exist following a prolonged period under the new climate. In this context, forest NPP as a surrogate for growth rate, the most important facet that determines stand dynamics, can lead to valid prediction on the transition stage to new vegetation-climate equilibrium as it represents changes in structure of forest reflecting site conditions and climate factors. The objective of this study is to develop forest growth map using regression tree analysis by extracting large-scale non-linear structures from both field-based FIA and remotely sensed MODIS data set. The major issue addressed in this approach is non-linear spatial patterns of forest attributes. Forest inventory data showed complex spatial patterns that reflect environmental states and processes that originate at different spatial scales. At broad scales, non-linear spatial trends in forest attributes and mixture of continuous and discrete types of environmental variables make traditional statistical (multivariate regression) and geostatistical (kriging) models inefficient. It calls into question some traditional underlying assumptions of spatial trends that uncritically accepted in forest data. To solve the controversy surrounding the suitability of forest data, regression tree analysis are performed using Software See5 and Cubist. Four publicly available data sets were obtained: First, field-based Forest Inventory and Analysis (USDA

  9. Selection of orthogonal reversed-phase HPLC systems by univariate and auto-associative multivariate regression trees.

    PubMed

    Put, R; Van Gyseghem, E; Coomans, D; Vander Heyden, Y

    2005-11-25

    In order to select chromatographic starting conditions to be optimized during further method development of the separation of a given mixture, so-called generic orthogonal chromatographic systems could be explored in parallel. In this paper the use of univariate and multivariate regression trees (MRT) was studied to define the most orthogonal subset from a given set of chromatographic systems. Two data sets were considered, which contain the retention data of 68 structurally diversive drugs on sets of 32 and 38 chromatographic systems, respectively. For both the univariate and multivariate approaches no other data but the measured retention factors are needed to build the decision trees. Since multivariate regression trees are used in an unsupervised way, they are called auto-associative multivariate regression trees (AAMRT). For all decision trees used, a variable importance list of the predictor variables can be derived. It was concluded that based on these ranked lists, both for univariate and multivariate regression trees, a selection of the most orthogonal systems from a given set of systems can be obtained in a user-friendly and fast way.

  10. An efficiency data envelopment analysis model reinforced by classification and regression tree for hospital performance evaluation.

    PubMed

    Chuang, Chun-Ling; Chang, Peng-Chan; Lin, Rong-Ho

    2011-10-01

    As changes in the medical environment and policies on national health insurance coverage have triggered tremendous impacts on the business performance and financial management of medical institutions, effective management becomes increasingly crucial for hospitals to enhance competitiveness and to strive for sustainable development. The study accordingly aims at evaluating hospital operational efficiency for better resource allocation and cost effectiveness. Several data envelopment analysis (DEA)-based models were first compared, and the DEA-artificial neural network (ANN) model was identified as more capable than the DEA and DEA-assurance region (AR) models of measuring operational efficiency and recognizing the best-performing hospital. The classification and regression tree (CART) efficiency model was then utilized to extract rules for improving resource allocation of medical institutions. PMID:20878210

  11. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis.

    PubMed

    Cohen, Ira L; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N S; Romanczyk, Raymond G; Karmel, Bernard Z; Gardner, Judith M

    2016-09-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80 %, generalized to an independent validation set, and generalized across age groups and sites, and agreed well with ADOS classifications. Parent PDDBIs yielded better results than teacher PDDBIs but, when CART predictions agreed across informants, sensitivity increased. Results also revealed three subtypes of ASD: minimally verbal, verbal, and atypical; and two, relatively common subtypes of non-ASD children: social pragmatic problems and good social skills. These subgroups corresponded to differences in behavior profiles and associated bio-medical findings. PMID:27318809

  12. Prediction of Wind Speeds Based on Digital Elevation Models Using Boosted Regression Trees

    NASA Astrophysics Data System (ADS)

    Fischer, P.; Etienne, C.; Tian, J.; Krauß, T.

    2015-12-01

    In this paper a new approach is presented to predict maximum wind speeds using Gradient Boosted Regression Trees (GBRT). GBRT are a non-parametric regression technique used in various applications, suitable to make predictions without having an in-depth a-priori knowledge about the functional dependancies between the predictors and the response variables. Our aim is to predict maximum wind speeds based on predictors, which are derived from a digital elevation model (DEM). The predictors describe the orography of the Area-of-Interest (AoI) by various means like first and second order derivatives of the DEM, but also higher sophisticated classifications describing exposure and shelterness of the terrain to wind flux. In order to take the different scales into account which probably influence the streams and turbulences of wind flow over complex terrain, the predictors are computed on different spatial resolutions ranging from 30 m up to 2000 m. The geographic area used for examination of the approach is Switzerland, a mountainious region in the heart of europe, dominated by the alps, but also covering large valleys. The full workflow is described in this paper, which consists of data preparation using image processing techniques, model training using a state-of-the-art machine learning algorithm, in-depth analysis of the trained model, validation of the model and application of the model to generate a wind speed map.

  13. Complementing boosted regression trees models of SOC stocks distributions with geostatistical approaches

    NASA Astrophysics Data System (ADS)

    martin, manuel; Lacarce, Eva; Meersmans, Jeroen; Orton, Thomas; Saby, Nicolas; Paroissien, Jean-Baptiste; Jolivet, Claudy; Boulonne, Line; Arrouays, Dominique

    2013-04-01

    Soil organic carbon (SOC) plays a major role in the global carbon budget. It can act as a source or a sink of atmospheric carbon, thereby possibly influencing the course of climate change. Improving the tools that model the spatial distributions of SOC stocks at national scales is a priority, both for monitoring changes in SOC and as an input for global carbon cycles studies. In this paper, first, we considered several increasingly complex boosted regression trees (BRT), a convenient and efficient multiple regression model from the statistical learning field. Further, we considered and a robust geostatistical approach coupled to the BRT models. Testing the different approaches was performed on the dataset from the French Soil Monitoring Network, with a consistent cross-validation procedure. We showed that the BRT models, given its ease of use and its predictive performance, could be preferred to geostatistical models for SOC mapping at the national scale, and if possible be joined with geostatistical models. This conclusion is valid provided that care is exercised in model fitting and validating, that the dataset does not allow for modeling local spatial autocorrelations, as it is the case for many national systematic sampling schemes, and when good quality data about SOC drivers included in the models is available.

  14. Nitrogen and phosphorus additions negatively affect tree species diversity in tropical forest regrowth trajectories.

    PubMed

    Siddique, Ilyas; Vieira, Ima Célia Guimarães; Schmidt, Susanne; Lamb, David; Carvalho, Cláudio José Reis; Figueiredo, Ricardo de Oliveira; Blomberg, Simon; Davidson, Eric A

    2010-07-01

    Nutrient enrichment is increasingly affecting many tropical ecosystems, but there is no information on how this affects tree biodiversity. To examine dynamics in vegetation structure and tree species biomass and diversity, we annually remeasured tree species before and for six years after repeated additions of nitrogen (N) and phosphorus (P) in permanent plots of abandoned pasture in Amazonia. Nitrogen and, to a lesser extent, phosphorus addition shifted growth among woody species. Nitrogen stimulated growth of two common pioneer tree species and one common tree species adaptable to both high- and low-light environments, while P stimulated growth only of the dominant pioneer tree Rollinia exsucca (Annonaceae). Overall, N or P addition reduced tree assemblage evenness and delayed tree species accrual over time, likely due to competitive monopolization of other resources by the few tree species responding to nutrient enrichment with enhanced establishment and/or growth rates. Absolute tree growth rates were elevated for two years after nutrient addition. However, nutrient-induced shifts in relative tree species growth and reduced assemblage evenness persisted for more than three years after nutrient addition, favoring two nutrient-responsive pioneers and one early-secondary tree species. Surprisingly, N + P effects on tree biomass and species diversity were consistently weaker than N-only and P-only effects, because grass biomass increased dramatically in response to N + P addition. The resulting intensified competition probably prevented an expected positive N + P synergy in the tree assemblage. Thus, N or P enrichment may favor unknown tree functional response types, reduce the diversity of coexisting species, and delay species accrual during structurally and functionally complex tropical rainforest secondary succession. PMID:20715634

  15. Predictive model of biliocystic communication in liver hydatid cysts using classification and regression tree analysis

    PubMed Central

    2010-01-01

    Background Incidence of liver hydatid cyst (LHC) rupture ranged 15%-40% of all cases and most of them concern the bile duct tree. Patients with biliocystic communication (BCC) had specific clinic and therapeutic aspect. The purpose of this study was to determine witch patients with LHC may develop BCC using classification and regression tree (CART) analysis Methods A retrospective study of 672 patients with liver hydatid cyst treated at the surgery department "A" at Ibn Sina University Hospital, Rabat Morocco. Four-teen risk factors for BCC occurrence were entered into CART analysis to build an algorithm that can predict at the best way the occurrence of BCC. Results Incidence of BCC was 24.5%. Subgroups with high risk were patients with jaundice and thick pericyst risk at 73.2% and patients with thick pericyst, with no jaundice 36.5 years and younger with no past history of LHC risk at 40.5%. Our developed CART model has sensitivity at 39.6%, specificity at 93.3%, positive predictive value at 65.6%, a negative predictive value at 82.6% and accuracy of good classification at 80.1%. Discriminating ability of the model was good 82%. Conclusion we developed a simple classification tool to identify LHC patients with high risk BCC during a routine clinic visit (only on clinical history and examination followed by an ultrasonography). Predictive factors were based on pericyst aspect, jaundice, age, past history of liver hydatidosis and morphological Gharbi cyst aspect. We think that this classification can be useful with efficacy to direct patients at appropriated medical struct's. PMID:20398342

  16. Weighing risk factors associated with bee colony collapse disorder by classification and regression tree analysis.

    PubMed

    VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude

    2010-10-01

    Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.

  17. Using boosted regression trees to predict the near-saturated hydraulic conductivity of undisturbed soils

    NASA Astrophysics Data System (ADS)

    Koestel, John; Bechtold, Michel; Jorda, Helena; Jarvis, Nicholas

    2015-04-01

    The saturated and near-saturated hydraulic conductivity of soil is of key importance for modelling water and solute fluxes in the vadose zone. Hydraulic conductivity measurements are cumbersome at the Darcy scale and practically impossible at larger scales where water and solute transport models are mostly applied. Hydraulic conductivity must therefore be estimated from proxy variables. Such pedotransfer functions are known to work decently well for e.g. water retention curves but rather poorly for near-saturated and saturated hydraulic conductivities. Recently, Weynants et al. (2009, Revisiting Vereecken pedotransfer functions: Introducing a closed-form hydraulic model. Vadose Zone Journal, 8, 86-95) reported a coefficients of determination of 0.25 (validation with an independent data set) for the saturated hydraulic conductivity from lab-measurements of Belgian soil samples. In our study, we trained boosted regression trees on a global meta-database containing tension-disk infiltrometer data (see Jarvis et al. 2013. Influence of soil, land use and climatic factors on the hydraulic conductivity of soil. Hydrology & Earth System Sciences, 17, 5185-5195) to predict the saturated hydraulic conductivity (Ks) and the conductivity at a tension of 10 cm (K10). We found coefficients of determination of 0.39 and 0.62 under a simple 10-fold cross-validation for Ks and K10. When carrying out the validation folded over the data-sources, i.e. the source publications, we found that the corresponding coefficients of determination reduced to 0.15 and 0.36, respectively. We conclude that the stricter source-wise cross-validation should be applied in future pedotransfer studies to prevent overly optimistic validation results. The boosted regression trees also allowed for an investigation of relevant predictors for estimating the near-saturated hydraulic conductivity. We found that land use and bulk density were most important to predict Ks. We also observed that Ks is large in fine

  18. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  19. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  20. Matching in Vitro Bioaccessibility of Polyphenols and Antioxidant Capacity of Soluble Coffee by Boosted Regression Trees.

    PubMed

    Podio, Natalia S; López-Froilán, Rebeca; Ramirez-Moreno, Esther; Bertrand, Lidwina; Baroni, María V; Pérez-Rodríguez, María L; Sánchez-Mata, María-Cortes; Wunderlin, Daniel A

    2015-11-01

    The aim of this study was to evaluate changes in polyphenol profile and antioxidant capacity of five soluble coffees throughout a simulated gastro-intestinal digestion, including absorption through a dialysis membrane. Our results demonstrate that both polyphenol content and antioxidant capacity were characteristic for each type of studied coffee, showing a drop after dialysis. Twenty-seven compounds were identified in coffee by HPLC-MS, while only 14 of them were found after dialysis. Green+roasted coffee blend and chicory+coffee blend showed the highest and lowest content of polyphenols and antioxidant capacity before in vitro digestion and after dialysis, respectively. Canonical correlation analysis showed significant correlation between the antioxidant capacity and the polyphenol profile before digestion and after dialysis. Furthermore, boosted regression trees analysis (BRT) showed that only four polyphenol compounds (5-p-coumaroylquinic acid, quinic acid, coumaroyl tryptophan conjugated, and 5-O-caffeoylquinic acid) appear to be the most relevant to explain the antioxidant capacity after dialysis, these compounds being the most bioaccessible after dialysis. To our knowledge, this is the first report matching the antioxidant capacity of foods with the polyphenol profile by BRT, which opens an interesting method of analysis for future reports on the antioxidant capacity of foods.

  1. Matching in Vitro Bioaccessibility of Polyphenols and Antioxidant Capacity of Soluble Coffee by Boosted Regression Trees.

    PubMed

    Podio, Natalia S; López-Froilán, Rebeca; Ramirez-Moreno, Esther; Bertrand, Lidwina; Baroni, María V; Pérez-Rodríguez, María L; Sánchez-Mata, María-Cortes; Wunderlin, Daniel A

    2015-11-01

    The aim of this study was to evaluate changes in polyphenol profile and antioxidant capacity of five soluble coffees throughout a simulated gastro-intestinal digestion, including absorption through a dialysis membrane. Our results demonstrate that both polyphenol content and antioxidant capacity were characteristic for each type of studied coffee, showing a drop after dialysis. Twenty-seven compounds were identified in coffee by HPLC-MS, while only 14 of them were found after dialysis. Green+roasted coffee blend and chicory+coffee blend showed the highest and lowest content of polyphenols and antioxidant capacity before in vitro digestion and after dialysis, respectively. Canonical correlation analysis showed significant correlation between the antioxidant capacity and the polyphenol profile before digestion and after dialysis. Furthermore, boosted regression trees analysis (BRT) showed that only four polyphenol compounds (5-p-coumaroylquinic acid, quinic acid, coumaroyl tryptophan conjugated, and 5-O-caffeoylquinic acid) appear to be the most relevant to explain the antioxidant capacity after dialysis, these compounds being the most bioaccessible after dialysis. To our knowledge, this is the first report matching the antioxidant capacity of foods with the polyphenol profile by BRT, which opens an interesting method of analysis for future reports on the antioxidant capacity of foods. PMID:26457815

  2. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  3. Improving Automatic English Writing Assessment Using Regression Trees and Error-Weighting

    NASA Astrophysics Data System (ADS)

    Lee, Kong-Joo; Kim, Jee-Eun

    The proposed automated scoring system for English writing tests provides an assessment result including a score and diagnostic feedback to test-takers without human's efforts. The system analyzes an input sentence and detects errors related to spelling, syntax and content similarity. The scoring model has adopted one of the statistical approaches, a regression tree. A scoring model in general calculates a score based on the count and the types of automatically detected errors. Accordingly, a system with higher accuracy in detecting errors raises the accuracy in scoring a test. The accuracy of the system, however, cannot be fully guaranteed for several reasons, such as parsing failure, incompleteness of knowledge bases, and ambiguous nature of natural language. In this paper, we introduce an error-weighting technique, which is similar to term-weighting widely used in information retrieval. The error-weighting technique is applied to judge reliability of the errors detected by the system. The score calculated with the technique is proven to be more accurate than the score without it.

  4. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees.

    PubMed

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥10(6) CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥10(5) CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  5. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees

    PubMed Central

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥106 CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥105 CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  6. Exploring the link between drought indicators and impacts through data visualization and regression trees

    NASA Astrophysics Data System (ADS)

    Bachmair, Sophie; Stahl, Kerstin; Blauhut, Veit; Kohn, Irene

    2014-05-01

    impact occurrence. The applied data visualization and regression tree approach proved to be a valuable methodology for exploring the link between indicators and impacts. Nevertheless, the results are influenced by the uncertainty of identifying and quantifying drought impacts and vulnerability factors at a suitable spatial and temporal scale. This calls for more research on methodological issues of drought impact and vulnerability assessment, as well as for further developing impact inventories and exploiting the link between drought indicators and impacts.

  7. Availability and Capacity of Substance Abuse Programs in Correctional Settings: A Classification and Regression Tree Analysis

    PubMed Central

    Kitsantas, Panagiota

    2009-01-01

    Objective to be addressed The purpose of this study was to investigate the structural and organizational factors that contribute to the availability and increased capacity for substance abuse treatment programs in correctional settings. We used Classification and Regression Tree statistical procedures to identify how multi-level data can explain the variability in availability and capacity of substance abuse treatment programs in jails and probation/parole offices. Methods The data for this study combined the National Criminal Justice Treatment Practices survey (NCJTP) and the 2000 Census. The NCJTP survey was a nationally representative sample of correctional administrators for jails and probation/parole agencies. The sample size included 295 substance abuse treatment programs that were classified according to the intensity of their services: high, medium, and low. The independent variables included jurisdictional-level structural variables, attributes of the correctional administrators, and program and service delivery characteristics of the correctional agency. Results The two most important variables in predicting the availability of all three types of services were stronger working relationships with other organizations and the adoption of a standardized substance abuse screening tool by correctional agencies. For high and medium intensive programs, the capacity increased when an organizational learning strategy was used by administrators and the organization used a substance abuse screening tool. Implications on advancing treatment practices in correctional settings are discussed, including further work to test theories on how to better understand access to intensive treatment services. This study presents the first phase of understanding capacity-related issues regarding treatment programs offered in correctional settings. PMID:19395204

  8. Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

    NASA Astrophysics Data System (ADS)

    Kaskhedikar, Apoorva Prakash

    According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI

  9. Simulating California Reservoir Operation Using the Classification and Regression Tree Algorithm Combined with a Shuffled Cross-Validation Scheme

    NASA Astrophysics Data System (ADS)

    Yang, T.; Gao, X.; Sorooshian, S.; Li, X.

    2015-12-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs, and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as the consideration of policy and regulation, environmental constraints, dry/wet conditions, etc. In this paper, a reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve model's predictive performance. An application study of 9 major reservoirs in California is carried out and the simulated results from different decision tree approaches are compared with observation, including original CART and Random Forest. The statistical measurements show that CART combined with the shuffled cross-validation scheme gives a better predictive performance over the other two methods, especially in simulating the peak flows. The results for simulated controlled outflow, storage changes and storage trajectories also show that the proposed model is able to consistently and reasonably predict the human's reservoir operation decisions. In addition, we found that the operation in the Trinity Lake, Oroville Lake and Shasta Lake are greatly influenced by policy and regulation, while low elevation reservoirs are more sensitive to inflow amount than others.

  10. Bacillary dysentery and meteorological factors in northeastern China: a historical review based on classification and regression trees.

    PubMed

    Guan, Peng; Huang, Desheng; Guo, Junqiao; Wang, Ping; Zhou, Baosen

    2008-09-01

    The relationship between the incidence of bacillary dysentery and meteorological factors was investigated. Data on bacillary dysentery incidence in Shenyang from 1990 to 1996 were obtained from Liaoning Provincial Center for Disease Control and Prevention, and meteorological data such as atmospheric pressure, air temperature, precipitation, evaporation, wind speed, and the amount of solar radiation were obtained from Shenyang Meteorological Bureau. Kendall and Spearman correlations were used to analyze the relationship between bacillary dysentery and meteorological factors. The incidence of bacillary dysentery was treated as a response variable, and meteorological factors were treated as predictable variables. Software R 2.3.1 was used to execute the classification and regression trees (CART). The model improved the accuracy of the fitting results. The residual sum square error of the regression tree model was 53.9, while the residual sum square error of the multivariate linear regression model was 107.2. Among all the meteorological indexes, relative humidity, minimum temperature, and pressure one month prior were statistically influential factors in the multivariate regression tree model. CART may be a useful tool for dealing with heterogeneous data, as it can serve as a decision support tool and is notable for its simplicity and ease.

  11. [Error structure and additivity of individual tree biomass model for four natural conifer species in Northeast China].

    PubMed

    Dogn, Li-hu; Li, Feng-ri; Song, Yu-wen

    2015-03-01

    Based on the biomass data of 276 sampling trees of Pinus koraiensis, Abies nephrolepis, Picea koraiensis and Larix gmelinii, the mono-element and dual-element additive system of biomass equations for the four conifer species was developed. The model error structure (additive vs. multiplicative) of the allometric equation was evaluated using the likelihood analysis, while nonlinear seemly unrelated regression was used to estimate the parameters in the additive system of biomass equations. The results indicated that the assumption of multiplicative error structure was strongly supported for the biomass equations of total and tree components for the four conifer species. Thus, the additive system of log-transformed biomass equations was developed. The adjusted coefficient of determination (Ra 2) of the additive system of biomass equations for the four conifer species was 0.85-0.99, the mean relative error was between -7.7% and 5.5%, and the mean absolute relative error was less than 30.5%. Adding total tree height in the additive systems of biomass equations could significantly improve model fitting performance and predicting precision, and the biomass equations of total, aboveground and stem were better than biomass equations of root, branch, foliage and crown. The precision of each biomass equation in the additive system varied from 77.0% to 99.7% with a mean value of 92.3% that would be suitable for predicting the biomass of the four natural conifer species.

  12. Boosted Regression Trees for Early Yield Estimation: A Case Study on Winter Wheat in the North China Plain

    NASA Astrophysics Data System (ADS)

    Heremans, Stien; Dong, Qinghan; Bydekerke, Lieven; Van Orshoven, Jos

    2014-11-01

    Crop yield estimates should be both early and accurate to be useful in the context of food security. The combination of remotely sensed vegetation indices and measured climatic variables (temperature, rainfall,…) allows timely, cost efficient and spatially explicit yield estimation. Machine learning methods like boosted regression trees (BoRTs) have proven to be accurate in predicting a diverse range of biophysical parameters. Until now, they have however rarely been applied in crop yield estimation.

  13. Classification of the PALMS single particle mass spectral data from Atlanta by regression tree analysis

    NASA Astrophysics Data System (ADS)

    Middlebrook, A. M.; Murphy, D. M.; Lee, S.; Lee, S.; Lee, S.; Thomson, D. S.; Thomson, D. S.

    2001-12-01

    During the Atlanta Supersites project in August 1999, the PALMS (Particle Analysis by Laser Mass Spectrometry) instrument collected over 500,000 individual particle spectra. The Atlanta data were originally analyzed by examining combinations of peaks and relative peak areas [Lee et al., 2001a,b], and a wide range of particle components such as sulfate, nitrate, mineral species, metals, organic species, and elemental carbon were detected. To further study the dataset, a classification program using regression tree analysis was developed and applied. Spectral data were compressed into a lower resolution spectrum (every 0.25 mass units) of the raw data and a list of peak areas (every mass unit). Each spectrum started as a normalized classification vector by itself. If the dot product of two classification vectors was within a certain threshold, they were combined into a new classification. The new classification vector was a normalized running average of the classifications being combined. In subsequent steps, the threshold for combining classifications was continuously lowered until a reasonable number of classifications remained. After the final iteration, each spectrum was compared individually with the entire set of classification vectors. Classifications were also combined manually. The classification results from the Atlanta data are generally consistent with those determined by peak identification. However, the classification program identified specific patterns in the mass spectra that were not found by peak identification and generated new particle types. Furthermore, rare particle types that may affect human health were studied in more detail. A description of the classification program as well as the results for the Atlanta data will be presented. Lee, S.-H., D. M. Murphy, D. S. Thomson, and A. M. Middlebrook, Chemical components of single particles measured with particle analysis by laser mass spectrometry (PALMS) during the Atlanta Supersites Project

  14. Analysis of the impact of recreational trail usage for prioritising management decisions: a regression tree approach

    NASA Astrophysics Data System (ADS)

    Tomczyk, Aleksandra; Ewertowski, Marek; White, Piran; Kasprzak, Leszek

    2016-04-01

    The dual role of many Protected Natural Areas in providing benefits for both conservation and recreation poses challenges for management. Although recreation-based damage to ecosystems can occur very quickly, restoration can take many years. The protection of conservation interests at the same as providing for recreation requires decisions to be made about how to prioritise and direct management actions. Trails are commonly used to divert visitors from the most important areas of a site, but high visitor pressure can lead to increases in trail width and a concomitant increase in soil erosion. Here we use detailed field data on condition of recreational trails in Gorce National Park, Poland, as the basis for a regression tree analysis to determine the factors influencing trail deterioration, and link specific trail impacts with environmental, use related and managerial factors. We distinguished 12 types of trails, characterised by four levels of degradation: (1) trails with an acceptable level of degradation; (2) threatened trails; (3) damaged trails; and (4) heavily damaged trails. Damaged trails were the most vulnerable of all trails and should be prioritised for appropriate conservation and restoration. We also proposed five types of monitoring of recreational trail conditions: (1) rapid inventory of negative impacts; (2) monitoring visitor numbers and variation in type of use; (3) change-oriented monitoring focusing on sections of trail which were subjected to changes in type or level of use or subjected to extreme weather events; (4) monitoring of dynamics of trail conditions; and (5) full assessment of trail conditions, to be carried out every 10-15 years. The application of the proposed framework can enhance the ability of Park managers to prioritise their trail management activities, enhancing trail conditions and visitor safety, while minimising adverse impacts on the conservation value of the ecosystem. A.M.T. was supported by the Polish Ministry of

  15. Variances in the projections, resulting from CLIMEX, Boosted Regression Trees and Random Forests techniques

    NASA Astrophysics Data System (ADS)

    Shabani, Farzin; Kumar, Lalit; Solhjouy-fard, Samaneh

    2016-05-01

    The aim of this study was to have a comparative investigation and evaluation of the capabilities of correlative and mechanistic modeling processes, applied to the projection of future distributions of date palm in novel environments and to establish a method of minimizing uncertainty in the projections of differing techniques. The location of this study on a global scale is in Middle Eastern Countries. We compared the mechanistic model CLIMEX (CL) with the correlative models MaxEnt (MX), Boosted Regression Trees (BRT), and Random Forests (RF) to project current and future distributions of date palm (Phoenix dactylifera L.). The Global Climate Model (GCM), the CSIRO-Mk3.0 (CS) using the A2 emissions scenario, was selected for making projections. Both indigenous and alien distribution data of the species were utilized in the modeling process. The common areas predicted by MX, BRT, RF, and CL from the CS GCM were extracted and compared to ascertain projection uncertainty levels of each individual technique. The common areas identified by all four modeling techniques were used to produce a map indicating suitable and unsuitable areas for date palm cultivation for Middle Eastern countries, for the present and the year 2100. The four different modeling approaches predict fairly different distributions. Projections from CL were more conservative than from MX. The BRT and RF were the most conservative methods in terms of projections for the current time. The combination of the final CL and MX projections for the present and 2100 provide higher certainty concerning those areas that will become highly suitable for future date palm cultivation. According to the four models, cold, hot, and wet stress, with differences on a regional basis, appears to be the major restrictions on future date palm distribution. The results demonstrate variances in the projections, resulting from different techniques. The assessment and interpretation of model projections requires reservations

  16. Determinants of reproductive success in dominant pairs of clownfish: a boosted regression tree analysis.

    PubMed

    Buston, Peter M; Elith, Jane

    2011-05-01

    1. Central questions of behavioural and evolutionary ecology are what factors influence the reproductive success of dominant breeders and subordinate nonbreeders within animal societies? A complete understanding of any society requires that these questions be answered for all individuals. 2. The clown anemonefish, Amphiprion percula, forms simple societies that live in close association with sea anemones, Heteractis magnifica. Here, we use data from a well-studied population of A. percula to determine the major predictors of reproductive success of dominant pairs in this species. 3. We analyse the effect of multiple predictors on four components of reproductive success, using a relatively new technique from the field of statistical learning: boosted regression trees (BRTs). BRTs have the potential to model complex relationships in ways that give powerful insight. 4. We show that the reproductive success of dominant pairs is unrelated to the presence, number or phenotype of nonbreeders. This is consistent with the observation that nonbreeders do not help or hinder breeders in any way, confirming and extending the results of a previous study. 5. Primarily, reproductive success is negatively related to male growth and positively related to breeding experience. It is likely that these effects are interrelated because males that grow a lot have little breeding experience. These effects are indicative of a trade-off between male growth and parental investment. 6. Secondarily, reproductive success is positively related to female growth and size. In this population, female size is positively related to group size and anemone size, also. These positive correlations among traits likely are caused by variation in site quality and are suggestive of a silver-spoon effect. 7. Noteworthily, whereas reproductive success is positively related to female size, it is unrelated to male size. This observation provides support for the size advantage hypothesis for sex change: both

  17. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China

    PubMed Central

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (P<0.05). However, the biomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia. PMID:27002822

  18. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China.

    PubMed

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (P<0.05). However, the biomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia.

  19. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  20. Cost-Sensitive Boosting: Fitting an Additive Asymmetric Logistic Regression Model

    NASA Astrophysics Data System (ADS)

    Li, Qiu-Jie; Mao, Yao-Bin; Wang, Zhi-Quan; Xiang, Wen-Bo

    Conventional machine learning algorithms like boosting tend to equally treat misclassification errors that are not adequate to process certain cost-sensitive classification problems such as object detection. Although many cost-sensitive extensions of boosting by directly modifying the weighting strategy of correspond original algorithms have been proposed and reported, they are heuristic in nature and only proved effective by empirical results but lack sound theoretical analysis. This paper develops a framework from a statistical insight that can embody almost all existing cost-sensitive boosting algorithms: fitting an additive asymmetric logistic regression model by stage-wise optimization of certain criterions. Four cost-sensitive versions of boosting algorithms are derived, namely CSDA, CSRA, CSGA and CSLB which respectively correspond to Discrete AdaBoost, Real AdaBoost, Gentle AdaBoost and LogitBoost. Experimental results on the application of face detection have shown the effectiveness of the proposed learning framework in the reduction of the cumulative misclassification cost.

  1. Factors linked to outcomes in sexually abused girls: a regression tree analysis.

    PubMed

    Hébert, Martine; Collin-Vézina, Delphine; Daigneault, Isabelle; Parent, Nathalie; Tremblay, Caroline

    2006-01-01

    Children who report sexual abuse (SA) have been found to display a range of internalizing and externalizing behavior problems. In the present study, a tree-based analysis was used to derive models predicting the variability of internalizing and externalizing behavior problems as well as dissociation symptoms in SA girls. Participants were 150 girls aged 4 to 12 years referred to a specialized pediatric clinic after disclosure of SA. The potential predictors taken into account included sociodemographic and abuse-related variables as well as maternal and family characteristics. The models obtained point to prior abuse as a salient variable in predicting outcomes of SA girls. Implications for the treatment for children disclosing SA are discussed.

  2. Which sociodemographic factors are important on smoking behaviour of high school students? The contribution of classification and regression tree methodology in a broad epidemiological survey

    PubMed Central

    Özge, C; Toros, F; Bayramkaya, E; Çamdeviren, H; Şaşmaz, T

    2006-01-01

    Background The purpose of this study is to evaluate the most important sociodemographic factors on smoking status of high school students using a broad randomised epidemiological survey. Methods Using in‐class, self administered questionnaire about their sociodemographic variables and smoking behaviour, a representative sample of total 3304 students of preparatory, 9th, 10th, and 11th grades, from 22 randomly selected schools of Mersin, were evaluated and discriminative factors have been determined using appropriate statistics. In addition to binary logistic regression analysis, the study evaluated combined effects of these factors using classification and regression tree methodology, as a new statistical method. Results The data showed that 38% of the students reported lifetime smoking and 16.9% of them reported current smoking with a male predominancy and increasing prevalence by age. Second hand smoking was reported at a 74.3% frequency with father predominance (56.6%). The significantly important factors that affect current smoking in these age groups were increased by household size, late birth rank, certain school types, low academic performance, increased second hand smoking, and stress (especially reported as separation from a close friend or because of violence at home). Classification and regression tree methodology showed the importance of some neglected sociodemographic factors with a good classification capacity. Conclusions It was concluded that, as closely related with sociocultural factors, smoking was a common problem in this young population, generating important academic and social burden in youth life and with increasing data about this behaviour and using new statistical methods, effective coping strategies could be composed. PMID:16891446

  3. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  4. Further Insight and Additional Inference Methods for Polynomial Regression Applied to the Analysis of Congruence

    ERIC Educational Resources Information Center

    Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti

    2010-01-01

    In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…

  5. An Additional Measure of Overall Effect Size for Logistic Regression Models

    ERIC Educational Resources Information Center

    Allen, Jeff; Le, Huy

    2008-01-01

    Users of logistic regression models often need to describe the overall predictive strength, or effect size, of the model's predictors. Analogs of R[superscript 2] have been developed, but none of these measures are interpretable on the same scale as effects of individual predictors. Furthermore, R[superscript 2] analogs are not invariant to the…

  6. Reconstructing palaeoclimatic variables from fossil pollen using boosted regression trees: comparison and synthesis with other quantitative reconstruction methods

    NASA Astrophysics Data System (ADS)

    Salonen, J. Sakari; Luoto, Miska; Alenius, Teija; Heikkilä, Maija; Seppä, Heikki; Telford, Richard J.; Birks, H. John B.

    2014-03-01

    We test and analyse a new calibration method, boosted regression trees (BRTs) in palaeoclimatic reconstructions based on fossil pollen assemblages. We apply BRTs to multiple Holocene and Lateglacial pollen sequences from northern Europe, and compare their performance with two commonly-used calibration methods: weighted averaging regression (WA) and the modern-analogue technique (MAT). Using these calibration methods and fossil pollen data, we present synthetic reconstructions of Holocene summer temperature, winter temperature, and water balance changes in northern Europe. Highly consistent trends are found for summer temperature, with a distinct Holocene thermal maximum at ca 8000-4000 cal. a BP, with a mean Tjja anomaly of ca +0.7 °C at 6 ka compared to 0.5 ka. We were unable to reconstruct reliably winter temperature or water balance, due to the confounding effects of summer temperature and the great between-reconstruction variability. We find BRTs to be a promising tool for quantitative reconstructions from palaeoenvironmental proxy data. BRTs show good performance in cross-validations compared with WA and MAT, can model a variety of taxon response types, find relevant predictors and incorporate interactions between predictors, and show some robustness with non-analogue fossil assemblages.

  7. Mineral elements of subtropical tree seedlings in response to elevated carbon dioxide and nitrogen addition.

    PubMed

    Huang, Wenjuan; Zhou, Guoyi; Liu, Juxiu; Zhang, Deqiang; Liu, Shizhong; Chu, Guowei; Fang, Xiong

    2015-01-01

    Mineral elements in plants have been strongly affected by increased atmospheric carbon dioxide (CO2) concentrations and nitrogen (N) deposition due to human activities. However, such understanding is largely limited to N and phosphorus in grassland. Using open-top chambers, we examined the concentrations of potassium (K), calcium (Ca), magnesium (Mg), aluminum (Al), copper (Cu) and manganese (Mn) in the leaves and roots of the seedlings of five subtropical tree species in response to elevated CO2 (ca. 700 μmol CO2 mol(-1)) and N addition (100 kg N ha(-1) yr(-1)) from 2005 to 2009. These mineral elements in the roots responded more strongly to elevated CO2 and N addition than those in the leaves. Elevated CO2 did not consistently decrease the concentrations of plant mineral elements, with increases in K, Al, Cu and Mn in some tree species. N addition decreased K and had no influence on Cu in the five tree species. Given the shifts in plant mineral elements, Schima superba and Castanopsis hystrix were less responsive to elevated CO2 and N addition alone, respectively. Our results indicate that plant stoichiometry would be altered by increasing CO2 and N deposition, and K would likely become a limiting nutrient under increasing N deposition in subtropics. PMID:25794046

  8. Additional results on 'Reducing geometric dilution of precision using ridge regression'

    NASA Astrophysics Data System (ADS)

    Kelly, Robert J.

    1990-07-01

    Kelly (1990) presented preliminary results on the feasibility of using ridge regression (RR) to reduce the effects of geometric dilution of precision (GDOP) error inflation in position-fix navigation systems. Recent results indicate that RR will not reduce GDOP bias inflation when biaslike measurement errors last much longer than the aircraft guidance-loop response time. This conclusion precludes the use of RR on navigation systems whose dominant error sources are biaslike; e.g., the GPS selective-availability error source. The simulation results given by Kelly are, however, valid for the conditions defined. Although RR has not yielded a satisfactory solution to the general GDOP problem, it has illuminated the role that multicollinearity plays in navigation signal processors such as the Kalman filter. Bias inflation, initial position guess errors, ridge-parameter selection methodology, and the recursive ridge filter are discussed.

  9. Assessing College Student Interest in Math and/or Computer Science in a Cross-National Sample Using Classification and Regression Trees

    ERIC Educational Resources Information Center

    Kitsantas, Anastasia; Kitsantas, Panagiota; Kitsantas, Thomas

    2012-01-01

    The purpose of this exploratory study was to assess the relative importance of a number of variables in predicting students' interest in math and/or computer science. Classification and regression trees (CART) were employed in the analysis of survey data collected from 276 college students enrolled in two U.S. and Greek universities. The…

  10. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  11. Hourly predictive artificial neural network and multivariate regression tree models of Alternaria and Cladosporium spore concentrations in Szczecin (Poland)

    NASA Astrophysics Data System (ADS)

    Grinn-Gofroń, Agnieszka; Strzelczak, Agnieszka

    2009-11-01

    A study was made of the link between time of day, weather variables and the hourly content of certain fungal spores in the atmosphere of the city of Szczecin, Poland, in 2004-2007. Sampling was carried out with a Lanzoni 7-day-recording spore trap. The spores analysed belonged to the taxa Alternaria and Cladosporium. These spores were selected both for their allergenic capacity and for their high level presence in the atmosphere, particularly during summer. Spearman correlation coefficients between spore concentrations, meteorological parameters and time of day showed different indices depending on the taxon being analysed. Relative humidity (RH), air temperature, air pressure and clouds most strongly and significantly influenced the concentration of Alternaria spores. Cladosporium spores correlated less strongly and significantly than Alternaria. Multivariate regression tree analysis revealed that, at air pressures lower than 1,011 hPa the concentration of Alternaria spores was low. Under higher air pressure spore concentrations were higher, particularly when RH was lower than 36.5%. In the case of Cladosporium, under higher air pressure (>1,008 hPa), the spores analysed were more abundant, particularly after 0330 hours. In artificial neural networks, RH, air pressure and air temperature were the most important variables in the model for Alternaria spore concentration. For Cladosporium, clouds, time of day, air pressure, wind speed and dew point temperature were highly significant factors influencing spore concentration. The maximum abundance of Cladosporium spores in air fell between 1200 and 1700 hours.

  12. Hourly predictive artificial neural network and multivariate regression tree models of Alternaria and Cladosporium spore concentrations in Szczecin (Poland).

    PubMed

    Grinn-Gofroń, Agnieszka; Strzelczak, Agnieszka

    2009-11-01

    A study was made of the link between time of day, weather variables and the hourly content of certain fungal spores in the atmosphere of the city of Szczecin, Poland, in 2004-2007. Sampling was carried out with a Lanzoni 7-day-recording spore trap. The spores analysed belonged to the taxa Alternaria and Cladosporium. These spores were selected both for their allergenic capacity and for their high level presence in the atmosphere, particularly during summer. Spearman correlation coefficients between spore concentrations, meteorological parameters and time of day showed different indices depending on the taxon being analysed. Relative humidity (RH), air temperature, air pressure and clouds most strongly and significantly influenced the concentration of Alternaria spores. Cladosporium spores correlated less strongly and significantly than Alternaria. Multivariate regression tree analysis revealed that, at air pressures lower than 1,011 hPa the concentration of Alternaria spores was low. Under higher air pressure spore concentrations were higher, particularly when RH was lower than 36.5%. In the case of Cladosporium, under higher air pressure (>1,008 hPa), the spores analysed were more abundant, particularly after 0330 hours. In artificial neural networks, RH, air pressure and air temperature were the most important variables in the model for Alternaria spore concentration. For Cladosporium, clouds, time of day, air pressure, wind speed and dew point temperature were highly significant factors influencing spore concentration. The maximum abundance of Cladosporium spores in air fell between 1200 and 1700 hours.

  13. Estimating Dbh of Trees Employing Multiple Linear Regression of the best Lidar-Derived Parameter Combination Automated in Python in a Natural Broadleaf Forest in the Philippines

    NASA Astrophysics Data System (ADS)

    Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.

    2016-06-01

    Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).

  14. Landslide susceptibility mapping using decision-tree based CHi-squared automatic interaction detection (CHAID) and Logistic regression (LR) integration

    NASA Astrophysics Data System (ADS)

    Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin

    2014-06-01

    This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.

  15. Multi-scale remote sensing sagebrush characterization with regression trees over Wyoming, USA: laying a foundation for monitoring

    USGS Publications Warehouse

    Homer, Collin G.; Aldridge, Cameron L.; Meyer, Debra K.; Schell, Spencer J.

    2012-01-01

    agebrush ecosystems in North America have experienced extensive degradation since European settlement. Further degradation continues from exotic invasive plants, altered fire frequency, intensive grazing practices, oil and gas development, and climate change – adding urgency to the need for ecosystem-wide understanding. Remote sensing is often identified as a key information source to facilitate ecosystem-wide characterization, monitoring, and analysis; however, approaches that characterize sagebrush with sufficient and accurate local detail across large enough areas to support this paradigm are unavailable. We describe the development of a new remote sensing sagebrush characterization approach for the state of Wyoming, U.S.A. This approach integrates 2.4 m QuickBird, 30 m Landsat TM, and 56 m AWiFS imagery into the characterization of four primary continuous field components including percent bare ground, percent herbaceous cover, percent litter, and percent shrub, and four secondary components including percent sagebrush (Artemisia spp.), percent big sagebrush (Artemisia tridentata), percent Wyoming sagebrush (Artemisia tridentata Wyomingensis), and shrub height using a regression tree. According to an independent accuracy assessment, primary component root mean square error (RMSE) values ranged from 4.90 to 10.16 for 2.4 m QuickBird, 6.01 to 15.54 for 30 m Landsat, and 6.97 to 16.14 for 56 m AWiFS. Shrub and herbaceous components outperformed the current data standard called LANDFIRE, with a shrub RMSE value of 6.04 versus 12.64 and a herbaceous component RMSE value of 12.89 versus 14.63. This approach offers new advancements in sagebrush characterization from remote sensing and provides a foundation to quantitatively monitor these components into the future.

  16. Multi-scale remote sensing sagebrush characterization with regression trees over Wyoming, USA: Laying a foundation for monitoring

    NASA Astrophysics Data System (ADS)

    Homer, Collin G.; Aldridge, Cameron L.; Meyer, Debra K.; Schell, Spencer J.

    2012-02-01

    Sagebrush ecosystems in North America have experienced extensive degradation since European settlement. Further degradation continues from exotic invasive plants, altered fire frequency, intensive grazing practices, oil and gas development, and climate change - adding urgency to the need for ecosystem-wide understanding. Remote sensing is often identified as a key information source to facilitate ecosystem-wide characterization, monitoring, and analysis; however, approaches that characterize sagebrush with sufficient and accurate local detail across large enough areas to support this paradigm are unavailable. We describe the development of a new remote sensing sagebrush characterization approach for the state of Wyoming, U.S.A. This approach integrates 2.4 m QuickBird, 30 m Landsat TM, and 56 m AWiFS imagery into the characterization of four primary continuous field components including percent bare ground, percent herbaceous cover, percent litter, and percent shrub, and four secondary components including percent sagebrush ( Artemisia spp.), percent big sagebrush ( Artemisia tridentata), percent Wyoming sagebrush ( Artemisia tridentata Wyomingensis), and shrub height using a regression tree. According to an independent accuracy assessment, primary component root mean square error (RMSE) values ranged from 4.90 to 10.16 for 2.4 m QuickBird, 6.01 to 15.54 for 30 m Landsat, and 6.97 to 16.14 for 56 m AWiFS. Shrub and herbaceous components outperformed the current data standard called LANDFIRE, with a shrub RMSE value of 6.04 versus 12.64 and a herbaceous component RMSE value of 12.89 versus 14.63. This approach offers new advancements in sagebrush characterization from remote sensing and provides a foundation to quantitatively monitor these components into the future.

  17. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  18. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  19. Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods

    USGS Publications Warehouse

    Moisen, G.G.; Freeman, E.A.; Blackard, J.A.; Frescino, T.S.; Zimmermann, N.E.; Edwards, T.C.

    2006-01-01

    Many efforts are underway to produce broad-scale forest attribute maps by modelling forest class and structure variables collected in forest inventories as functions of satellite-based and biophysical information. Typically, variants of classification and regression trees implemented in Rulequest's?? See5 and Cubist (for binary and continuous responses, respectively) are the tools of choice in many of these applications. These tools are widely used in large remote sensing applications, but are not easily interpretable, do not have ties with survey estimation methods, and use proprietary unpublished algorithms. Consequently, three alternative modelling techniques were compared for mapping presence and basal area of 13 species located in the mountain ranges of Utah, USA. The modelling techniques compared included the widely used See5/Cubist, generalized additive models (GAMs), and stochastic gradient boosting (SGB). Model performance was evaluated using independent test data sets. Evaluation criteria for mapping species presence included specificity, sensitivity, Kappa, and area under the curve (AUC). Evaluation criteria for the continuous basal area variables included correlation and relative mean squared error. For predicting species presence (setting thresholds to maximize Kappa), SGB had higher values for the majority of the species for specificity and Kappa, while GAMs had higher values for the majority of the species for sensitivity. In evaluating resultant AUC values, GAM and/or SGB models had significantly better results than the See5 models where significant differences could be detected between models. For nine out of 13 species, basal area prediction results for all modelling techniques were poor (correlations less than 0.5 and relative mean squared errors greater than 0.8), but SGB provided the most stable predictions in these instances. SGB and Cubist performed equally well for modelling basal area for three species with moderate prediction success

  20. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  1. Separating the effects of water physicochemistry and sediment contamination on Chironomus tepperi (Skuse) survival, growth and development: a boosted regression tree approach.

    PubMed

    Hale, Robin; Marshall, Stephen; Jeppe, Katherine; Pettigrove, Vincent

    2014-07-01

    More comprehensive ecological risk assessment procedures are needed as the unprecedented rate of anthropogenic disturbances to aquatic ecosystems continues. Identifying the effects of pollutants on aquatic ecosystems is difficult, requiring the individual and joint effects of a range of natural and anthropogenic factors to be isolated, often via the analysis of large, complicated datasets. Ecotoxicologists have traditionally used multiple regression to analyse such datasets, but there are inherent problems with this approach and a need to consider other potentially more suitable methods. Sediment pollution can cause a range of negative effects on aquatic animals, and these are used as the basis for toxicity bioassays to measure the biological impact of pollution and the success of remediation efforts. However, experimental artefacts can also lead to sediments being incorrectly classed as toxic in such studies. Understanding the influence of potentially confounding factors will help more accurate assessments of sediment pollution. In this study, we analysed standardised sediment bioassays conducted using the chironomid Chironomus tepperi, with the aim of modelling the impact of sediment toxicants and water physico-chemistry on four endpoints (survival, growth, median emergence day, and number of emerging adults). We used boosted regression trees (BRT), a method that has a number of advantages over multiple regression, to model bioassay endpoints as a function of water chemistry, sediment quality and underlying geology. Endpoints were generally influenced most strongly by water quality parameters and nutrients, although some metals negatively influenced emergence endpoints. Sub-lethal endpoints were generally better predicted than lethal endpoints; median emergence day was the most sensitive endpoint examined in this study, while the number of emerging adults was the least sensitive. We tested our modelling results by experimentally manipulating sediment and

  2. Contrasting regional and national mechanisms for predicting elevated arsenic in private wells across the United States using classification and regression trees.

    PubMed

    Frederick, Logan; VanDerslice, James; Taddie, Marissa; Malecki, Kristen; Gregg, Josh; Faust, Nicholas; Johnson, William P

    2016-03-15

    Arsenic contamination in groundwater is a public health and environmental concern in the United States (U.S.) particularly where monitoring is not required under the Safe Water Drinking Act. Previous studies suggest the influence of regional mechanisms for arsenic mobilization into groundwater; however, no study has examined how influencing parameters change at a continental scale spanning multiple regions. We herein examine covariates for groundwater in the western, central and eastern U.S. regions representing mechanisms associated with arsenic concentrations exceeding the U.S. Environmental Protection Agency maximum contamination level (MCL) of 10 parts per billion (ppb). Statistically significant covariates were identified via classification and regression tree (CART) analysis, and included hydrometeorological and groundwater chemical parameters. The CART analyses were performed at two scales: national and regional; for which three physiographic regions located in the western (Payette Section and the Snake River Plain), central (Osage Plains of the Central Lowlands), and eastern (Embayed Section of the Coastal Plains) U.S. were examined. Validity of each of the three regional CART models was indicated by values >85% for the area under the receiver-operating characteristic curve. Aridity (precipitation minus potential evapotranspiration) was identified as the primary covariate associated with elevated arsenic at the national scale. At the regional scale, aridity and pH were the major covariates in the arid to semi-arid (western) region; whereas dissolved iron (taken to represent chemically reducing conditions) and pH were major covariates in the temperate (eastern) region, although additional important covariates emerged, including elevated phosphate. Analysis in the central U.S. region indicated that elevated arsenic concentrations were driven by a mixture of those observed in the western and eastern regions.

  3. Contrasting regional and national mechanisms for predicting elevated arsenic in private wells across the United States using classification and regression trees.

    PubMed

    Frederick, Logan; VanDerslice, James; Taddie, Marissa; Malecki, Kristen; Gregg, Josh; Faust, Nicholas; Johnson, William P

    2016-03-15

    Arsenic contamination in groundwater is a public health and environmental concern in the United States (U.S.) particularly where monitoring is not required under the Safe Water Drinking Act. Previous studies suggest the influence of regional mechanisms for arsenic mobilization into groundwater; however, no study has examined how influencing parameters change at a continental scale spanning multiple regions. We herein examine covariates for groundwater in the western, central and eastern U.S. regions representing mechanisms associated with arsenic concentrations exceeding the U.S. Environmental Protection Agency maximum contamination level (MCL) of 10 parts per billion (ppb). Statistically significant covariates were identified via classification and regression tree (CART) analysis, and included hydrometeorological and groundwater chemical parameters. The CART analyses were performed at two scales: national and regional; for which three physiographic regions located in the western (Payette Section and the Snake River Plain), central (Osage Plains of the Central Lowlands), and eastern (Embayed Section of the Coastal Plains) U.S. were examined. Validity of each of the three regional CART models was indicated by values >85% for the area under the receiver-operating characteristic curve. Aridity (precipitation minus potential evapotranspiration) was identified as the primary covariate associated with elevated arsenic at the national scale. At the regional scale, aridity and pH were the major covariates in the arid to semi-arid (western) region; whereas dissolved iron (taken to represent chemically reducing conditions) and pH were major covariates in the temperate (eastern) region, although additional important covariates emerged, including elevated phosphate. Analysis in the central U.S. region indicated that elevated arsenic concentrations were driven by a mixture of those observed in the western and eastern regions. PMID:26803265

  4. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests

    ERIC Educational Resources Information Center

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2009-01-01

    Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…

  5. Decomposition of conifer tree bark under field conditions: effects of nitrogen and phosphorus additions

    NASA Astrophysics Data System (ADS)

    Lopes de Gerenyu, Valentin; Kurganova, Irina; Kapitsa, Ekaterina; Shorokhova, Ekaterina

    2016-04-01

    In forest ecosystems, the processes of decomposition of coarse woody debris (CWD) can contribute significantly to the emission component of carbon (C) cycle and thus accelerate the greenhouse effect and global climate change. A better understanding of decomposition of CWD is required to refine estimates of the C balance in forest ecosystems and improve biogeochemical models. These estimates will in turn contribute to assessing the role of forests in maintaining their long-term productivity and other ecosystems services. We examined the decomposition rate of coniferous bark with added nitrogen (N) and phosphorus (P) fertilizers in experiment under field conditions. The experiment was carried out in 2015 during 17 weeks in Moscow region (54o50'N, 37o36'E) under continental-temperate climatic conditions. The conifer tree bark mixture (ca. 70% of Norway spruce and 30% of Scots pine) was combined with soil and placed in piles of soil-bark substrate (SBS) with height of ca. 60 cm and surface area of ca. 3 m2. The dry mass ratio of bark to soil was 10:1. The experimental design included following treatments: (1) soil (Luvisols Haplic) without bark, (S), (2) pure SBS, (3) SBS with N addition in the amount of 1% of total dry bark mass (SBS-N), and (4) SBS with N and P addition in the amount of 1% of total dry bark mass for each element (SBS-NP). The decomposition rate expressed as CO2 emission flux, g C/m2/h was measured using closed chamber method 1-3 times per week from July to early November using LiCor 6400 (Nebraska, USA). During the experiment, we also controlled soil temperature at depths of 5, 20, 40, and 60 cm below surface of SBS using thermochrons iButton (DS1921G, USA). The pattern of CO2 emission rate from SBS depended strongly on fertilizing. The highest decomposition rates (DecR) of 2.8-5.6 g C/m2/h were observed in SBS-NP treatment during the first 6 weeks of experiment. The decay process of bark was less active in the treatment with only N addition. In this

  6. A community-level, mesoscale analysis of fish assemblage structure in shoreline habitats of a large river using multivariate regression trees

    NASA Astrophysics Data System (ADS)

    Wilkes, Martin; Maddock, Ian; Link, Oscar; Habit, Evelyn

    2015-04-01

    Despite the numerous advantages over traditional methods ascribed to community-level analyses, including the ability to rapidly predict the abundance of multiple species and the integration of complex biological interactions, very few applications to the mesoscale of river habitats can be found in the extant literature. Most previous work has been based on single species, species-by-species modelling or reduced dimensionality approaches. Community-level analyses have especially good properties for improving the understanding of habitat associations in large rivers where biological interactions are most intense and applications of the mesohabitat concept relatively sparse. This chapter seeks to identify quantitative relationships between key environmental variables and community structure using a particular type of community-level technique known as multivariate regression trees in order to test the ecological basis for applications of the mesohabitat concept in large rivers. Mesohabitats were mapped and their environmental characteristics recorded along a reach of the San Pedro River, Chile, which is inhabited by a highly endemic fish community. A representative portion of the mesohabitats were selected for fish sampling and multivariate regression trees produced to predict community structure based on combinations of environmental variables. The analyses showed that fish assemblages were distinct at the mesoscale, with flow depth, bank materials, cover and woody debris the key predictor variables. The results support the application of the mesohabitat concept in this geographical context and establish a basis for predicting the community structure of any mesohabitat along the reach.

  7. Partitioning of multivariate phenotypes using regression trees reveals complex patterns of adaptation to climate across the range of black cottonwood (Populus trichocarpa).

    PubMed

    Oubida, Regis W; Gantulga, Dashzeveg; Zhang, Man; Zhou, Lecong; Bawa, Rajesh; Holliday, Jason A

    2015-01-01

    Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13-0.32) and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref) explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP), and mean annual precipitation (MAP). These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref) had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures) had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP) had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP) performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar. PMID:25870603

  8. Partitioning of multivariate phenotypes using regression trees reveals complex patterns of adaptation to climate across the range of black cottonwood (Populus trichocarpa)

    PubMed Central

    Oubida, Regis W.; Gantulga, Dashzeveg; Zhang, Man; Zhou, Lecong; Bawa, Rajesh; Holliday, Jason A.

    2015-01-01

    Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13–0.32) and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref) explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP), and mean annual precipitation (MAP). These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref) had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures) had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP) had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP) performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar. PMID:25870603

  9. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    NASA Astrophysics Data System (ADS)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  10. A regional classification scheme for estimating reference water quality in streams using land-use-adjusted spatial regression-tree analysis

    USGS Publications Warehouse

    Robertson, D.M.; Saad, D.A.; Heisey, D.M.

    2006-01-01

    Various approaches are used to subdivide large areas into regions containing streams that have similar reference or background water quality and that respond similarly to different factors. For many applications, such as establishing reference conditions, it is preferable to use physical characteristics that are not affected by human activities to delineate these regions. However, most approaches, such as ecoregion classifications, rely on land use to delineate regions or have difficulties compensating for the effects of land use. Land use not only directly affects water quality, but it is often correlated with the factors used to define the regions. In this article, we describe modifications to SPARTA (spatial regression-tree analysis), a relatively new approach applied to water-quality and environmental characteristic data to delineate zones with similar factors affecting water quality. In this modified approach, land-use-adjusted (residualized) water quality and environmental characteristics are computed for each site. Regression-tree analysis is applied to the residualized data to determine the most statistically important environmental characteristics describing the distribution of a specific water-quality constituent. Geographic information for small basins throughout the study area is then used to subdivide the area into relatively homogeneous environmental water-quality zones. For each zone, commonly used approaches are subsequently used to define its reference water quality and how its water quality responds to changes in land use. SPARTA is used to delineate zones of similar reference concentrations of total phosphorus and suspended sediment throughout the upper Midwestern part of the United States. ?? 2006 Springer Science+Business Media, Inc.

  11. Effects of multiple chronic conditions on health care costs: an analysis based on an advanced tree-based regression model

    PubMed Central

    2013-01-01

    Background To analyze the impact of multimorbidity (MM) on health care costs taking into account data heterogeneity. Methods Data come from a multicenter prospective cohort study of 1,050 randomly selected primary care patients aged 65 to 85 years suffering from MM in Germany. MM was defined as co-occurrence of ≥3 conditions from a list of 29 chronic diseases. A conditional inference tree (CTREE) algorithm was used to detect the underlying structure and most influential variables on costs of inpatient care, outpatient care, medications as well as formal and informal nursing care. Results Irrespective of the number and combination of co-morbidities, a limited number of factors influential on costs were detected. Parkinson’s disease (PD) and cardiac insufficiency (CI) were the most influential variables for total costs. Compared to patients not suffering from any of the two conditions, PD increases predicted mean total costs 3.5-fold to approximately € 11,000 per 6 months, and CI two-fold to approximately € 6,100. The high total costs of PD are largely due to costs of nursing care. Costs of inpatient care were significantly influenced by cerebral ischemia/chronic stroke, whereas medication costs were associated with COPD, insomnia, PD and Diabetes. Except for costs of nursing care, socio-demographic variables did not significantly influence costs. Conclusions Irrespective of any combination and number of co-occurring diseases, PD and CI appear to be most influential on total health care costs in elderly patients with MM, and only a limited number of factors significantly influenced cost. Trial registration Current Controlled Trials ISRCTN89818205 PMID:23768192

  12. Predictors of adherence with self-care guidelines among persons with type 2 diabetes: results from a logistic regression tree analysis.

    PubMed

    Yamashita, Takashi; Kart, Cary S; Noe, Douglas A

    2012-12-01

    Type 2 diabetes is known to contribute to health disparities in the U.S. and failure to adhere to recommended self-care behaviors is a contributing factor. Intervention programs face difficulties as a result of patient diversity and limited resources. With data from the 2005 Behavioral Risk Factor Surveillance System, this study employs a logistic regression tree algorithm to identify characteristics of sub-populations with type 2 diabetes according to their reported frequency of adherence to four recommended diabetes self-care behaviors including blood glucose monitoring, foot examination, eye examination and HbA1c testing. Using Andersen's health behavior model, need factors appear to dominate the definition of which sub-groups were at greatest risk for low as well as high adherence. Findings demonstrate the utility of easily interpreted tree diagrams to design specific culturally appropriate intervention programs targeting sub-populations of diabetes patients who need to improve their self-care behaviors. Limitations and contributions of the study are discussed.

  13. Performance of seedlings of a shade-tolerant tropical tree species after moderate addition of N and P

    NASA Astrophysics Data System (ADS)

    Cárate Tandalla, Daisy; Leuschner, Christoph; Homeier, Jürgen

    2015-12-01

    Nitrogen deposition to tropical forests is predicted to increase in future in many regions due to agricultural intensification. We conducted a seedling transplantation experiment in a tropical premontane forest in Ecuador with a locally abundant late-successional tree species (Pouteria torta, Sapotaceae) aimed at detecting species-specific responses to moderate N and P addition and to understand how increasing nutrient availability will affect regeneration. From locally collected seeds, 320 seedlings were produced and transplanted to the plots of the Ecuadorian Nutrient Manipulation Experiment (NUMEX) with three treatments (moderate N addition: 50 kg N ha-1 yr-1, moderate P addition: 10 kg P ha-1 yr-1 and combined N and P addition) and a control (80 plants per treatment). After 12 months, mortality, relative growth rate, leaf nutrient content and leaf herbivory rate were measured. N and NP addition significantly increased the mortality rate (70 % vs. 54 % in the control). However, N and P addition also increased the diameter growth rate of the surviving seedlings. N and P addition did not alter foliar nutrient concentrations and leaf N:P ratio, but N addition decreased the leaf C:N ratio and increased SLA. P addition (but not N addition) resulted in higher leaf area loss to herbivore consumption and also shifted carbon allocation to root growth. This fertilization experiment with a common rainforest tree species conducted in old-growth forest shows that already moderate doses of added N and P are affecting seedling performance which most likely will have consequences for the competitive strength in the understory and the recruitment success of P. torta. Simultaneous increases in growth, herbivory and mortality rates make it difficult to assess the species' overall performance and predict how a future increase in nutrient deposition will alter the abundance of this species in the Andean tropical montane forests.

  14. Understanding how roadside concentrations of NOx are influenced by the background levels, traffic density, and meteorological conditions using Boosted Regression Trees

    NASA Astrophysics Data System (ADS)

    Sayegh, Arwa; Tate, James E.; Ropkins, Karl

    2016-02-01

    Oxides of Nitrogen (NOx) is a major component of photochemical smog and its constituents are considered principal traffic-related pollutants affecting human health. This study investigates the influence of background concentrations of NOx, traffic density, and prevailing meteorological conditions on roadside concentrations of NOx at UK urban, open motorway, and motorway tunnel sites using the statistical approach Boosted Regression Trees (BRT). BRT models have been fitted using hourly concentration, traffic, and meteorological data for each site. The models predict, rank, and visualise the relationship between model variables and roadside NOx concentrations. A strong relationship between roadside NOx and monitored local background concentrations is demonstrated. Relationships between roadside NOx and other model variables have been shown to be strongly influenced by the quality and resolution of background concentrations of NOx, i.e. if it were based on monitored data or modelled prediction. The paper proposes a direct method of using site-specific fundamental diagrams for splitting traffic data into four traffic states: free-flow, busy-flow, congested, and severely congested. Using BRT models, the density of traffic (vehicles per kilometre) was observed to have a proportional influence on the concentrations of roadside NOx, with different fitted regression line slopes for the different traffic states. When other influences are conditioned out, the relationship between roadside concentrations and ambient air temperature suggests NOx concentrations reach a minimum at around 22 °C with high concentrations at low ambient air temperatures which could be associated to restricted atmospheric dispersion and/or to changes in road traffic exhaust emission characteristics at low ambient air temperatures. This paper uses BRT models to study how different critical factors, and their relative importance, influence the variation of roadside NOx concentrations. The paper

  15. Application of Classification and Regression Tree (CART) analysis on the microflora of minced meat for classification according to Reg. (EC) 2073/2005.

    PubMed

    Paulsen, P; Smulders, F J M; Tichy, A; Aydin, A; Höck, C

    2011-07-01

    In a retrospective study on the microbiology of minced meat from small food businesses supplying directly to the consumer, the relative contribution of meat supplier, meat species and outlet where meat was minced was assessed by "Classification and Regression Tree" (CART) analysis. Samples (n=888) originated from 129 outlets of a single supermarket chain. Sampling units were 4-5 packs (pork, beef, and mixed pork-beef). Total aerobic counts (TACs) were 5.3±1.0 log CFU/g. In 75.6% of samples, E. coli were <1 log CFU/g. The proportion of "unsatisfactory" sample sets [as defined in Reg. (EC) 2073/2005] were 31.3 and 4.5% for TAC and E. coli, respectively. For classification according to TACs, the outlet where meat was minced and the "meat supplier" were the most important predictors. For E. coli, "outlet" was the most important predictor, but the limit of detection of 1 log CFU/g was not discriminative enough to allow further conclusions.

  16. Additives

    NASA Technical Reports Server (NTRS)

    Smalheer, C. V.

    1973-01-01

    The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.

  17. Quantifying mineral abundances of complex mixtures by coupling spectral deconvolution of SWIR spectra (2.1-2.4 μm) and regression tree analysis

    USGS Publications Warehouse

    Mulder, V.L.; Plotze, Michael; de Bruin, Sytze; Schaepman, Michael E.; Mavris, C.; Kokaly, Raymond F.; Egli, Markus

    2013-01-01

    This paper presents a methodology for assessing mineral abundances of mixtures having more than two constituents using absorption features in the 2.1-2.4 μm wavelength region. In the first step, the absorption behaviour of mineral mixtures is parameterised by exponential Gaussian optimisation. Next, mineral abundances are predicted by regression tree analysis using these parameters as inputs. The approach is demonstrated on a range of prepared samples with known abundances of kaolinite, dioctahedral mica, smectite, calcite and quartz and on a set of field samples from Morocco. The latter contained varying quantities of other minerals, some of which did not have diagnostic absorption features in the 2.1-2.4 μm region. Cross validation showed that the prepared samples of kaolinite, dioctahedral mica, smectite and calcite were predicted with a root mean square error (RMSE) less than 9 wt.%. For the field samples, the RMSE was less than 8 wt.% for calcite, dioctahedral mica and kaolinite abundances. Smectite could not be well predicted, which was attributed to spectral variation of the cations within the dioctahedral layered smectites. Substitution of part of the quartz by chlorite at the prediction phase hardly affected the accuracy of the predicted mineral content; this suggests that the method is robust in handling the omission of minerals during the training phase. The degree of expression of absorption components was different between the field sample and the laboratory mixtures. This demonstrates that the method should be calibrated and trained on local samples. Our method allows the simultaneous quantification of more than two minerals within a complex mixture and thereby enhances the perspectives of spectral analysis for mineral abundances.

  18. Responses of nitrous oxide emissions to nitrogen and phosphorus additions in two tropical plantations with N-fixing vs. non-N-fixing tree species

    NASA Astrophysics Data System (ADS)

    Zhang, W.; Zhu, X.; Luo, Y.; Rafique, R.; Chen, H.; Huang, J.; Mo, J.

    2014-01-01

    Leguminous tree plantations at phosphorus (P) limited sites may result in higher rates of nitrous oxide (N2O) emissions, however, the effects of nitrogen (N) and P applications on soil N2O emissions from plantations with N-fixing vs. non-N-fixing tree species has rarely been studied in the field. We conducted an experimental manipulation of N and P additions in two tropical plantations with Acacia auriculiformis (AA) and Eucalyptus urophylla (EU) tree species in South China. The objective was to determine the effects of N- or P-addition alone, as well as NP application together on soil N2O emissions from tropical plantations with N-fixing vs. non-N-fixing tree species. We found that the average N2O emission from control was greater in AA (2.26 ± 0.06 kg N2O-N ha-1 yr-1) than in EU plantation (1.87 ± 0.05 kg N2O-N ha-1 yr-1). For the AA plantation, N-addition stimulated the N2O emission from soil while P-addition did not. Applications of N with P together significantly decreased N2O emission compared to N-addition alone, especially in high level treatment plots (decreased by 18%). In the EU plantation, N2O emissions significantly decreased in P-addition plots compared with the controls, however, N- and NP-additions did not. The differing response of N2O emissions to N- or P-addition was attributed to the higher initial soil N status in the AA than that of the EU plantation, due to symbiotic N fixation in the former. Our results suggest that atmospheric N deposition potentially stimulates N2O emissions from leguminous tree plantations in the tropics, whereas P fertilization has the potential to mitigate N deposition-induced N2O emissions from such plantations.

  19. Generalized additive mixed models for disentangling long-term trends, local anomalies, and seasonality in fruit tree phenology

    PubMed Central

    Polansky, Leo; Robbins, Martha M

    2013-01-01

    Quantifying temporal patterns of ephemeral plant structures such as leaves, flowers, and fruits gives insight into both plant and animal ecology. Different scales of temporal changes in fruits, for example within- versus across-year variability, are driven by different processes, but are not always easy to disentangle. We apply generalized additive mixed models (GAMMs) to study a long-term fruit presence–absence data set of individual trees collected from a high-altitude Afromontane tropical rain forest site within Bwindi Impenetrable National Park (BINP), Uganda. Our primary aim was to highlight and evaluate GAMM methodology, and quantify both intra- and interannual changes in fruit production. First, we conduct several simulation experiments to study the practical utility of model selection and smooth term estimation relevant for disentangling intra- and interannual variability. These simulations indicate that estimation of nonlinearity and seasonality is generally accurately identified using asymptotic theory. Applied to the empirical data set, we found that the forest-level fruiting variability arises from both regular seasonality and significant interannual variability, with the years 2009–2010 in particular showing a significant increase in the presence of fruits-driven by increased productivity of most species, and a regular annual peak associated occurring at the end of one of the two dry seasons. Our analyses illustrate a statistical framework for disentangling short-term increases/decreases in fruiting effort while pinpointing specific times in which fruiting is atypical, providing a first step for assessing the impacts of regular and irregular (e.g., climate change) abiotic covariates on fruiting phenology. Some consequences of the rich diversity of fruiting patterns observed here for the population biology of frugivores in BINP are also discussed. PMID:24102000

  20. Generalized additive mixed models for disentangling long-term trends, local anomalies, and seasonality in fruit tree phenology.

    PubMed

    Polansky, Leo; Robbins, Martha M

    2013-09-01

    Quantifying temporal patterns of ephemeral plant structures such as leaves, flowers, and fruits gives insight into both plant and animal ecology. Different scales of temporal changes in fruits, for example within- versus across-year variability, are driven by different processes, but are not always easy to disentangle. We apply generalized additive mixed models (GAMMs) to study a long-term fruit presence-absence data set of individual trees collected from a high-altitude Afromontane tropical rain forest site within Bwindi Impenetrable National Park (BINP), Uganda. Our primary aim was to highlight and evaluate GAMM methodology, and quantify both intra- and interannual changes in fruit production. First, we conduct several simulation experiments to study the practical utility of model selection and smooth term estimation relevant for disentangling intra- and interannual variability. These simulations indicate that estimation of nonlinearity and seasonality is generally accurately identified using asymptotic theory. Applied to the empirical data set, we found that the forest-level fruiting variability arises from both regular seasonality and significant interannual variability, with the years 2009-2010 in particular showing a significant increase in the presence of fruits-driven by increased productivity of most species, and a regular annual peak associated occurring at the end of one of the two dry seasons. Our analyses illustrate a statistical framework for disentangling short-term increases/decreases in fruiting effort while pinpointing specific times in which fruiting is atypical, providing a first step for assessing the impacts of regular and irregular (e.g., climate change) abiotic covariates on fruiting phenology. Some consequences of the rich diversity of fruiting patterns observed here for the population biology of frugivores in BINP are also discussed.

  1. Use of generalized additive models and cokriging of spatial residuals to improve land-use regression estimates of nitrogen oxides in Southern California

    PubMed Central

    Li, Lianfa; Wu, Jun; Wilhelm, Michelle; Ritz, Beate

    2012-01-01

    Land-use regression (LUR) models have been developed to estimate spatial distributions of traffic-related pollutants. Several studies have examined spatial autocorrelation among residuals in LUR models, but few utilized spatial residual information in model prediction, or examined the impact of modeling methods, monitoring site selection, or traffic data quality on LUR performance. This study aims to improve spatial models for traffic-related pollutants using generalized additive models (GAM) combined with cokriging of spatial residuals. Specifically, we developed spatial models for nitrogen dioxide (NO2) and nitrogen oxides (NOx) concentrations in Southern California separately for two seasons (summer and winter) based on over 240 sampling locations. Pollutant concentrations were disaggregated into three components: local means, spatial residuals, and normal random residuals. Local means were modeled by GAM. Spatial residuals were cokriged with global residuals at nearby sampling locations that were spatially auto-correlated. We compared this two-stage approach with four commonly-used spatial models: universal kriging, multiple linear LUR and GAM with and without a spatial smoothing term. Leave-one-out cross validation was conducted for model validation and comparison purposes. The results show that our GAM plus cokriging models predicted summer and winter NO2 and NOx concentration surfaces well, with cross validation R2 values ranging from 0.88 to 0.92. While local covariates accounted for partial variance of the measured NO2 and NOx concentrations, spatial autocorrelation accounted for about 20% of the variance. Our spatial GAM model improved R2 considerably compared to the other four approaches. Conclusively, our two-stage model captured summer and winter differences in NO2 and NOx spatial distributions in Southern California well. When sampling location selection cannot be optimized for the intended model and fewer covariates are available as predictors for

  2. Growth enhancement of Picea abies trees under long-term, low-dose N addition is due to morphological more than to physiological changes.

    PubMed

    Krause, Kim; Cherubini, Paolo; Bugmann, Harald; Schleppi, Patrick

    2012-12-01

    Human activities have drastically increased nitrogen (N) inputs into natural and near-natural terrestrial ecosystems such that critical loads are now being exceeded in many regions of the world. This implies that these ecosystems are shifting from natural N limitation to eutrophication or even N saturation. This process is expected to modify the growth of forests and thus, along with management, to affect their carbon (C) sequestration. However, knowledge of the physiological mechanisms underlying tree response to N inputs, especially in the long term, is still lacking. In this study, we used tree-ring patterns and a dual stable isotope approach (δ(13)C and δ(18)O) to investigate tree growth responses and the underlying physiological reactions in a long-term, low-dose N addition experiment (+23 kg N ha(-1) a(-1)). This experiment has been conducted for 14 years in a mountain Picea abies (L.) Karst. forest in Alptal, Switzerland, using a paired-catchment design. Tree stem C sequestration increased by ∼22%, with an N use efficiency (NUE) of ca. 8 kg additional C in tree stems per kg of N added. Neither earlywood nor latewood δ(13)C values changed significantly compared with the control, indicating that the intrinsic water use efficiency (WUE(i)) (A/g(s)) did not change due to N addition. Further, the isotopic signal of δ(18)O in early- and latewood showed no significant response to the treatment, indicating that neither stomatal conductance nor leaf-level photosynthesis changed significantly. Foliar analyses showed that needle N concentration significantly increased in the fourth to seventh treatment year, accompanied by increased dry mass and area per needle, and by increased tree height growth. Later, N concentration and height growth returned to nearly background values, while dry mass and area per needle remained high. Our results support the hypothesis that enhanced stem growth caused by N addition is mainly due to an increased leaf area index (LAI

  3. Application of the deletion/substitution/addition algorithm to selecting land use regression models for interpolating air pollution measurements in California

    NASA Astrophysics Data System (ADS)

    Beckerman, Bernardo S.; Jerrett, Michael; Martin, Randall V.; van Donkelaar, Aaron; Ross, Zev; Burnett, Richard T.

    2013-10-01

    Land use regression (LUR) models are widely employed in health studies to characterize chronic exposure to air pollution. The LUR is essentially an interpolation technique that employs the pollutant of interest as the dependent variable with proximate land use, traffic, and physical environmental variables used as independent predictors. Two major limitations with this method have not been addressed: (1) variable selection in the model building process, and (2) dealing with unbalanced repeated measures. In this paper, we address these issues with a modeling framework that implements the deletion/substitution/addition (DSA) machine learning algorithm that uses a generalized linear model to average over unbalanced temporal observations. Models were derived for fine particulate matter with aerodynamic diameter of 2.5 microns or less (PM2.5) and nitrogen dioxide (NO2) using monthly observations. We used 4119 observations at 108 sites and 15,301 observations at 138 sites for PM2.5 and NO2, respectively. We derived models with good predictive capacity (cross-validated-R2 values were 0.65 and 0.71 for PM2.5 and NO2, respectively). By addressing these two shortcomings in current approaches to LUR modeling, we have developed a framework that minimizes arbitrary decisions during the model selection process. We have also demonstrated how to integrate temporally unbalanced data in a theoretically sound manner. These developments could have widespread applicability for future LUR modeling efforts.

  4. Responses of nitrous oxide emissions to nitrogen and phosphorus additions in two tropical plantations with N-fixing vs. non-N-fixing tree species

    NASA Astrophysics Data System (ADS)

    Zhang, W.; Zhu, X.; Luo, Y.; Rafique, R.; Chen, H.; Huang, J.; Mo, J.

    2014-09-01

    Leguminous tree plantations at phosphorus (P) limited sites may result in excess nitrogen (N) and higher rates of nitrous oxide (N2O) emissions. However, the effects of N and P applications on soil N2O emissions from plantations with N-fixing vs. non-N-fixing tree species have rarely been studied in the field. We conducted an experimental manipulation of N and/or P additions in two plantations with Acacia auriculiformis (AA, N-fixing) and Eucalyptus urophylla (EU, non-N-fixing) in South China. The objective was to determine the effects of N or P addition alone, as well as NP application together on soil N2O emissions from these tropical plantations. We found that the average N2O emission from control was greater in the AA (2.3 ± 0.1 kg N2O-N ha-1 yr-1) than in EU plantation (1.9 ± 0.1 kg N2O-N ha-1 yr-1). For the AA plantation, N addition stimulated N2O emission from the soil while P addition did not. Applications of N with P together significantly decreased N2O emission compared to N addition alone, especially in the high-level treatments (decreased by 18%). In the EU plantation, N2O emissions significantly decreased in P-addition plots compared with the controls; however, N and NP additions did not. The different response of N2O emission to N or P addition was attributed to the higher initial soil N status in the AA than that of EU plantation, due to symbiotic N fixation in the former. Our result suggests that atmospheric N deposition potentially stimulates N2O emissions from leguminous tree plantations in the tropics, whereas P fertilization has the potential to mitigate N-deposition-induced N2O emissions from such plantations.

  5. The integration of geophysical and enhanced Moderate Resolution Imaging Spectroradiometer Normalized Difference Vegetation Index data into a rule-based, piecewise regression-tree model to estimate cheatgrass beginning of spring growth

    USGS Publications Warehouse

    Boyte, Stephen P.; Wylie, Bruce K.; Major, Donald J.; Brown, Jesslyn F.

    2015-01-01

    Cheatgrass exhibits spatial and temporal phenological variability across the Great Basin as described by ecological models formed using remote sensing and other spatial data-sets. We developed a rule-based, piecewise regression-tree model trained on 99 points that used three data-sets – latitude, elevation, and start of season time based on remote sensing input data – to estimate cheatgrass beginning of spring growth (BOSG) in the northern Great Basin. The model was then applied to map the location and timing of cheatgrass spring growth for the entire area. The model was strong (R2 = 0.85) and predicted an average cheatgrass BOSG across the study area of 29 March–4 April. Of early cheatgrass BOSG areas, 65% occurred at elevations below 1452 m. The highest proportion of cheatgrass BOSG occurred between mid-April and late May. Predicted cheatgrass BOSG in this study matched well with previous Great Basin cheatgrass green-up studies.

  6. Photosynthetic and growth response of sugar maple (Acer saccharum Marsh.) mature trees and seedlings to calcium, magnesium, and nitrogen additions in the Catskill Mountains, NY, USA

    USGS Publications Warehouse

    Momen, Bahram; Behling, Shawna J; Lawrence, Gregory B.; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the

  7. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA

    PubMed Central

    Momen, Bahram; Behling, Shawna J.; Lawrence, Greg B.; Sullivan, Joseph H.

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  8. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA.

    PubMed

    Momen, Bahram; Behling, Shawna J; Lawrence, Greg B; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  9. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA.

    PubMed

    Momen, Bahram; Behling, Shawna J; Lawrence, Greg B; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  10. Autistic Regression

    ERIC Educational Resources Information Center

    Matson, Johnny L.; Kozlowski, Alison M.

    2010-01-01

    Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…

  11. Effect of fat additions to diets of dairy cattle on milk production and components: a meta-analysis and meta-regression.

    PubMed

    Rabiee, A R; Breinhild, K; Scott, W; Golder, H M; Block, E; Lean, I J

    2012-06-01

    The objectives of this study were to critically review randomized controlled trials, and quantify, using meta-analysis and meta-regression, the effects of supplementation with fats on milk production and components by dairy cows. We reviewed 59 papers, of which 38 (containing 86 comparisons) met eligibility criteria. Five groups of fats were evaluated: tallows, calcium salts of palm fat (Megalac, Church and Dwight Co. Inc., Princeton, NJ), oilseeds, prilled fat, and other calcium salts. Milk production responses to fats were significant, and the estimated mean difference was 1.05 kg/cow per day, but results were heterogeneous. Milk yield increased with increased difference in dry matter intake (DMI) between treatment and control groups, decreased with predicted metabolizable energy (ME) balance between these groups, and decreased with increased difference in soluble protein percentage of the diet between groups. Decreases in DMI were significant for Megalac, oilseeds, and other Ca salts, and approached significance for tallow. Feeding fat for a longer period increased DMI, as did greater differences in the amount of soluble protein percentage of the diet between control and treatment diets. Tallow, oilseeds, and other Ca salts reduced, whereas Megalac increased, milk fat percentage. Milk fat percentage effects were heterogeneous for fat source. Differences between treatment and control groups in duodenal concentrations of C18:2 and C 18:0 fatty acids and Mg percentage reduced the milk fat percentage standardized mean difference. Milk fat yield responses to fat treatments were very variable. The other Ca salts substantially decrease, and the Megalac and oilseeds increased, fat yield. Fat yield increased with increased DMI difference between groups and was lower with an increased estimated ME balance between treatment and control groups, indicating increased partitioning of fat to body tissue reserves. Feeding fats decreased milk protein percentage, but results were

  12. Robust Regression.

    PubMed

    Huang, Dong; Cabral, Ricardo; De la Torre, Fernando

    2016-02-01

    Discriminative methods (e.g., kernel regression, SVM) have been extensively used to solve problems such as object recognition, image alignment and pose estimation from images. These methods typically map image features ( X) to continuous (e.g., pose) or discrete (e.g., object category) values. A major drawback of existing discriminative methods is that samples are directly projected onto a subspace and hence fail to account for outliers common in realistic training sets due to occlusion, specular reflections or noise. It is important to notice that existing discriminative approaches assume the input variables X to be noise free. Thus, discriminative methods experience significant performance degradation when gross outliers are present. Despite its obvious importance, the problem of robust discriminative learning has been relatively unexplored in computer vision. This paper develops the theory of robust regression (RR) and presents an effective convex approach that uses recent advances on rank minimization. The framework applies to a variety of problems in computer vision including robust linear discriminant analysis, regression with missing data, and multi-label classification. Several synthetic and real examples with applications to head pose estimation from images, image and video classification and facial attribute classification with missing data are used to illustrate the benefits of RR. PMID:26761740

  13. Morse–Smale Regression

    SciTech Connect

    Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  14. Under which conditions, additional monitoring data are worth gathering for improving decision making? Application of the VOI theory in the Bayesian Event Tree eruption forecasting framework

    NASA Astrophysics Data System (ADS)

    Loschetter, Annick; Rohmer, Jérémy

    2016-04-01

    Standard and new generation of monitoring observations provide in almost real-time important information about the evolution of the volcanic system. These observations are used to update the model and contribute to a better hazard assessment and to support decision making concerning potential evacuation. The framework BET_EF (based on Bayesian Event Tree) developed by INGV enables dealing with the integration of information from monitoring with the prospect of decision making. Using this framework, the objectives of the present work are i. to propose a method to assess the added value of information (within the Value Of Information (VOI) theory) from monitoring; ii. to perform sensitivity analysis on the different parameters that influence the VOI from monitoring. VOI consists in assessing the possible increase in expected value provided by gathering information, for instance through monitoring. Basically, the VOI is the difference between the value with information and the value without additional information in a Cost-Benefit approach. This theory is well suited to deal with situations that can be represented in the form of a decision tree such as the BET_EF tool. Reference values and ranges of variation (for sensitivity analysis) were defined for input parameters, based on data from the MESIMEX exercise (performed at Vesuvio volcano in 2006). Complementary methods for sensitivity analyses were implemented: local, global using Sobol' indices and regional using Contribution to Sample Mean and Variance plots. The results (specific to the case considered) obtained with the different techniques are in good agreement and enable answering the following questions: i. Which characteristics of monitoring are important for early warning (reliability)? ii. How do experts' opinions influence the hazard assessment and thus the decision? Concerning the characteristics of monitoring, the more influent parameters are the means rather than the variances for the case considered

  15. Fault-Tree Compiler

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Boerschlein, David P.

    1993-01-01

    Fault-Tree Compiler (FTC) program, is software tool used to calculate probability of top event in fault tree. Gates of five different types allowed in fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N. High-level input language easy to understand and use. In addition, program supports hierarchical fault-tree definition feature, which simplifies tree-description process and reduces execution time. Set of programs created forming basis for reliability-analysis workstation: SURE, ASSIST, PAWS/STEM, and FTC fault-tree tool (LAR-14586). Written in PASCAL, ANSI-compliant C language, and FORTRAN 77. Other versions available upon request.

  16. Tree based machine learning framework for predicting ground state energies of molecules

    NASA Astrophysics Data System (ADS)

    Himmetoglu, Burak

    2016-10-01

    We present an application of the boosted regression tree algorithm for predicting ground state energies of molecules made up of C, H, N, O, P, and S (CHNOPS). The PubChem chemical compound database has been incorporated to construct a dataset of 16 242 molecules, whose electronic ground state energies have been computed using density functional theory. This dataset is used to train the boosted regression tree algorithm, which allows a computationally efficient and accurate prediction of molecular ground state energies. Predictions from boosted regression trees are compared with neural network regression, a widely used method in the literature, and shown to be more accurate with significantly reduced computational cost. The performance of the regression model trained using the CHNOPS set is also tested on a set of distinct molecules that contain additional Cl and Si atoms. It is shown that the learning algorithms lead to a rich and diverse possibility of applications in molecular discovery and materials informatics.

  17. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed

  18. The effectiveness of selected feed and water additives for reducing Salmonella spp. of public health importance in broiler chickens: a systematic review, meta-analysis, and meta-regression approach.

    PubMed

    Totton, Sarah C; Farrar, Ashley M; Wilkins, Wendy; Bucher, Oliver; Waddell, Lisa A; Wilhelm, Barbara J; McEwen, Scott A; Rajić, Andrijana

    2012-10-01

    Eating inappropriately prepared poultry meat is a major cause of foodborne salmonellosis. Our objectives were to determine the efficacy of feed and water additives (other than competitive exclusion and antimicrobials) on reducing Salmonella prevalence or concentration in broiler chickens using systematic review-meta-analysis and to explore sources of heterogeneity found in the meta-analysis through meta-regression. Six electronic databases were searched (Current Contents (1999-2009), Agricola (1924-2009), MEDLINE (1860-2009), Scopus (1960-2009), Centre for Agricultural Bioscience (CAB) (1913-2009), and CAB Global Health (1971-2009)), five topic experts were contacted, and the bibliographies of review articles and a topic-relevant textbook were manually searched to identify all relevant research. Study inclusion criteria comprised: English-language primary research investigating the effects of feed and water additives on the Salmonella prevalence or concentration in broiler chickens. Data extraction and study methodological assessment were conducted by two reviewers independently using pretested forms. Seventy challenge studies (n=910 unique treatment-control comparisons), seven controlled studies (n=154), and one quasi-experiment (n=1) met the inclusion criteria. Compared to an assumed control group prevalence of 44 of 1000 broilers, random-effects meta-analysis indicated that the Salmonella cecal colonization in groups with prebiotics (fructooligosaccharide, lactose, whey, dried milk, lactulose, lactosucrose, sucrose, maltose, mannanoligosaccharide) added to feed or water was 15 out of 1000 broilers; with lactose added to feed or water it was 10 out of 1000 broilers; with experimental chlorate product (ECP) added to feed or water it was 21 out of 1000. For ECP the concentration of Salmonella in the ceca was decreased by 0.61 log(10)cfu/g in the treated group compared to the control group. Significant heterogeneity (Cochran's Q-statistic p≤0.10) was observed

  19. The Regression Trunk Approach to Discover Treatment Covariate Interaction

    ERIC Educational Resources Information Center

    Dusseldorp, Elise; Meulman, Jacqueline J.

    2004-01-01

    The regression trunk approach (RTA) is an integration of regression trees and multiple linear regression analysis. In this paper RTA is used to discover treatment covariate interactions, in the regression of one continuous variable on a treatment variable with "multiple" covariates. The performance of RTA is compared to the classical method of…

  20. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  1. Atlas of relations between climatic parameters and distributions of important trees and shrubs in North America; additional conifers, hardwoods, and monocots

    USGS Publications Warehouse

    Thompson, Robert S.; Anderson, Katherine H.; Bartlein, Patrick J.; Smith, Sharon A.

    2000-01-01

    This volume explores the continental-scale relations between climate and the geographic ranges of woody plant species in North America. A 25-km equal-area grid of modern climatic and bioclimatic parameters for North America was constructed from instrumental weather records. The geographic distributions of selected tree and shrub species were digitized, and the presence or absence of each species was determined for each cell on the 25-km grid, thus providing a basis for comparing climatic data and species' distribution.

  2. Changes in the delta13C values of trees during a tropical rainy season: some effects in addition to diffusion and carboxylation by Rubisco?

    PubMed

    Terwilliger, V

    1997-12-01

    The d13C values of deciduous and evergreen tree leaves were compared in open and closed- canopy environments throughout a rainy season in Panamá. Newly emerging leaves had higher d13C values than older leaves of all seedlings and trees at all dates sampled. This was apparently not caused by a decline in water use efficiency as leaves develop because instantaneous ci/ca was significantly higher in newly emerging than in expanded leaves on the same twigs of trees in the field as well as on seedlings growing in a controlled, unchanging environment. Higher d13C values in newly emerging leaves occurred across diverse environmental comparisons. For example, leaves emerging during the rainy season had higher d13C values than corresponding mature leaves that had emerged both during the dry season and when water was abundant. The early enrichment in 13C may thus reflect the translocation of carbon to initiate a new leaf. Furthermore, the lack of sensitivity of this enrichment to a microclimate suggests that it might be the result of processes that occur after carbon fixation by Rubisco. Other changes in d13C values as leaves developed may also have resulted from carbon translocation processes. Foliar d13C decreased significantly after most of the leaf biomass of the deciduous Apeiba membranacea had developed. The d13C values of the evergreen Cecropia insignis were lower in the open canopy than in closed-canopy forests at the end of the rainy season. These findings suggest that the d13C values of leaves can yield ecological information about the allocation of carbon within trees.

  3. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  4. Our Air: Unfit for Trees.

    ERIC Educational Resources Information Center

    Dochinger, Leon S.

    To help urban, suburban, and rural tree owners know about air pollution's effects on trees and their tolerance and intolerance to pollutants, the USDA Forest Service has prepared this booklet. It answers the following questions about atmospheric pollution: Where does it come from? What can it do to trees? and What can we do about it? In addition,…

  5. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  6. Tree Lifecycle.

    ERIC Educational Resources Information Center

    Nature Study, 1998

    1998-01-01

    Presents a Project Learning Tree (PLT) activity that has students investigate and compare the lifecycle of a tree to other living things and the tree's role in the ecosystem. Includes background material as well as step-by-step instructions, variation and enrichment ideas, assessment opportunities, and student worksheets. (SJR)

  7. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  8. NCCS Regression Test Harness

    SciTech Connect

    Tharrington, Arnold N.

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  9. Fully Regressive Melanoma

    PubMed Central

    Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé

    2016-01-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis.

  10. Fully Regressive Melanoma

    PubMed Central

    Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé

    2016-01-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418

  11. Assessing visual green effects of individual urban trees using airborne Lidar data.

    PubMed

    Chen, Ziyue; Xu, Bing; Gao, Bingbo

    2015-12-01

    Urban trees benefit people's daily life in terms of air quality, local climate, recreation and aesthetics. Among these functions, a growing number of studies have been conducted to understand the relationship between residents' preference towards local environments and visual green effects of urban greenery. However, except for on-site photography, there are few quantitative methods to calculate green visibility, especially tree green visibility, from viewers' perspectives. To fill this research gap, a case study was conducted in the city of Cambridge, which has a diversity of tree species, sizes and shapes. Firstly, a photograph-based survey was conducted to approximate the actual value of visual green effects of individual urban trees. In addition, small footprint airborne Lidar (Light detection and ranging) data was employed to measure the size and shape of individual trees. Next, correlations between visual tree green effects and tree structural parameters were examined. Through experiments and gradual refinement, a regression model with satisfactory R2 and limited large errors is proposed. Considering the diversity of sample trees and the result of cross-validation, this model has the potential to be applied to other study sites. This research provides urban planners and decision makers with an innovative method to analyse and evaluate landscape patterns in terms of tree greenness.

  12. Tree Amigos.

    ERIC Educational Resources Information Center

    Center for Environmental Study, Grand Rapids, MI.

    Tree Amigos is a special cross-cultural program that uses trees as a common bond to bring the people of the Americas together in unique partnerships to preserve and protect the shared global environment. It is a tangible program that embodies the philosophy that individuals, acting together, can make a difference. This resource book contains…

  13. Talking Trees

    ERIC Educational Resources Information Center

    Tolman, Marvin

    2005-01-01

    Students love outdoor activities and will love them even more when they build confidence in their tree identification and measurement skills. Through these activities, students will learn to identify the major characteristics of trees and discover how the pace--a nonstandard measuring unit--can be used to estimate not only distances but also the…

  14. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  15. Bayesian Evidence Framework for Decision Tree Learning

    NASA Astrophysics Data System (ADS)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  16. Terraces in phylogenetic tree space.

    PubMed

    Sanderson, Michael J; McMahon, Michelle M; Steel, Mike

    2011-07-22

    A key step in assembling the tree of life is the construction of species-rich phylogenies from multilocus--but often incomplete--sequence data sets. We describe previously unknown structure in the landscape of solutions to the tree reconstruction problem, comprising sometimes vast "terraces" of trees with identical quality, arranged on islands of phylogenetically similar trees. Phylogenetic ambiguity within a terrace can be characterized efficiently and then ameliorated by new algorithms for obtaining a terrace's maximum-agreement subtree or by identifying the smallest set of new targets for additional sequencing. Algorithms to find optimal trees or estimate Bayesian posterior tree distributions may need to navigate strategically in the neighborhood of large terraces in tree space.

  17. Classificação geométrica de galáxias bianeladas através do metódo CART (Classification And Regression Trees)

    NASA Astrophysics Data System (ADS)

    Ormeño, M. I.; Faúndez-Abans, M.; Cavada, G.

    2003-08-01

    A importância deste trabalho deve-se à seleção de objetos ainda não tratados particularmente como uma família e ao emprego de procedimento estatístico robusto que não precisa de pressupostos ou condições de contorno. Contribui, assim, ao melhor entendimento do cenário das Galáxias Aneladas do diagrama de Hubble via classificação e estudo de subclasses. Selecionaram-se 100 galáxias possuidoras de dois anéis do Catalog of Southern Ringed Galaxies compilado por Ronald Buta, de modo a construir uma amostra completa em termos de conhecimento dos semi-eixos dos anéis interno e externo projetados no plano do céu. Visando uma possível classificação destas galáxias aneladas normais em famílias de acordo com as características geométricas dos anéis, empregou-se primeiramente a Análise de Aglomerados (ferramenta de classificação: medições de semelhança em um espaço bidimensional) para explorar a possível existência de famílias. As variáveis analisadas foram: os diâmetros interiores menores d(I) e maiores D(I), os diâmetros exteriores menores d(E) e maiores D(E), e os ângulos de inclinação dos semi-eixos maiores interiores q(I) e exteriores q(E) dos anéis. Como metodologia de discriminação, empregou-se a construção de Árvores de Classificação. As árvores de classificação constituem um método de discriminação alternativo aos modelos clássicos, tais como a Análise Discriminante e a Regressão Logística, onde uma base de dados é dividida em partições (subgrupos) da árvore por ação de um predictor (variável específica). Os pacotes estatísticos utilizados para o processamento da informação foram: SAS versão 8.0 (Statistical Analisys System) e CART versão 3.6.3. Esta análise estatística sugere a existência de três possíveis famílias de galáxias bianeladas, com base apenas na geometria dos anéis. Como forma exploratória inicial deste resultado, a construção de um diagrama BT (magnitude total) versus o

  18. Improved Regression Calibration

    ERIC Educational Resources Information Center

    Skrondal, Anders; Kuha, Jouni

    2012-01-01

    The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…

  19. Prediction in Multiple Regression.

    ERIC Educational Resources Information Center

    Osborne, Jason W.

    2000-01-01

    Presents the concept of prediction via multiple regression (MR) and discusses the assumptions underlying multiple regression analyses. Also discusses shrinkage, cross-validation, and double cross-validation of prediction equations and describes how to calculate confidence intervals around individual predictions. (SLD)

  20. Rate of tree carbon accumulation increases continuously with tree size.

    PubMed

    Stephenson, N L; Das, A J; Condit, R; Russo, S E; Baker, P J; Beckman, N G; Coomes, D A; Lines, E R; Morris, W K; Rüger, N; Alvarez, E; Blundo, C; Bunyavejchewin, S; Chuyong, G; Davies, S J; Duque, A; Ewango, C N; Flores, O; Franklin, J F; Grau, H R; Hao, Z; Harmon, M E; Hubbell, S P; Kenfack, D; Lin, Y; Makana, J-R; Malizia, A; Malizia, L R; Pabst, R J; Pongpattananurak, N; Su, S-H; Sun, I-F; Tan, S; Thomas, D; van Mantgem, P J; Wang, X; Wiser, S K; Zavala, M A

    2014-03-01

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle--particularly net primary productivity and carbon storage--increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree's total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to undertand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  1. Rate of tree carbon accumulation increases continuously with tree size

    NASA Astrophysics Data System (ADS)

    Stephenson, N. L.; Das, A. J.; Condit, R.; Russo, S. E.; Baker, P. J.; Beckman, N. G.; Coomes, D. A.; Lines, E. R.; Morris, W. K.; Rüger, N.; Álvarez, E.; Blundo, C.; Bunyavejchewin, S.; Chuyong, G.; Davies, S. J.; Duque, Á.; Ewango, C. N.; Flores, O.; Franklin, J. F.; Grau, H. R.; Hao, Z.; Harmon, M. E.; Hubbell, S. P.; Kenfack, D.; Lin, Y.; Makana, J.-R.; Malizia, A.; Malizia, L. R.; Pabst, R. J.; Pongpattananurak, N.; Su, S.-H.; Sun, I.-F.; Tan, S.; Thomas, D.; van Mantgem, P. J.; Wang, X.; Wiser, S. K.; Zavala, M. A.

    2014-03-01

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle--particularly net primary productivity and carbon storage--increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree's total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to undertand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  2. Rate of tree carbon accumulation increases continuously with tree size.

    PubMed

    Stephenson, N L; Das, A J; Condit, R; Russo, S E; Baker, P J; Beckman, N G; Coomes, D A; Lines, E R; Morris, W K; Rüger, N; Alvarez, E; Blundo, C; Bunyavejchewin, S; Chuyong, G; Davies, S J; Duque, A; Ewango, C N; Flores, O; Franklin, J F; Grau, H R; Hao, Z; Harmon, M E; Hubbell, S P; Kenfack, D; Lin, Y; Makana, J-R; Malizia, A; Malizia, L R; Pabst, R J; Pongpattananurak, N; Su, S-H; Sun, I-F; Tan, S; Thomas, D; van Mantgem, P J; Wang, X; Wiser, S K; Zavala, M A

    2014-03-01

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle--particularly net primary productivity and carbon storage--increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree's total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to undertand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence. PMID:24429523

  3. Regression problems for magnitudes

    NASA Astrophysics Data System (ADS)

    Castellaro, S.; Mulargia, F.; Kagan, Y. Y.

    2006-06-01

    Least-squares linear regression is so popular that it is sometimes applied without checking whether its basic requirements are satisfied. In particular, in studying earthquake phenomena, the conditions (a) that the uncertainty on the independent variable is at least one order of magnitude smaller than the one on the dependent variable, (b) that both data and uncertainties are normally distributed and (c) that residuals are constant are at times disregarded. This may easily lead to wrong results. As an alternative to least squares, when the ratio between errors on the independent and the dependent variable can be estimated, orthogonal regression can be applied. We test the performance of orthogonal regression in its general form against Gaussian and non-Gaussian data and error distributions and compare it with standard least-square regression. General orthogonal regression is found to be superior or equal to the standard least squares in all the cases investigated and its use is recommended. We also compare the performance of orthogonal regression versus standard regression when, as often happens in the literature, the ratio between errors on the independent and the dependent variables cannot be estimated and is arbitrarily set to 1. We apply these results to magnitude scale conversion, which is a common problem in seismology, with important implications in seismic hazard evaluation, and analyse it through specific tests. Our analysis concludes that the commonly used standard regression may induce systematic errors in magnitude conversion as high as 0.3-0.4, and, even more importantly, this can introduce apparent catalogue incompleteness, as well as a heavy bias in estimates of the slope of the frequency-magnitude distributions. All this can be avoided by using the general orthogonal regression in magnitude conversions.

  4. Decision tree modeling using R.

    PubMed

    Zhang, Zhongheng

    2016-08-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building. PMID:27570769

  5. Decision tree modeling using R.

    PubMed

    Zhang, Zhongheng

    2016-08-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.

  6. Decision tree modeling using R

    PubMed Central

    2016-01-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building. PMID:27570769

  7. Multivariate Regression with Calibration*

    PubMed Central

    Liu, Han; Wang, Lie; Zhao, Tuo

    2014-01-01

    We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite-sample performance. Computationally, we develop an efficient smoothed proximal gradient algorithm which has a worst-case iteration complexity O(1/ε), where ε is a pre-specified numerical accuracy. Theoretically, we prove that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness of CMR by thorough numerical simulations and show that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR on a brain activity prediction problem and find that CMR is as competitive as the handcrafted model created by human experts. PMID:25620861

  8. Metamorphic geodesic regression.

    PubMed

    Hong, Yi; Joshi, Sarang; Sanchez, Mar; Styner, Martin; Niethammer, Marc

    2012-01-01

    We propose a metamorphic geodesic regression approach approximating spatial transformations for image time-series while simultaneously accounting for intensity changes. Such changes occur for example in magnetic resonance imaging (MRI) studies of the developing brain due to myelination. To simplify computations we propose an approximate metamorphic geodesic regression formulation that only requires pairwise computations of image metamorphoses. The approximated solution is an appropriately weighted average of initial momenta. To obtain initial momenta reliably, we develop a shooting method for image metamorphosis.

  9. Rate of tree carbon accumulation increases continuously with tree size

    USGS Publications Warehouse

    Stephenson, N.L.; Das, A.J.; Condit, R.; Russo, S.E.; Baker, P.J.; Beckman, N.G.; Coomes, D.A.; Lines, E.R.; Morris, W.K.; Rüger, N.; Álvarez, E.; Blundo, C.; Bunyavejchewin, S.; Chuyong, G.; Davies, S.J.; Duque, Á.; Ewango, C.N.; Flores, O.; Franklin, J.F.; Grau, H.R.; Hao, Z.; Harmon, M.E.; Hubbell, S.P.; Kenfack, D.; Lin, Y.; Makana, J.-R.; Malizia, A.; Malizia, L.R.; Pabst, R.J.; Pongpattananurak, N.; Su, S.-H.; Sun, I-F.; Tan, S.; Thomas, D.; van Mantgem, P.J.; Wang, X.; Wiser, S.K.; Zavala, M.A.

    2014-01-01

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle—particularly net primary productivity and carbon storage—increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree’s total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to understand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  10. The fault-tree compiler

    NASA Technical Reports Server (NTRS)

    Martensen, Anna L.; Butler, Ricky W.

    1987-01-01

    The Fault Tree Compiler Program is a new reliability tool used to predict the top event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N gates. The high level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precise (within the limits of double precision floating point arithmetic) to the five digits in the answer. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Corporation VAX with the VMS operation system.

  11. Evaluating patients with chest pain using classification and regression trees.

    PubMed

    Buntinx, F; Truyen, J; Embrechts, P; Moreel, G; Peeters, R

    1992-06-01

    We collected data on 320 patients complaining to their general practitioner of a new episode of chest pain, discomfort or oppression. Relationships were examined between initial signs and symptoms and a follow-up diagnosis after a period of 2 weeks to 2 months. The data were analysed with CART, a statistical decision theory software package. In our first run, the number of misclassifications by CART was 56%. After regrouping of the data and diagnostic categories, there were 37% misclassifications. The most discriminating variable turned out to be pain on palpation. When comparing each of five diagnostic groups to all others, we found a positive predictive value of 27% for gastrointestinal diseases, 72% for cardiovascular disorders, 69% for respiratory diseases, 58% for psychopathology and 73% for chest wall pathology. The CART methodology needs further investigation and testing before any clinical application will be possible in general practice.

  12. Tree-Based Methods for Discovery of Association between Flow Cytometry Data and Clinical Endpoints.

    PubMed

    Eliot, M; Azzoni, L; Firnhaber, C; Stevens, W; Glencross, D K; Sanne, I; Montaner, L J; Foulkes, A S

    2009-01-01

    We demonstrate the application and comparative interpretations of three tree-based algorithms for the analysis of data arising from flow cytometry: classification and regression trees (CARTs), random forests (RFs), and logic regression (LR). Specifically, we consider the question of what best predicts CD4 T-cell recovery in HIV-1 infected persons starting antiretroviral therapy with CD4 count between 200 and 350 cell/muL. A comparison to a more standard contingency table analysis is provided. While contingency table analysis and RFs provide information on the importance of each potential predictor variable, CART and LR offer additional insight into the combinations of variables that together are predictive of the outcome. In all cases considered, baseline CD3-DR-CD56+CD16+ emerges as an important predictor variable, while the tree-based approaches identify additional variables as potentially informative. Application of tree-based methods to our data suggests that a combination of baseline immune activation states, with emphasis on CD8 T-cell activation, may be a better predictor than any single T-cell/innate cell subset analyzed. Taken together, we show that tree-based methods can be successfully applied to flow cytometry data to better inform and discover associations that may not emerge in the context of a univariate analysis.

  13. Audubon Tree Study Program.

    ERIC Educational Resources Information Center

    National Audubon Society, New York, NY.

    Included are an illustrated student reader, "The Story of Trees," a leaders' guide, and a large tree chart with 37 colored pictures. The student reader reviews several aspects of trees: a definition of a tree; where and how trees grow; flowers, pollination and seed production; how trees make their food; how to recognize trees; seasonal changes;…

  14. Visualizing phylogenetic trees using TreeView.

    PubMed

    Page, Roderic D M

    2002-08-01

    TreeView provides a simple way to view the phylogenetic trees produced by a range of programs, such as PAUP*, PHYLIP, TREE-PUZZLE, and ClustalX. While some phylogenetic programs (such as the Macintosh version of PAUP*) have excellent tree printing facilities, many programs do not have the ability to generate publication quality trees. TreeView addresses this need. The program can read and write a range of tree file formats, display trees in a variety of styles, print trees, and save the tree as a graphic file. Protocols in this unit cover both displaying and printing a tree. Support protocols describe how to download and install TreeView, and how to display bootstrap values in trees generated by ClustalX and PAUP*. PMID:18792942

  15. Regression based modeling of vegetation and climate variables for the Amazon rainforests

    NASA Astrophysics Data System (ADS)

    Kodali, A.; Khandelwal, A.; Ganguly, S.; Bongard, J.; Das, K.

    2015-12-01

    Both short-term (weather) and long-term (climate) variations in the atmosphere directly impact various ecosystems on earth. Forest ecosystems, especially tropical forests, are crucial as they are the largest reserves of terrestrial carbon sink. For example, the Amazon forests are a critical component of global carbon cycle storing about 100 billion tons of carbon in its woody biomass. There is a growing concern that these forests could succumb to precipitation reduction in a progressively warming climate, leading to release of significant amount of carbon in the atmosphere. Therefore, there is a need to accurately quantify the dependence of vegetation growth on different climate variables and obtain better estimates of drought-induced changes to atmospheric CO2. The availability of globally consistent climate and earth observation datasets have allowed global scale monitoring of various climate and vegetation variables such as precipitation, radiation, surface greenness, etc. Using these diverse datasets, we aim to quantify the magnitude and extent of ecosystem exposure, sensitivity and resilience to droughts in forests. The Amazon rainforests have undergone severe droughts twice in last decade (2005 and 2010), which makes them an ideal candidate for the regional scale analysis. Current studies on vegetation and climate relationships have mostly explored linear dependence due to computational and domain knowledge constraints. We explore a modeling technique called symbolic regression based on evolutionary computation that allows discovery of the dependency structure without any prior assumptions. In symbolic regression the population of possible solutions is defined via trees structures. Each tree represents a mathematical expression that includes pre-defined functions (mathematical operators) and terminal sets (independent variables from data). Selection of these sets is critical to computational efficiency and model accuracy. In this work we investigate

  16. Latent Regression Analysis.

    PubMed

    Tarpey, Thaddeus; Petkova, Eva

    2010-07-01

    Finite mixture models have come to play a very prominent role in modelling data. The finite mixture model is predicated on the assumption that distinct latent groups exist in the population. The finite mixture model therefore is based on a categorical latent variable that distinguishes the different groups. Often in practice distinct sub-populations do not actually exist. For example, disease severity (e.g. depression) may vary continuously and therefore, a distinction of diseased and not-diseased may not be based on the existence of distinct sub-populations. Thus, what is needed is a generalization of the finite mixture's discrete latent predictor to a continuous latent predictor. We cast the finite mixture model as a regression model with a latent Bernoulli predictor. A latent regression model is proposed by replacing the discrete Bernoulli predictor by a continuous latent predictor with a beta distribution. Motivation for the latent regression model arises from applications where distinct latent classes do not exist, but instead individuals vary according to a continuous latent variable. The shapes of the beta density are very flexible and can approximate the discrete Bernoulli distribution. Examples and a simulation are provided to illustrate the latent regression model. In particular, the latent regression model is used to model placebo effect among drug treated subjects in a depression study. PMID:20625443

  17. Semiparametric Regression Pursuit.

    PubMed

    Huang, Jian; Wei, Fengrong; Ma, Shuangge

    2012-10-01

    The semiparametric partially linear model allows flexible modeling of covariate effects on the response variable in regression. It combines the flexibility of nonparametric regression and parsimony of linear regression. The most important assumption in the existing methods for the estimation in this model is to assume a priori that it is known which covariates have a linear effect and which do not. However, in applied work, this is rarely known in advance. We consider the problem of estimation in the partially linear models without assuming a priori which covariates have linear effects. We propose a semiparametric regression pursuit method for identifying the covariates with a linear effect. Our proposed method is a penalized regression approach using a group minimax concave penalty. Under suitable conditions we show that the proposed approach is model-pursuit consistent, meaning that it can correctly determine which covariates have a linear effect and which do not with high probability. The performance of the proposed method is evaluated using simulation studies, which support our theoretical results. A real data example is used to illustrated the application of the proposed method. PMID:23559831

  18. Tree harvesting

    SciTech Connect

    Badger, P.C.

    1995-12-31

    Short rotation intensive culture tree plantations have been a major part of biomass energy concepts since the beginning. One aspect receiving less attention than it deserves is harvesting. This article describes an method of harvesting somewhere between agricultural mowing machines and huge feller-bunchers of the pulpwood and lumber industries.

  19. Aspen Trees.

    ERIC Educational Resources Information Center

    Canfield, Elaine

    2002-01-01

    Describes a fifth-grade art activity that offers a new approach to creating pictures of Aspen trees. Explains that the students learned about art concepts, such as line and balance, in this lesson. Discusses the process in detail for creating the pictures. (CMK)

  20. On Tree-Based Phylogenetic Networks.

    PubMed

    Zhang, Louxin

    2016-07-01

    A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree-based networks. We present a simple necessary and sufficient condition for tree-based networks and prove that a universal tree-based network exists for any number of taxa that contains as its base every phylogenetic tree on the same set of taxa. This answers two problems posted by Francis and Steel recently. A byproduct is a computer program for generating random binary phylogenetic networks under the uniform distribution model.

  1. [Understanding logistic regression].

    PubMed

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  2. Logistic regression: a brief primer.

    PubMed

    Stoltzfus, Jill C

    2011-10-01

    Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model

  3. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  4. Factors Governing Stemflow Production from Plantation Grown Teak Trees in Thailand

    NASA Astrophysics Data System (ADS)

    Tanaka, N.; Levia, D. F., Jr.; Igarashi, Y.; Yoshifuji, N.; Tanaka, K.; Chatchai, T.; Nanko, K.; Suzuki, M.; Kumagai, T.

    2015-12-01

    Stemflow (SF) is recognized as an important process delivering water, solute, and particulate fluxes to spatially localized areas of the forest floor. Using both long-term SF data from nine even-aged deciduous teak trees grown in the same plantation and meteorological data from a nearby tower, this study seeks to better understand how: (1) specific biotic and abiotic factors control stand-scale SF production of teak; and (2) various biotic and abiotic factors affect tree-to-tree variations in teak SF production. A conventional regression analysis of SF volume against rainfall indicates that, for five individuals among the nine, SF was more efficiently produced in the leafless than in the leafed. However, for the other individuals, there was no such a relation, suggesting tree-to-tree variation in the response of SF to canopy status. A boosted regression tree (BRT) analysis setting daily basis SF funneling ratios (SFF) of the nine trees as dependent variables, indicates that SFF was intricately controlled by a variety of biotic and abiotic factors. The top six influential factors were, in descending order, rainfall duration, tree height, rainfall intensity, air temperature, wind speed, and antecedent dry period length having positive, negative, positive, negative, positive, and negative influence on SFF, respectively. Although teak exhibits drastic intra-annual changes in leaf phenology, leaf area index (LAI) had an unexpectedly small influence on SFF on a stand scale. Additional BRT analyses focusing on individuals with the maximum and the minimum SFF values (among the nine individuals) showed that there was considerable tree-to-tree variation in an array of the influential variables for SFF, even though they were planted in the same year and grown in the same plot. In addition to this difference, the BRT analyses also showed that response of SFF to LAI differs between the two individuals. The differentiating responses to LAI depending on individuals may be the

  5. Modelling of filariasis in East Java with Poisson regression and generalized Poisson regression models

    NASA Astrophysics Data System (ADS)

    Darnah

    2016-04-01

    Poisson regression has been used if the response variable is count data that based on the Poisson distribution. The Poisson distribution assumed equal dispersion. In fact, a situation where count data are over dispersion or under dispersion so that Poisson regression inappropriate because it may underestimate the standard errors and overstate the significance of the regression parameters, and consequently, giving misleading inference about the regression parameters. This paper suggests the generalized Poisson regression model to handling over dispersion and under dispersion on the Poisson regression model. The Poisson regression model and generalized Poisson regression model will be applied the number of filariasis cases in East Java. Based regression Poisson model the factors influence of filariasis are the percentage of families who don't behave clean and healthy living and the percentage of families who don't have a healthy house. The Poisson regression model occurs over dispersion so that we using generalized Poisson regression. The best generalized Poisson regression model showing the factor influence of filariasis is percentage of families who don't have healthy house. Interpretation of result the model is each additional 1 percentage of families who don't have healthy house will add 1 people filariasis patient.

  6. Explorations in Statistics: Regression

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas

    2011-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…

  7. Modern Regression Discontinuity Analysis

    ERIC Educational Resources Information Center

    Bloom, Howard S.

    2012-01-01

    This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…

  8. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  9. Mechanisms of neuroblastoma regression

    PubMed Central

    Brodeur, Garrett M.; Bagatell, Rochelle

    2014-01-01

    Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179

  10. Unimodular trees versus Einstein trees

    NASA Astrophysics Data System (ADS)

    Álvarez, Enrique; González-Martín, Sergio; Martín, Carmelo P.

    2016-10-01

    The maximally helicity violating tree-level scattering amplitudes involving three, four or five gravitons are worked out in Unimodular Gravity. They are found to coincide with the corresponding amplitudes in General Relativity. This a remarkable result, insofar as both the propagators and the vertices are quite different in the two theories.

  11. Current and Potential Tree Locations in Tree Line Ecotone of Changbai Mountains, Northeast China: The Controlling Effects of Topography

    PubMed Central

    Zong, Shengwei; Wu, Zhengfang; Xu, Jiawei; Li, Ming; Gao, Xiaofeng; He, Hongshi; Du, Haibo; Wang, Lei

    2014-01-01

    Tree line ecotone in the Changbai Mountains has undergone large changes in the past decades. Tree locations show variations on the four sides of the mountains, especially on the northern and western sides, which has not been fully explained. Previous studies attributed such variations to the variations in temperature. However, in this study, we hypothesized that topographic controls were responsible for causing the variations in the tree locations in tree line ecotone of the Changbai Mountains. To test the hypothesis, we used IKONOS images and WorldView-1 image to identify the tree locations and developed a logistic regression model using topographical variables to identify the dominant controls of the tree locations. The results showed that aspect, wetness, and slope were dominant controls for tree locations on western side of the mountains, whereas altitude, SPI, and aspect were the dominant factors on northern side. The upmost altitude a tree can currently reach was 2140 m asl on the northern side and 2060 m asl on western side. The model predicted results showed that habitats above the current tree line on the both sides were available for trees. Tree recruitments under the current tree line may take advantage of the available habitats at higher elevations based on the current tree location. Our research confirmed the controlling effects of topography on the tree locations in the tree line ecotone of Changbai Mountains and suggested that it was essential to assess the tree response to topography in the research of tree line ecotone. PMID:25170918

  12. Technical Tree Climbing.

    ERIC Educational Resources Information Center

    Jenkins, Peter

    Tree climbing offers a safe, inexpensive adventure sport that can be performed almost anywhere. Using standard procedures practiced in tree surgery or rock climbing, almost any tree can be climbed. Tree climbing provides challenge and adventure as well as a vigorous upper-body workout. Tree Climbers International classifies trees using a system…

  13. Ridge Regression Signal Processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.

  14. The gene tree delusion.

    PubMed

    Springer, Mark S; Gatesy, John

    2016-01-01

    Higher-level relationships among placental mammals are mostly resolved, but several polytomies remain contentious. Song et al. (2012) claimed to have resolved three of these using shortcut coalescence methods (MP-EST, STAR) and further concluded that these methods, which assume no within-locus recombination, are required to unravel deep-level phylogenetic problems that have stymied concatenation. Here, we reanalyze Song et al.'s (2012) data and leverage these re-analyses to explore key issues in systematics including the recombination ratchet, gene tree stoichiometry, the proportion of gene tree incongruence that results from deep coalescence versus other factors, and simulations that compare the performance of coalescence and concatenation methods in species tree estimation. Song et al. (2012) reported an average locus length of 3.1 kb for the 447 protein-coding genes in their phylogenomic dataset, but the true mean length of these loci (start codon to stop codon) is 139.6 kb. Empirical estimates of recombination breakpoints in primates, coupled with consideration of the recombination ratchet, suggest that individual coalescence genes (c-genes) approach ∼12 bp or less for Song et al.'s (2012) dataset, three to four orders of magnitude shorter than the c-genes reported by these authors. This result has general implications for the application of coalescence methods in species tree estimation. We contend that it is illogical to apply coalescence methods to complete protein-coding sequences. Such analyses amalgamate c-genes with different evolutionary histories (i.e., exons separated by >100,000 bp), distort true gene tree stoichiometry that is required for accurate species tree inference, and contradict the central rationale for applying coalescence methods to difficult phylogenetic problems. In addition, Song et al.'s (2012) dataset of 447 genes includes 21 loci with switched taxonomic names, eight duplicated loci, 26 loci with non-homologous sequences that are

  15. The gene tree delusion.

    PubMed

    Springer, Mark S; Gatesy, John

    2016-01-01

    Higher-level relationships among placental mammals are mostly resolved, but several polytomies remain contentious. Song et al. (2012) claimed to have resolved three of these using shortcut coalescence methods (MP-EST, STAR) and further concluded that these methods, which assume no within-locus recombination, are required to unravel deep-level phylogenetic problems that have stymied concatenation. Here, we reanalyze Song et al.'s (2012) data and leverage these re-analyses to explore key issues in systematics including the recombination ratchet, gene tree stoichiometry, the proportion of gene tree incongruence that results from deep coalescence versus other factors, and simulations that compare the performance of coalescence and concatenation methods in species tree estimation. Song et al. (2012) reported an average locus length of 3.1 kb for the 447 protein-coding genes in their phylogenomic dataset, but the true mean length of these loci (start codon to stop codon) is 139.6 kb. Empirical estimates of recombination breakpoints in primates, coupled with consideration of the recombination ratchet, suggest that individual coalescence genes (c-genes) approach ∼12 bp or less for Song et al.'s (2012) dataset, three to four orders of magnitude shorter than the c-genes reported by these authors. This result has general implications for the application of coalescence methods in species tree estimation. We contend that it is illogical to apply coalescence methods to complete protein-coding sequences. Such analyses amalgamate c-genes with different evolutionary histories (i.e., exons separated by >100,000 bp), distort true gene tree stoichiometry that is required for accurate species tree inference, and contradict the central rationale for applying coalescence methods to difficult phylogenetic problems. In addition, Song et al.'s (2012) dataset of 447 genes includes 21 loci with switched taxonomic names, eight duplicated loci, 26 loci with non-homologous sequences that are

  16. Tree thinning as an option to increase herbaceous yield of an encroached semi-arid savanna in South Africa

    PubMed Central

    Smit, Gert N

    2005-01-01

    Background The investigation was conducted in a savanna area covered by what was considered an undesirably dense stand of Colophospermum mopane trees, mainly because such a dense stand of trees often results in the suppression of herbaceous plants. The objectives of this study were to determine the influence of intensity of tree thinning on the dry matter yield of herbaceous plants (notably grasses) and to investigate differences in herbaceous species composition between defined subhabitats (under tree canopies, between tree canopies and where trees have been removed). Seven plots (65 × 180 m) were subjected to different intensities of tree thinning, ranging from a totally cleared plot (0 %) to plots thinned to the equivalent of 10 %, 20%, 35 %, 50% and 75 % of the leaf biomass of a control plot (100 %) with a tree density of 2711 plants ha-1. The establishment of herbaceous plants (grasses and forbs) in response to reduced competition from the woody plants was measured during three full growing seasons following the thinning treatments. Results The grass component reacted positively to the tree thinning in terms of total dry matter (DM) yield, but forbs were negatively influenced. Rainfall interacted with tree density and the differences between grass DM yields in thinned plots during years of below average rainfall were substantially higher than those of the control. At high tree densities, yields differed little between seasons of varying rainfall. The relation between grass DM yield and tree biomass was curvilinear, best described by the exponential regression equation. Subhabitat differentiation by C. mopane trees did provide some qualitative benefits, with certain desirable grass species showing a preference for the subhabitat under tree canopies. Conclusion While it can be concluded from this study that high tree densities suppress herbaceous production, the decision to clear/thin the C. mopane trees should include additional considerations. Thinning of C

  17. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    PubMed

    Chen, Shu-Chuan; Ogata, Aaron

    2015-01-01

    The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process. PMID:25826378

  18. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    PubMed

    Chen, Shu-Chuan; Ogata, Aaron

    2015-01-01

    The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  19. Extensions and applications of ensemble-of-trees methods in machine learning

    NASA Astrophysics Data System (ADS)

    Bleich, Justin

    Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of

  20. Understanding Boswellia papyrifera tree secondary metabolites through bark spectral analysis

    NASA Astrophysics Data System (ADS)

    Girma, Atkilt; Skidmore, Andrew K.; de Bie, C. A. J. M.; Bongers, Frans

    2015-07-01

    Decision makers are concerned whether to tap or rest Boswellia Papyrifera trees. Tapping for the production of frankincense is known to deplete carbon reserves from the tree leading to production of less viable seeds, tree carbon starvation and ultimately tree mortality. Decision makers use traditional experience without considering the amount of metabolites stored or depleted from the stem-bark of the tree. This research was designed to come up with a non-destructive B. papyrifera tree metabolite estimation technique relevant for management using spectroscopy. The concentration of biochemicals (metabolites) found in the tree bark was estimated through spectral analysis. Initially, a random sample of 33 trees was selected, the spectra of bark measured with an Analytical Spectral Device (ASD) spectrometer. Bark samples were air dried and ground. Then, 10 g of sample was soaked in Petroleum ether to extract crude metabolites. Further chemical analysis was conducted to quantify and isolate pure metabolite compounds such as incensole acetate and boswellic acid. The crude metabolites, which relate to frankincense produce, were compared to plant properties (such as diameter and crown area) and reflectance spectra of the bark. Moreover, the extract was compared to the ASD spectra using partial least square regression technique (PLSR) and continuum removed spectral analysis. The continuum removed spectral analysis were performed, on two wavelength regions (1275-1663 and 1836-2217) identified through PLSR, using absorption features such as band depth, area, position, asymmetry and the width to characterize and find relationship with the bark extracts. The results show that tree properties such as diameter at breast height (DBH) and the crown area of untapped and healthy trees were strongly correlated to the amount of stored crude metabolites. In addition, the PLSR technique applied to the first derivative transformation of the reflectance spectrum was found to estimate the

  1. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  2. Atlas of United States Trees, Volume 2: Alaska Trees and Common Shrubs.

    ERIC Educational Resources Information Center

    Viereck, Leslie A.; Little, Elbert L., Jr.

    This volume is the second in a series of atlases describing the natural distribution or range of native tree species in the United States. The 82 species maps include 32 of trees in Alaska, 6 of shrubs rarely reaching tree size, and 44 more of common shrubs. More than 20 additional maps summarize environmental factors and furnish general…

  3. Digression and Value Concatenation to Enable Privacy-Preserving Regression

    PubMed Central

    Li, Xiao-Bai; Sarkar, Sumit

    2015-01-01

    Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals’ sensitive data. This problem, which we call a “regression attack,” has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression, which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis. PMID:26752802

  4. Interactive effects of ambient ozone and climate measured on growth of mature loblolly pine trees

    SciTech Connect

    McLaughlin, S.B.; Downing, D.J.

    1995-02-01

    Analysis of the seasonal growth patterns of mature loblolly pine trees over the interval 1988-1993 has provided the first direct measurement of reductions of stem growth of large forest trees by ambient ozone. Patterns of stem expansion and contraction of 34 trees were examined in eastern Tennessee using serial measurements with sensitive dendrometer bind systems. Study sites varied in soil moisture, soil fertility, and stand density. Levels of ozone, rainfall, and temperature varied widely over the six year study interval. Regression analysis identified statistically and biologically significant influences of ozone on stem growth. Acting either individually or in interaction with high temperature and moisture stress, higher levels of ozone were associated with reduced stem expansion of individual trees within and across years. Observed responses to ozone were relatively rapid, differed widely among trees, and across years, and were significantly amplified by low soil moisture and high air temperatures. Both short term responses, clearly tied to changing stem water status, and longer term cumulative responses were identified. These data indicate that relatively low levels of ambient ozone can significantly reduce growth of mature forest trees and that interactions between ambient ozone and climate are likely to be important modifiers of future forest growth and function. Additional studies of mechanisms of short term response and inter species comparisons are clearly needed.

  5. The Tree Worker's Manual.

    ERIC Educational Resources Information Center

    Smithyman, S. J.

    This manual is designed to prepare students for entry-level positions as tree care professionals. Addressed in the individual chapters of the guide are the following topics: the tree service industry; clothing, eqiupment, and tools; tree workers; basic tree anatomy; techniques of pruning; procedures for climbing and working in the tree; aerial…

  6. CSWS-related autistic regression versus autistic regression without CSWS.

    PubMed

    Tuchman, Roberto

    2009-08-01

    Continuous spike-waves during slow-wave sleep (CSWS) and Landau-Kleffner syndrome (LKS) are two clinical epileptic syndromes that are associated with the electroencephalography (EEG) pattern of electrical status epilepticus during slow wave sleep (ESES). Autistic regression occurs in approximately 30% of children with autism and is associated with an epileptiform EEG in approximately 20%. The behavioral phenotypes of CSWS, LKS, and autistic regression overlap. However, the differences in age of regression, degree and type of regression, and frequency of epilepsy and EEG abnormalities suggest that these are distinct phenotypes. CSWS with autistic regression is rare, as is autistic regression associated with ESES. The pathophysiology and as such the treatment implications for children with CSWS and autistic regression are distinct from those with autistic regression without CSWS.

  7. Through bore subsea christmas trees

    SciTech Connect

    Huber, D.S.; Simmers, G.F.C.; Johnson, C.S.

    1985-01-01

    The workovers of subsea completed wells are expensive and time consuming as even the most routine tasks must be carried out by a semi-submersible. This paper describes the economic, safety and operational advantages which led to the development and successful first installation of 'through bore' subsea production trees. The conventional wet subsea trees have proved to be very reliable over the past ten years of operation in the Argyll, Duncan and Innes fields, however the completion strings require pulling on the average about once every three to five years. The conventional subsea tree/tubing hanger set up design requires the tree to be tripped and a rig BOP stack run to pull the tubing. This operation is time consuming, very weather sensitive and leaves the well temporarily without a well control stack on the wellhead. The 7 1/16'' 'through bore' subsea tree was developed to minimize the tubing pulling workover time and several trees have been run successfully since the latter part of 1984. The time saving on a tubing pulling workover is three days. In addition, the design considerably reduces the hazards and equipment damage risk inherent in the conventional design. Hamilton Brothers and National Supply Company in Aberdeen designed the equipment which must be considered a new generation of subsea production trees.

  8. Gene Tree Diameter for Deep Coalescence.

    PubMed

    Górecki, Paweł; Eulenstein, Oliver

    2015-01-01

    The deep coalescence cost accounts for discord caused by deep coalescence between a gene tree and a species tree. It is a major concern that the diameter of a gene tree (the tree's maximum deep coalescence cost across all species trees) depends on its topology, which can largely obfuscate phylogenetic studies. While this bias can be compensated by normalizing the deep coalescence cost using diameters, obtaining them efficiently has been posed as an open problem by Than and Rosenberg. Here, we resolve this problem by describing a linear time algorithm to compute the diameter of a gene tree. In addition, we provide a complete classification of the species trees yielding this diameter to guide phylogenetic analyses.

  9. Method for estimating potential tree-grade distributions for northeastern forest species. Forest Service research paper (Final)

    SciTech Connect

    Yaussy, D.A.

    1993-03-01

    The generalized logistic regression was used to distribute trees into four potential tree grades for 20 northeastern species groups. The potential tree grade is defined as the tree grade based on the length and amount of clear cuttings and defects only, disregarding minimum grading diameter. The algorithms described use site index and tree diameter as the predictive variables, allowing the equations to be incorporated into individual-tree growth and yield simulators such as NE-TWIGS.

  10. Estimating Scots Pine Tree Mortality Using High Resolution Multispectral Images

    NASA Astrophysics Data System (ADS)

    Buriak, L.; Sukhinin, A. I.; Conard, S. G.; Ivanova, G. A.; McRae, D. J.; Soja, A. J.; Okhotkina, E.

    2010-12-01

    Scots pine (Pinus sylvestris) forest stands of central Siberia are characterized by a mixed-severity fire regime that is dominated by low- to high-severity surface fires, with crown fires occurring less frequently. The purpose of this study was to link ground measurements with air-borne and satellite observations of active wildfires and older fire scars to better estimate tree mortality remotely. Data from field sampling on experimental fires and wildfires were linked with intermediate-resolution satellite (Landsat Enhanced Thematic Mapper) data to estimate fire severity and carbon emissions. Results are being applied to Advanced Very High Resolution Radiometer (AVHRR) and Moderate Resolution Imaging Spectroradiometer (MODIS) imagery, MERIS, Landsat-ETM, SPOT (i.e., low, middle and high spatial resolution), to understand their remote-sensing capability for mapping fire severity, as indicated by tree mortality. Tree mortality depends on fireline intensity, residence time, and the physiological effects on the cambium layer, foliage and roots. We have correlated tree mortality measured after fires of varying severity with NDVI and other Chlorophyll Indexes to model tree mortality on a landscape scale. The field data obtained on experimental and wildfires are being analyzed and compared with intermediate-resolution satellite data (Landsat7-ETM) to help estimate fire severity, emissions, and carbon balance. In addition, it is being used to monitor immediate ecosystem fire effects (e.g., tree mortality) and long-term postfire vegetation recovery. These data are also being used to validate AVHRR , MODIS, and MERIS estimates of burn area. We studied burned areas in the Angara Region of central Siberia (northeast of Lake Baikal) for which both ground data and satellite data (ENVISAT-MERIS, Spot4, Landsat5, Landsat7-ETM) were available for the 2003 - 2004 and 2006 - 2008 periods. Ground validation was conducted on seventy sample plots established on burned sites differing in

  11. Using tree diversity to compare phylogenetic heuristics

    PubMed Central

    Sul, Seung-Jin; Matthews, Suzanne; Williams, Tiffani L

    2009-01-01

    Background Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms. Results Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3. Conclusion Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees—especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest. PMID:19426451

  12. Insert tree completion system

    SciTech Connect

    Brands, K.W.; Ball, I.G.; Cegielski, E.J.; Gresham, J.S.; Saunders, D.N.

    1982-09-01

    This paper outlines the overall project for development and installation of a low-profile, caisson-installed subsea Christmas tree. After various design studies and laboratory and field tests of key components, a system for installation inside a 30-in. conductor was ordered in July 1978 from Cameron Iron Works Inc. The system is designed to have all critical-pressure-containing components below the mudline and, with the reduced profile (height) above seabed, provides for improved safety of satellite underwater wells from damage by anchors, trawl boards, and even icebergs. In addition to the innovative nature of the tree design, the completion includes improved 3 1/2-in. through flowline (TFL) pumpdown completion equipment with deep set safety valves and a dual detachable packer head for simplified workover capability. The all-hydraulic control system incorporates a new design of sequencing valve for both Christmas tree control and remote flowline connection. A semisubmersible drilling rig was used to initiate the first end flowline connection at the wellhead for subsequent tie-in to the prelaid, surface-towed, all-welded subsea pipeline bundle.

  13. Wild bootstrap for quantile regression.

    PubMed

    Feng, Xingdong; He, Xuming; Hu, Jianhua

    2011-12-01

    The existing theory of the wild bootstrap has focused on linear estimators. In this note, we broaden its validity by providing a class of weight distributions that is asymptotically valid for quantile regression estimators. As most weight distributions in the literature lead to biased variance estimates for nonlinear estimators of linear regression, we propose a modification of the wild bootstrap that admits a broader class of weight distributions for quantile regression. A simulation study on median regression is carried out to compare various bootstrap methods. With a simple finite-sample correction, the wild bootstrap is shown to account for general forms of heteroscedasticity in a regression model with fixed design points.

  14. Non-phytoseiid Mesostigmata within citrus orchards in Florida: species distribution, relative and seasonal abundance within trees, associated vines and ground cover plants and additional collection records of mites in citrus orchards.

    PubMed

    Childers, Carl C; Ueckermann, Eduard A

    2015-03-01

    Seven citrus orchards on reduced- to no-pesticide spray programs in central and south central Florida were sampled for non-phytoseiid mesostigmatid mites. Inner and outer canopy leaves, fruits, twigs and trunk scrapings were sampled monthly between August 1994 and January 1996. Open flowers were sampled in March from five of the sites. A total of 431 samples from one or more of 82 vine or ground cover plants were sampled monthly in five of the seven orchards. Two of the seven orchards (Mixon I and II) were on full herbicide programs and vines and ground cover plants were absent. A total of 2,655 mites (26 species) within the families: Ascidae, Blattisociidae, Laelapidae, Macrochelidae, Melicharidae, Pachylaelapidae and Parasitidae were identified. A total of 685 mites in the genus Asca (nine species: family Ascidae) were collected from within tree samples, 79 from vine or ground cover plants. Six species of Blattisociidae were collected: Aceodromus convolvuli, Blattisocius dentriticus, B. keegani, Cheiroseius sp. near jamaicensis, Lasioseius athiashenriotae and L. dentatus. A total of 485 Blattisociidae were collected from within tree samples compared with 167 from vine or ground cover plants. Low numbers of Laelapidae and Macrochelidae were collected from within tree samples. One Zygoseius furciger (Pachylaelapidae) was collected from Eleusine indica. Four species of Melicharidae were identified from 34 mites collected from within tree samples and 1,190 from vine or ground cover plants: Proctolaelaps lobatus was the most abundant species with 1,177 specimens collected from seven ground cover plants. One Phorytocarpais fimetorum (Parasitidae) was collected from inner leaves and four from twigs. Species of Ascidae, Blattisociidae, Melicharidae, Laelapidae and Pachylaelapidae were collected from 31 of the 82 vine or ground cover plants sampled, representing only a small fraction of the total number of Phytoseiidae collected from the same plants. Including the

  15. Non-phytoseiid Mesostigmata within citrus orchards in Florida: species distribution, relative and seasonal abundance within trees, associated vines and ground cover plants and additional collection records of mites in citrus orchards.

    PubMed

    Childers, Carl C; Ueckermann, Eduard A

    2015-03-01

    Seven citrus orchards on reduced- to no-pesticide spray programs in central and south central Florida were sampled for non-phytoseiid mesostigmatid mites. Inner and outer canopy leaves, fruits, twigs and trunk scrapings were sampled monthly between August 1994 and January 1996. Open flowers were sampled in March from five of the sites. A total of 431 samples from one or more of 82 vine or ground cover plants were sampled monthly in five of the seven orchards. Two of the seven orchards (Mixon I and II) were on full herbicide programs and vines and ground cover plants were absent. A total of 2,655 mites (26 species) within the families: Ascidae, Blattisociidae, Laelapidae, Macrochelidae, Melicharidae, Pachylaelapidae and Parasitidae were identified. A total of 685 mites in the genus Asca (nine species: family Ascidae) were collected from within tree samples, 79 from vine or ground cover plants. Six species of Blattisociidae were collected: Aceodromus convolvuli, Blattisocius dentriticus, B. keegani, Cheiroseius sp. near jamaicensis, Lasioseius athiashenriotae and L. dentatus. A total of 485 Blattisociidae were collected from within tree samples compared with 167 from vine or ground cover plants. Low numbers of Laelapidae and Macrochelidae were collected from within tree samples. One Zygoseius furciger (Pachylaelapidae) was collected from Eleusine indica. Four species of Melicharidae were identified from 34 mites collected from within tree samples and 1,190 from vine or ground cover plants: Proctolaelaps lobatus was the most abundant species with 1,177 specimens collected from seven ground cover plants. One Phorytocarpais fimetorum (Parasitidae) was collected from inner leaves and four from twigs. Species of Ascidae, Blattisociidae, Melicharidae, Laelapidae and Pachylaelapidae were collected from 31 of the 82 vine or ground cover plants sampled, representing only a small fraction of the total number of Phytoseiidae collected from the same plants. Including the

  16. The influence of tree morphology on stemflow generation in a tropical lowland rainforest

    NASA Astrophysics Data System (ADS)

    Uber, Magdalena; Levia, Delphis F.; Zimmermann, Beate; Zimmermann, Alexander

    2014-05-01

    Even though stemflow usually accounts for only a small proportion of rainfall, it is an important point source of water and ion input to forest floors and may, for instance, influence soil moisture patterns and groundwater recharge. Previous studies showed that the generation of stemflow depends on a multitude of meteorological and biological factors. Interestingly, despite the tremendous progress in stemflow research during the last decades it is still largely unknown which combination of tree characteristics determines stemflow volumes in species-rich tropical forests. This knowledge gap motivated us to analyse the influence of tree characteristics on stemflow volumes in a 1 hectare plot located in a Panamanian lowland rainforest. Our study comprised stemflow measurements in six randomly selected 10 m by 10 m subplots. In each subplot we measured stemflow of all trees with a diameter at breast height (DBH) > 5 cm on an event-basis for a period of six weeks. Additionally, we identified all tree species and determined a set of tree characteristics including DBH, crown diameter, bark roughness, bark furrowing, epiphyte coverage, tree architecture, stem inclination, and crown position. During the sampling period, we collected 985 L of stemflow (0.98 % of total rainfall). Based on regression analyses and comparisons among plant functional groups we show that palms were most efficient in yielding stemflow due to their large inclined fronds. Trees with large emergent crowns also produced relatively large amounts of stemflow. Due to their abundance, understory trees contribute much to stemflow yield not on individual but on the plot scale. Even though parameters such as crown diameter, branch inclination and position of the crown influence stemflow generation to some extent, these parameters explain less than 30 % of the variation in stemflow volumes. In contrast to published results from temperate forests, we did not detect a negative correlation between bark roughness

  17. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy.

    PubMed

    Letunic, Ivica; Bork, Peer

    2011-07-01

    Interactive Tree Of Life (http://itol.embl.de) is a web-based tool for the display, manipulation and annotation of phylogenetic trees. It is freely available and open to everyone. In addition to classical tree viewer functions, iTOL offers many novel ways of annotating trees with various additional data. Current version introduces numerous new features and greatly expands the number of supported data set types. Trees can be interactively manipulated and edited. A free personal account system is available, providing management and sharing of trees in user defined workspaces and projects. Export to various bitmap and vector graphics formats is supported. Batch access interface is available for programmatic access or inclusion of interactive trees into other web services.

  18. Tree Tectonics

    NASA Astrophysics Data System (ADS)

    Vogt, Peter R.

    2004-09-01

    Nature often replicates her processes at different scales of space and time in differing media. Here a tree-trunk cross section I am preparing for a dendrochronological display at the Battle Creek Cypress Swamp Nature Sanctuary (Calvert County, Maryland) dried and cracked in a way that replicates practically all the planform features found along the Mid-Oceanic Ridge (see Figure 1). The left-lateral offset of saw marks, contrasting with the right-lateral ``rift'' offset, even illustrates the distinction between transcurrent (strike-slip) and transform faults, the latter only recognized as a geologic feature, by J. Tuzo Wilson, in 1965. However, wood cracking is but one of many examples of natural processes that replicate one or several elements of lithospheric plate tectonics. Many of these examples occur in everyday venues and thus make great teaching aids, ``teachable'' from primary school to university levels. Plate tectonics, the dominant process of Earth geology, also occurs in miniature on the surface of some lava lakes, and as ``ice plate tectonics'' on our frozen seas and lakes. Ice tectonics also happens at larger spatial and temporal scales on the Jovian moons Europa and perhaps Ganymede. Tabletop plate tectonics, in which a molten-paraffin ``asthenosphere'' is surfaced by a skin of congealing wax ``plates,'' first replicated Mid-Oceanic Ridge type seafloor spreading more than three decades ago. A seismologist (J. Brune, personal communication, 2004) discovered wax plate tectonics by casually and serendipitously pulling a stick across a container of molten wax his wife and daughters had used in making candles. Brune and his student D. Oldenburg followed up and mirabile dictu published the results in Science (178, 301-304).

  19. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  20. The Needs of Trees

    ERIC Educational Resources Information Center

    Boyd, Amy E.; Cooper, Jim

    2004-01-01

    Tree rings can be used not only to look at plant growth, but also to make connections between plant growth and resource availability. In this lesson, students in 2nd-4th grades use role-play to become familiar with basic requirements of trees and how availability of those resources is related to tree ring sizes and tree growth. These concepts can…

  1. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  2. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  3. Does Gene Tree Discordance Explain the Mismatch between Macroevolutionary Models and Empirical Patterns of Tree Shape and Branching Times?

    PubMed Central

    Stadler, Tanja; Degnan, James H.; Rosenberg, Noah A.

    2016-01-01

    Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth–death and multispecies coalescent model can explain the difference in empirical trees and birth–death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion. PMID:26968785

  4. Does Gene Tree Discordance Explain the Mismatch between Macroevolutionary Models and Empirical Patterns of Tree Shape and Branching Times?

    PubMed

    Stadler, Tanja; Degnan, James H; Rosenberg, Noah A

    2016-07-01

    Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth-death and multispecies coalescent model can explain the difference in empirical trees and birth-death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion. PMID:26968785

  5. Censored partial regression.

    PubMed

    Orbe, Jesus; Ferreira, Eva; Núñez-Antón, Vicente

    2003-01-01

    In this work we study the effect of several covariates on a censored response variable with unknown probability distribution. A semiparametric model is proposed to consider situations where the functional form of the effect of one or more covariates is unknown, as is the case in the application presented in this work. We provide its estimation procedure and, in addition, a bootstrap technique to make inference on the parameters. A simulation study has been carried out to show the good performance of the proposed estimation process and to analyse the effect of the censorship. Finally, we present the results when the methodology is applied to AIDS diagnosed patients.

  6. Support vector machines for classification and regression.

    PubMed

    Brereton, Richard G; Lloyd, Gavin R

    2010-02-01

    The increasing interest in Support Vector Machines (SVMs) over the past 15 years is described. Methods are illustrated using simulated case studies, and 4 experimental case studies, namely mass spectrometry for studying pollution, near infrared analysis of food, thermal analysis of polymers and UV/visible spectroscopy of polyaromatic hydrocarbons. The basis of SVMs as two-class classifiers is shown with extensive visualisation, including learning machines, kernels and penalty functions. The influence of the penalty error and radial basis function radius on the model is illustrated. Multiclass implementations including one vs. all, one vs. one, fuzzy rules and Directed Acyclic Graph (DAG) trees are described. One-class Support Vector Domain Description (SVDD) is described and contrasted to conventional two- or multi-class classifiers. The use of Support Vector Regression (SVR) is illustrated including its application to multivariate calibration, and why it is useful when there are outliers and non-linearities.

  7. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  8. Quantile regression for climate data

    NASA Astrophysics Data System (ADS)

    Marasinghe, Dilhani Shalika

    Quantile regression is a developing statistical tool which is used to explain the relationship between response and predictor variables. This thesis describes two examples of climatology using quantile regression.Our main goal is to estimate derivatives of a conditional mean and/or conditional quantile function. We introduce a method to handle autocorrelation in the framework of quantile regression and used it with the temperature data. Also we explain some properties of the tornado data which is non-normally distributed. Even though quantile regression provides a more comprehensive view, when talking about residuals with the normality and the constant variance assumption, we would prefer least square regression for our temperature analysis. When dealing with the non-normality and non constant variance assumption, quantile regression is a better candidate for the estimation of the derivative.

  9. A New GP Recombination Method Using Random Tree Sampling

    NASA Astrophysics Data System (ADS)

    Tanji, Makoto; Iba, Hitoshi

    We propose a new program evolution method named PORTS (Program Optimization by Random Tree Sampling) which is motivated by the idea of preservation and control of tree fragments in GP (Genetic Programming). We assume that to recombine genetic materials efficiently, tree fragments of any size should be preserved into the next generation. PORTS samples tree fragments and concatenates them by traversing and transitioning between promising trees instead of using subtree crossover and mutation. Because the size of a fragment preserved during a generation update follows a geometric distribution, merits of the method are that it is relatively easy to predict the behavior of tree fragments over time and to control sampling size, by changing a single parameter. From experimental results on RoyalTree, Symbolic Regression and 6-Multiplexer problem, we observed that the performance of PORTS is competitive with Simple GP. Furthermore, the average node size of optimal solutions obtained by PORTS was simple than Simple GP's result.

  10. Decision Tree Modeling for Ranking Data

    NASA Astrophysics Data System (ADS)

    Yu, Philip L. H.; Wan, Wai Ming; Lee, Paul H.

    Ranking/preference data arises from many applications in marketing, psychology, and politics. We establish a new decision tree model for the analysis of ranking data by adopting the concept of classification and regression tree. The existing splitting criteria are modified in a way that allows them to precisely measure the impurity of a set of ranking data. Two types of impurity measures for ranking data are introduced, namelyg-wise and top-k measures. Theoretical results show that the new measures exhibit properties of impurity functions. In model assessment, the area under the ROC curve (AUC) is applied to evaluate the tree performance. Experiments are carried out to investigate the predictive performance of the tree model for complete and partially ranked data and promising results are obtained. Finally, a real-world application of the proposed methodology to analyze a set of political rankings data is presented.

  11. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  12. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA. PMID:11410035

  13. Trees Grow on Money: Urban Tree Canopy Cover and Environmental Justice

    PubMed Central

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G.; Zhou, Weiqi; McHale, Melissa; Grove, J. Morgan; O’Neil-Dunne, Jarlath; McFadden, Joseph P.; Buckley, Geoffrey L.; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L.

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman’s correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns. PMID:25830303

  14. Trees grow on money: urban tree canopy cover and environmental justice.

    PubMed

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G; Zhou, Weiqi; McHale, Melissa; Grove, J Morgan; O'Neil-Dunne, Jarlath; McFadden, Joseph P; Buckley, Geoffrey L; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman's correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns.

  15. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  16. Can luteal regression be reversed?

    PubMed Central

    Telleria, Carlos M

    2006-01-01

    The corpus luteum is an endocrine gland whose limited lifespan is hormonally programmed. This debate article summarizes findings of our research group that challenge the principle that the end of function of the corpus luteum or luteal regression, once triggered, cannot be reversed. Overturning luteal regression by pharmacological manipulations may be of critical significance in designing strategies to improve fertility efficacy. PMID:17074090

  17. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  18. Wild bootstrap for quantile regression.

    PubMed

    Feng, Xingdong; He, Xuming; Hu, Jianhua

    2011-12-01

    The existing theory of the wild bootstrap has focused on linear estimators. In this note, we broaden its validity by providing a class of weight distributions that is asymptotically valid for quantile regression estimators. As most weight distributions in the literature lead to biased variance estimates for nonlinear estimators of linear regression, we propose a modification of the wild bootstrap that admits a broader class of weight distributions for quantile regression. A simulation study on median regression is carried out to compare various bootstrap methods. With a simple finite-sample correction, the wild bootstrap is shown to account for general forms of heteroscedasticity in a regression model with fixed design points. PMID:23049133

  19. [Regression grading in gastrointestinal tumors].

    PubMed

    Tischoff, I; Tannapfel, A

    2012-02-01

    Preoperative neoadjuvant chemoradiation therapy is a well-established and essential part of the interdisciplinary treatment of gastrointestinal tumors. Neoadjuvant treatment leads to regressive changes in tumors. To evaluate the histological tumor response different scoring systems describing regressive changes are used and known as tumor regression grading. Tumor regression grading is usually based on the presence of residual vital tumor cells in proportion to the total tumor size. Currently, no nationally or internationally accepted grading systems exist. In general, common guidelines should be used in the pathohistological diagnostics of tumors after neoadjuvant therapy. In particularly, the standard tumor grading will be replaced by tumor regression grading. Furthermore, tumors after neoadjuvant treatment are marked with the prefix "y" in the TNM classification. PMID:22293790

  20. Fungible weights in logistic regression.

    PubMed

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record

  1. Categorizing Ideas about Trees: A Tree of Trees

    PubMed Central

    Fisler, Marie; Lecointre, Guillaume

    2013-01-01

    The aim of this study is to explore whether matrices and MP trees used to produce systematic categories of organisms could be useful to produce categories of ideas in history of science. We study the history of the use of trees in systematics to represent the diversity of life from 1766 to 1991. We apply to those ideas a method inspired from coding homologous parts of organisms. We discretize conceptual parts of ideas, writings and drawings about trees contained in 41 main writings; we detect shared parts among authors and code them into a 91-characters matrix and use a tree representation to show who shares what with whom. In other words, we propose a hierarchical representation of the shared ideas about trees among authors: this produces a “tree of trees.” Then, we categorize schools of tree-representations. Classical schools like “cladists” and “pheneticists” are recovered but others are not: “gradists” are separated into two blocks, one of them being called here “grade theoreticians.” We propose new interesting categories like the “buffonian school,” the “metaphoricians,” and those using “strictly genealogical classifications.” We consider that networks are not useful to represent shared ideas at the present step of the study. A cladogram is made for showing who is sharing what with whom, but also heterobathmy and homoplasy of characters. The present cladogram is not modelling processes of transmission of ideas about trees, and here it is mostly used to test for proximity of ideas of the same age and for categorization. PMID:23950877

  2. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes

    PubMed Central

    Petermann, Jana S.; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W.; Gossner, Martin M.

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  3. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes.

    PubMed

    Petermann, Jana S; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W; Gossner, Martin M

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  4. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes.

    PubMed

    Petermann, Jana S; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W; Gossner, Martin M

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  5. Chem-Is-Tree.

    ERIC Educational Resources Information Center

    Barry, Dana M.

    1997-01-01

    Provides details on the chemical composition of trees including a definition of wood. Also includes an activity on anthocyanins as well as a discussion of the resistance of wood to solvents and chemicals. Lists interesting products from trees. (DDR)

  6. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  7. Category of trees in representation theory of quantum algebras

    SciTech Connect

    Moskaliuk, N. M.; Moskaliuk, S. S.

    2013-10-15

    New applications of categorical methods are connected with new additional structures on categories. One of such structures in representation theory of quantum algebras, the category of Kuznetsov-Smorodinsky-Vilenkin-Smirnov (KSVS) trees, is constructed, whose objects are finite rooted KSVS trees and morphisms generated by the transition from a KSVS tree to another one.

  8. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  9. A tutorial on Bayesian Normal linear regression

    NASA Astrophysics Data System (ADS)

    Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens

    2015-12-01

    Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.

  10. Modeling confounding by half-sibling regression.

    PubMed

    Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-07-01

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154

  11. Modeling confounding by half-sibling regression.

    PubMed

    Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-07-01

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application.

  12. Modeling confounding by half-sibling regression

    PubMed Central

    Schölkopf, Bernhard; Hogg, David W.; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-01-01

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as “half-sibling regression,” is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154

  13. Carbon dynamics in trees: feast or famine?

    PubMed

    Sala, Anna; Woodruff, David R; Meinzer, Frederick C

    2012-06-01

    Research on the degree to which carbon (C) availability limits growth in trees, as well as recent trends in climate change and concurrent increases in drought-related tree mortality, have led to a renewed focus on the physiological mechanisms associated with tree growth responses to current and future climate. This has led to some dispute over the role of stored non-structural C compounds as indicators of a tree's current demands for photosynthate. Much of the uncertainty surrounding this issue could be resolved by developing a better understanding of the potential functions of non-structural C stored within trees. In addition to functioning as a buffer to reconcile temporal asynchrony between C demand and supply, the storage of non-structural C compounds may be under greater regulation than commonly recognized. We propose that in the face of environmental stochasticity, large, long-lived trees may require larger C investments in storage pools as safety margins than previously recognized, and that an important function of these pools may be to maintain hydraulic transport, particularly during episodes of severe stress. If so, survival and long-term growth in trees remain a function of C availability. Given that drought, freeze-thaw events and increasing tree height all impose additional constraints on vascular transport, the common trend of an increase in non-structural carbohydrate concentrations with tree size, drought or cold is consistent with our hypothesis. If the regulated maintenance of relatively large constitutive stored C pools in trees serves to maintain hydraulic integrity, then the minimum thresholds are expected to vary depending on the specific tissues, species, environment, growth form and habit. Much research is needed to elucidate the extent to which allocation of C to storage in trees is a passive vs. an active process, the specific functions of stored C pools, and the factors that drive active C allocation to storage. PMID:22302370

  14. Carbon dynamics in trees: feast or famine?

    PubMed

    Sala, Anna; Woodruff, David R; Meinzer, Frederick C

    2012-06-01

    Research on the degree to which carbon (C) availability limits growth in trees, as well as recent trends in climate change and concurrent increases in drought-related tree mortality, have led to a renewed focus on the physiological mechanisms associated with tree growth responses to current and future climate. This has led to some dispute over the role of stored non-structural C compounds as indicators of a tree's current demands for photosynthate. Much of the uncertainty surrounding this issue could be resolved by developing a better understanding of the potential functions of non-structural C stored within trees. In addition to functioning as a buffer to reconcile temporal asynchrony between C demand and supply, the storage of non-structural C compounds may be under greater regulation than commonly recognized. We propose that in the face of environmental stochasticity, large, long-lived trees may require larger C investments in storage pools as safety margins than previously recognized, and that an important function of these pools may be to maintain hydraulic transport, particularly during episodes of severe stress. If so, survival and long-term growth in trees remain a function of C availability. Given that drought, freeze-thaw events and increasing tree height all impose additional constraints on vascular transport, the common trend of an increase in non-structural carbohydrate concentrations with tree size, drought or cold is consistent with our hypothesis. If the regulated maintenance of relatively large constitutive stored C pools in trees serves to maintain hydraulic integrity, then the minimum thresholds are expected to vary depending on the specific tissues, species, environment, growth form and habit. Much research is needed to elucidate the extent to which allocation of C to storage in trees is a passive vs. an active process, the specific functions of stored C pools, and the factors that drive active C allocation to storage.

  15. Illumination Under Trees

    SciTech Connect

    Max, N

    2002-08-19

    This paper is a survey of the author's work on illumination and shadows under trees, including the effects of sky illumination, sun penumbras, scattering in a misty atmosphere below the trees, and multiple scattering and transmission between leaves. It also describes a hierarchical image-based rendering method for trees.

  16. Winter Birch Trees

    ERIC Educational Resources Information Center

    Sweeney, Debra; Rounds, Judy

    2011-01-01

    Trees are great inspiration for artists. Many art teachers find themselves inspired and maybe somewhat obsessed with the natural beauty and elegance of the lofty tree, and how it changes through the seasons. One such tree that grows in several regions and always looks magnificent, regardless of the time of year, is the birch. In this article, the…

  17. Minnesota's Forest Trees. Revised.

    ERIC Educational Resources Information Center

    Miles, William R.; Fuller, Bruce L.

    This bulletin describes 46 of the more common trees found in Minnesota's forests and windbreaks. The bulletin contains two tree keys, a summer key and a winter key, to help the reader identify these trees. Besides the two keys, the bulletin includes an introduction, instructions for key use, illustrations of leaf characteristics and twig…

  18. The Wish Tree Project

    ERIC Educational Resources Information Center

    Brooks, Sarah DeWitt

    2010-01-01

    This article describes the author's experience in implementing a Wish Tree project in her school in an effort to bring the school community together with a positive art-making experience during a potentially stressful time. The concept of a wish tree is simple: plant a tree; provide tags and pencils for writing wishes; and encourage everyone to…

  19. Multiple Regression and Its Discontents

    ERIC Educational Resources Information Center

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  20. Regression methods for spatial data

    NASA Technical Reports Server (NTRS)

    Yakowitz, S. J.; Szidarovszky, F.

    1982-01-01

    The kriging approach, a parametric regression method used by hydrologists and mining engineers, among others also provides an error estimate the integral of the regression function. The kriging method is explored and some of its statistical characteristics are described. The Watson method and theory are extended so that the kriging features are displayed. Theoretical and computational comparisons of the kriging and Watson approaches are offered.

  1. Basis Selection for Wavelet Regression

    NASA Technical Reports Server (NTRS)

    Wheeler, Kevin R.; Lau, Sonie (Technical Monitor)

    1998-01-01

    A wavelet basis selection procedure is presented for wavelet regression. Both the basis and the threshold are selected using cross-validation. The method includes the capability of incorporating prior knowledge on the smoothness (or shape of the basis functions) into the basis selection procedure. The results of the method are demonstrated on sampled functions widely used in the wavelet regression literature. The results of the method are contrasted with other published methods.

  2. Regression Discontinuity Designs in Epidemiology

    PubMed Central

    Moscoe, Ellen; Mutevedzi, Portia; Newell, Marie-Louise; Bärnighausen, Till

    2014-01-01

    When patients receive an intervention based on whether they score below or above some threshold value on a continuously measured random variable, the intervention will be randomly assigned for patients close to the threshold. The regression discontinuity design exploits this fact to estimate causal treatment effects. In spite of its recent proliferation in economics, the regression discontinuity design has not been widely adopted in epidemiology. We describe regression discontinuity, its implementation, and the assumptions required for causal inference. We show that regression discontinuity is generalizable to the survival and nonlinear models that are mainstays of epidemiologic analysis. We then present an application of regression discontinuity to the much-debated epidemiologic question of when to start HIV patients on antiretroviral therapy. Using data from a large South African cohort (2007–2011), we estimate the causal effect of early versus deferred treatment eligibility on mortality. Patients whose first CD4 count was just below the 200 cells/μL CD4 count threshold had a 35% lower hazard of death (hazard ratio = 0.65 [95% confidence interval = 0.45–0.94]) than patients presenting with CD4 counts just above the threshold. We close by discussing the strengths and limitations of regression discontinuity designs for epidemiology. PMID:25061922

  3. Metrics on multilabeled trees: interrelationships and diameter bounds.

    PubMed

    Huber, Katharina T; Spillner, Andreas; Suchecki, Radosław; Moulton, Vincent

    2011-01-01

    Multilabeled trees or MUL-trees, for short, are trees whose leaves are labeled by elements of some nonempty finite set X such that more than one leaf may be labeled by the same element of X. This class of trees includes phylogenetic trees and tree shapes. MUL-trees arise naturally in, for example, biogeography and gene evolution studies and also in the area of phylogenetic network reconstruction. In this paper, we introduce novel metrics which may be used to compare MUL-trees, most of which generalize well-known metrics on phylogenetic trees and tree shapes. These metrics can be used, for example, to better understand the space of MUL-trees or to help visualize collections of MUL-trees. In addition, we describe some relationships between the MUL-tree metrics that we present and also give some novel diameter bounds for these metrics. We conclude by briefly discussing some open problems as well as pointing out how MUL-tree metrics may be used to define metrics on the space of phylogenetic networks.

  4. Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression.

    PubMed

    Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula

    2011-01-01

    Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.

  5. A Spectrum Tree Kernel

    NASA Astrophysics Data System (ADS)

    Kuboyama, Tetsuji; Hirata, Kouichi; Kashima, Hisashi; F. Aoki-Kinoshita, Kiyoko; Yasuda, Hiroshi

    Learning from tree-structured data has received increasing interest with the rapid growth of tree-encodable data in the World Wide Web, in biology, and in other areas. Our kernel function measures the similarity between two trees by counting the number of shared sub-patterns called tree q-grams, and runs, in effect, in linear time with respect to the number of tree nodes. We apply our kernel function with a support vector machine (SVM) to classify biological data, the glycans of several blood components. The experimental results show that our kernel function performs as well as one exclusively tailored to glycan properties.

  6. Distributed Contour Trees

    SciTech Connect

    Morozov, Dmitriy; Weber, Gunther H.

    2014-03-31

    Topological techniques provide robust tools for data analysis. They are used, for example, for feature extraction, for data de-noising, and for comparison of data sets. This chapter concerns contour trees, a topological descriptor that records the connectivity of the isosurfaces of scalar functions. These trees are fundamental to analysis and visualization of physical phenomena modeled by real-valued measurements. We study the parallel analysis of contour trees. After describing a particular representation of a contour tree, called local{global representation, we illustrate how di erent problems that rely on contour trees can be solved in parallel with minimal communication.

  7. Trees, soils, and food security

    PubMed Central

    Sanchez, P. A.; Buresh, R. J.; Leakey, R. R. B.

    1997-01-01

    Trees have a different impact on soil properties than annual crops, because of their longer residence time, larger biomass accumulation, and longer-lasting, more extensive root systems. In natural forests nutrients are efficiently cycled with very small inputs and outputs from the system. In most agricultural systems the opposite happens. Agroforestry encompasses the continuum between these extremes, and emerging hard data is showing that successful agroforestry systems increase nutrient inputs, enhance internal flows, decrease nutrient losses and provide environmental benefits: when the competition for growth resources between the tree and the crop component is well managed. The three main determinants for overcoming rural poverty in Africa are (i) reversing soil fertility depletion, (ii) intensifying and diversifying land use with high-value products, and (iii) providing an enabling policy environment for the smallholder farming sector. Agroforestry practices can improve food production in a sustainable way through their contribution to soil fertility replenishment. The use of organic inputs as a source of biologically-fixed nitrogen, together with deep nitrate that is captured by trees, plays a major role in nitrogen replenishment. The combination of commercial phosphorus fertilizers with available organic resources may be the key to increasing and sustaining phosphorus capital. High-value trees, 'Cinderella' species, can fit in specific niches on farms, thereby making the system ecologically stable and more rewarding economically, in addition to diversifying and increasing rural incomes and improving food security. In the most heavily populated areas of East Africa, where farm size is extremely small, the number of trees on farms is increasing as farmers seek to reduce labour demands, compatible with the drift of some members of the family into the towns to earn off-farm income. Contrary to the concept that population pressure promotes deforestation, there is

  8. Growth of a Pine Tree

    ERIC Educational Resources Information Center

    Rollinson, Susan Wells

    2012-01-01

    The growth of a pine tree is examined by preparing "tree cookies" (cross-sectional disks) between whorls of branches. The use of Christmas trees allows the tree cookies to be obtained with inexpensive, commonly available tools. Students use the tree cookies to investigate the annual growth of the tree and how it corresponds to the number of whorls…

  9. Estimating peak flow characteristics at ungaged sites by ridge regression

    USGS Publications Warehouse

    Tasker, Gary D.

    1982-01-01

    A regression simulation model, is combined with a multisite streamflow generator to simulate a regional regression of 50-year peak discharge against a set of basin characteristics. Monte Carlo experiments are used to compare the unbiased ordinary lease squares parameter estimator with Hoerl and Kennard's (1970a) ridge estimator in which the biasing parameter is that proposed by Hoerl, Kennard, and Baldwin (1975). The simulation results indicate a substantial improvement in parameter estimation using ridge regression when the correlation between basin characteristics is more than about 0.90. In addition, results indicate a strong potential for improving the mean square error of prediction of a peak-flow characteristic versus basin characteristics regression model when the basin characteristics are approximately colinear. The simulation covers a range of regression parameters, streamflow statistics, and basin characteristics commonly found in regional regression studies.

  10. Food additives

    PubMed Central

    Spencer, Michael

    1974-01-01

    Food additives are discussed from the food technology point of view. The reasons for their use are summarized: (1) to protect food from chemical and microbiological attack; (2) to even out seasonal supplies; (3) to improve their eating quality; (4) to improve their nutritional value. The various types of food additives are considered, e.g. colours, flavours, emulsifiers, bread and flour additives, preservatives, and nutritional additives. The paper concludes with consideration of those circumstances in which the use of additives is (a) justified and (b) unjustified. PMID:4467857

  11. Controls on tree water uptake and information storage in tree rings

    NASA Astrophysics Data System (ADS)

    Blume, Theresa; Simard, Sonia; Heidbüchel, Ingo; Güntner, Andreas; Heinrich, Ingo

    2016-04-01

    Controls on tree water uptake are investigated in various forest stands in the northeastern German lowlands by a multi-method approach. This approach combines sapflow and dendrometer measurements as well as tree-ring analyses with soil moisture derived root water uptake rates. The latter method has the advantage that it provides depth distributions of root water uptake and thus additional information allowing for a more detailed analysis of the relationship between water availability and water uptake. High resolution climatic data makes it possible to investigate the site specific interplay between atmospheric demand and water availability on the one hand and tree response and adaptation on the other hand. The comparison of spatio-temporal patterns of these responses with concurrent tree growth as well as tree-ring analyses enables a first matching of actual and "archived" patterns and thus an estimate of how much of this information is stored in tree rings.

  12. IND - THE IND DECISION TREE PACKAGE

    NASA Technical Reports Server (NTRS)

    Buntine, W.

    1994-01-01

    A common approach to supervised classification and prediction in artificial intelligence and statistical pattern recognition is the use of decision trees. A tree is "grown" from data using a recursive partitioning algorithm to create a tree which has good prediction of classes on new data. Standard algorithms are CART (by Breiman Friedman, Olshen and Stone) and ID3 and its successor C4 (by Quinlan). As well as reimplementing parts of these algorithms and offering experimental control suites, IND also introduces Bayesian and MML methods and more sophisticated search in growing trees. These produce more accurate class probability estimates that are important in applications like diagnosis. IND is applicable to most data sets consisting of independent instances, each described by a fixed length vector of attribute values. An attribute value may be a number, one of a set of attribute specific symbols, or it may be omitted. One of the attributes is delegated the "target" and IND grows trees to predict the target. Prediction can then be done on new data or the decision tree printed out for inspection. IND provides a range of features and styles with convenience for the casual user as well as fine-tuning for the advanced user or those interested in research. IND can be operated in a CART-like mode (but without regression trees, surrogate splits or multivariate splits), and in a mode like the early version of C4. Advanced features allow more extensive search, interactive control and display of tree growing, and Bayesian and MML algorithms for tree pruning and smoothing. These often produce more accurate class probability estimates at the leaves. IND also comes with a comprehensive experimental control suite. IND consists of four basic kinds of routines: data manipulation routines, tree generation routines, tree testing routines, and tree display routines. The data manipulation routines are used to partition a single large data set into smaller training and test sets. The

  13. Functional Generalized Additive Models.

    PubMed

    McLean, Mathew W; Hooker, Giles; Staicu, Ana-Maria; Scheipl, Fabian; Ruppert, David

    2014-01-01

    We introduce the functional generalized additive model (FGAM), a novel regression model for association studies between a scalar response and a functional predictor. We model the link-transformed mean response as the integral with respect to t of F{X(t), t} where F(·,·) is an unknown regression function and X(t) is a functional covariate. Rather than having an additive model in a finite number of principal components as in Müller and Yao (2008), our model incorporates the functional predictor directly and thus our model can be viewed as the natural functional extension of generalized additive models. We estimate F(·,·) using tensor-product B-splines with roughness penalties. A pointwise quantile transformation of the functional predictor is also considered to ensure each tensor-product B-spline has observed data on its support. The methods are evaluated using simulated data and their predictive performance is compared with other competing scalar-on-function regression alternatives. We illustrate the usefulness of our approach through an application to brain tractography, where X(t) is a signal from diffusion tensor imaging at position, t, along a tract in the brain. In one example, the response is disease-status (case or control) and in a second example, it is the score on a cognitive test. R code for performing the simulations and fitting the FGAM can be found in supplemental materials available online.

  14. Interpretation of Standardized Regression Coefficients in Multiple Regression.

    ERIC Educational Resources Information Center

    Thayer, Jerome D.

    The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for variables…

  15. Regressive Evolution in Astyanax Cavefish

    PubMed Central

    Jeffery, William R.

    2013-01-01

    A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment. PMID:19640230

  16. Laplace regression with censored data.

    PubMed

    Bottai, Matteo; Zhang, Jiajia

    2010-08-01

    We consider a regression model where the error term is assumed to follow a type of asymmetric Laplace distribution. We explore its use in the estimation of conditional quantiles of a continuous outcome variable given a set of covariates in the presence of random censoring. Censoring may depend on covariates. Estimation of the regression coefficients is carried out by maximizing a non-differentiable likelihood function. In the scenarios considered in a simulation study, the Laplace estimator showed correct coverage and shorter computation time than the alternative methods considered, some of which occasionally failed to converge. We illustrate the use of Laplace regression with an application to survival time in patients with small cell lung cancer.

  17. Survival Data and Regression Models

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.

  18. Interquantile Shrinkage in Regression Models

    PubMed Central

    Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.

    2012-01-01

    Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546

  19. [Is regression of atherosclerosis possible?].

    PubMed

    Thomas, D; Richard, J L; Emmerich, J; Bruckert, E; Delahaye, F

    1992-10-01

    Experimental studies have shown the regression of atherosclerosis in animals given a cholesterol-rich diet and then given a normal diet or hypolipidemic therapy. Despite favourable results of clinical trials of primary prevention modifying the lipid profile, the concept of atherosclerosis regression in man remains very controversial. The methodological approach is difficult: this is based on angiographic data and requires strict standardisation of angiographic views and reliable quantitative techniques of analysis which are available with image processing. Several methodologically acceptable clinical coronary studies have shown not only stabilisation but also regression of atherosclerotic lesions with reductions of about 25% in total cholesterol levels and of about 40% in LDL cholesterol levels. These reductions were obtained either by drugs as in CLAS (Cholesterol Lowering Atherosclerosis Study), FATS (Familial Atherosclerosis Treatment Study) and SCOR (Specialized Center of Research Intervention Trial), by profound modifications in dietary habits as in the Lifestyle Heart Trial, or by surgery (ileo-caecal bypass) as in POSCH (Program On the Surgical Control of the Hyperlipidemias). On the other hand, trials with non-lipid lowering drugs such as the calcium antagonists (INTACT, MHIS) have not shown significant regression of existing atherosclerotic lesions but only a decrease on the number of new lesions. The clinical benefits of these regression studies are difficult to demonstrate given the limited period of observation, relatively small population numbers and the fact that in some cases the subjects were asymptomatic. The decrease in the number of cardiovascular events therefore seems relatively modest and concerns essentially subjects who were symptomatic initially. The clinical repercussion of studies of prevention involving a single lipid factor is probably partially due to the reduction in progression and anatomical regression of the atherosclerotic plaque

  20. Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data

    PubMed Central

    Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.

    2016-01-01

    We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872

  1. Progression and regression of the atherosclerotic plaque.

    PubMed

    de Feyter, P J; Vos, J; Deckers, J W

    1995-08-01

    In animals in which atherosclerosis was induced experimentally (by a high cholesterol diet) regression of the atherosclerotic lesion was demonstrated after serum cholesterol was reduced by cholesterol- lowering drugs or a low-fat diet. Whether regression of advanced coronary arterly lesions also takes place in humans after a similar intervention remains conjectural. However, several randomized studies, primarily employing lipid-lowering intervention or comprehensive changes in lifestyle, have demonstrated, using serial angiograms, that it is possible to achieve less progression, arrest or even (small) regression of atherosclerotic lesions. The lipid-lowering trials (NHBLI, CLAS, POSCH, FATS, SCOR and STARS) studied 1240 symptomatic patients, mostly men, with moderately elevated cholesterol levels and moderately severe angiographic-proven coronary artery disease. A variety of lipid-lowering drugs, in addition to a diet, were used over an intervention period ranging from 2 to 3 years. In all but one study (NHBLI), the progression of coronary atherosclerosis was less in the treated group, but regression was induced in only a few patients. The overall relative risk of progression of coronary atherosclerosis was 0 x 62 and 2 x 13, respectively. The induced angiographic differences were small and did not produce any significant haemodynamic benefit. The most important result was tht the disease process could be stabilized in the majority of patients. Three comprehensive lifestyle change trials (the Lifestyle Heart study, STARS and the Heidelberg Study) studied 183 patients, who were subjected to stress management, and/or intensive exercise, in addition to a low fat diet, over a period ranging from 1 to 3 years. All three trials demonstrated less progression, and more regression with overall relative risks of 0 x 40 and 2 x 35 respectively, in the intervention groups. Angiographic trials demonstrated that retardation or arrest of coronary atherosclerosis was possible

  2. Correlation Weights in Multiple Regression

    ERIC Educational Resources Information Center

    Waller, Niels G.; Jones, Jeff A.

    2010-01-01

    A general theory on the use of correlation weights in linear prediction has yet to be proposed. In this paper we take initial steps in developing such a theory by describing the conditions under which correlation weights perform well in population regression models. Using OLS weights as a comparison, we define cases in which the two weighting…

  3. Weighting Regressions by Propensity Scores

    ERIC Educational Resources Information Center

    Freedman, David A.; Berk, Richard A.

    2008-01-01

    Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If…

  4. Multiple Regression: A Leisurely Primer.

    ERIC Educational Resources Information Center

    Daniel, Larry G.; Onwuegbuzie, Anthony J.

    Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…

  5. Cactus: An Introduction to Regression

    ERIC Educational Resources Information Center

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…

  6. Ridge Regression for Interactive Models.

    ERIC Educational Resources Information Center

    Tate, Richard L.

    1988-01-01

    An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are favorable to…

  7. Quantile Regression with Censored Data

    ERIC Educational Resources Information Center

    Lin, Guixian

    2009-01-01

    The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…

  8. Species integrity in trees.

    PubMed

    Ortiz-Barrientos, Daniel; Baack, Eric J

    2014-09-01

    From California sequoia, to Australian eucalyptus, to the outstanding diversity of Amazonian forests, trees are fundamental to many processes in ecology and evolution. Trees define the communities that they inhabit, are host to a multiplicity of other organisms and can determine the ecological dynamics of other plants and animals. Trees are also at the heart of major patterns of biodiversity such as the latitudinal gradient of species diversity and thus are important systems for studying the origin of new plant species. Although the role of trees in community assembly and ecological succession is partially understood, the origin of tree diversity remains largely opaque. For instance, the relative importance of differing habitats and phenologies as barriers to hybridization between closely related species is still largely uncharacterized in trees. Consequently, we know very little about the origin of trees species and their integrity. Similarly, studies on the interplay between speciation and tree community assembly are in their infancy and so are studies on how processes like forest maturation modifies the context in which reproductive isolation evolves. In this issue of Molecular Ecology, Lindtke et al. (2014) and Lagache et al. (2014) overcome some traditional difficulties in studying mating systems and sexual isolation in the iconic oaks and poplars, providing novel insights about the integrity of tree species and on how ecology leads to variation in selection on reproductive isolation over time and space. PMID:25155715

  9. Tree growth response to ENSO in Durango, Mexico

    NASA Astrophysics Data System (ADS)

    Pompa-García, Marin; Miranda-Aragón, Liliana; Aguirre-Salado, Carlos Arturo

    2015-01-01

    The dynamics of forest ecosystems worldwide have been driven largely by climatic teleconnections. El Niño-Southern Oscillation (ENSO) is the strongest interannual variation of the Earth's climate, affecting the regional climatic regime. These teleconnections may impact plant phenology, growth rate, forest extent, and other gradual changes in forest ecosystems. The objective of this study was to investigate how Pinus cooperi populations face the influence of ENSO and regional microclimates in five ecozones in northwestern Mexico. Using standard dendrochronological techniques, tree-ring chronologies (TRI) were generated. TRI, ENSO, and climate relationships were correlated from 1950-2010. Additionally, multiple regressions were conducted in order to detect those ENSO months with direct relations in TRI ( p < 0.1). The five chronologies showed similar trends during the period they overlapped, indicating that the P. cooperi populations shared an interannual growth variation. In general, ENSO index showed correspondences with tree-ring growth in synchronous periods. We concluded that ENSO had connectivity with regional climate in northern Mexico and radial growth of P. cooperi populations has been driven largely by positive ENSO values (El Niño episodes).

  10. More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011.

    PubMed

    Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R

    2016-01-01

    Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period. PMID:26285776

  11. More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011

    NASA Astrophysics Data System (ADS)

    Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R.

    2016-01-01

    Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.

  12. More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011.

    PubMed

    Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R

    2016-01-01

    Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.

  13. Stable feature selection for clinical prediction: exploiting ICD tree structure using Tree-Lasso.

    PubMed

    Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha

    2015-02-01

    Modern healthcare is getting reshaped by growing Electronic Medical Records (EMR). Recently, these records have been shown of great value towards building clinical prediction models. In EMR data, patients' diseases and hospital interventions are captured through a set of diagnoses and procedures codes. These codes are usually represented in a tree form (e.g. ICD-10 tree) and the codes within a tree branch may be highly correlated. These codes can be used as features to build a prediction model and an appropriate feature selection can inform a clinician about important risk factors for a disease. Traditional feature selection methods (e.g. Information Gain, T-test, etc.) consider each variable independently and usually end up having a long feature list. Recently, Lasso and related l1-penalty based feature selection methods have become popular due to their joint feature selection property. However, Lasso is known to have problems of selecting one feature of many correlated features randomly. This hinders the clinicians to arrive at a stable feature set, which is crucial for clinical decision making process. In this paper, we solve this problem by using a recently proposed Tree-Lasso model. Since, the stability behavior of Tree-Lasso is not well understood, we study the stability behavior of Tree-Lasso and compare it with other feature selection methods. Using a synthetic and two real-world datasets (Cancer and Acute Myocardial Infarction), we show that Tree-Lasso based feature selection is significantly more stable than Lasso and comparable to other methods e.g. Information Gain, ReliefF and T-test. We further show that, using different types of classifiers such as logistic regression, naive Bayes, support vector machines, decision trees and Random Forest, the classification performance of Tree-Lasso is comparable to Lasso and better than other methods. Our result has implications in identifying stable risk factors for many healthcare problems and therefore can

  14. Amalgamating source trees with different taxonomic levels.

    PubMed

    Berry, Vincent; Bininda-Emonds, Olaf R P; Semple, Charles

    2013-03-01

    Supertree methods combine a collection of source trees into a single parent tree or supertree. For almost all such methods, the terminal taxa across the source trees have to be non-nested for the output supertree to make sense. Motivated by Page, the first supertree method for combining rooted source trees where the taxa can be hierarchically nested is called AncestralBuild. In addition to taxa labeling the leaves, this method allows the rooted source trees to have taxa labeling some of the interior nodes at a higher taxonomic level than their descendants (e.g., genera vs. species). However, the utility of AncestralBuild is somewhat restricted as it is mostly intended to decide if a collection of rooted source trees is compatible. If the initial collection is not compatible, then no tree is returned. To overcome this restriction, we introduce here the MultiLevelSupertree (MLS) supertree method whose input is the same as that for AncestralBuild, but which accommodates incompatibilities among rooted source trees using a MinCut-like procedure. We show that MLS has several desirable properties including the preservation of common subtrees among the source trees, the preservation of ancestral relationships whenever they are compatible, as well as running in polynomial time. Furthermore, application to a small test data set (the mammalian carnivore family Phocidae) indicates that the method correctly places nested taxa at different taxonomic levels (reflecting vertical signal), even in cases where the input trees harbor a significant level of conflict between their clades (i.e., in their horizontal signal).

  15. Regression Verification Using Impact Summaries

    NASA Technical Reports Server (NTRS)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program

  16. Embedded Sensors for Measuring Surface Regression

    NASA Technical Reports Server (NTRS)

    Gramer, Daniel J.; Taagen, Thomas J.; Vermaak, Anton G.

    2006-01-01

    non-eroding end of the sensor. The sensor signal can be transmitted from inside a high-pressure chamber to the ambient environment, using commercially available feedthrough connectors. Miniaturized internal recorders or wireless data transmission could also potentially be employed to eliminate the need for producing penetrations in the chamber case. The rungs are designed so that as each successive rung is eroded away, the resistance changes by an amount that yields a readily measurable signal larger than the background noise. (In addition, signal-conditioning techniques are used in processing the resistance readings to mitigate the effect of noise.) Hence, each discrete change of resistance serves to indicate the arrival of the regressing host material front at the known depth of the affected resistor rung. The average rate of regression between two adjacent resistors can be calculated simply as the distance between the resistors divided by the time interval between their resistance jumps. Advanced data reduction techniques have also been developed to establish the instantaneous surface position and regression rate when the regressing front is between rungs.

  17. Convex Regression with Interpretable Sharp Partitions

    PubMed Central

    Petersen, Ashley; Simon, Noah; Witten, Daniela

    2016-01-01

    We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set.

  18. Convex Regression with Interpretable Sharp Partitions

    PubMed Central

    Petersen, Ashley; Simon, Noah; Witten, Daniela

    2016-01-01

    We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set. PMID:27635120

  19. Quantile Regression With Measurement Error

    PubMed Central

    Wei, Ying; Carroll, Raymond J.

    2010-01-01

    Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. PMID:20305802

  20. Precision and Recall for Regression

    NASA Astrophysics Data System (ADS)

    Torgo, Luis; Ribeiro, Rita

    Cost sensitive prediction is a key task in many real world applications. Most existing research in this area deals with classification problems. This paper addresses a related regression problem: the prediction of rare extreme values of a continuous variable. These values are often regarded as outliers and removed from posterior analysis. However, for many applications (e.g. in finance, meteorology, biology, etc.) these are the key values that we want to accurately predict. Any learning method obtains models by optimizing some preference criteria. In this paper we propose new evaluation criteria that are more adequate for these applications. We describe a generalization for regression of the concepts of precision and recall often used in classification. Using these new evaluation metrics we are able to focus the evaluation of predictive models on the cases that really matter for these applications. Our experiments indicate the advantages of the use of these new measures when comparing predictive models in the context of our target applications.

  1. The Flame Tree

    ERIC Educational Resources Information Center

    Lewis, Richard

    2004-01-01

    Lewis's own experiences living in Indonesia are fertile ground for telling "a ripping good story," one found in "The Flame Tree." He hopes people will enjoy the tale and appreciate the differences of an unfamiliar culture. The excerpt from "The Flame Tree" will reel readers in quickly.

  2. CSI for Trees

    ERIC Educational Resources Information Center

    Rubino, Darrin L.; Hanson, Deborah

    2009-01-01

    The circles and patterns in a tree's stem tell a story, but that story can be a mystery. Interpreting the story of tree rings provides a way to heighten the natural curiosity of students and help them gain insight into the interaction of elements in the environment. It also represents a wonderful opportunity to incorporate the nature of science.…

  3. Trees Are Terrific!

    ERIC Educational Resources Information Center

    Braus, Judy, Ed.

    1992-01-01

    Ranger Rick's NatureScope is a creative education series dedicated to inspiring in children an understanding and appreciation of the natural world while developing the skills they will need to make responsible decisions about the environment. Contents are organized into the following sections: (1) "What Makes a Tree a Tree?," including information…

  4. Tree Topology Estimation.

    PubMed

    Estrada, Rolando; Tomasi, Carlo; Schmidler, Scott C; Farsiu, Sina

    2015-08-01

    Tree-like structures are fundamental in nature, and it is often useful to reconstruct the topology of a tree - what connects to what - from a two-dimensional image of it. However, the projected branches often cross in the image: the tree projects to a planar graph, and the inverse problem of reconstructing the topology of the tree from that of the graph is ill-posed. We regularize this problem with a generative, parametric tree-growth model. Under this model, reconstruction is possible in linear time if one knows the direction of each edge in the graph - which edge endpoint is closer to the root of the tree - but becomes NP-hard if the directions are not known. For the latter case, we present a heuristic search algorithm to estimate the most likely topology of a rooted, three-dimensional tree from a single two-dimensional image. Experimental results on retinal vessel, plant root, and synthetic tree data sets show that our methodology is both accurate and efficient. PMID:26353004

  5. Tree nut oils

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The major tree nuts include almonds, Brazil nuts, cashew nuts, hazelnuts, macadamia nuts, pecans, pine nuts, pistachio nuts, and walnuts. Tree nut oils are appreciated in food applications because of their flavors and are generally more expensive than other gourmet oils. Research during the last de...

  6. Trees for Mother Earth.

    ERIC Educational Resources Information Center

    Greer, Sandy

    1993-01-01

    Describes Trees for Mother Earth, a program in which secondary students raise funds to buy fruit trees to plant during visits to the Navajo Reservation. Benefits include developing feelings of self-worth among participants, promoting cultural exchange and understanding, and encouraging self-sufficiency among the Navajo. (LP)

  7. Structural Equation Model Trees

    ERIC Educational Resources Information Center

    Brandmaier, Andreas M.; von Oertzen, Timo; McArdle, John J.; Lindenberger, Ulman

    2013-01-01

    In the behavioral and social sciences, structural equation models (SEMs) have become widely accepted as a modeling tool for the relation between latent and observed variables. SEMs can be seen as a unification of several multivariate analysis techniques. SEM Trees combine the strengths of SEMs and the decision tree paradigm by building tree…

  8. Spontaneous regression of bronchogenic cyst accompanied by pneumonia.

    PubMed

    Himuro, Naoya; Minakata, Takao; Oshima, Yutaka; Kataoka, Daisuke; Yamamoto, Shigeru; Kadokura, Mitsutaka

    2015-12-01

    Bronchogenic cysts arise from abnormal budding of the ventral diverticulum of the foregut or tracheobronchial tree during embryogenesis, are the most common cystic masses in the mediastinum, and are generally asymptomatic. A spontaneous regression in a mediastinal bronchogenic cyst (MBC) with pneumonia is rare. A 30-year-old male had a tumor shadow in the middle mediastinum. When he visited our hospital, he had a mild fever with coughing and sputum. A chest computed tomography (CT) scan showed a decrease in the tumor size and the existence of right pneumonia. MBC may be involved in the etiology of pneumonia; therefore, bronchogenic cysts need to be resected as soon as possible. PMID:26943430

  9. The tree of eukaryotes.

    PubMed

    Keeling, Patrick J; Burger, Gertraud; Durnford, Dion G; Lang, B Franz; Lee, Robert W; Pearlman, Ronald E; Roger, Andrew J; Gray, Michael W

    2005-12-01

    Recent advances in resolving the tree of eukaryotes are converging on a model composed of a few large hypothetical 'supergroups', each comprising a diversity of primarily microbial eukaryotes (protists, or protozoa and algae). The process of resolving the tree involves the synthesis of many kinds of data, including single-gene trees, multigene analyses, and other kinds of molecular and structural characters. Here, we review the recent progress in assembling the tree of eukaryotes, describing the major evidence for each supergroup, and where gaps in our knowledge remain. We also consider other factors emerging from phylogenetic analyses and comparative genomics, in particular lateral gene transfer, and whether such factors confound our understanding of the eukaryotic tree.

  10. From Family Trees to Decision Trees.

    ERIC Educational Resources Information Center

    Trobian, Helen R.

    This paper is a preliminary inquiry by a non-mathematician into graphic methods of sequential planning and ways in which hierarchical analysis and tree structures can be helpful in developing interest in the use of mathematical modeling in the search for creative solutions to real-life problems. Highlights include a discussion of hierarchical…

  11. Pre-growth mortality of Abies cilicica trees and mortality models performance.

    PubMed

    Carus, Serdar

    2010-05-01

    In this study, we compared tree-growth rates (basal area increment) from recently dead and living Taurus fir (Abies cilicica Carr.) trees in the Kovada lake Forest of Isparta, Turkey. For each dead tree, tree-growth rates were analyzed for the presence of pre-death growth depressions in the study area (number of sample plots = 11) in 2006. However, we compared both the magnitude and rate of growth prior to death to a control (living) group of trees. Basal area increment (BAI) averaged substantially less during the last 10 years before death than for control trees. Trees that died started diverging in growth, on average, 50-60 years before death. About 18% of trees that died had chronically slow growth, 46% had pronounced declines in growth, whereas 36% had good growth up to death. However, tree-ring-based growth patterns of dead and living Taurus fir trees were compared and used 12 mortality models that were derived using logistic regression from growth patterns of tree-ring series as predictor variables. The four models with the highest overall performance correctly classified 43.8-56.3% of all dead trees and 75.0-87.5% of all living trees, and they predicted 25.0-43.8% of all dead trees to die within 0-15 years prior to the actual year of death.

  12. Feedback of trees on nitrogen mineralization to restrict the advance of trees in C4 savannahs.

    PubMed

    Higgins, Steven I; Keretetse, Moagi; February, Edmund C

    2015-08-01

    Remote sensing studies suggest that savannahs are transforming into more tree-dominated states; however, progressive nitrogen limitation could potentially retard this putatively CO2-driven invasion. We analysed controls on nitrogen mineralization rates in savannah by manipulating rainfall and the cover of grass and tree elements against the backdrop of the seasonal temperature and rainfall variation. We found that the seasonal pattern of nitrogen mineralization was strongly influenced by rainfall, and that manipulative increases in rainfall could boost mineralization rates. Additionally, mineralization rates were considerably higher on plots with grasses and lower on plots with trees. Our findings suggest that shifting a savannah from a grass to a tree-dominated state can substantially reduce nitrogen mineralization rates, thereby potentially creating a negative feedback on the CO2-induced invasion of savannahs by trees.

  13. Feedback of trees on nitrogen mineralization to restrict the advance of trees in C4 savannahs.

    PubMed

    Higgins, Steven I; Keretetse, Moagi; February, Edmund C

    2015-08-01

    Remote sensing studies suggest that savannahs are transforming into more tree-dominated states; however, progressive nitrogen limitation could potentially retard this putatively CO2-driven invasion. We analysed controls on nitrogen mineralization rates in savannah by manipulating rainfall and the cover of grass and tree elements against the backdrop of the seasonal temperature and rainfall variation. We found that the seasonal pattern of nitrogen mineralization was strongly influenced by rainfall, and that manipulative increases in rainfall could boost mineralization rates. Additionally, mineralization rates were considerably higher on plots with grasses and lower on plots with trees. Our findings suggest that shifting a savannah from a grass to a tree-dominated state can substantially reduce nitrogen mineralization rates, thereby potentially creating a negative feedback on the CO2-induced invasion of savannahs by trees. PMID:26268994

  14. Phosphazene additives

    DOEpatents

    Harrup, Mason K; Rollins, Harry W

    2013-11-26

    An additive comprising a phosphazene compound that has at least two reactive functional groups and at least one capping functional group bonded to phosphorus atoms of the phosphazene compound. One of the at least two reactive functional groups is configured to react with cellulose and the other of the at least two reactive functional groups is configured to react with a resin, such as an amine resin of a polycarboxylic acid resin. The at least one capping functional group is selected from the group consisting of a short chain ether group, an alkoxy group, or an aryloxy group. Also disclosed are an additive-resin admixture, a method of treating a wood product, and a wood product.

  15. Potlining Additives

    SciTech Connect

    Rudolf Keller

    2004-08-10

    In this project, a concept to improve the performance of aluminum production cells by introducing potlining additives was examined and tested. Boron oxide was added to cathode blocks, and titanium was dissolved in the metal pool; this resulted in the formation of titanium diboride and caused the molten aluminum to wet the carbonaceous cathode surface. Such wetting reportedly leads to operational improvements and extended cell life. In addition, boron oxide suppresses cyanide formation. This final report presents and discusses the results of this project. Substantial economic benefits for the practical implementation of the technology are projected, especially for modern cells with graphitized blocks. For example, with an energy savings of about 5% and an increase in pot life from 1500 to 2500 days, a cost savings of $ 0.023 per pound of aluminum produced is projected for a 200 kA pot.

  16. Lazy decision trees

    SciTech Connect

    Friedman, J.H.; Yun, Yeogirl; Kohavi, R.

    1996-12-31

    Lazy learning algorithms, exemplified by nearest-neighbor algorithms, do not induce a concise hypothesis from a given training set; the inductive process is delayed until a test instance is given. Algorithms for constructing decision trees, such as C4.5, ID3, and CART create a single {open_quotes}best{close_quotes} decision tree during the training phase, and this tree is then used to classify test instances. The tests at the nodes of the constructed tree are good on average, but there may be better tests for classifying a specific instance. We propose a lazy decision tree algorithm-LazyDT-that conceptually constructs the {open_quotes}best{close_quote} decision tree for each test instance. In practice, only a path needs to be constructed, and a caching scheme makes the algorithm fast. The algorithm is robust with respect to missing values without resorting to the complicated methods usually seen in induction of decision trees. Experiments on real and artificial problems are presented.

  17. Regression analysis of cytopathological data

    SciTech Connect

    Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.

    1982-12-01

    Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.

  18. Up in the tree--the overlooked richness of bryophytes and lichens in tree crowns.

    PubMed

    Boch, Steffen; Müller, Jörg; Prati, Daniel; Blaser, Stefan; Fischer, Markus

    2013-01-01

    Assessing diversity is among the major tasks in ecology and conservation science. In ecological and conservation studies, epiphytic cryptogams are usually sampled up to accessible heights in forests. Thus, their diversity, especially of canopy specialists, likely is underestimated. If the proportion of those species differs among forest types, plot-based diversity assessments are biased and may result in misleading conservation recommendations. We sampled bryophytes and lichens in 30 forest plots of 20 m × 20 m in three German regions, considering all substrates, and including epiphytic litter fall. First, the sampling of epiphytic species was restricted to the lower 2 m of trees and shrubs. Then, on one representative tree per plot, we additionally recorded epiphytic species in the crown, using tree climbing techniques. Per tree, on average 54% of lichen and 20% of bryophyte species were overlooked if the crown was not been included. After sampling all substrates per plot, including the bark of all shrubs and trees, still 38% of the lichen and 4% of the bryophyte species were overlooked if the tree crown of the sampled tree was not included. The number of overlooked lichen species varied strongly among regions. Furthermore, the number of overlooked bryophyte and lichen species per plot was higher in European beech than in coniferous stands and increased with increasing diameter at breast height of the sampled tree. Thus, our results indicate a bias of comparative studies which might have led to misleading conservation recommendations of plot-based diversity assessments. PMID:24358373

  19. Phylogenetic Tree Reconstruction Accuracy and Model Fit when Proportions of Variable Sites Change across the Tree

    PubMed Central

    Grievink, Liat Shavit; Penny, David; Hendy, Michael D.; Holland, Barbara R.

    2010-01-01

    Commonly used phylogenetic models assume a homogeneous process through time in all parts of the tree. However, it is known that these models can be too simplistic as they do not account for nonhomogeneous lineage-specific properties. In particular, it is now widely recognized that as constraints on sequences evolve, the proportion and positions of variable sites can vary between lineages causing heterotachy. The extent to which this model misspecification affects tree reconstruction is still unknown. Here, we evaluate the effect of changes in the proportions and positions of variable sites on model fit and tree estimation. We consider 5 current models of nucleotide sequence evolution in a Bayesian Markov chain Monte Carlo framework as well as maximum parsimony (MP). We show that for a tree with 4 lineages where 2 nonsister taxa undergo a change in the proportion of variable sites tree reconstruction under the best-fitting model, which is chosen using a relative test, often results in the wrong tree. In this case, we found that an absolute test of model fit is a better predictor of tree estimation accuracy. We also found further evidence that MP is not immune to heterotachy. In addition, we show that increased sampling of taxa that have undergone a change in proportion and positions of variable sites is critical for accurate tree reconstruction. PMID:20525636

  20. Up in the tree--the overlooked richness of bryophytes and lichens in tree crowns.

    PubMed

    Boch, Steffen; Müller, Jörg; Prati, Daniel; Blaser, Stefan; Fischer, Markus

    2013-01-01

    Assessing diversity is among the major tasks in ecology and conservation science. In ecological and conservation studies, epiphytic cryptogams are usually sampled up to accessible heights in forests. Thus, their diversity, especially of canopy specialists, likely is underestimated. If the proportion of those species differs among forest types, plot-based diversity assessments are biased and may result in misleading conservation recommendations. We sampled bryophytes and lichens in 30 forest plots of 20 m × 20 m in three German regions, considering all substrates, and including epiphytic litter fall. First, the sampling of epiphytic species was restricted to the lower 2 m of trees and shrubs. Then, on one representative tree per plot, we additionally recorded epiphytic species in the crown, using tree climbing techniques. Per tree, on average 54% of lichen and 20% of bryophyte species were overlooked if the crown was not been included. After sampling all substrates per plot, including the bark of all shrubs and trees, still 38% of the lichen and 4% of the bryophyte species were overlooked if the tree crown of the sampled tree was not included. The number of overlooked lichen species varied strongly among regions. Furthermore, the number of overlooked bryophyte and lichen species per plot was higher in European beech than in coniferous stands and increased with increasing diameter at breast height of the sampled tree. Thus, our results indicate a bias of comparative studies which might have led to misleading conservation recommendations of plot-based diversity assessments.

  1. The gravity apple tree

    NASA Astrophysics Data System (ADS)

    Espinosa Aldama, Mariana

    2015-04-01

    The gravity apple tree is a genealogical tree of the gravitation theories developed during the past century. The graphic representation is full of information such as guides in heuristic principles, names of main proponents, dates and references for original articles (See under Supplementary Data for the graphic representation). This visual presentation and its particular classification allows a quick synthetic view for a plurality of theories, many of them well validated in the Solar System domain. Its diachronic structure organizes information in a shape of a tree following similarities through a formal concept analysis. It can be used for educational purposes or as a tool for philosophical discussion.

  2. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  3. Evolutionary tree reconstruction

    NASA Technical Reports Server (NTRS)

    Cheeseman, Peter; Kanefsky, Bob

    1990-01-01

    It is described how Minimum Description Length (MDL) can be applied to the problem of DNA and protein evolutionary tree reconstruction. If there is a set of mutations that transform a common ancestor into a set of the known sequences, and this description is shorter than the information to encode the known sequences directly, then strong evidence for an evolutionary relationship has been found. A heuristic algorithm is described that searches for the simplest tree (smallest MDL) that finds close to optimal trees on the test data. Various ways of extending the MDL theory to more complex evolutionary relationships are discussed.

  4. Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

    PubMed Central

    Smith, Stephen A.; Brown, Joseph W.; Hinchliff, Cody E.

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe. PMID:24086118

  5. Mortality rates associated with crown health for eastern forest tree species.

    PubMed

    Morin, Randall S; Randolph, KaDonna C; Steinman, Jim

    2015-03-01

    The condition of tree crowns is an important indicator of tree and forest health. Crown conditions have been evaluated during inventories of the US Forest Service Forest Inventory and Analysis (FIA) program since 1999. In this study, remeasured data from 55,013 trees on 2616 FIA plots in the eastern USA were used to assess the probability of survival among various tree species using the suite of FIA crown condition variables. Logistic regression procedures were employed to develop models for predicting tree survival. Results of the regression analyses indicated that crown dieback was the most important crown condition variable for predicting tree survival for all species combined and for many of the 15 individual species in the study. The logistic models were generally successful in representing recent tree mortality responses to multiyear infestations of beech bark disease and hemlock woolly adelgid. Although our models are only applicable to trees growing in a forest setting, the utility of models that predict impending tree mortality goes beyond forest inventory or traditional forestry growth and yield models and includes any application where managers need to assess tree health or predict tree mortality including urban forest, recreation, wildlife, and pest management.

  6. Multiatlas Segmentation as Nonparametric Regression

    PubMed Central

    Awate, Suyash P.; Whitaker, Ross T.

    2015-01-01

    This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528

  7. Variable Selection in ROC Regression

    PubMed Central

    2013-01-01

    Regression models are introduced into the receiver operating characteristic (ROC) analysis to accommodate effects of covariates, such as genes. If many covariates are available, the variable selection issue arises. The traditional induced methodology separately models outcomes of diseased and nondiseased groups; thus, separate application of variable selections to two models will bring barriers in interpretation, due to differences in selected models. Furthermore, in the ROC regression, the accuracy of area under the curve (AUC) should be the focus instead of aiming at the consistency of model selection or the good prediction performance. In this paper, we obtain one single objective function with the group SCAD to select grouped variables, which adapts to popular criteria of model selection, and propose a two-stage framework to apply the focused information criterion (FIC). Some asymptotic properties of the proposed methods are derived. Simulation studies show that the grouped variable selection is superior to separate model selections. Furthermore, the FIC improves the accuracy of the estimated AUC compared with other criteria. PMID:24312135

  8. Tree-Ring Based May-July Temperature Reconstruction Since AD 1630 on the Western Loess Plateau, China

    PubMed Central

    Song, Huiming; Liu, Yu; Li, Qiang; Gao, Na; Ma, Yongyong; Zhang, Yanhua

    2014-01-01

    Tree-ring samples from Chinese Pine (Pinus tabulaeformis Carr.) collected at Mt. Shimen on the western Loess Plateau, China, were used to reconstruct the mean May–July temperature during AD 1630–2011. The regression model explained 48% of the adjusted variance in the instrumentally observed mean May–July temperature. The reconstruction revealed significant temperature variations at interannual to decadal scales. Cool periods observed in the reconstruction coincided with reduced solar activities. The reconstructed temperature matched well with two other tree-ring based temperature reconstructions conducted on the northern slope of the Qinling Mountains (on the southern margin of the Loess Plateau of China) for both annual and decadal scales. In addition, this study agreed well with several series derived from different proxies. This reconstruction improves upon the sparse network of high-resolution paleoclimatic records for the western Loess Plateau, China. PMID:24690885

  9. Tree-ring based may-july temperature reconstruction since AD 1630 on the Western Loess Plateau, China.

    PubMed

    Song, Huiming; Liu, Yu; Li, Qiang; Gao, Na; Ma, Yongyong; Zhang, Yanhua

    2014-01-01

    Tree-ring samples from Chinese Pine (Pinus tabulaeformis Carr.) collected at Mt. Shimen on the western Loess Plateau, China, were used to reconstruct the mean May-July temperature during AD 1630-2011. The regression model explained 48% of the adjusted variance in the instrumentally observed mean May-July temperature. The reconstruction revealed significant temperature variations at interannual to decadal scales. Cool periods observed in the reconstruction coincided with reduced solar activities. The reconstructed temperature matched well with two other tree-ring based temperature reconstructions conducted on the northern slope of the Qinling Mountains (on the southern margin of the Loess Plateau of China) for both annual and decadal scales. In addition, this study agreed well with several series derived from different proxies. This reconstruction improves upon the sparse network of high-resolution paleoclimatic records for the western Loess Plateau, China.

  10. Using Classification Trees to Predict Alumni Giving for Higher Education

    ERIC Educational Resources Information Center

    Weerts, David J.; Ronca, Justin M.

    2009-01-01

    As the relative level of public support for higher education declines, colleges and universities aim to maximize alumni-giving to keep their programs competitive. Anchored in a utility maximization framework, this study employs the classification and regression tree methodology to examine characteristics of alumni donors and non-donors at a…

  11. Taxonomy of interpretation trees

    NASA Astrophysics Data System (ADS)

    Flynn, Patrick J.; Jain, Anil K.

    1992-02-01

    This paper explores alternative models of the interpretation tree (IT), whose search is one of the dominant paradigms for object recognition. Recurrence relations for the unpruned size of eight different types of search tree are introduced. Since exhaustive search of the IT in most recognition systems is impractical, pruning of various types is employed. It is therefore useful to see how much of the IT will be explored in a typical recognition problem. Probabilistic models of the search process have been proposed in the literature and used as a basis for theoretical bounds on search tree size, but experiments on a large number of images suggest that for 3-D object recognition from range data, the error probabilities (assumed to be constant) display significant variation. Hence, the theoretical bounds on the interpretation tree's size can serve only as rough estimates of the computational burden incurred during object recognition.

  12. Tree Nut Allergies

    MedlinePlus

    ... tree nut used on the label. Read all product labels carefully before purchasing and consuming any item. Ingredients ... Getting Started Newly Diagnosed Emergency Care Plan Food Labels Mislabeled Products Tips for Managing Food Allergies Resources For... Most ...

  13. Generalized constructive tree weights

    SciTech Connect

    Rivasseau, Vincent E-mail: adrian.tanasa@ens-lyon.org; Tanasa, Adrian E-mail: adrian.tanasa@ens-lyon.org

    2014-04-15

    The Loop Vertex Expansion (LVE) is a quantum field theory (QFT) method which explicitly computes the Borel sum of Feynman perturbation series. This LVE relies in a crucial way on symmetric tree weights which define a measure on the set of spanning trees of any connected graph. In this paper we generalize this method by defining new tree weights. They depend on the choice of a partition of a set of vertices of the graph, and when the partition is non-trivial, they are no longer symmetric under permutation of vertices. Nevertheless we prove they have the required positivity property to lead to a convergent LVE; in fact we formulate this positivity property precisely for the first time. Our generalized tree weights are inspired by the Brydges-Battle-Federbush work on cluster expansions and could be particularly suited to the computation of connected functions in QFT. Several concrete examples are explicitly given.

  14. Leonardo's Tree Theory.

    ERIC Educational Resources Information Center

    Werner, Suzanne K.

    2003-01-01

    Describes a series of activities exploring Leonardo da Vinci's tree theory that are designed to strengthen 8th grade students' data collection and problem solving skills in physical science classes. (KHR)

  15. The tree BVOC index.

    PubMed

    Simpson, J R; McPherson, E G

    2011-01-01

    Urban trees can produce a number of benefits, among them improved air quality. Biogenic volatile organic compounds (BVOCs) emitted by some species are ozone precursors. Modifying future tree planting to favor lower-emitting species can reduce these emissions and aid air management districts in meeting federally mandated emissions reductions for these compounds. Changes in BVOC emissions are calculated as the result of transitioning to a lower-emitting species mix in future planting. A simplified method for calculating the emissions reduction and a Tree BVOC index based on the calculated reduction is described. An example illustrates the use of the index as a tool for implementation and monitoring of a tree program designed to reduce BVOC emissions as a control measure being developed as part of the State Implementation Plan (SIP) for the Sacramento Federal Nonattainment Area. PMID:21435760

  16. Tree-bank grammars

    SciTech Connect

    Charniak, E.

    1996-12-31

    By a {open_quotes}tree-bank grammar{close_quotes} we mean a context-free grammar created by reading the production rules directly from hand-parsed sentences in a tree bank. Common wisdom has it that such grammars do not perform well, though we know of no published data on the issue. The primary purpose of this paper is to show that the common wisdom is wrong. In particular, we present results on a tree-bank grammar based on the Penn Wall Street Journal tree bank. To the best of our knowledge, this grammar outperforms all other non-word-based statistical parsers/grammars on this corpus. That is, it outperforms parsers that consider the input as a string of tags and ignore the actual words of the corpus.

  17. Mapping geogenic radon potential by regression kriging.

    PubMed

    Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos

    2016-02-15

    Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly. PMID:26706761

  18. Mapping geogenic radon potential by regression kriging.

    PubMed

    Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos

    2016-02-15

    Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly.

  19. Tree Topology Estimation

    PubMed Central

    Estrada, Rolando; Tomasi, Carlo; Schmidler, Scott C.; Farsiu, Sina

    2015-01-01

    Tree-like structures are fundamental in nature, and it is often useful to reconstruct the topology of a tree—what connects to what—from a two-dimensional image of it. However, the projected branches often cross in the image: the tree projects to a planar graph, and the inverse problem of reconstructing the topology of the tree from that of the graph is ill-posed. We regularize this problem with a generative, parametric tree-growth model. Under this model, reconstruction is possible in linear time if one knows the direction of each edge in the graph—which edge endpoint is closer to the root of the tree—but becomes NP-hard if the directions are not known. For the latter case, we present a heuristic search algorithm to estimate the most likely topology of a rooted, three-dimensional tree from a single two-dimensional image. Experimental results on retinal vessel, plant root, and synthetic tree datasets show that our methodology is both accurate and efficient. PMID:26353004

  20. Exposure and effects of perfluoroalkyl substances in tree swallows nesting in Minnesota and Wisconsin, USA

    USGS Publications Warehouse

    Custer, Christine M.; Custer, Thomas W.; Dummer, Paul; Etterson, Matthew A.; Thogmartin, Wayne E.; Wu, Qian; Kannan, Kurunthachalam; Trowbridge, Annette; McKann, Patrick C.

    2013-01-01

    The exposure and effects of perfluoroalkyl substances (PFASs) were studied at eight locations in Minnesota and Wisconsin between 2007 and 2011 using tree swallows (Tachycineta bicolor). Concentrations of PFASs were quantified as were reproductive success end points. The sample egg method was used wherein an egg sample is collected, and the hatching success of the remaining eggs in the nest is assessed. The association between PFAS exposure and reproductive success was assessed by site comparisons, logistic regression analysis, and multistate modeling, a technique not previously used in this context. There was a negative association between concentrations of perfluorooctane sulfonate (PFOS) in eggs and hatching success. The concentration at which effects became evident (150–200 ng/g wet weight) was far lower than effect levels found in laboratory feeding trials or egg-injection studies of other avian species. This discrepancy was likely because behavioral effects and other extrinsic factors are not accounted for in these laboratory studies and the possibility that tree swallows are unusually sensitive to PFASs. The results from multistate modeling and simple logistic regression analyses were nearly identical. Multistate modeling provides a better method to examine possible effects of additional covariates and assessment of models using Akaike information criteria analyses. There was a credible association between PFOS concentrations in plasma and eggs, so extrapolation between these two commonly sampled tissues can be performed.

  1. Practical Session: Multiple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).

  2. How Trees Can Save Energy.

    ERIC Educational Resources Information Center

    Fazio, James R., Ed.

    1991-01-01

    This document might easily have been called "How To Use Trees To Save Energy". It presents the energy saving advantages of landscaping the home and community with trees. The discussion includes: (1) landscaping advice to obtain the benefits of tree shade; (2) the heat island phenomenon in cities; (3) how and where to properly plant trees for…

  3. State Trees and Arbor Days.

    ERIC Educational Resources Information Center

    Forest Service (USDA), Washington, DC.

    Provides information on state trees for each of the 50 states and the District of Columbia. Includes for each state: (1) year in which state tree was chosen; (2) common and scientific names of the tree; (3) arbor day observance; (4) address of state forester; and (5) drawings of the tree, leaf, and fruit or cone. (JN)

  4. A Gibbs sampler for multivariate linear regression

    NASA Astrophysics Data System (ADS)

    Mantz, Adam B.

    2016-04-01

    Kelly described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modelled by a flexible mixture of Gaussians rather than assumed to be uniform. Here, I extend the Kelly algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Secondly, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamically relaxed galaxy clusters as a function of their mass and redshift. An implementation of the Gibbs sampler in the R language, called LRGS, is provided.

  5. Tree growth and competition in an old-growth Picea abies forest of boreal Sweden: influence of tree spatial patterning

    USGS Publications Warehouse

    Fraver, Shawn; D'Amato, Anthony W.; Bradford, John B.; Jonsson, Bengt Gunnar; Jönsson, Mari; Esseen, Per-Anders

    2013-01-01

    Question: What factors best characterize tree competitive environments in this structurally diverse old-growth forest, and do these factors vary spatially within and among stands? Location: Old-growth Picea abies forest of boreal Sweden. Methods: Using long-term, mapped permanent plot data augmented with dendrochronological analyses, we evaluated the effect of neighbourhood competition on focal tree growth by means of standard competition indices, each modified to include various metrics of trees size, neighbour mortality weighting (for neighbours that died during the inventory period), and within-neighbourhood tree clustering. Candidate models were evaluated using mixed-model linear regression analyses, with mean basal area increment as the response variable. We then analysed stand-level spatial patterns of competition indices and growth rates (via kriging) to determine if the relationship between these patterns could further elucidate factors influencing tree growth. Results: Inter-tree competition clearly affected growth rates, with crown volume being the size metric most strongly influencing the neighbourhood competitive environment. Including neighbour tree mortality weightings in models only slightly improved descriptions of competitive interactions. Although the within-neighbourhood clustering index did not improve model predictions, competition intensity was influenced by the underlying stand-level tree spatial arrangement: stand-level clustering locally intensified competition and reduced tree growth, whereas in the absence of such clustering, inter-tree competition played a lesser role in constraining tree growth. Conclusions: Our findings demonstrate that competition continues to influence forest processes and structures in an old-growth system that has not experienced major disturbances for at least two centuries. The finding that the underlying tree spatial pattern influenced the competitive environment suggests caution in interpreting traditional tree

  6. Identifying representative trees from ensembles.

    PubMed

    Banerjee, Mousumi; Ding, Ying; Noone, Anne-Michelle

    2012-07-10

    Tree-based methods have become popular for analyzing complex data structures where the primary goal is risk stratification of patients. Ensemble techniques improve the accuracy in prediction and address the instability in a single tree by growing an ensemble of trees and aggregating. However, in the process, individual trees get lost. In this paper, we propose a methodology for identifying the most representative trees in an ensemble on the basis of several tree distance metrics. Although our focus is on binary outcomes, the methods are applicable to censored data as well. For any two trees, the distance metrics are chosen to (1) measure similarity of the covariates used to split the trees; (2) reflect similar clustering of patients in the terminal nodes of the trees; and (3) measure similarity in predictions from the two trees. Whereas the latter focuses on prediction, the first two metrics focus on the architectural similarity between two trees. The most representative trees in the ensemble are chosen on the basis of the average distance between a tree and all other trees in the ensemble. Out-of-bag estimate of error rate is obtained using neighborhoods of representative trees. Simulations and data examples show gains in predictive accuracy when averaging over such neighborhoods. We illustrate our methods using a dataset of kidney cancer treatment receipt (binary outcome) and a second dataset of breast cancer survival (censored outcome).

  7. How tree roots respond to drought

    PubMed Central

    Brunner, Ivano; Herzog, Claude; Dawes, Melissa A.; Arend, Matthias; Sperisen, Christoph

    2015-01-01

    The ongoing climate change is characterized by increased temperatures and altered precipitation patterns. In addition, there has been an increase in both the frequency and intensity of extreme climatic events such as drought. Episodes of drought induce a series of interconnected effects, all of which have the potential to alter the carbon balance of forest ecosystems profoundly at different scales of plant organization and ecosystem functioning. During recent years, considerable progress has been made in the understanding of how aboveground parts of trees respond to drought and how these responses affect carbon assimilation. In contrast, processes of belowground parts are relatively underrepresented in research on climate change. In this review, we describe current knowledge about responses of tree roots to drought. Tree roots are capable of responding to drought through a variety of strategies that enable them to avoid and tolerate stress. Responses include root biomass adjustments, anatomical alterations, and physiological acclimations. The molecular mechanisms underlying these responses are characterized to some extent, and involve stress signaling and the induction of numerous genes, leading to the activation of tolerance pathways. In addition, mycorrhizas seem to play important protective roles. The current knowledge compiled in this review supports the view that tree roots are well equipped to withstand drought situations and maintain morphological and physiological functions as long as possible. Further, the reviewed literature demonstrates the important role of tree roots in the functioning of forest ecosystems and highlights the need for more research in this emerging field. PMID:26284083

  8. Multidimensional minimal spanning tree: The Dow Jones case

    NASA Astrophysics Data System (ADS)

    Brida, Juan Gabriel; Risso, Wiston Adrián

    2008-09-01

    This paper introduces a new methodology in order to construct Minimal Spanning Trees (MST) and Hierarchical Trees (HT) using the information provided by more than one variable. In fact, the Symbolic Time Series Analysis (STSA) approach is applied to the Dow Jones companies using information not only from asset returns but also for trading volume. The US stock market structure is obtained, showing eight clusters of companies and General Electric as a central node in the tree. We use different partitions showing that the results do not depend on the particular partition. In addition, we apply Monte Carlo simulations suggesting that the tree is not the result of random connections.

  9. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  10. MetaTreeMap: An Alternative Visualization Method for Displaying Metagenomic Phylogenic Trees

    PubMed Central

    Taylor, Todd D.

    2016-01-01

    Metagenomic samples can contain hundreds or thousands of different species. The most common method to identify these species is to sequence the samples and then classify the reads to nodes along a phylogenic tree. Linear representations of trees with so many nodes face legibility issues. In addition, such views are not optimal for appreciating the read quantity assigned to each node. The problem is exaggerated when comparison between multiple samples is needed. MetaTreeMap adapts a visualization method that addresses these weaknesses. The tree is represented by nested rectangles that illustrate the number or percentage of assigned reads. MetaTreeMap implements various options specific to phylogenic trees that allow for quick overview and investigation of the information. More generally, the goal of this software is to provide the user with the ability to easily display phylogenic trees based on various quantities assigned to the nodes, such as read number, percentage or other values. The tool can be used online at http://metasystems.riken.jp/visualization/treemap/. PMID:27336370

  11. The inference of gene trees with species trees.

    PubMed

    Szöllősi, Gergely J; Tannier, Eric; Daubin, Vincent; Boussau, Bastien

    2015-01-01

    This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.

  12. [Spatial interpolation of soil organic matter using regression Kriging and geographically weighted regression Kriging].

    PubMed

    Yang, Shun-hua; Zhang, Hai-tao; Guo, Long; Ren, Yan

    2015-06-01

    Relative elevation and stream power index were selected as auxiliary variables based on correlation analysis for mapping soil organic matter. Geographically weighted regression Kriging (GWRK) and regression Kriging (RK) were used for spatial interpolation of soil organic matter and compared with ordinary Kriging (OK), which acts as a control. The results indicated that soil or- ganic matter was significantly positively correlated with relative elevation whilst it had a significantly negative correlation with stream power index. Semivariance analysis showed that both soil organic matter content and its residuals (including ordinary least square regression residual and GWR resi- dual) had strong spatial autocorrelation. Interpolation accuracies by different methods were esti- mated based on a data set of 98 validation samples. Results showed that the mean error (ME), mean absolute error (MAE) and root mean square error (RMSE) of RK were respectively 39.2%, 17.7% and 20.6% lower than the corresponding values of OK, with a relative-improvement (RI) of 20.63. GWRK showed a similar tendency, having its ME, MAE and RMSE to be respectively 60.6%, 23.7% and 27.6% lower than those of OK, with a RI of 59.79. Therefore, both RK and GWRK significantly improved the accuracy of OK interpolation of soil organic matter due to their in- corporation of auxiliary variables. In addition, GWRK performed obviously better than RK did in this study, and its improved performance should be attributed to the consideration of sample spatial locations. PMID:26572015

  13. Diurnal Cycles of Tree Mass Obtained Using Accelerometers

    NASA Astrophysics Data System (ADS)

    Llamas, R. A.; Niemeier, J. J.; Kruger, A.; Lintz, H. E.; Kleinknecht, G. J.; Miller, R. A.

    2013-12-01

    We used a non-invasive technique to estimate the mass of trees using accelerometers. The technique was inspired by Selker et al., 2011 who performed experiments with an oak tree to estimate the time-varying canopy mass. The technique consists of placing an accelerometer on the trunk of a live tree. The resonance frequency is related to the mass of the tree. Wind drives the tree and the accelerometer data are analyzed to obtain estimates of the resonance frequency and mass of the tree. In addition to wind speed and direction, we also collected ambient temperature and rain accumulation using co-located instruments. We collected data for 3 months using several accelerometers configured for different sampling rates. Analysis of the data revealed diurnal cycles in temperature, wind speed, and tree mass derived from the tree resonance frequency. We used the Welch method for power spectral density estimation to obtain hourly estimates of the tree resonance frequency. Our hypothesis is that the mass diurnal cycle is related to the tree water content.

  14. Tree Testing of Hierarchical Menu Structures for Health Applications

    PubMed Central

    Le, Thai; Chaudhuri, Shomir; Chung, Jane; Thompson, Hilaire J; Demiris, George

    2014-01-01

    To address the need for greater evidence-based evaluation of Health Information Technology (HIT) systems we introduce a method of usability testing termed tree testing. In a tree test, participants are presented with an abstract hierarchical tree of the system taxonomy and asked to navigate through the tree in completing representative tasks. We apply tree testing to a commercially available health application, demonstrating a use case and providing a comparison with more traditional in-person usability testing methods. Online tree tests (N=54) and in-person usability tests (N=15) were conducted from August to September 2013. Tree testing provided a method to quantitatively evaluate the information structure of a system using various navigational metrics including completion time, task accuracy, and path length. The results of the analyses compared favorably to the results seen from the traditional usability test. Tree testing provides a flexible, evidence-based approach for researchers to evaluate the information structure of HITs. In addition, remote tree testing provides a quick, flexible, and high volume method of acquiring feedback in a structured format that allows for quantitative comparisons. With the diverse nature and often large quantities of health information available, addressing issues of terminology and concept classifications during the early development process of a health information system will improve navigation through the system and save future resources. Tree testing is a usability method that can be used to quickly and easily assess information hierarchy of health information systems. PMID:24582924

  15. Developmental Regression in Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Rogers, Sally J.

    2004-01-01

    The occurrence of developmental regression in autism is one of the more puzzling features of this disorder. Although several studies have documented the validity of parental reports of regression using home videos, accumulating data suggest that most children who demonstrate regression also demonstrated previous, subtle, developmental differences.…

  16. Building Regression Models: The Importance of Graphics.

    ERIC Educational Resources Information Center

    Dunn, Richard

    1989-01-01

    Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)

  17. Regression Analysis by Example. 5th Edition

    ERIC Educational Resources Information Center

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  18. Bayesian Unimodal Density Regression for Causal Inference

    ERIC Educational Resources Information Center

    Karabatsos, George; Walker, Stephen G.

    2011-01-01

    Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…

  19. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  20. Streamflow forecasting using functional regression

    NASA Astrophysics Data System (ADS)

    Masselot, Pierre; Dabo-Niang, Sophie; Chebana, Fateh; Ouarda, Taha B. M. J.

    2016-07-01

    Streamflow, as a natural phenomenon, is continuous in time and so are the meteorological variables which influence its variability. In practice, it can be of interest to forecast the whole flow curve instead of points (daily or hourly). To this end, this paper introduces the functional linear models and adapts it to hydrological forecasting. More precisely, functional linear models are regression models based on curves instead of single values. They allow to consider the whole process instead of a limited number of time points or features. We apply these models to analyse the flow volume and the whole streamflow curve during a given period by using precipitations curves. The functional model is shown to lead to encouraging results. The potential of functional linear models to detect special features that would have been hard to see otherwise is pointed out. The functional model is also compared to the artificial neural network approach and the advantages and disadvantages of both models are discussed. Finally, future research directions involving the functional model in hydrology are presented.

  1. Estimating equivalence with quantile regression

    USGS Publications Warehouse

    Cade, B.S.

    2011-01-01

    Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. ?? 2011 by the Ecological Society of America.

  2. Insulin resistance: regression and clustering.

    PubMed

    Yoon, Sangho; Assimes, Themistocles L; Quertermous, Thomas; Hsiao, Chin-Fu; Chuang, Lee-Ming; Hwu, Chii-Min; Rajaratnam, Bala; Olshen, Richard A

    2014-01-01

    In this paper we try to define insulin resistance (IR) precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI) or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ), a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT). We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with "main effects" is not satisfactory, but prediction that includes interactions may be. PMID:24887437

  3. The Fault Tree Compiler (FTC): Program and mathematics

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Martensen, Anna L.

    1989-01-01

    The Fault Tree Compiler Program is a new reliability tool used to predict the top-event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, AND m OF n gates. The high-level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precisely (within the limits of double precision floating point arithmetic) within a user specified number of digits accuracy. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Equipment Corporation (DEC) VAX computer with the VMS operation system.

  4. Tree nut allergens.

    PubMed

    Roux, Kenneth H; Teuber, Suzanne S; Sathe, Shridhar K

    2003-08-01

    Allergic reactions to tree nuts can be serious and life threatening. Considerable research has been conducted in recent years in an attempt to characterize those allergens that are most responsible for allergy sensitization and triggering. Both native and recombinant nut allergens have been identified and characterized and, for some, the IgE-reactive epitopes described. Some allergens, such as lipid transfer proteins, profilins, and members of the Bet v 1-related family, represent minor constituents in tree nuts. These allergens are frequently cross-reactive with other food and pollen homologues, and are considered panallergens. Others, such as legumins, vicilins, and 2S albumins, represent major seed storage protein constituents of the nuts. The allergenic tree nuts discussed in this review include those most commonly responsible for allergic reactions such as hazelnut, walnut, cashew, and almond as well as those less frequently associated with allergies including pecan, chestnut, Brazil nut, pine nut, macadamia nut, pistachio, coconut, Nangai nut, and acorn.

  5. Tree nut allergens.

    PubMed

    Roux, Kenneth H; Teuber, Suzanne S; Sathe, Shridhar K

    2003-08-01

    Allergic reactions to tree nuts can be serious and life threatening. Considerable research has been conducted in recent years in an attempt to characterize those allergens that are most responsible for allergy sensitization and triggering. Both native and recombinant nut allergens have been identified and characterized and, for some, the IgE-reactive epitopes described. Some allergens, such as lipid transfer proteins, profilins, and members of the Bet v 1-related family, represent minor constituents in tree nuts. These allergens are frequently cross-reactive with other food and pollen homologues, and are considered panallergens. Others, such as legumins, vicilins, and 2S albumins, represent major seed storage protein constituents of the nuts. The allergenic tree nuts discussed in this review include those most commonly responsible for allergic reactions such as hazelnut, walnut, cashew, and almond as well as those less frequently associated with allergies including pecan, chestnut, Brazil nut, pine nut, macadamia nut, pistachio, coconut, Nangai nut, and acorn. PMID:12915766

  6. Nonparametric instrumental regression with non-convex constraints

    NASA Astrophysics Data System (ADS)

    Grasmair, M.; Scherzer, O.; Vanhems, A.

    2013-03-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.

  7. Long Tree-Ring Chronologies Provide Evidence of Recent Tree Growth Decrease in a Central African Tropical Forest

    PubMed Central

    Battipaglia, Giovanna; Zalloni, Enrica; Castaldi, Simona; Marzaioli, Fabio; Cazzolla- Gatti, Roberto; Lasserre, Bruno; Tognetti, Roberto; Marchetti, Marco; Valentini, Riccardo

    2015-01-01

    It is still unclear whether the exponential rise of atmospheric CO2 concentration has produced a fertilization effect on tropical forests, thus incrementing their growth rate, in the last two centuries. As many factors affect tree growth patterns, short -term studies might be influenced by the confounding effect of several interacting environmental variables on plant growth. Long-term analyses of tree growth can elucidate long-term trends of plant growth response to dominant drivers. The study of annual rings, applied to long tree-ring chronologies in tropical forest trees enables such analysis. Long-term tree-ring chronologies of three widespread African species were measured in Central Africa to analyze the growth of trees over the last two centuries. Growth trends were correlated to changes in global atmospheric CO2 concentration and local variations in the main climatic drivers, temperature and rainfall. Our results provided no evidence for a fertilization effect of CO2 on tree growth. On the contrary, an overall growth decline was observed for all three species in the last century, which appears to be significantly correlated to the increase in local temperature. These findings provide additional support to the global observations of a slowing down of C sequestration in the trunks of forest trees in recent decades. Data indicate that the CO2 increase alone has not been sufficient to obtain a tree growth increase in tropical trees. The effect of other changing environmental factors, like temperature, may have overridden the fertilization effect of CO2. PMID:25806946

  8. Fully Regressive Melanoma: A Case Without Metastasis.

    PubMed

    Ehrsam, Eric; Kallini, Joseph R; Lebas, Damien; Khachemoune, Amor; Modiano, Philippe; Cotten, Hervé

    2016-08-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418

  9. Heartwood and tree exudates

    SciTech Connect

    Hillis, W.E.

    1987-01-01

    Increasingly, mankind will depend on renewable resources produced at low energy cost - such as forest products. Greater demands will require increased growth as well as utilisation with reduced loss. After a certain age, trees from heartwood containing increased amounts of extractives which are also formed in injured sapwood or are exuded. Their presence can provide trees with resistance to disease and insect attack and they can also affect the efficient utilisation of wood. In this book different facets of heartwood, extractives and exudates are reviewed as a whole for the first time.

  10. Developmental regression in autism spectrum disorder

    PubMed Central

    Al Backer, Nouf Backer

    2015-01-01

    The occurrence of developmental regression in autism spectrum disorder (ASD) is one of the most puzzling phenomena of this disorder. A little is known about the nature and mechanism of developmental regression in ASD. About one-third of young children with ASD lose some skills during the preschool period, usually speech, but sometimes also nonverbal communication, social or play skills are also affected. There is a lot of evidence suggesting that most children who demonstrate regression also had previous, subtle, developmental differences. It is difficult to predict the prognosis of autistic children with developmental regression. It seems that the earlier development of social, language, and attachment behaviors followed by regression does not predict the later recovery of skills or better developmental outcomes. The underlying mechanisms that lead to regression in autism are unknown. The role of subclinical epilepsy in the developmental regression of children with autism remains unclear. PMID:27493417

  11. Optimizing a basal bark spray of dinotefuran to manage armored scales (Hemiptera: Diaspididae) in Christmas tree plantations.

    PubMed

    Cowles, Richard S

    2010-10-01

    The armored scales Fiorinia externa Ferris and Aspidiotus cryptomeriae Kuwana (Hemiptera: Diaspididae) are increasingly damaging to Christmas tree plantings in southern New England. The systemic insecticide dinotefuran was investigated for selectively suppressing armored scale populations relative to their natural enemies in cooperating growers' fields in 2008 and 2009. Banded soil application of dinotefuran resulted in poor control. However, a dinotefuran spray applied to the basal 25 cm of trunk resulted in its absorption through the bark, translocation to the foliage, and good efficacy. The basal bark spray did not significantly impact the activity of predators Chilocorus stigma (Say) or Cybocephalus nipponicus Enrody-Younga and in 2009 showed a dosage-dependent improvement in the percentage of scales parasitized by Encarsia citrina Craw. A field dosage-response factorial experiment revealed that a 0.25% (vol:vol) addition of a surfactant with dinotefuran did not enhance insecticidal effect. Probit-transformed scale population reduction relative to the untreated check was subjected to linear regression analysis; reduction of scale populations was proportional to the log of insecticide dosage, whereas basal bark spray efficacy declined in proportion to the cube of tree height. The regression equation can be used to optimize dosage relative to tree height. Excellent efficacy resulted from basal bark spray application dates of 28 April (prebud break) to mid-June, but earlier spray timing within that treatment window had fewer crawlers discoloring new growth with their short-lived feeding. A basal bark spray of dinotefuran is well suited for integration with natural enemies to manage armored scales in Christmas tree plantations.

  12. The Inference of Gene Trees with Species Trees

    PubMed Central

    Szöllősi, Gergely J.; Tannier, Eric; Daubin, Vincent; Boussau, Bastien

    2015-01-01

    This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution. PMID:25070970

  13. Forward estimation for game-tree search

    SciTech Connect

    Zhang, Weixiong

    1996-12-31

    It is known that bounds on the minimax values of nodes in a game tree can be used to reduce the computational complexity of minimax search for two-player games. We describe a very simple method to estimate bounds on the minimax values of interior nodes of a game tree, and use the bounds to improve minimax search. The new algorithm, called forward estimation, does not require additional domain knowledge other than a static node evaluation function, and has small constant overhead per node expansion. We also propose a variation of forward estimation, which provides a tradeoff between computational complexity and decision quality. Our experimental results show that forward estimation outperforms alpha-beta pruning on random game trees and the game of Othello.

  14. Tree attenuation at 20 GHz: Foliage effects

    NASA Technical Reports Server (NTRS)

    Vogel, Wolfhard J.; Goldhirsh, Julius

    1993-01-01

    Static tree attenuation measurements at 20 GHz (K-Band) on a 30 deg slant path through a mature Pecan tree with and without leaves showed median fades exceeding approximately 23 dB and 7 dB, respectively. The corresponding 1% probability fades were 43 dB and 25 dB. Previous 1.6 GHz (L-Band) measurements for the bare tree case showed fades larger than those at K-Band by 3.4 dB for the median and smaller by approximately 7 dB at the 1% probability. While the presence of foliage had only a small effect on fading at L-Band (approximately 1 dB additional for the median to 1% probability range), the attenuation increase was significant at K-Band, where it increased by about 17 dB over the same probability range.

  15. Arbutus unedo, Strawberry Tree

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Encylopedia of Fruit and Nuts is designed as a research reference source on temperate and tropical fruit and nut crops. Strawberry tree or madrone is native to the Mediterranean region of southern Europe (Arbutus unedo L., Ericaceae) with a relict population in Ireland, as well as in North Ameri...

  16. A Universal Phylogenetic Tree.

    ERIC Educational Resources Information Center

    Offner, Susan

    2001-01-01

    Presents a universal phylogenetic tree suitable for use in high school and college-level biology classrooms. Illustrates the antiquity of life and that all life is related, even if it dates back 3.5 billion years. Reflects important evolutionary relationships and provides an exciting way to learn about the history of life. (SAH)

  17. MPI File Tree Walk

    2007-04-30

    MPI-FTW is a scalable MPI based software application that navigates a directory tree by dynamically allocating processes to navigate sub-directories found. Upon completion, MPI-FTW provides statistics on the number of directories found, files found, and time to complete. Inaddition, commands can be executed at each directory level.

  18. Tree-Ties.

    ERIC Educational Resources Information Center

    Gresczyk, Rick

    Created to help students understand how plants were used for food, for medicine, and for arts and crafts among the Ojibwe (Chippewa) Indians, the game Tree-Ties combines earth and social sciences within a specific culture. The game requires mutual respect, understanding, and agreement to succeed. Sounding like the word "treaties", the title is a…

  19. The Medicine Tree.

    ERIC Educational Resources Information Center

    Brokenleg, Martin

    2000-01-01

    Demographic changes in population continue to bring children of different cultural backgrounds to classrooms. This article provides suggestions teachers and counselors can use to bridge cultures. Using the parable of a medicine tree, it explains how no society can endure without caring for its young. (Author/JDM)

  20. Phylogenics & Tree-Thinking

    ERIC Educational Resources Information Center

    Baum, David A.; Offner, Susan

    2008-01-01

    Phylogenetic trees, which are depictions of the inferred evolutionary relationships among a set of species, now permeate almost all branches of biology and are appearing in increasing numbers in biology textbooks. While few state standards explicitly require knowledge of phylogenetics, most require some knowledge of evolutionary biology, and many…

  1. Tree theorem for inflation

    SciTech Connect

    Weinberg, Steven

    2008-09-15

    It is shown that the generating function for tree graphs in the ''in-in'' formalism may be calculated by solving the classical equations of motion subject to certain constraints. This theorem is illustrated by application to the evolution of a single inflaton field in a Robertson-Walker background.

  2. Digging Deeper with Trees.

    ERIC Educational Resources Information Center

    Growing Ideas, 2001

    2001-01-01

    Describes hands-on science areas that focus on trees. A project on leaf pigmentation involves putting crushed leaves in a test tube with solvent acetone to dissolve pigment. In another project, students learn taxonomy by sorting and classifying leaves based on observable characteristics. Includes a language arts connection. (PVD)

  3. Trees at the Center.

    ERIC Educational Resources Information Center

    Flannery, Maura

    1998-01-01

    Recommends introducing students to biology using a topical focus that can offer intriguing perspectives on the discipline. Describes a biology course that uses trees as a topical focus. Presents a list of literary resources and reviews student interactions. Contains 50 references. (DDR)

  4. Christmas Tree Category Manual.

    ERIC Educational Resources Information Center

    Bowman, James S.; Turmel, Jon P.

    This manual provides information needed to meet the standards for pesticide applicator certification. Pests and diseases of christmas tree plantations are identified and discussed. Section one deals with weeds and woody plants and the application, formulation and effects of herbicides in controlling them. Section two discusses specific diseases…

  5. The Sacred Tree.

    ERIC Educational Resources Information Center

    Lethbridge Univ. (Alberta).

    Designed as a text for high school students and adults, this illustrated book presents ethical concepts and teachings of Native societies throughout North America concerning the nature and possibilities of human existence. The final component of a course in self-discovery and development, the book begins with the legend of the "Sacred Tree"…

  6. Flexible regression models over river networks

    PubMed Central

    O’Donnell, David; Rushworth, Alastair; Bowman, Adrian W; Marian Scott, E; Hallard, Mark

    2014-01-01

    Many statistical models are available for spatial data but the vast majority of these assume that spatial separation can be measured by Euclidean distance. Data which are collected over river networks constitute a notable and commonly occurring exception, where distance must be measured along complex paths and, in addition, account must be taken of the relative flows of water into and out of confluences. Suitable models for this type of data have been constructed based on covariance functions. The aim of the paper is to place the focus on underlying spatial trends by adopting a regression formulation and using methods which allow smooth but flexible patterns. Specifically, kernel methods and penalized splines are investigated, with the latter proving more suitable from both computational and modelling perspectives. In addition to their use in a purely spatial setting, penalized splines also offer a convenient route to the construction of spatiotemporal models, where data are available over time as well as over space. Models which include main effects and spatiotemporal interactions, as well as seasonal terms and interactions, are constructed for data on nitrate pollution in the River Tweed. The results give valuable insight into the changes in water quality in both space and time. PMID:25653460

  7. Tree-Ring Evidence for Volcanic Eruption Effects on Temperate and Boreal Tree Net Primary Productivity

    NASA Astrophysics Data System (ADS)

    Krakauer, N. Y.; Smith, N. V.; Randerson, J. T.

    2003-12-01

    The 1991 Pinatubo eruption and the apparent increased terrestrial carbon uptake in 1992 and 1993 have motivated interest in understanding the impact on plant productivity of the climate and radiative change resulting from volcanic eruptions that generate large stratospheric aerosol loadings. We used tree ring width series to look for anomalously high or low tree growth following 10 large eruptions since 1500 (not including Pinatubo) that resulted in stratospheric aerosol loadings comparable to Pinatubo's. We obtained crossdated ring width series from the International Tree Ring Data Bank, developed regional mean width indices, and used a Monte Carlo approach to test for significant departures in the indices following eruptions. Boreal zone trees (north of 50° N) showed significantly reduced widths ( ˜5% below average) for several years centered around years 4-6 after eruptions. Temperate zone (35° -50° N) trees in eastern North America showed significantly increased (by ˜6%) widths on years 0-2 after eruptions. Temperate zone trees in western North America showed a smaller increase, and trees in Europe showed no increase. We tentatively suggest that eruption-induced cooling causes the growth reduction in boreal trees, whereas the differing regional patterns found in temperate trees could be due to a combination of differences in eruption climate effects between regions, temperature versus moisture limited growth depending on ambient climate, and enhancement of tree light use efficiency in closed-canopy forest because of an increase in diffuse light fraction. Our findings invite additional research to clarify how regional climate and ecology modulate the effects of eruptions on tree growth and to assess the net effect of eruptions on global plant productivity. A series of annual tree carbon increment compiled from coring in plots of Harvard Forest, Massachusetts (42.5° N, 72.2° W), as well as other series for eastern North America do not show increased growth

  8. Unravelling the limits to tree height: a major role for water and nutrient trade-offs.

    PubMed

    Cramer, Michael D

    2012-05-01

    Competition for light has driven forest trees to grow exceedingly tall, but the lack of a single universal limit to tree height indicates multiple interacting environmental limitations. Because soil nutrient availability is determined by both nutrient concentrations and soil water, water and nutrient availabilities may interact in determining realised nutrient availability and consequently tree height. In SW Australia, which is characterised by nutrient impoverished soils that support some of the world's tallest forests, total [P] and water availability were independently correlated with tree height (r = 0.42 and 0.39, respectively). However, interactions between water availability and each of total [P], pH and [Mg] contributed to a multiple linear regression model of tree height (r = 0.72). A boosted regression tree model showed that maximum tree height was correlated with water availability (24%), followed by soil properties including total P (11%), Mg (10%) and total N (9%), amongst others, and that there was an interaction between water availability and total [P] in determining maximum tree height. These interactions indicated a trade-off between water and P availability in determining maximum tree height in SW Australia. This is enabled by a species assemblage capable of growing tall and surviving (some) disturbances. The mechanism for this trade-off is suggested to be through water enabling mass-flow and diffusive mobility of P, particularly of relatively mobile organic P, although water interactions with microbial activity could also play a role.

  9. Using GA-Ridge regression to select hydro-geological parameters influencing groundwater pollution vulnerability.

    PubMed

    Ahn, Jae Joon; Kim, Young Min; Yoo, Keunje; Park, Joonhong; Oh, Kyong Joo

    2012-11-01

    For groundwater conservation and management, it is important to accurately assess groundwater pollution vulnerability. This study proposed an integrated model using ridge regression and a genetic algorithm (GA) to effectively select the major hydro-geological parameters influencing groundwater pollution vulnerability in an aquifer. The GA-Ridge regression method determined that depth to water, net recharge, topography, and the impact of vadose zone media were the hydro-geological parameters that influenced trichloroethene pollution vulnerability in a Korean aquifer. When using these selected hydro-geological parameters, the accuracy was improved for various statistical nonlinear and artificial intelligence (AI) techniques, such as multinomial logistic regression, decision trees, artificial neural networks, and case-based reasoning. These results provide a proof of concept that the GA-Ridge regression is effective at determining influential hydro-geological parameters for the pollution vulnerability of an aquifer, and in turn, improves the AI performance in assessing groundwater pollution vulnerability.

  10. A tree grows in Seattle.

    PubMed

    1978-01-01

    In 1975 in Seattle, Washington, a nonprescription contraceptive specialty shop known as the Rubber Tree opened as a nonprofit project of the Seattle Chapter of Zero Population Growth. A visit to the shop confirms the logic of the concept that unhealthy attitudes toward contraceptives, particularly condoms, are reinforced when the products are not sold openly. The shop's clientele - the only shop of its kind in the U.S. - has steadily grown over 3 years; an estimated 7000 person were served last year. Inside the shop, situated on a quiet residential street, there are brightly packaged contraceptives attractively arranged on shelves as well as colorful posters, T-shirts, and bumper stickers bearing population control messages, and a variety of books dealing with human sexuality and family planning. In addition to making nonprescription contraceptives easily available and working to abolish social taboos, the Rubber Tree's mission includes the dissemination of medically accurate information in the hope of reducing the incidence of venereal disease and unplanned pregnancies. The shop also has a mail-order service to help increase the availability of contraceptives and family planning information in small towns and rural areas.

  11. Waveform correlation by tree matching.

    PubMed

    Cheng, Y C; Lu, S Y

    1985-03-01

    A waveform correlation scheme is presented. The scheme consists of four parts: 1) the representation of waveforms by trees, 2) the definition of basic operations on tree nodes and tree distance, 3) a tree matching algorithm, and 4) a backtracking procedure to find the best node-to-node correlation. This correlation scheme has been implemented. Results show that the scheme has the capability of handling distortions that result from stretching or shrinking of intervals or from missing intervals.

  12. A Beta-splitting model for evolutionary trees

    PubMed Central

    Sainudiin, Raazesh

    2016-01-01

    In this article, we construct a generalization of the Blum–François Beta-splitting model for evolutionary trees, which was itself inspired by Aldous' Beta-splitting model on cladograms. The novelty of our approach allows for asymmetric shares of diversification rates (or diversification ‘potential’) between two sister species in an evolutionarily interpretable manner, as well as the addition of extinction to the model in a natural way. We describe the incremental evolutionary construction of a tree with n leaves by splitting or freezing extant lineages through the generating, organizing and deleting processes. We then give the probability of any (binary rooted) tree under this model with no extinction, at several resolutions: ranked planar trees giving asymmetric roles to the first and second offspring species of a given species and keeping track of the order of the speciation events occurring during the creation of the tree, unranked planar trees, ranked non-planar trees and finally (unranked non-planar) trees. We also describe a continuous-time equivalent of the generating, organizing and deleting processes where tree topology and branch lengths are jointly modelled and provide code in SageMath/Python for these algorithms. PMID:27293780

  13. A Beta-splitting model for evolutionary trees.

    PubMed

    Sainudiin, Raazesh; Véber, Amandine

    2016-05-01

    In this article, we construct a generalization of the Blum-François Beta-splitting model for evolutionary trees, which was itself inspired by Aldous' Beta-splitting model on cladograms. The novelty of our approach allows for asymmetric shares of diversification rates (or diversification 'potential') between two sister species in an evolutionarily interpretable manner, as well as the addition of extinction to the model in a natural way. We describe the incremental evolutionary construction of a tree with n leaves by splitting or freezing extant lineages through the generating, organizing and deleting processes. We then give the probability of any (binary rooted) tree under this model with no extinction, at several resolutions: ranked planar trees giving asymmetric roles to the first and second offspring species of a given species and keeping track of the order of the speciation events occurring during the creation of the tree, unranked planar trees, ranked non-planar trees and finally (unranked non-planar) trees. We also describe a continuous-time equivalent of the generating, organizing and deleting processes where tree topology and branch lengths are jointly modelled and provide code in SageMath/Python for these algorithms.

  14. A Beta-splitting model for evolutionary trees.

    PubMed

    Sainudiin, Raazesh; Véber, Amandine

    2016-05-01

    In this article, we construct a generalization of the Blum-François Beta-splitting model for evolutionary trees, which was itself inspired by Aldous' Beta-splitting model on cladograms. The novelty of our approach allows for asymmetric shares of diversification rates (or diversification 'potential') between two sister species in an evolutionarily interpretable manner, as well as the addition of extinction to the model in a natural way. We describe the incremental evolutionary construction of a tree with n leaves by splitting or freezing extant lineages through the generating, organizing and deleting processes. We then give the probability of any (binary rooted) tree under this model with no extinction, at several resolutions: ranked planar trees giving asymmetric roles to the first and second offspring species of a given species and keeping track of the order of the speciation events occurring during the creation of the tree, unranked planar trees, ranked non-planar trees and finally (unranked non-planar) trees. We also describe a continuous-time equivalent of the generating, organizing and deleting processes where tree topology and branch lengths are jointly modelled and provide code in SageMath/Python for these algorithms. PMID:27293780

  15. Efficient Gene Tree Correction Guided by Genome Evolution

    PubMed Central

    Lafond, Manuel; Seguin, Jonathan; Boussau, Bastien; Guéguen, Laurent; El-Mabrouk, Nadia; Tannier, Eric

    2016-01-01

    Motivations Gene trees inferred solely from multiple alignments of homologous sequences often contain weakly supported and uncertain branches. Information for their full resolution may lie in the dependency between gene families and their genomic context. Integrative methods, using species tree information in addition to sequence information, often rely on a computationally intensive tree space search which forecloses an application to large genomic databases. Results We propose a new method, called ProfileNJ, that takes a gene tree with statistical supports on its branches, and corrects its weakly supported parts by using a combination of information from a species tree and a distance matrix. Its low running time enabled us to use it on the whole Ensembl Compara database, for which we propose an alternative, arguably more plausible set of gene trees. This allowed us to perform a genome-wide analysis of duplication and loss patterns on the history of 63 eukaryote species, and predict ancestral gene content and order for all ancestors along the phylogeny. Availability A web interface called RefineTree, including ProfileNJ as well as a other gene tree correction methods, which we also test on the Ensembl gene families, is available at: http://www-ens.iro.umontreal.ca/~adbit/polytomysolver.html. The code of ProfileNJ as well as the set of gene trees corrected by ProfileNJ from Ensembl Compara version 73 families are also made available. PMID:27513924

  16. Optimizing Urban Tree Soil Substrate for the City of Vienna

    NASA Astrophysics Data System (ADS)

    Murer, Erwin; Strauss, Peter; Schmidt, Stefan

    2015-04-01

    Many of the city garden managements in Central Europe encounter problems with the sustainable growing of trees in the cities. Tree root space is more and more limited by pavements and roads and is polluted by salt application during winter time. Thus, the life expectancy of the city trees is decreasing because the trees become more susceptible to diseases. Diseased trees are a safety risk. These challenges are additionally enforced by lower budgets to re-establish new trees. To actively react on this challenge a new soil substrate for city trees has been developed and tested combining cost effectiveness with improved characteristics for water retention and nutrient delivery on one side and drainage capabilities on the other side. The new substrate should be inexpensive, easy and simple to produce and well miscible. Therefore, easily available materials have been tested which are river sediments that are delivered by annual floods; compost produced by a city owned composting plant and low cost dolomite chippings from quarries near Vienna. The final composition of the new Vienna tree substrate consists of 3 mineral components and one organic component. These are mixed in a relationship of 4 parts dolomite chippings, 3 parts sand and 3 parts of fluvial fine sediment and 2 parts of compost. After a laboratory phase to develop the new substrate, field testing of the newly developed substrate is presently carried out in three different types of field experiments consisting of 20 implementation sites distributed over the city of Vienna, with annual checking for the growth of trees, 2 implementation sites with sensors to measure the water and salt balance and 6 city lysimeters with implementation of enhanced facilities to monitor substrate and water behaviour. These facilities will be used to relate the growing factors in connection with the site properties, to developing of a fertilizer recommendation for urban trees and to make tests for the compatibility of the trees

  17. The Hopi Fruit Tree Book.

    ERIC Educational Resources Information Center

    Nyhuis, Jane

    Referring as often as possible to traditional Hopi practices and to materials readily available on the reservation, the illustrated booklet provides information on the care and maintenance of young fruit trees. An introduction to fruit trees explains the special characteristics of new trees, e.g., grafting, planting pits, and watering. The…

  18. Building up rhetorical structure trees

    SciTech Connect

    Marcu, D.

    1996-12-31

    I use the distinction between the nuclei and the satellites that pertain to discourse relations to introduce a compositionality criterion for discourse trees. I provide a first-order formalization of rhetorical structure trees and, on its basis, I derive an algorithm that constructs all the valid rhetorical trees that can be associated with a given discourse.

  19. New Life From Dead Trees

    ERIC Educational Resources Information Center

    DeGraaf, Richard M.

    1978-01-01

    There are numerous bird species that will nest only in dead or dying trees. Current forestry practices include clearing forests of these snags, or dead trees. This practice is driving many species out of the forests. An illustrated example of bird succession in and on a tree is given. (MA)

  20. The Re-Think Tree.

    ERIC Educational Resources Information Center

    Gear, Jim

    1993-01-01

    The Re-Think Tree is a simple framework to help individuals assess and improve their behaviors related to environmental issues. The branches of the tree in order of priority are refuse, reduce, re-use, and recycle. Roots of the tree include such things as public opinion, education, and watchdog groups. (KS)

  1. LRGS: Linear Regression by Gibbs Sampling

    NASA Astrophysics Data System (ADS)

    Mantz, Adam B.

    2016-02-01

    LRGS (Linear Regression by Gibbs Sampling) implements a Gibbs sampler to solve the problem of multivariate linear regression with uncertainties in all measured quantities and intrinsic scatter. LRGS extends an algorithm by Kelly (2007) that used Gibbs sampling for performing linear regression in fairly general cases in two ways: generalizing the procedure for multiple response variables, and modeling the prior distribution of covariates using a Dirichlet process.

  2. Quantile regression applied to spectral distance decay

    USGS Publications Warehouse

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  3. Regression Calibration with Heteroscedastic Error Variance

    PubMed Central

    Spiegelman, Donna; Logan, Roger; Grove, Douglas

    2011-01-01

    The problem of covariate measurement error with heteroscedastic measurement error variance is considered. Standard regression calibration assumes that the measurement error has a homoscedastic measurement error variance. An estimator is proposed to correct regression coefficients for covariate measurement error with heteroscedastic variance. Point and interval estimates are derived. Validation data containing the gold standard must be available. This estimator is a closed-form correction of the uncorrected primary regression coefficients, which may be of logistic or Cox proportional hazards model form, and is closely related to the version of regression calibration developed by Rosner et al. (1990). The primary regression model can include multiple covariates measured without error. The use of these estimators is illustrated in two data sets, one taken from occupational epidemiology (the ACE study) and one taken from nutritional epidemiology (the Nurses’ Health Study). In both cases, although there was evidence of moderate heteroscedasticity, there was little difference in estimation or inference using this new procedure compared to standard regression calibration. It is shown theoretically that unless the relative risk is large or measurement error severe, standard regression calibration approximations will typically be adequate, even with moderate heteroscedasticity in the measurement error model variance. In a detailed simulation study, standard regression calibration performed either as well as or better than the new estimator. When the disease is rare and the errors normally distributed, or when measurement error is moderate, standard regression calibration remains the method of choice. PMID:22848187

  4. Process modeling with the regression network.

    PubMed

    van der Walt, T; Barnard, E; van Deventer, J

    1995-01-01

    A new connectionist network topology called the regression network is proposed. The structural and underlying mathematical features of the regression network are investigated. Emphasis is placed on the intricacies of the optimization process for the regression network and some measures to alleviate these difficulties of optimization are proposed and investigated. The ability of the regression network algorithm to perform either nonparametric or parametric optimization, as well as a combination of both, is also highlighted. It is further shown how the regression network can be used to model systems which are poorly understood on the basis of sparse data. A semi-empirical regression network model is developed for a metallurgical processing operation (a hydrocyclone classifier) by building mechanistic knowledge into the connectionist structure of the regression network model. Poorly understood aspects of the process are provided for by use of nonparametric regions within the structure of the semi-empirical connectionist model. The performance of the regression network model is compared to the corresponding generalization performance results obtained by some other nonparametric regression techniques.

  5. Hybrid fuzzy regression with trapezoidal fuzzy data

    NASA Astrophysics Data System (ADS)

    Razzaghnia, T.; Danesh, S.; Maleki, A.

    2011-12-01

    In this regard, this research deals with a method for hybrid fuzzy least-squares regression. The extension of symmetric triangular fuzzy coefficients to asymmetric trapezoidal fuzzy coefficients is considered as an effective measure for removing unnecessary fuzziness of the linear fuzzy model. First, trapezoidal fuzzy variable is applied to derive a bivariate regression model. In the following, normal equations are formulated to solve the four parts of hybrid regression coefficients. Also the model is extended to multiple regression analysis. Eventually, method is compared with Y-H.O. chang's model.

  6. [From clinical judgment to linear regression model.

    PubMed

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R(2)) indicates the importance of independent variables in the outcome.

  7. Geodesic least squares regression on information manifolds

    SciTech Connect

    Verdoolaege, Geert

    2014-12-05

    We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.

  8. Tests with VHR images for the identification of olive trees and other fruit trees in the European Union

    NASA Astrophysics Data System (ADS)

    Masson, Josiane; Soille, Pierre; Mueller, Rick

    2004-10-01

    In the context of the Common Agricultural Policy (CAP) there is a strong interest of the European Commission for counting and individually locating fruit trees. An automatic counting algorithm developed by the JRC (OLICOUNT) was used in the past for olive trees only, on 1m black and white orthophotos but with limits in case of young trees or irregular groves. This study investigates the improvement of fruit tree identification using VHR images on a large set of data in three test sites, one in Creta (Greece; one in the south-east of France with a majority of olive trees and associated fruit trees, and the last one in Florida on citrus trees. OLICOUNT was compared with two other automatic tree counting, applications, one using the CRISP software on citrus trees and the other completely automatic based on regional minima (morphological image analysis). Additional investigation was undertaken to refine the methods. This paper describes the automatic methods and presents the results derived from the tests.

  9. Normalization Ridge Regression in Practice I: Comparisons Between Ordinary Least Squares, Ridge Regression and Normalization Ridge Regression.

    ERIC Educational Resources Information Center

    Bulcock, J. W.

    The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…

  10. Consistency and inconsistency of consensus methods for inferring species trees from gene trees in the presence of ancestral population structure.

    PubMed

    DeGiorgio, Michael; Rosenberg, Noah A

    2016-08-01

    In the last few years, several statistically consistent consensus methods for species tree inference have been devised that are robust to the gene tree discordance caused by incomplete lineage sorting in unstructured ancestral populations. One source of gene tree discordance that has only recently been identified as a potential obstacle for phylogenetic inference is ancestral population structure. In this article, we describe a general model of ancestral population structure, and by relying on a single carefully constructed example scenario, we show that the consensus methods Democratic Vote, STEAC, STAR, R(∗) Consensus, Rooted Triple Consensus, Minimize Deep Coalescences, and Majority-Rule Consensus are statistically inconsistent under the model. We find that among the consensus methods evaluated, the only method that is statistically consistent in the presence of ancestral population structure is GLASS/Maximum Tree. We use simulations to evaluate the behavior of the various consensus methods in a model with ancestral population structure, showing that as the number of gene trees increases, estimates on the basis of GLASS/Maximum Tree approach the true species tree topology irrespective of the level of population structure, whereas estimates based on the remaining methods only approach the true species tree topology if the level of structure is low. However, through simulations using species trees both with and without ancestral population structure, we show that GLASS/Maximum Tree performs unusually poorly on gene trees inferred from alignments with little information. This practical limitation of GLASS/Maximum Tree together with the inconsistency of other methods prompts the need for both further testing of additional existing methods and development of novel methods under conditions that incorporate ancestral population structure.

  11. Joint regression analysis and AMMI model applied to oat improvement

    NASA Astrophysics Data System (ADS)

    Oliveira, A.; Oliveira, T. A.; Mejza, S.

    2012-09-01

    In our work we present an application of some biometrical methods useful in genotype stability evaluation, namely AMMI model, Joint Regression Analysis (JRA) and multiple comparison tests. A genotype stability analysis of oat (Avena Sativa L.) grain yield was carried out using data of the Portuguese Plant Breeding Board, sample of the 22 different genotypes during the years 2002, 2003 and 2004 in six locations. In Ferreira et al. (2006) the authors state the relevance of the regression models and of the Additive Main Effects and Multiplicative Interactions (AMMI) model, to study and to estimate phenotypic stability effects. As computational techniques we use the Zigzag algorithm to estimate the regression coefficients and the agricolae-package available in R software for AMMI model analysis.

  12. Use of probabilistic weights to enhance linear regression myoelectric control

    NASA Astrophysics Data System (ADS)

    Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

    2015-12-01

    Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  13. Global Value Trees

    PubMed Central

    Zhu, Zhen; Puliga, Michelangelo; Cerina, Federica; Chessa, Alessandro; Riccaboni, Massimo

    2015-01-01

    The fragmentation of production across countries has become an important feature of the globalization in recent decades and is often conceptualized by the term “global value chains” (GVCs). When empirically investigating the GVCs, previous studies are mainly interested in knowing how global the GVCs are rather than how the GVCs look like. From a complex networks perspective, we use the World Input-Output Database (WIOD) to study the evolution of the global production system. We find that the industry-level GVCs are indeed not chain-like but are better characterized by the tree topology. Hence, we compute the global value trees (GVTs) for all the industries available in the WIOD. Moreover, we compute an industry importance measure based on the GVTs and compare it with other network centrality measures. Finally, we discuss some future applications of the GVTs. PMID:25978067

  14. Doubly robust survival trees.

    PubMed

    Steingrimsson, Jon Arni; Diao, Liqun; Molinaro, Annette M; Strawderman, Robert L

    2016-09-10

    Estimating a patient's mortality risk is important in making treatment decisions. Survival trees are a useful tool and employ recursive partitioning to separate patients into different risk groups. Existing 'loss based' recursive partitioning procedures that would be used in the absence of censoring have previously been extended to the setting of right censored outcomes using inverse probability censoring weighted estimators of loss functions. In this paper, we propose new 'doubly robust' extensions of these loss estimators motivated by semiparametric efficiency theory for missing data that better utilize available data. Simulations and a data analysis demonstrate strong performance of the doubly robust survival trees compared with previously used methods. Copyright © 2016 John Wiley & Sons, Ltd. PMID:27037609

  15. Narrowing historical uncertainty: probabilistic classification of ambiguously identified tree species in historical forest survey data

    USGS Publications Warehouse

    Mladenoff, D.J.; Dahir, S.E.; Nordheim, E.V.; Schulte, L.A.; Guntenspergen, G.R.

    2002-01-01

    Historical data have increasingly become appreciated for insight into the past conditions of ecosystems. Uses of such data include assessing the extent of ecosystem change; deriving ecological baselines for management, restoration, and modeling; and assessing the importance of past conditions on the composition and function of current systems. One historical data set of this type is the Public Land Survey (PLS) of the United States General Land Office, which contains data on multiple tree species, sizes, and distances recorded at each survey point, located at half-mile (0.8 km) intervals on a 1-mi (1.6 km) grid. This survey method was begun in the 1790s on US federal lands extending westward from Ohio. Thus, the data have the potential of providing a view of much of the US landscape from the mid-1800s, and they have been used extensively for this purpose. However, historical data sources, such as those describing the species composition of forests, can often be limited in the detail recorded and the reliability of the data, since the information was often not originally recorded for ecological purposes. Forest trees are sometimes recorded ambiguously, using generic or obscure common names. For the PLS data of northern Wisconsin, USA, we developed a method to classify ambiguously identified tree species using logistic regression analysis, using data on trees that were clearly identified to species and a set of independent predictor variables to build the models. The models were first created on partial data sets for each species and then tested for fit against the remaining data. Validations were conducted using repeated, random subsets of the data. Model prediction accuracy ranged from 81% to 96% in differentiating congeneric species among oak, pine, ash, maple, birch, and elm. Major predictor variables were tree size, associated species, landscape classes indicative of soil type, and spatial location within the study region. Results help to clarify ambiguities

  16. Spatial vulnerability assessments by regression kriging

    NASA Astrophysics Data System (ADS)

    Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor

    2016-04-01

    information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).

  17. Predicting 'very poor' beach water quality gradings using classification tree.

    PubMed

    Thoe, Wai; Choi, King Wah; Lee, Joseph Hun-wei

    2016-02-01

    A beach water quality prediction system has been developed in Hong Kong using multiple linear regression (MLR) models. However, linear models are found to be weak at capturing the infrequent 'very poor' water quality occasions when Escherichia coli (E. coli) concentration exceeds 610 counts/100 mL. This study uses a classification tree to increase the accuracy in predicting the 'very poor' water quality events at three Hong Kong beaches affected either by non-point source or point source pollution. Binary-output classification trees (to predict whether E. coli concentration exceeds 610 counts/100 mL) are developed over the periods before and after the implementation of the Harbour Area Treatment Scheme, when systematic changes in water quality were observed. Results show that classification trees can capture more 'very poor' events in both periods when compared to the corresponding linear models, with an increase in correct positives by an average of 20%. Classification trees are also developed at two beaches to predict the four-category Beach Water Quality Indices. They perform worse than the binary tree and give excessive false alarms of 'very poor' events. Finally, a combined modelling approach using both MLR model and classification tree is proposed to enhance the beach water quality prediction system for Hong Kong.

  18. Predicting 'very poor' beach water quality gradings using classification tree.

    PubMed

    Thoe, Wai; Choi, King Wah; Lee, Joseph Hun-wei

    2016-02-01

    A beach water quality prediction system has been developed in Hong Kong using multiple linear regression (MLR) models. However, linear models are found to be weak at capturing the infrequent 'very poor' water quality occasions when Escherichia coli (E. coli) concentration exceeds 610 counts/100 mL. This study uses a classification tree to increase the accuracy in predicting the 'very poor' water quality events at three Hong Kong beaches affected either by non-point source or point source pollution. Binary-output classification trees (to predict whether E. coli concentration exceeds 610 counts/100 mL) are developed over the periods before and after the implementation of the Harbour Area Treatment Scheme, when systematic changes in water quality were observed. Results show that classification trees can capture more 'very poor' events in both periods when compared to the corresponding linear models, with an increase in correct positives by an average of 20%. Classification trees are also developed at two beaches to predict the four-category Beach Water Quality Indices. They perform worse than the binary tree and give excessive false alarms of 'very poor' events. Finally, a combined modelling approach using both MLR model and classification tree is proposed to enhance the beach water quality prediction system for Hong Kong. PMID:26837834

  19. Fault-Tree Compiler Program

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Martensen, Anna L.

    1992-01-01

    FTC, Fault-Tree Compiler program, is reliability-analysis software tool used to calculate probability of top event of fault tree. Five different types of gates allowed in fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N. High-level input language of FTC easy to understand and use. Program supports hierarchical fault-tree-definition feature simplifying process of description of tree and reduces execution time. Solution technique implemented in FORTRAN, and user interface in Pascal. Written to run on DEC VAX computer operating under VMS operating system.

  20. Hide and vanish: data sets where the most parsimonious tree is known but hard to find, and their implications for tree search methods.

    PubMed

    Goloboff, Pablo A

    2014-10-01

    Three different types of data sets, for which the uniquely most parsimonious tree can be known exactly but is hard to find with heuristic tree search methods, are studied. Tree searches are complicated more by the shape of the tree landscape (i.e. the distribution of homoplasy on different trees) than by the sheer abundance of homoplasy or character conflict. Data sets of Type 1 are those constructed by Radel et al. (2013). Data sets of Type 2 present a very rugged landscape, with narrow peaks and valleys, but relatively low amounts of homoplasy. For such a tree landscape, subjecting the trees to TBR and saving suboptimal trees produces much better results when the sequence of clipping for the tree branches is randomized instead of fixed. An unexpected finding for data sets of Types 1 and 2 is that starting a search from a random tree instead of a random addition sequence Wagner tree may increase the probability that the search finds the most parsimonious tree; a small artificial example where these probabilities can be calculated exactly is presented. Data sets of Type 3, the most difficult data sets studied here, comprise only congruent characters, and a single island with only one most parsimonious tree. Even if there is a single island, missing entries create a very flat landscape which is difficult to traverse with tree search algorithms because the number of equally parsimonious trees that need to be saved and swapped to effectively move around the plateaus is too large. Minor modifications of the parameters of tree drifting, ratchet, and sectorial searches allow travelling around these plateaus much more efficiently than saving and swapping large numbers of equally parsimonious trees with TBR. For these data sets, two new related criteria for selecting taxon addition sequences in Wagner trees (the "selected" and "informative" addition sequences) produce much better results than the standard random or closest addition sequences. These new methods for Wagner

  1. Penalized spline estimation for functional coefficient regression models

    PubMed Central

    Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z.

    2011-01-01

    The functional coefficient regression models assume that the regression coefficients vary with some “threshold” variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called “curse of dimensionality” in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application. PMID:21516260

  2. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.; Bader, Jon B.

    2010-01-01

    Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

  3. Interactive regional regression approach to estimating flood quantiles

    USGS Publications Warehouse

    Tasker, Gary D.; Slade, Raymond M.

    1994-01-01

    In Texas, a computer program has been developed which will estimate flood quantiles for an ungaged site based on data from gaging stations with similar watershed characteristics. The user enters site location and watershed characteristics for an ungaged site and the program selects, from a data base of gaging stations, a subset of stations to be used in the regression analysis. The subset of stations are selected based on the similarity of their basin characteristics to the ungaged site's basin characteristics. This approach offers several advantages over the usual regional regression approach. For example, the estimation data includes only stations whose size, topography, and climate are similar to the ungaged site. Therefore, predictions tend to be made near the center of the space of the explanatory variables, and extrapolation errors are reduced. In addition, any violation of the assumption of linearity for the regression is less likely to cause problems. A new regression equation is developed for each prediction site, thus numerous calculations are necessary. However, today's desktop computers can make the calculations easily. A split sampling study is used to compare this technique with the more conventional regional regression approach.

  4. Association between regression and self injury among children with autism.

    PubMed

    Lance, Eboni I; York, Janet M; Lee, Li-Ching; Zimmerman, Andrew W

    2014-02-01

    Self injurious behaviors (SIBs) are challenging clinical problems in individuals with autism spectrum disorders (ASDs). This study is one of the first and largest to utilize inpatient data to examine the associations between autism, developmental regression, and SIBs. Medical records of 125 neurobehavioral hospitalized patients with diagnoses of ASDs and SIBs between 4 and 17 years of age were reviewed. Data were collected from medical records on the type and frequency of SIBs and a history of language, social, or behavioral regression during development. The children with a history of any type of developmental regression (social, behavioral, or language) were more likely to have a diagnosis of autistic disorder than other ASD diagnoses. There were no significant differences in the occurrence of self injurious or other problem behaviors (such as aggression or disruption) between children with and without regression. Regression may influence the diagnostic considerations in ASDs but does not seem to influence the clinical phenotype with regard to behavioral issues. Additional data analyses explored the frequencies and subtypes of SIBs and other medical diagnoses in ASDs, with intellectual disability and disruptive behavior disorder found most commonly.

  5. PhyBin: binning trees by topology.

    PubMed

    Newton, Ryan R; Newton, Irene L G

    2013-01-01

    A major goal of many evolutionary analyses is to determine the true evolutionary history of an organism. Molecular methods that rely on the phylogenetic signal generated by a few to a handful of loci can be used to approximate the evolution of the entire organism but fall short of providing a global, genome-wide, perspective on evolutionary processes. Indeed, individual genes in a genome may have different evolutionary histories. Therefore, it is informative to analyze the number and kind of phylogenetic topologies found within an orthologous set of genes across a genome. Here we present PhyBin: a flexible program for clustering gene trees based on topological structure. PhyBin can generate bins of topologies corresponding to exactly identical trees or can utilize Robinson-Fould's distance matrices to generate clusters of similar trees, using a user-defined threshold. Additionally, PhyBin allows the user to adjust for potential noise in the dataset (as may be produced when comparing very closely related organisms) by pre-processing trees to collapse very short branches or those nodes not meeting a defined bootstrap threshold. As a test case, we generated individual trees based on an orthologous gene set from 10 Wolbachia species across four different supergroups (A-D) and utilized PhyBin to categorize the complete set of topologies produced from this dataset. Using this approach, we were able to show that although a single topology generally dominated the analysis, confirming the separation of the supergroups, many genes supported alternative evolutionary histories. Because PhyBin's output provides the user with lists of gene trees in each topological cluster, it can be used to explore potential reasons for discrepancies between phylogenies including homoplasies, long-branch attraction, or horizontal gene transfer events.

  6. First analysis of risk factors associated with bee colony collapse disorder by classification and regression trees

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sudden losses of managed honey bee (Apis mellifera L.) colonies are considered an important problem worldwide but the underlying cause or causes of these losses are currently unknown. In the United States, this syndrome was termed Colony Collapse Disorder (CCD), since the defining trait was a rapid ...

  7. Pesticides in Urban Multiunit Dwellings: Hazard IdentificationUsing Classification and Regression Tree (CART) Analysis

    EPA Science Inventory

    Many units in public housing or other low-income urban dwellings may have elevated pesticide residues, given recurring infestation, but it would be logistically and economically infeasible to sample a large number of units to identify highly exposed households to design interven...

  8. How To Write a Municipal Tree Ordinance.

    ERIC Educational Resources Information Center

    Fazio, James R., Ed.

    1990-01-01

    At the heart of the Tree City USA program are four basic requirements: The community must have the following: (1) a tree board or department; (2) an annual community forestry program with financial provisions for trees and tree care; (3) an annual Arbor Day proclamation and observance; and (4) a tree ordinance. Sections of a model tree ordinance…

  9. Suppression Situations in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2006-01-01

    This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…

  10. Deriving the Regression Equation without Using Calculus

    ERIC Educational Resources Information Center

    Gordon, Sheldon P.; Gordon, Florence S.

    2004-01-01

    Probably the one "new" mathematical topic that is most responsible for modernizing courses in college algebra and precalculus over the last few years is the idea of fitting a function to a set of data in the sense of a least squares fit. Whether it be simple linear regression or nonlinear regression, this topic opens the door to applying the…

  11. A Practical Guide to Regression Discontinuity

    ERIC Educational Resources Information Center

    Jacob, Robin; Zhu, Pei; Somers, Marie-Andrée; Bloom, Howard

    2012-01-01

    Regression discontinuity (RD) analysis is a rigorous nonexperimental approach that can be used to estimate program impacts in situations in which candidates are selected for treatment based on whether their value for a numeric rating exceeds a designated threshold or cut-point. Over the last two decades, the regression discontinuity approach has…

  12. Dealing with Outliers: Robust, Resistant Regression

    ERIC Educational Resources Information Center

    Glasser, Leslie

    2007-01-01

    Least-squares linear regression is the best of statistics and it is the worst of statistics. The reasons for this paradoxical claim, arising from possible inapplicability of the method and the excessive influence of "outliers", are discussed and substitute regression methods based on median selection, which is both robust and resistant, are…

  13. Cross-Validation, Shrinkage, and Multiple Regression.

    ERIC Educational Resources Information Center

    Hynes, Kevin

    One aspect of multiple regression--the shrinkage of the multiple correlation coefficient on cross-validation is reviewed. The paper consists of four sections. In section one, the distinction between a fixed and a random multiple regression model is made explicit. In section two, the cross-validation paradigm and an explanation for the occurrence…

  14. Application and Interpretation of Hierarchical Multiple Regression.

    PubMed

    Jeong, Younhee; Jung, Mi Jung

    2016-01-01

    The authors reported the association between motivation and self-management behavior of individuals with chronic low back pain after adjusting control variables using hierarchical multiple regression (). This article describes details of the hierarchical regression applying the actual data used in the article by , including how to test assumptions, run the statistical tests, and report the results. PMID:27648796

  15. Regression Analysis: Legal Applications in Institutional Research

    ERIC Educational Resources Information Center

    Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.

    2008-01-01

    This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…

  16. A Simulation Investigation of Principal Component Regression.

    ERIC Educational Resources Information Center

    Allen, David E.

    Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…

  17. Incremental Net Effects in Multiple Regression

    ERIC Educational Resources Information Center

    Lipovetsky, Stan; Conklin, Michael

    2005-01-01

    A regular problem in regression analysis is estimating the comparative importance of the predictors in the model. This work considers the 'net effects', or shares of the predictors in the coefficient of the multiple determination, which is a widely used characteristic of the quality of a regression model. Estimation of the net effects can be a…

  18. Illustration of Regression towards the Means

    ERIC Educational Resources Information Center

    Govindaraju, K.; Haslett, S. J.

    2008-01-01

    This article presents a procedure for generating a sequence of data sets which will yield exactly the same fitted simple linear regression equation y = a + bx. Unless rescaled, the generated data sets will have progressively smaller variability for the two variables, and the associated response and covariate will "regress" towards their…

  19. Regression Analysis and the Sociological Imagination

    ERIC Educational Resources Information Center

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  20. Three-Dimensional Modeling in Linear Regression.

    ERIC Educational Resources Information Center

    Herman, James D.

    Linear regression examines the relationship between one or more independent (predictor) variables and a dependent variable. By using a particular formula, regression determines the weights needed to minimize the error term for a given set of predictors. With one predictor variable, the relationship between the predictor and the dependent variable…

  1. Protection of individual ash trees from emerald ash borer (Coleoptera: Buprestidae) with basal soil applications of imidacloprid.

    PubMed

    Smitley, D R; Rebek, E J; Royalty, R N; Davis, T W; Newhouse, K F

    2010-02-01

    We conducted field trials at five different locations over a period of 6 yr to investigate the efficacy of imidacloprid applied each spring as a basal soil drench for protection against emerald ash borer, Agrilus planipennis Fairmaire (Coleoptera: Buprestidae). Canopy thinning and emerald ash borer larval density were used to evaluate efficacy for 3-4 yr at each location while treatments continued. Test sites included small urban trees (5-15 cm diameter at breast height [dbh]), medium to large (15-65 cm dbh) trees at golf courses, and medium to large street trees. Annual basal drenches with imidacloprid gave complete protection of small ash trees for three years. At three sites where the size of trees ranged from 23 to 37 cm dbh, we successfully protected all ash trees beginning the test with <60% canopy thinning. Regression analysis of data from two sites reveals that tree size explains 46% of the variation in efficacy of imidacloprid drenches. The smallest trees (<30 cm dbh) remained in excellent condition for 3 yr, whereas most of the largest trees (>38 cm dbh) declined to a weakened state and undesirable appearance. The five-fold increase in trunk and branch surface area of ash trees as the tree dbh doubles may account for reduced efficacy on larger trees, and suggests a need to increase treatment rates for larger trees.

  2. Error analysis of leaf area estimates made from allometric regression models

    NASA Technical Reports Server (NTRS)

    Feiveson, A. H.; Chhikara, R. S.

    1986-01-01

    Biological net productivity, measured in terms of the change in biomass with time, affects global productivity and the quality of life through biochemical and hydrological cycles and by its effect on the overall energy balance. Estimating leaf area for large ecosystems is one of the more important means of monitoring this productivity. For a particular forest plot, the leaf area is often estimated by a two-stage process. In the first stage, known as dimension analysis, a small number of trees are felled so that their areas can be measured as accurately as possible. These leaf areas are then related to non-destructive, easily-measured features such as bole diameter and tree height, by using a regression model. In the second stage, the non-destructive features are measured for all or for a sample of trees in the plots and then used as input into the regression model to estimate the total leaf area. Because both stages of the estimation process are subject to error, it is difficult to evaluate the accuracy of the final plot leaf area estimates. This paper illustrates how a complete error analysis can be made, using an example from a study made on aspen trees in northern Minnesota. The study was a joint effort by NASA and the University of California at Santa Barbara known as COVER (Characterization of Vegetation with Remote Sensing).

  3. Nodule Regression in Adults With Nodular Gastritis

    PubMed Central

    Kim, Ji Wan; Lee, Sun-Young; Kim, Jeong Hwan; Sung, In-Kyung; Park, Hyung Seok; Shim, Chan-Sup; Han, Hye Seung

    2015-01-01

    Background Nodular gastritis (NG) is associated with the presence of Helicobacter pylori infection, but there are controversies on nodule regression in adults. The aim of this study was to analyze the factors that are related to the nodule regression in adults diagnosed as NG. Methods Adult population who were diagnosed as NG with H. pylori infection during esophagogastroduodenoscopy (EGD) at our center were included. Changes in the size and location of the nodules, status of H. pylori infection, upper gastrointestinal (UGI) symptom, EGD and pathology findings were analyzed between the initial and follow-up tests. Results Of the 117 NG patients, 66.7% (12/18) of the eradicated NG patients showed nodule regression after H. pylori eradication, whereas 9.9% (9/99) of the non-eradicated NG patients showed spontaneous nodule regression without H. pylori eradication (P < 0.001). Nodule regression was more frequent in NG patients with antral nodule location (P = 0.010), small-sized nodules (P = 0.029), H. pylori eradication (P < 0.001), UGI symptom (P = 0.007), and a long-term follow-up period (P = 0.030). On the logistic regression analysis, nodule regression was inversely correlated with the persistent H. pylori infection on the follow-up test (odds ratio (OR): 0.020, 95% confidence interval (CI): 0.003 - 0.137, P < 0.001) and short-term follow-up period < 30.5 months (OR: 0.140, 95% CI: 0.028 - 0.700, P = 0.017). Conclusions In adults with NG, H. pylori eradication is the most significant factor associated with nodule regression. Long-term follow-up period is also correlated with nodule regression, but is less significant than H. pylori eradication. Our findings suggest that H. pylori eradication should be considered to promote nodule regression in NG patients with H. pylori infection.

  4. Tamarind tree seed dispersal by ring-tailed lemurs.

    PubMed

    Mertl-Millhollen, Anne S; Blumenfeld-Jones, Kathryn; Raharison, Sahoby Marin; Tsaramanana, Donald Raymond; Rasamimanana, Hantanirina

    2011-10-01

    In Madagascar, the gallery forests of the south are among the most endangered. Tamarind trees (Tamarindus indica) dominate these riverine forests and are a keystone food resource for ring-tailed lemurs (Lemur catta). At Berenty Reserve, the presence of tamarind trees is declining, and there is little recruitment of young trees. Because mature tamarinds inhibit growth under their crowns, seeds must be dispersed away from adult trees if tree recruitment is to occur. Ring-tailed lemurs are likely seed dispersers; however, because they spend much of their feeding, siesta, and sleeping time in tamarinds, they may defecate a majority of the tamarind seeds under tamarind trees. To determine whether they disperse tamarind seeds away from overhanging tamarind tree crowns, we observed two troops for 10 days each, noted the locations of feeding and defecation, and collected seeds from feces and fruit for germination. We also collected additional data on tamarind seedling recruitment under natural conditions, in which seedling germination was abundant after extensive rain, including under the canopy. However, seedling survival to 1 year was lower when growing under mature tamarind tree crowns than when growing away from an overhanging crown. Despite low fruit abundance averaging two fruits/m(3) in tamarind crowns, lemurs fed on tamarind fruit for 32% of their feeding samples. Daily path lengths averaged 1,266 m, and lemurs deposited seeds throughout their ranges. Fifty-eight percent of the 417 recorded lemur defecations were on the ground away from overhanging tamarind tree crowns. Tamarind seeds collected from both fruit and feces germinated. Because lemurs deposited viable seeds on the ground away from overhanging mature tamarind tree crowns, we conclude that ring-tailed lemurs provide tamarind tree seed dispersal services.

  5. Tamarind tree seed dispersal by ring-tailed lemurs.

    PubMed

    Mertl-Millhollen, Anne S; Blumenfeld-Jones, Kathryn; Raharison, Sahoby Marin; Tsaramanana, Donald Raymond; Rasamimanana, Hantanirina

    2011-10-01

    In Madagascar, the gallery forests of the south are among the most endangered. Tamarind trees (Tamarindus indica) dominate these riverine forests and are a keystone food resource for ring-tailed lemurs (Lemur catta). At Berenty Reserve, the presence of tamarind trees is declining, and there is little recruitment of young trees. Because mature tamarinds inhibit growth under their crowns, seeds must be dispersed away from adult trees if tree recruitment is to occur. Ring-tailed lemurs are likely seed dispersers; however, because they spend much of their feeding, siesta, and sleeping time in tamarinds, they may defecate a majority of the tamarind seeds under tamarind trees. To determine whether they disperse tamarind seeds away from overhanging tamarind tree crowns, we observed two troops for 10 days each, noted the locations of feeding and defecation, and collected seeds from feces and fruit for germination. We also collected additional data on tamarind seedling recruitment under natural conditions, in which seedling germination was abundant after extensive rain, including under the canopy. However, seedling survival to 1 year was lower when growing under mature tamarind tree crowns than when growing away from an overhanging crown. Despite low fruit abundance averaging two fruits/m(3) in tamarind crowns, lemurs fed on tamarind fruit for 32% of their feeding samples. Daily path lengths averaged 1,266 m, and lemurs deposited seeds throughout their ranges. Fifty-eight percent of the 417 recorded lemur defecations were on the ground away from overhanging tamarind tree crowns. Tamarind seeds collected from both fruit and feces germinated. Because lemurs deposited viable seeds on the ground away from overhanging mature tamarind tree crowns, we conclude that ring-tailed lemurs provide tamarind tree seed dispersal services. PMID:21629992

  6. Improving ensemble decision tree performance using Adaboost and Bagging

    NASA Astrophysics Data System (ADS)

    Hasan, Md. Rajib; Siraj, Fadzilah; Sainin, Mohd Shamrie

    2015-12-01

    Ensemble classifier systems are considered as one of the most promising in medical data classification and the performance of deceision tree classifier can be increased by the ensemble method as it is proven to be better than single classifiers. However, in a ensemble settings the performance depends on the selection of suitable base classifier. This research employed two prominent esemble s namely Adaboost and Bagging with base classifiers such as Random Forest, Random Tree, j48, j48grafts and Logistic Model Regression (LMT) that have been selected independently. The empirical study shows that the performance varries when different base classifiers are selected and even some places overfitting issue also been noted. The evidence shows that ensemble decision tree classfiers using Adaboost and Bagging improves the performance of selected medical data sets.

  7. Investigating how students communicate tree-thinking

    NASA Astrophysics Data System (ADS)

    Boyce, Carrie Jo

    Learning is often an active endeavor that requires students work at building conceptual understandings of complex topics. Personal experiences, ideas, and communication all play large roles in developing knowledge of and understanding complex topics. Sometimes these experiences can promote formation of scientifically inaccurate or incomplete ideas. Representations are tools used to help individuals understand complex topics. In biology, one way that educators help people understand evolutionary histories of organisms is by using representations called phylogenetic trees. In order to understand phylogenetics trees, individuals need to understand the conventions associated with phylogenies. My dissertation, supported by the Tree-Thinking Representational Competence and Word Association frameworks, is a mixed-methods study investigating the changes in students' tree-reading, representational competence and mental association of phylogenetic terminology after participation in varied instruction. Participants included 128 introductory biology majors from a mid-sized southern research university. Participants were enrolled in either Introductory Biology I, where they were not taught phylogenetics, or Introductory Biology II, where they were explicitly taught phylogenetics. I collected data using a pre- and post-assessment consisting of a word association task and tree-thinking diagnostic (n=128). Additionally, I recruited a subset of students from both courses (n=37) to complete a computer simulation designed to teach students about phylogenetic trees. I then conducted semi-structured interviews consisting of a word association exercise with card sort task, a retrospective pre-assessment discussion, a post-assessment discussion, and interview questions. I found that students who received explicit lecture instruction had a significantly higher increase in scores on a tree-thinking diagnostic than students who did not receive lecture instruction. Students who received both

  8. On Determining if Tree-based Networks Contain Fixed Trees.

    PubMed

    Anaya, Maria; Anipchenko-Ulaj, Olga; Ashfaq, Aisha; Chiu, Joyce; Kaiser, Mahedi; Ohsawa, Max Shoji; Owen, Megan; Pavlechko, Ella; St John, Katherine; Suleria, Shivam; Thompson, Keith; Yap, Corrine

    2016-05-01

    We address an open question of Francis and Steel about phylogenetic networks and trees. They give a polynomial time algorithm to decide if a phylogenetic network, N, is tree-based and pose the problem: given a fixed tree T and network N, is N based on T? We show that it is [Formula: see text]-hard to decide, by reduction from 3-Dimensional Matching (3DM) and further that the problem is fixed-parameter tractable. PMID:27125655

  9. Chilling and heat requirements for flowering in temperate fruit trees

    NASA Astrophysics Data System (ADS)

    Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike

    2014-08-01

    Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut ( Castanea mollissima Blume) and jujube ( Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing's cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.

  10. Chilling and heat requirements for flowering in temperate fruit trees.

    PubMed

    Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike

    2014-08-01

    Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut (Castanea mollissima Blume) and jujube (Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing’s cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics. PMID:23958788

  11. Chilling and heat requirements for flowering in temperate fruit trees.

    PubMed

    Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike

    2014-08-01

    Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut (Castanea mollissima Blume) and jujube (Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing’s cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.

  12. Regression modeling of ground-water flow

    USGS Publications Warehouse

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  13. Informing tree-ring reconstructions with automated dendrometer data: the case of single-leaf pinyon (Pinus monophylla) from Great Basin National Park, Nevada, USA

    NASA Astrophysics Data System (ADS)

    Biondi, F.

    2012-12-01

    One of the most pressing issues in modern tree-ring science is to reduce uncertainty of reconstructions while emphasizing that the composition and dynamics of modern ecosystems cannot be understood from the present alone. I present here the latest results from research on the environmental factors that control radial growth of single-leaf pinyon (Pinus monophylla) in the Great Basin of North America using dendrometer data collected at half-hour intervals during two full growing season, 2010 and 2011. Automated (solar-powered) sensors at the site consisted of 8 point dendrometers installed on 7 trees to measure stem size, together with environmental probes that recorded air temperature, soil temperature and soil moisture. Additional meteorological variables at hourly timesteps were available from the EPA-CASTNET station located within 100 m of the dendrometer site. Daily cycles of stem expansion and contraction were quantified using the approach of Deslauriers et al. 2011, and the amount of daily radial stem increment was regressed against environmental variables. Graphical and numerical results showed that tree growth is relatively insensitive to surface soil moisture during the growing season. This finding corroborates empirical dendroclimatic results that showed how tree-ring chronologies of single-leaf pinyon are mostly a proxy for the balance between winter-spring precipitation supply and growing season evapotranspiration demand, thereby making it an ideal species for drought reconstructions.

  14. Automatic generation of tree level helicity amplitudes

    NASA Astrophysics Data System (ADS)

    Stelzer, T.; Long, W. F.

    1994-11-01

    The program MadGraph is presented which automatically generates postscript Feynman diagrams and Fortran code to calculate arbitrary tree level helicity amplitudes by calling HELAS[1] subroutines. The program is written in Fortran and is available in Unix and VMS versions. MadGraph currently includes standard model interactions of QCD and QFD, but is easily modified to include additional models such as supersymmetry.

  15. Automatic generation of tree level helicity amplitudes

    NASA Astrophysics Data System (ADS)

    Stelzer, T.; Long, W. F.

    1994-07-01

    The program MadGraph is presented which automatically generates postscript Feynman diagrams and Fortran code to calculate arbitrary tree level helicity amplitudes by calling HELAS[1] subroutines. The program is written in Fortran and is available in Unix and VMS versions. MadGraph currently includes standard model interactions of QCD and QFD, but is easily modified to include additional models such as supersymmetry.

  16. A comparison of regression and regression-kriging for soil characterization using remote sensing imagery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In precision agriculture regression has been used widely to quality the relationship between soil attributes and other environmental variables. However, spatial correlation existing in soil samples usually makes the regression model suboptimal. In this study, a regression-kriging method was attemp...

  17. Generalized linear and generalized additive models in studies of species distributions: Setting the scene

    USGS Publications Warehouse

    Guisan, A.; Edwards, T.C.; Hastie, T.

    2002-01-01

    An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.

  18. Save a Tree

    NASA Astrophysics Data System (ADS)

    Williams, Kathryn R.

    1999-10-01

    Starting in September 1925, JCE reproduced pictures of famous chemists or chemistry-related works of art as frontispieces. Often, the Journal included a biography or other article about the picture. The August 1945 frontispiece featured the largest cork oak in the United States. An accompanying article described the goals of the Cork Project to plant cork trees in suitable locations in the U.S., to compensate for uncertain European and African sources during World War II. The final frontispiece appeared in December 1956. To view supplementary material, please refer to JCE Online's supplementary links.

  19. Regression of altitude-produced cardiac hypertrophy.

    NASA Technical Reports Server (NTRS)

    Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.

    1973-01-01

    The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.

  20. Elevation-dependent responses of tree mast seeding to climate change over 45 years

    PubMed Central

    Allen, Robert B; Hurst, Jennifer M; Portier, Jeanne; Richardson, Sarah J

    2014-01-01

    We use seed count data from a New Zealand mono-specific mountain beech forest to test for decadal trends in seed production along an elevation gradient in relation to changes in climate. Seedfall was collected (1965 to 2009) from seed trays located on transect lines at fixed elevations along an elevation gradient (1020 to 1370 m). We counted the number of seeds in the catch of each tray, for each year, and determined the number of viable seeds. Climate variables were obtained from a nearby (<2 km) climate station (914-m elevation). Variables were the sum or mean of daily measurements, using periods within each year known to correlate with subsequent interannual variation in seed production. To determine trends in mean seed production, at each elevation, and climate variables, we used generalized least squares (GLS) regression. We demonstrate a trend of increasing total and viable seed production, particularly at higher elevations, which emerged from marked interannual variation. Significant changes in four seasonal climate variables had GLS regression coefficients consistent with predictions of increased seed production. These variables subsumed the effect of year in GLS regressions with a greater influence on seed production with increasing elevation. Regression models enforce a view that the sequence of climate variables was additive in their influence on seed production throughout a reproductive cycle spanning more than 2 years and including three summers. Models with the most support always included summer precipitation as the earliest variable in the sequence followed by summer maximum daily temperatures. We interpret this as reflecting precipitation driven increases in soil nutrient availability enhancing seed production at higher elevations rather than the direct effects of climate, stand development or rising atmospheric CO2 partial pressures. Greater sensitivity of tree seeding at higher elevations to changes in climate reveals how ecosystem responses to

  1. Using Evidence-Based Decision Trees Instead of Formulas to Identify At-Risk Readers. REL 2014-036

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov; Foorman, Barbara R.

    2014-01-01

    This study examines whether the classification and regression tree (CART) model improves the early identification of students at risk for reading comprehension difficulties compared with the more difficult to interpret logistic regression model. CART is a type of predictive modeling that relies on nonparametric techniques. It presents results in…

  2. Barking up the Right Tree

    ERIC Educational Resources Information Center

    Houston, Paul D.

    2006-01-01

    There is a childhood saying about a confused dog who thinks he sees a possum in a tree. The problem is that the possum is actually in a different tree so the dog barks up the wrong tree. American education is constantly playing both dog and possum. Sometimes they are the prey, and sometimes they are just confused about what and where the prey is.…

  3. Automatic localization of bifurcations and vessel crossings in digital fundus photographs using location regression

    NASA Astrophysics Data System (ADS)

    Niemeijer, Meindert; Dumitrescu, Alina V.; van Ginneken, Bram; Abrámoff, Michael D.

    2011-03-01

    Parameters extracted from the vasculature on the retina are correlated with various conditions such as diabetic retinopathy and cardiovascular diseases such as stroke. Segmentation of the vasculature on the retina has been a topic that has received much attention in the literature over the past decade. Analysis of the segmentation result, however, has only received limited attention with most works describing methods to accurately measure the width of the vessels. Analyzing the connectedness of the vascular network is an important step towards the characterization of the complete vascular tree. The retinal vascular tree, from an image interpretation point of view, originates at the optic disc and spreads out over the retina. The tree bifurcates and the vessels also cross each other. The points where this happens form the key to determining the connectedness of the complete tree. We present a supervised method to detect the bifurcations and crossing points of the vasculature of the retina. The method uses features extracted from the vasculature as well as the image in a location regression approach to find those locations of the segmented vascular tree where the bifurcation or crossing occurs (from here, POI, points of interest). We evaluate the method on the publicly available DRIVE database in which an ophthalmologist has marked the POI.

  4. Error Tree: A Tree Structure for Hamming and Edit Distances and Wildcards Matching.

    PubMed

    Al-Okaily, Anas

    2015-12-01

    Approximate pattern matching is a fundamental problem in the bioinformatics and information retrieval applications. The problem involves different matching relations such as Hamming distance, edit distances, and the wildcards matching problem. The input is usually a text of length n over a fixed alphabet of length Σ, a pattern of length m, and an integer k. The output is to find all positions that have ≤ k Hamming distance, edit distance, or wildcards matching with P. Many algorithms and indexes have been proposed to solve the problems more efficiently, but due to the space and time complexities of the problems, most tools adopted heuristics approaches based on, for instance, suffix tree, suffix array, or Burrows Wheeler Transform to reach practical implementations. Error Tree is a novel tree structure that is mainly oriented to solve the approximate pattern matching problems, using less space and faster computation time. The algorithm proposes for Hamming distance and wildcards matching a tree structure that needs [Formula: see text] words and takes [Formula: see text] in the average case) of query time for any online/offline pattern, where occ is the number of outputs. In addition, a tree structure of [Formula: see text] words and [Formula: see text] in the average case) query time for edit distance for any online/offline pattern. PMID:26402070

  5. Tree shrew database (TreeshrewDB): a genomic knowledge base for the Chinese tree shrew.

    PubMed

    Fan, Yu; Yu, Dandan; Yao, Yong-Gang

    2014-11-21

    The tree shrew (Tupaia belangeri) is a small mammal with a close relationship to primates and it has been proposed as an alternative experimental animal to primates in biomedical research. The recent release of a high-quality Chinese tree shrew genome enables more researchers to use this species as the model animal in their studies. With the aim to making the access to an extensively annotated genome database straightforward and easy, we have created the Tree shrew Database (TreeshrewDB). This is a web-based platform that integrates the currently available data from the tree shrew genome, including an updated gene set, with a systematic functional annotation and a mRNA expression pattern. In addition, to assist with automatic gene sequence analysis, we have integrated the common programs Blast, Muscle, GBrowse, GeneWise and codeml, into TreeshrewDB. We have also developed a pipeline for the analysis of positive selection. The user-friendly interface of TreeshrewDB, which is available at http://www.treeshrewdb.org, will undoubtedly help in many areas of biological research into the tree shrew.

  6. Tree allometry, leaf size and adult tree size in old-growth forests of western Oregon.

    PubMed

    King, D A

    1991-10-01

    Relationships between tree height and crown dimensions and trunk diameter were determined for shade-tolerant species of old-growth forests of western Oregon. The study included both understory and overstory species, deciduous and evergreen angiosperms and evergreen conifers. A comparison of adult understory species with sapling overstory species of similar height showed greater crown width and trunk diameter in the former, whether the comparison is made among conifers or deciduous trees. Conifer saplings had wider crowns than deciduous saplings, but the crown widths of the two groups converged with increase in tree height. Conifer saplings had thicker trunks than deciduous saplings of similar crown width, possibly because of selection for resistance to stem bending under snow loads. The results suggest that understory species have morphologies that increase light interception and persistence in the understory, whereas overstory species allocate their biomass for efficient height growth, thereby attaining the high-light environment of the canopy. The greater crown widths and the additional strength requirements imposed by snow loads on conifer saplings result in less height growth per biomass increment in conifer saplings than in deciduous saplings. However, the convergence in crown width of the two groups at heights greater than 20 m, and the proportionately smaller effect of snow loads on large trees, may result in older conifers equalling or surpassing deciduous trees in biomass allocation to height growth.

  7. REGRESSION ESTIMATES FOR TOPOLOGICAL-HYDROGRAPH INPUT.

    USGS Publications Warehouse

    Karlinger, Michael R.; Guertin, D. Phillip; Troutman, Brent M.

    1988-01-01

    Physiographic, hydrologic, and rainfall data from 18 small drainage basins in semiarid, central Wyoming were used to calibrate topological, unit-hydrograph models for celerity, the average rate of travel of a flood wave through the basin. The data set consisted of basin characteristics and hydrologic data for the 18 basins and rainfall data for 68 storms. Calibrated values of celerity and peak discharges subsequently were regressed as a function of the basin characteristics and excess rainfall volume. Predicted values obtained in this way can be used as input for estimating hydrographs in ungaged basins. The regression models included ordinary least-squares and seemingly unrelated regression. This latter regression model jointly estimated the celerity and peak discharge.

  8. TWSVR: Regression via Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh

    2016-02-01

    Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets.

  9. TWSVR: Regression via Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh

    2016-02-01

    Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets. PMID:26624223

  10. Some Simple Computational Formulas for Multiple Regression

    ERIC Educational Resources Information Center

    Aiken, Lewis R., Jr.

    1974-01-01

    Short-cut formulas are presented for direct computation of the beta weights, the standard errors of the beta weights, and the multiple correlation coefficient for multiple regression problems involving three independent variables and one dependent variable. (Author)

  11. Trends and Tipping Points of Drought-induced Tree Mortality

    NASA Astrophysics Data System (ADS)

    Huang, K.; Yi, C.; Wu, D.; Zhou, T.; Zhao, X.; Blanford, W. J.; Wei, S.; Wu, H.; Du, L.

    2014-12-01

    Drought-induced tree mortality worldwide has been recently reported in a review of the literature by Allen et al. (2010). However, a quantitative relationship between widespread loss of forest from mortality and drought is still a key knowledge gap. Specifically, the field lacks quantitative knowledge of tipping point in trees when coping with water stress, which inhibits the assessments of how climate change affects the forest ecosystem. We investigate the statistical relationships for different (seven) conifer species between Ring Width Index (RWI) and Standardized Precipitation Evapotranspiration Index (SPEI), based on 411 chronologies from the International Tree-Ring Data Bank across 11 states of the western United States. We found robust species-specific relationships between RWI and SPEI for all seven conifer species at dry condition. The regression models show that the RWI decreases with SPEI decreasing (drying) and more than 76% variation of tree growth (RWI) can be explained by the drought index (SPEI). However, when soil water is sufficient (i.e., SPEI>SPEIu), soil water is no longer a restrictive factor for tree growth and, therefore, the RWI shows a weak correlation with SPEI. Based on the statistical models, we derived the tipping point of SPEI (SPEItp) where the RWI equals 0, which means the carbon efflux by tree respiration equals carbon influx by tree photosynthesis. When the severity of drought exceeds this tipping point(i.e. SPEItrees might not be able to sustain their lives as the carbon assimilated by photosynthesis could not suffice the lowest need of trees maintain respiration. The ranges of the tipping points for seven species-specific trees vary between -2.45 and -1.40. The lower value of a tipping point represents the stronger ability to endure drought. The predicted tipping points can be used as reference of tree mortality for assessment of forest mortality risk under climate change.This work was supported by the Fund for

  12. Distributed Merge Trees

    SciTech Connect

    Morozov, Dmitriy; Weber, Gunther

    2013-01-08

    Improved simulations and sensors are producing datasets whose increasing complexity exhausts our ability to visualize and comprehend them directly. To cope with this problem, we can detect and extract significant features in the data and use them as the basis for subsequent analysis. Topological methods are valuable in this context because they provide robust and general feature definitions. As the growth of serial computational power has stalled, data analysis is becoming increasingly dependent on massively parallel machines. To satisfy the computational demand created by complex datasets, algorithms need to effectively utilize these computer architectures. The main strength of topological methods, their emphasis on global information, turns into an obstacle during parallelization. We present two approaches to alleviate this problem. We develop a distributed representation of the merge tree that avoids computing the global tree on a single processor and lets us parallelize subsequent queries. To account for the increasing number of cores per processor, we develop a new data structure that lets us take advantage of multiple shared-memory cores to parallelize the work on a single node. Finally, we present experiments that illustrate the strengths of our approach as well as help identify future challenges.

  13. Human decision error (HUMDEE) trees

    SciTech Connect

    Ostrom, L.T.

    1993-08-01

    Graphical presentations of human actions in incident and accident sequences have been used for many years. However, for the most part, human decision making has been underrepresented in these trees. This paper presents a method of incorporating the human decision process into graphical presentations of incident/accident sequences. This presentation is in the form of logic trees. These trees are called Human Decision Error Trees or HUMDEE for short. The primary benefit of HUMDEE trees is that they graphically illustrate what else the individuals involved in the event could have done to prevent either the initiation or continuation of the event. HUMDEE trees also present the alternate paths available at the operator decision points in the incident/accident sequence. This is different from the Technique for Human Error Rate Prediction (THERP) event trees. There are many uses of these trees. They can be used for incident/accident investigations to show what other courses of actions were available and for training operators. The trees also have a consequence component so that not only the decision can be explored, also the consequence of that decision.

  14. The Geometry of Enhancement in Multiple Regression

    ERIC Educational Resources Information Center

    Waller, Niels G.

    2011-01-01

    In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and enhancement cannot…

  15. There is No Quantum Regression Theorem

    SciTech Connect

    Ford, G.W.; OConnell, R.F.

    1996-07-01

    The Onsager regression hypothesis states that the regression of fluctuations is governed by macroscopic equations describing the approach to equilibrium. It is here asserted that this hypothesis fails in the quantum case. This is shown first by explicit calculation for the example of quantum Brownian motion of an oscillator and then in general from the fluctuation-dissipation theorem. It is asserted that the correct generalization of the Onsager hypothesis is the fluctuation-dissipation theorem. {copyright} {ital 1996 The American Physical Society.}

  16. Distributed game-tree searching

    SciTech Connect

    Schaeffer, J. )

    1989-02-01

    Conventional parallelizations of the alpha-beta ({alpha}{beta}) algorithm have met with limited success. Implementations suffer primarily from the synchronization and search overheads of parallelization. This paper describes a parallel {alpha}{beta} searching program that achieves high performance through the use of four different types of processes: Controllers, Searchers, Table Managers, and Scouts. Synchronization is reduced by having Controller process reassigning idle processes to help out busy ones. Search overhead is reduced by having two types of parallel table management: global Table Managers and the periodic merging and redistribution of local tables. Experiments show that nine processors can achieve 5.67-fold speedups but beyond that, additional processors provide diminishing returns. Given that additional resources are of little benefit, speculative computing is introduced as a means of extending the effective number of processors that can be utilized. Scout processes speculatively search ahead in the tree looking for interesting features and communicate this information back to the {alpha}{beta} program. In this way, the effective search depth is extended. These ideas have been tested experimentally and empirically as part of the chess program ParaPhoenix.

  17. USE OF LETHALITY DATA DURING CATEGORICAL REGRESSION MODELING OF ACUTE REFERENCE EXPOSURES

    EPA Science Inventory

    Categorical regression is being considered by the U.S. EPA as an additional tool for derivation of acute reference exposures (AREs) to be used for human health risk assessment for exposure to inhaled chemicals. Categorical regression is used to calculate probability-response fun...

  18. Dieback and episodic mortality of Cercidium microphyllum (foothill paloverde), a dominant Sonoran Desert tree

    USGS Publications Warehouse

    Bowers, Janice E.; Turner, R.M.

    2001-01-01

    Past and current dieback of Cercidium microphyllum, a dominant, drought-deciduous tree in the Sonoran Desert, was investigated at Tumamoc Hill, Tucson, Arizona, USA. Logistic regression predicted that the odds of a Cercidium plant being alive should decrease with increasing circumference, association with the columnar cactus Carnegiea gigantea, and occurrence on steep slopes. Slope azimuth, parasitization by Phoradendron californicum, and distance to nearest Cercidium within 5 m did not significantly affect the odds of survival. Carnegiea was a source of background mortality rather than a primary cause of dieback. Of the >1,000 living and dead plants sampled, 7.7% had died within the past 5 to 7 years. An additional 12.8% died in the more distant past. Diebacks tended to occur during severe deficits in annual, especially summer, rain. More than half of the dead plants in the sample were ???50 cm in girth. In current and past diebacks on Tumamoc Hill, it seems likely that severe drought interacted with natural senescence of an aging population, weakening large, old trees and hastening their deaths.

  19. Elevation, Not Deforestation, Promotes Genetic Differentiation in a Pioneer Tropical Tree

    PubMed Central

    Castilla, Antonio R.; Pope, Nathaniel; Jaffé, Rodolfo; Jha, Shalene

    2016-01-01

    The regeneration of disturbed forest is an essential part of tropical forest ecology, both with respect to natural disturbance regimes and large-scale human-mediated logging, grazing, and agriculture. Pioneer tree species are critical for facilitating the transition from deforested land to secondary forest because they stabilize terrain and enhance connectivity between forest fragments by increasing matrix permeability and initiating disperser community assembly. Despite the ecological importance of early successional species, little is known about their ability to maintain gene flow across deforested landscapes. Utilizing highly polymorphic microsatellite markers, we examined patterns of genetic diversity and differentiation for the pioneer understory tree Miconia affinis across the Isthmus of Panama. Furthermore, we investigated the impact of geographic distance, forest cover, and elevation on genetic differentiation among populations using circuit theory and regression modeling within a landscape genetics framework. We report marked differences in historical and contemporary migration rates and moderately high levels of genetic differentiation in M. affinis populations across the Isthmus of Panama. Genetic differentiation increased significantly with elevation and geographic distance among populations; however, we did not find that forest cover enhanced or reduced genetic differentiation in the study region. Overall, our results reveal strong dispersal for M. affinis across human-altered landscapes, highlighting the potential use of this species for reforestation in tropical regions. Additionally, this study demonstrates the importance of considering topography when designing programs aimed at conserving genetic diversity within degraded tropical landscapes. PMID:27280872

  20. Elevation, Not Deforestation, Promotes Genetic Differentiation in a Pioneer Tropical Tree.

    PubMed

    Castilla, Antonio R; Pope, Nathaniel; Jaffé, Rodolfo; Jha, Shalene

    2016-01-01

    The regeneration of disturbed forest is an essential part of tropical forest ecology, both with respect to natural disturbance regimes and large-scale human-mediated logging, grazing, and agriculture. Pioneer tree species are critical for facilitating the transition from deforested land to secondary forest because they stabilize terrain and enhance connectivity between forest fragments by increasing matrix permeability and initiating disperser community assembly. Despite the ecological importance of early successional species, little is known about their ability to maintain gene flow across deforested landscapes. Utilizing highly polymorphic microsatellite markers, we examined patterns of genetic diversity and differentiation for the pioneer understory tree Miconia affinis across the Isthmus of Panama. Furthermore, we investigated the impact of geographic distance, forest cover, and elevation on genetic differentiation among populations using circuit theory and regression modeling within a landscape genetics framework. We report marked differences in historical and contemporary migration rates and moderately high levels of genetic differentiation in M. affinis populations across the Isthmus of Panama. Genetic differentiation increased significantly with elevation and geographic distance among populations; however, we did not find that forest cover enhanced or reduced genetic differentiation in the study region. Overall, our results reveal strong dispersal for M. affinis across human-altered landscapes, highlighting the potential use of this species for reforestation in tropical regions. Additionally, this study demonstrates the importance of considering topography when designing programs aimed at conserving genetic diversity within degraded tropical landscapes. PMID:27280872

  1. Synthesizing regression results: a factored likelihood method.

    PubMed

    Wu, Meng-Jia; Becker, Betsy Jane

    2013-06-01

    Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported in the regression studies to calculate synthesized standardized slopes. It uses available correlations to estimate missing ones through a series of regressions, allowing us to synthesize correlations among variables as if each included study contained all the same variables. Great accuracy and stability of this method under fixed-effects models were found through Monte Carlo simulation. An example was provided to demonstrate the steps for calculating the synthesized slopes through sweep operators. By rearranging the predictors in the included regression models or omitting a relatively small number of correlations from those models, we can easily apply the factored likelihood method to many situations involving synthesis of linear models. Limitations and other possible methods for synthesizing more complicated models are discussed. Copyright © 2012 John Wiley & Sons, Ltd. PMID:26053653

  2. Post-processing through linear regression

    NASA Astrophysics Data System (ADS)

    van Schaeybroeck, B.; Vannitsem, S.

    2011-03-01

    Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  3. Relating phylogenetic trees to transmission trees of infectious disease outbreaks.

    PubMed

    Ypma, Rolf J F; van Ballegooijen, W Marijn; Wallinga, Jacco

    2013-11-01

    Transmission events are the fundamental building blocks of the dynamics of any infectious disease. Much about the epidemiology of a disease can be learned when these individual transmission events are known or can be estimated. Such estimations are difficult and generally feasible only when detailed epidemiological data are available. The genealogy estimated from genetic sequences of sampled pathogens is another rich source of information on transmission history. Optimal inference of transmission events calls for the combination of genetic data and epidemiological data into one joint analysis. A key difficulty is that the transmission tree, which describes the transmission events between infected hosts, differs from the phylogenetic tree, which describes the ancestral relationships between pathogens sampled from these hosts. The trees differ both in timing of the internal nodes and in topology. These differences become more pronounced when a higher fraction of infected hosts is sampled. We show how the phylogenetic tree of sampled pathogens is related to the transmission tree of an outbreak of an infectious disease, by the within-host dynamics of pathogens. We provide a statistical framework to infer key epidemiological and mutational parameters by simultaneously estimating the phylogenetic tree and the transmission tree. We test the approach using simulations and illustrate its use on an outbreak of foot-and-mouth disease. The approach unifies existing methods in the emerging field of phylodynamics with transmission tree reconstruction methods that are used in infectious disease epidemiology.

  4. CartograTree: connecting tree genomes, phenotypes and environment.

    PubMed

    Vasquez-Gross, Hans A; Yu, John J; Figueroa, Ben; Gessler, Damian D G; Neale, David B; Wegrzyn, Jill L

    2013-05-01

    Today, researchers spend a tremendous amount of time gathering, formatting, filtering and visualizing data collected from disparate sources. Under the umbrella of forest tree biology, we seek to provide a platform and leverage modern technologies to connect biotic and abiotic data. Our goal is to provide an integrated web-based workspace that connects environmental, genomic and phenotypic data via geo-referenced coordinates. Here, we connect the genomic query web-based workspace, DiversiTree and a novel geographical interface called CartograTree to data housed on the TreeGenes database. To accomplish this goal, we implemented Simple Semantic Web Architecture and Protocol to enable the primary genomics database, TreeGenes, to communicate with semantic web services regardless of platform or back-end technologies. The novelty of CartograTree lies in the interactive workspace that allows for geographical visualization and engagement of high performance computing (HPC) resources. The application provides a unique tool set to facilitate research on the ecology, physiology and evolution of forest tree species. CartograTree can be accessed at: http://dendrome.ucdavis.edu/cartogratree.

  5. Leukemia regression by vascular disruption and antiangiogenic therapy

    PubMed Central

    Madlambayan, Gerard J.; Meacham, Amy M.; Hosaka, Koji; Mir, Saad; Jorgensen, Marda; Scott, Edward W.; Siemann, Dietmar W.

    2010-01-01

    Acute myelogenous leukemias (AMLs) and endothelial cells depend on each other for survival and proliferation. Monotherapy antivascular strategies such as targeting vascular endothelial growth factor (VEGF) has limited efficacy in treating AML. Thus, in search of a multitarget antivascular treatment strategy for AML, we tested a novel vascular disrupting agent, OXi4503, alone and in combination with the anti-VEGF antibody, bevacizumab. Using xenotransplant animal models, OXi4503 treatment of human AML chloromas led to vascular disruption in leukemia cores that displayed increased leukemia cell apoptosis. However, viable rims of leukemia cells remained and were richly vascular with increased VEGF-A expression. To target this peripheral reactive angiogenesis, bevacizumab was combined with OXi4503 and abrogated viable vascular rims, thereby leading to enhanced leukemia regression. In a systemic model of primary human AML, OXi4503 regressed leukemia engraftment alone and in combination with bevacizumab. Differences in blood vessel density alone could not account for the observed regression, suggesting that OXi4503 also exhibited direct cytotoxic effects on leukemia cells. In vitro analyses confirmed this targeted effect, which was mediated by the production of reactive oxygen species and resulted in apoptosis. Together, these data show that OXi4503 alone is capable of regressing AML by a multitargeted mechanism and that the addition of bevacizumab mitigates reactive angiogenesis. PMID:20472832

  6. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert; Bader, Jon B.

    2009-01-01

    Calibration data of a wind tunnel sting balance was processed using a search algorithm that identifies an optimized regression model for the data analysis. The selected sting balance had two moment gages that were mounted forward and aft of the balance moment center. The difference and the sum of the two gage outputs were fitted in the least squares sense using the normal force and the pitching moment at the balance moment center as independent variables. The regression model search algorithm predicted that the difference of the gage outputs should be modeled using the intercept and the normal force. The sum of the two gage outputs, on the other hand, should be modeled using the intercept, the pitching moment, and the square of the pitching moment. Equations of the deflection of a cantilever beam are used to show that the search algorithm s two recommended math models can also be obtained after performing a rigorous theoretical analysis of the deflection of the sting balance under load. The analysis of the sting balance calibration data set is a rare example of a situation when regression models of balance calibration data can directly be derived from first principles of physics and engineering. In addition, it is interesting to see that the search algorithm recommended the same regression models for the data analysis using only a set of statistical quality metrics.

  7. Supercooling Capacity Increases from Sea Level to Tree Line in the Hawaiian Tree Species Metrosideros polymorpha.

    PubMed

    Melcher; Cordell; Jones; Scowcroft; Niemczura; Giambelluca; Goldstein

    2000-05-01

    Population-specific differences in the freezing resistance of Metrosideros polymorpha leaves were studied along an elevational gradient from sea level to tree line (located at ca. 2500 m above sea level) on the east flank of the Mauna Loa volcano in Hawaii. In addition, we also studied 8-yr-old saplings grown in a common garden from seeds collected from the same field populations. Leaves of low-elevation field plants exhibited damage at -2 degrees C, before the onset of ice formation, which occurred at -5.7 degrees C. Leaves of high-elevation plants exhibited damage at ca. -8.5 degrees C, concurrent with ice formation in the leaf tissue, which is typical of plants that avoid freezing in their natural environment by supercooling. Nuclear magnetic resonance studies revealed that water molecules of both extra- and intracellular leaf water fractions from high-elevation plants had restricted mobility, which is consistent with their low water content and their high levels of osmotically active solutes. Decreased mobility of water molecules may delay ice nucleation and/or ice growth and may therefore enhance the ability of plant tissues to supercool. Leaf traits that correlated with specific differences in supercooling capacity were in part genetically determined and in part environmentally induced. Evidence indicated that lower apoplastic water content and smaller intercellular spaces were associated with the larger supercooling capacity of the plant's foliage at tree line. The irreversible tissue-damage temperature decreased by ca. 7 degrees C from sea level to tree line in leaves of field populations. However, this decrease appears to be only large enough to allow M. polymorpha trees to avoid leaf tissue damage from freezing up to a level of ca. 2500 m elevation, which is also the current tree line location on the east flank of Mauna Loa. The limited freezing resistance of M. polymorpha leaves may be partially responsible for the occurrence of tree line at a relatively

  8. Supercooling Capacity Increases from Sea Level to Tree Line in the Hawaiian Tree Species Metrosideros polymorpha.

    PubMed

    Melcher; Cordell; Jones; Scowcroft; Niemczura; Giambelluca; Goldstein

    2000-05-01

    Population-specific differences in the freezing resistance of Metrosideros polymorpha leaves were studied along an elevational gradient from sea level to tree line (located at ca. 2500 m above sea level) on the east flank of the Mauna Loa volcano in Hawaii. In addition, we also studied 8-yr-old saplings grown in a common garden from seeds collected from the same field populations. Leaves of low-elevation field plants exhibited damage at -2 degrees C, before the onset of ice formation, which occurred at -5.7 degrees C. Leaves of high-elevation plants exhibited damage at ca. -8.5 degrees C, concurrent with ice formation in the leaf tissue, which is typical of plants that avoid freezing in their natural environment by supercooling. Nuclear magnetic resonance studies revealed that water molecules of both extra- and intracellular leaf water fractions from high-elevation plants had restricted mobility, which is consistent with their low water content and their high levels of osmotically active solutes. Decreased mobility of water molecules may delay ice nucleation and/or ice growth and may therefore enhance the ability of plant tissues to supercool. Leaf traits that correlated with specific differences in supercooling capacity were in part genetically determined and in part environmentally induced. Evidence indicated that lower apoplastic water content and smaller intercellular spaces were associated with the larger supercooling capacity of the plant's foliage at tree line. The irreversible tissue-damage temperature decreased by ca. 7 degrees C from sea level to tree line in leaves of field populations. However, this decrease appears to be only large enough to allow M. polymorpha trees to avoid leaf tissue damage from freezing up to a level of ca. 2500 m elevation, which is also the current tree line location on the east flank of Mauna Loa. The limited freezing resistance of M. polymorpha leaves may be partially responsible for the occurrence of tree line at a relatively

  9. Rubbery Polya Tree

    PubMed Central

    NIETO-BARAJAS, LUIS E.; MÜLLER, PETER

    2013-01-01

    Polya trees (PT) are random probability measures which can assign probability 1 to the set of continuous distributions for certain specifications of the hyperparameters. This feature distinguishes the PT from the popular Dirichlet process (DP) model which assigns probability 1 to the set of discrete distributions. However, the PT is not nearly as widely used as the DP prior. Probably the main reason is an awkward dependence of posterior inference on the choice of the partitioning subsets in the definition of the PT. We propose a generalization of the PT prior that mitigates this undesirable dependence on the partition structure, by allowing the branching probabilities to be dependent within the same level. The proposed new process is not a PT anymore. However, it is still a tail-free process and many of the prior properties remain the same as those for the PT. PMID:24368872

  10. Predicting species' range limits from functional traits for the tree flora of North America.

    PubMed

    Stahl, Ulrike; Reu, Björn; Wirth, Christian

    2014-09-23

    Using functional traits to explain species' range limits is a promising approach in functional biogeography. It replaces the idiosyncrasy of species-specific climate ranges with a generic trait-based predictive framework. In addition, it has the potential to shed light on specific filter mechanisms creating large-scale vegetation patterns. However, its application to a continental flora, spanning large climate gradients, has been hampered by a lack of trait data. Here, we explore whether five key plant functional traits (seed mass, wood density, specific leaf area (SLA), maximum height, and longevity of a tree)--indicative of life history, mechanical, and physiological adaptations--explain the climate ranges of 250 North American tree species distributed from the boreal to the subtropics. Although the relationship between traits and the median climate across a species range is weak, quantile regressions revealed strong effects on range limits. Wood density and seed mass were strongly related to the lower but not upper temperature range limits of species. Maximum height affects the species range limits in both dry and humid climates, whereas SLA and longevity do not show clear relationships. These results allow the definition and delineation of climatic "no-go areas" for North American tree species based on key traits. As some of these key traits serve as important parameters in recent vegetation models, the implementation of trait-based climatic constraints has the potential to predict both range shifts and ecosystem consequences on a more functional basis. Moreover, for future trait-based vegetation models our results provide a benchmark for model evaluation. PMID:25225398

  11. Demographic Predictors of Peanut, Tree Nut, Fish, Shellfish, and Sesame Allergy in Canada

    PubMed Central

    Ben-Shoshan, M.; Harrington, D. W.; Soller, L.; Fragapane, J.; Joseph, L.; Pierre, Y. St.; Godefroy, S. B.; Elliott, S. J.; Clarke, A. E.

    2012-01-01

    Background. Studies suggest that the rising prevalence of food allergy during recent decades may have stabilized. Although genetics undoubtedly contribute to the emergence of food allergy, it is likely that other factors play a crucial role in mediating such short-term changes. Objective. To identify potential demographic predictors of food allergies. Methods. We performed a cross-Canada, random telephone survey. Criteria for food allergy were self-report of convincing symptoms and/or physician diagnosis of allergy. Multivariate logistic regressions were used to assess potential determinants. Results. Of 10,596 households surveyed in 2008/2009, 3666 responded, representing 9667 individuals. Peanut, tree nut, and sesame allergy were more common in children (odds ratio (OR) 2.24 (95% CI, 1.40, 3.59), 1.73 (95% CI, 1.11, 2.68), and 5.63 (95% CI, 1.39, 22.87), resp.) while fish and shellfish allergy were less common in children (OR 0.17 (95% CI, 0.04, 0.72) and 0.29 (95% CI, 0.14, 0.61)). Tree nut and shellfish allergy were less common in males (OR 0.55 (95% CI, 0.36, 0.83) and 0.63 (95% CI, 0.43, 0.91)). Shellfish allergy was more common in urban settings (OR 1.55 (95% CI, 1.04, 2.31)). There was a trend for most food allergies to be more prevalent in the more educated (tree nut OR 1.90 (95% CI, 1.18, 3.04)) and less prevalent in immigrants (shellfish OR 0.49 (95% CI, 0.26, 0.95)), but wide CIs preclude definitive conclusions for most foods. Conclusions. Our results reveal that in addition to age and sex, place of residence, socioeconomic status, and birth place may influence the development of food allergy. PMID:22187574

  12. Predicting species' range limits from functional traits for the tree flora of North America.

    PubMed

    Stahl, Ulrike; Reu, Björn; Wirth, Christian

    2014-09-23

    Using functional traits to explain species' range limits is a promising approach in functional biogeography. It replaces the idiosyncrasy of species-specific climate ranges with a generic trait-based predictive framework. In addition, it has the potential to shed light on specific filter mechanisms creating large-scale vegetation patterns. However, its application to a continental flora, spanning large climate gradients, has been hampered by a lack of trait data. Here, we explore whether five key plant functional traits (seed mass, wood density, specific leaf area (SLA), maximum height, and longevity of a tree)--indicative of life history, mechanical, and physiological adaptations--explain the climate ranges of 250 North American tree species distributed from the boreal to the subtropics. Although the relationship between traits and the median climate across a species range is weak, quantile regressions revealed strong effects on range limits. Wood density and seed mass were strongly related to the lower but not upper temperature range limits of species. Maximum height affects the species range limits in both dry and humid climates, whereas SLA and longevity do not show clear relationships. These results allow the definition and delineation of climatic "no-go areas" for North American tree species based on key traits. As some of these key traits serve as important parameters in recent vegetation models, the implementation of trait-based climatic constraints has the potential to predict both range shifts and ecosystem consequences on a more functional basis. Moreover, for future trait-based vegetation models our results provide a benchmark for model evaluation.

  13. Predicting species’ range limits from functional traits for the tree flora of North America

    PubMed Central

    Stahl, Ulrike; Reu, Björn; Wirth, Christian

    2014-01-01

    Using functional traits to explain species’ range limits is a promising approach in functional biogeography. It replaces the idiosyncrasy of species-specific climate ranges with a generic trait-based predictive framework. In addition, it has the potential to shed light on specific filter mechanisms creating large-scale vegetation patterns. However, its application to a continental flora, spanning large climate gradients, has been hampered by a lack of trait data. Here, we explore whether five key plant functional traits (seed mass, wood density, specific leaf area (SLA), maximum height, and longevity of a tree)—indicative of life history, mechanical, and physiological adaptations—explain the climate ranges of 250 North American tree species distributed from the boreal to the subtropics. Although the relationship between traits and the median climate across a species range is weak, quantile regressions revealed strong effects on range limits. Wood density and seed mass were strongly related to the lower but not upper temperature range limits of species. Maximum height affects the species range limits in both dry and humid climates, whereas SLA and longevity do not show clear relationships. These results allow the definition and delineation of climatic “no-go areas” for North American tree species based on key traits. As some of these key traits serve as important parameters in recent vegetation models, the implementation of trait-based climatic constraints has the potential to predict both range shifts and ecosystem consequences on a more functional basis. Moreover, for future trait-based vegetation models our results provide a benchmark for model evaluation. PMID:25225398

  14. Demographic predictors of peanut, tree nut, fish, shellfish, and sesame allergy in Canada.

    PubMed

    Ben-Shoshan, M; Harrington, D W; Soller, L; Fragapane, J; Joseph, L; Pierre, Y St; Godefroy, S B; Elliott, S J; Clarke, A E

    2012-01-01

    Background. Studies suggest that the rising prevalence of food allergy during recent decades may have stabilized. Although genetics undoubtedly contribute to the emergence of food allergy, it is likely that other factors play a crucial role in mediating such short-term changes. Objective. To identify potential demographic predictors of food allergies. Methods. We performed a cross-Canada, random telephone survey. Criteria for food allergy were self-report of convincing symptoms and/or physician diagnosis of allergy. Multivariate logistic regressions were used to assess potential determinants. Results. Of 10,596 households surveyed in 2008/2009, 3666 responded, representing 9667 individuals. Peanut, tree nut, and sesame allergy were more common in children (odds ratio (OR) 2.24 (95% CI, 1.40, 3.59), 1.73 (95% CI, 1.11, 2.68), and 5.63 (95% CI, 1.39, 22.87), resp.) while fish and shellfish allergy were less common in children (OR 0.17 (95% CI, 0.04, 0.72) and 0.29 (95% CI, 0.14, 0.61)). Tree nut and shellfish allergy were less common in males (OR 0.55 (95% CI, 0.36, 0.83) and 0.63 (95% CI, 0.43, 0.91)). Shellfish allergy was more common in urban settings (OR 1.55 (95% CI, 1.04, 2.31)). There was a trend for most food allergies to be more prevalent in the more educated (tree nut OR 1.90 (95% CI, 1.18, 3.04)) and less prevalent in immigrants (shellfish OR 0.49 (95% CI, 0.26, 0.95)), but wide CIs preclude definitive conclusions for most foods. Conclusions. Our results reveal that in addition to age and sex, place of residence, socioeconomic status, and birth place may influence the development of food allergy.

  15. The Group Tree of Experience.

    ERIC Educational Resources Information Center

    Ping, Ki

    1994-01-01

    Describes a group activity that uses a tree as a metaphor to reflect both group and personal growth during adventure activities. The tree's roots represent the group's formation, the branches and leaves represent the group's diversity and capabilities, and the seeds represent the personal learning and growth that took place within the group.…

  16. Studying Evergreen Trees in December.

    ERIC Educational Resources Information Center

    Platt, Dorothy K.

    1991-01-01

    This lesson plan uses evergreen trees on sale in cities and villages during the Christmas season to teach identification techniques. Background information, activities, and recommended references guides deal with historical, symbolic and current uses of evergreen trees, physical characteristics, selection, care, and suggestions for post-Christmas…

  17. Tree Hydraulics: How Sap Rises

    ERIC Educational Resources Information Center

    Denny, Mark

    2012-01-01

    Trees transport water from roots to crown--a height that can exceed 100 m. The physics of tree hydraulics can be conveyed with simple fluid dynamics based upon the Hagen-Poiseuille equation and Murray's law. Here the conduit structure is modelled as conical pipes and as branching pipes. The force required to lift sap is generated mostly by…

  18. Boosted Decision Trees and Applications

    NASA Astrophysics Data System (ADS)

    Coadou, Yann

    2013-07-01

    Decision trees are a machine learning technique more and more commonly used in high energy physics, while it has been widely used in the social sciences. After introducing the concepts of decision trees, this article focuses on its application in particle physics.

  19. Using farm trees for fuelwood

    SciTech Connect

    Poulsen, G.

    1983-01-01

    In the tropics, a significant proportion of wood supplies is obtained from trees on farmland rather than from forest. Reliable estimates of wood fuel resources are difficult to obtain by conventional mensuration techniques since such trees are often subjected to regular heavy pruning and pollarding. Productive potential of hedgerows and other small scrub vegetation used for fuel is also difficult to measure.

  20. Hazard Tree Management for Camps.

    ERIC Educational Resources Information Center

    Kong, Earl

    2002-01-01

    The principles behind a camp's hazard tree program are, first, identifying and removing those hazards that offer a clear, immediate threat, and then creating a management plan for the other trees. The plan should be written and contain goals and objectives, field evaluations, and treatments. Follow-up evaluations should be done annually and after…