Sample records for additive regression trees

  1. Boosted regression tree, table, and figure data

    EPA Pesticide Factsheets

    Spreadsheets are included here to support the manuscript Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition. This dataset is associated with the following publication:Golden , H., C. Lane , A. Prues, and E. D'Amico. Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition. JAWRA. American Water Resources Association, Middleburg, VA, USA, 52(5): 1251-1274, (2016).

  2. Regression analysis using dependent Polya trees.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J

    2013-11-30

    Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.

  3. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis.

    PubMed

    Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H

    2016-01-01

    Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P < 0.01). A clinically useful classification tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.

  4. Scalable Regression Tree Learning on Hadoop using OpenPlanet

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yin, Wei; Simmhan, Yogesh; Prasanna, Viktor

    As scientific and engineering domains attempt to effectively analyze the deluge of data arriving from sensors and instruments, machine learning is becoming a key data mining tool to build prediction models. Regression tree is a popular learning model that combines decision trees and linear regression to forecast numerical target variables based on a set of input features. Map Reduce is well suited for addressing such data intensive learning applications, and a proprietary regression tree algorithm, PLANET, using MapReduce has been proposed earlier. In this paper, we describe an open source implement of this algorithm, OpenPlanet, on the Hadoop framework usingmore » a hybrid approach. Further, we evaluate the performance of OpenPlanet using realworld datasets from the Smart Power Grid domain to perform energy use forecasting, and propose tuning strategies of Hadoop parameters to improve the performance of the default configuration by 75% for a training dataset of 17 million tuples on a 64-core Hadoop cluster on FutureGrid.« less

  5. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    PubMed

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed. © 2014 SETAC.

  6. Dynamic travel time estimation using regression trees.

    DOT National Transportation Integrated Search

    2008-10-01

    This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...

  7. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  8. The process and utility of classification and regression tree methodology in nursing research.

    PubMed

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-06-01

    This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.

  9. Building interpretable predictive models for pediatric hospital readmission using Tree-Lasso logistic regression.

    PubMed

    Jovanovic, Milos; Radovanovic, Sandro; Vukicevic, Milan; Van Poucke, Sven; Delibasic, Boris

    2016-09-01

    Quantification and early identification of unplanned readmission risk have the potential to improve the quality of care during hospitalization and after discharge. However, high dimensionality, sparsity, and class imbalance of electronic health data and the complexity of risk quantification, challenge the development of accurate predictive models. Predictive models require a certain level of interpretability in order to be applicable in real settings and create actionable insights. This paper aims to develop accurate and interpretable predictive models for readmission in a general pediatric patient population, by integrating a data-driven model (sparse logistic regression) and domain knowledge based on the international classification of diseases 9th-revision clinical modification (ICD-9-CM) hierarchy of diseases. Additionally, we propose a way to quantify the interpretability of a model and inspect the stability of alternative solutions. The analysis was conducted on >66,000 pediatric hospital discharge records from California, State Inpatient Databases, Healthcare Cost and Utilization Project between 2009 and 2011. We incorporated domain knowledge based on the ICD-9-CM hierarchy in a data driven, Tree-Lasso regularized logistic regression model, providing the framework for model interpretation. This approach was compared with traditional Lasso logistic regression resulting in models that are easier to interpret by fewer high-level diagnoses, with comparable prediction accuracy. The results revealed that the use of a Tree-Lasso model was as competitive in terms of accuracy (measured by area under the receiver operating characteristic curve-AUC) as the traditional Lasso logistic regression, but integration with the ICD-9-CM hierarchy of diseases provided more interpretable models in terms of high-level diagnoses. Additionally, interpretations of models are in accordance with existing medical understanding of pediatric readmission. Best performing models have

  10. Inferring gene regression networks with model trees

    PubMed Central

    2010-01-01

    Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate

  11. Newer classification and regression tree techniques: Bagging and Random Forests for ecological prediction

    Treesearch

    Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw

    2006-01-01

    We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.

  12. Individualized Prediction of Heat Stress in Firefighters: A Data-Driven Approach Using Classification and Regression Trees.

    PubMed

    Mani, Ashutosh; Rao, Marepalli; James, Kelley; Bhattacharya, Amit

    2015-01-01

    The purpose of this study was to explore data-driven models, based on decision trees, to develop practical and easy to use predictive models for early identification of firefighters who are likely to cross the threshold of hyperthermia during live-fire training. Predictive models were created for three consecutive live-fire training scenarios. The final predicted outcome was a categorical variable: will a firefighter cross the upper threshold of hyperthermia - Yes/No. Two tiers of models were built, one with and one without taking into account the outcome (whether a firefighter crossed hyperthermia or not) from the previous training scenario. First tier of models included age, baseline heart rate and core body temperature, body mass index, and duration of training scenario as predictors. The second tier of models included the outcome of the previous scenario in the prediction space, in addition to all the predictors from the first tier of models. Classification and regression trees were used independently for prediction. The response variable for the regression tree was the quantitative variable: core body temperature at the end of each scenario. The predicted quantitative variable from regression trees was compared to the upper threshold of hyperthermia (38°C) to predict whether a firefighter would enter hyperthermia. The performance of classification and regression tree models was satisfactory for the second (success rate = 79%) and third (success rate = 89%) training scenarios but not for the first (success rate = 43%). Data-driven models based on decision trees can be a useful tool for predicting physiological response without modeling the underlying physiological systems. Early prediction of heat stress coupled with proactive interventions, such as pre-cooling, can help reduce heat stress in firefighters.

  13. Predicting volume of distribution with decision tree-based regression methods using predicted tissue:plasma partition coefficients.

    PubMed

    Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat

    2015-01-01

    Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.

  14. Methods for estimating population density in data-limited areas: evaluating regression and tree-based models in Peru.

    PubMed

    Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

    2014-01-01

    Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies.

  15. Methods for Estimating Population Density in Data-Limited Areas: Evaluating Regression and Tree-Based Models in Peru

    PubMed Central

    Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

    2014-01-01

    Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies. PMID:24992657

  16. Bayesian additive decision trees of biomarker by treatment interactions for predictive biomarker detection and subgroup identification.

    PubMed

    Zhao, Yang; Zheng, Wei; Zhuo, Daisy Y; Lu, Yuefeng; Ma, Xiwen; Liu, Hengchang; Zeng, Zhen; Laird, Glen

    2017-10-11

    Personalized medicine, or tailored therapy, has been an active and important topic in recent medical research. Many methods have been proposed in the literature for predictive biomarker detection and subgroup identification. In this article, we propose a novel decision tree-based approach applicable in randomized clinical trials. We model the prognostic effects of the biomarkers using additive regression trees and the biomarker-by-treatment effect using a single regression tree. Bayesian approach is utilized to periodically revise the split variables and the split rules of the decision trees, which provides a better overall fitting. Gibbs sampler is implemented in the MCMC procedure, which updates the prognostic trees and the interaction tree separately. We use the posterior distribution of the interaction tree to construct the predictive scores of the biomarkers and to identify the subgroup where the treatment is superior to the control. Numerical simulations show that our proposed method performs well under various settings comparing to existing methods. We also demonstrate an application of our method in a real clinical trial.

  17. Predicting the limits to tree height using statistical regressions of leaf traits.

    PubMed

    Burgess, Stephen S O; Dawson, Todd E

    2007-01-01

    Leaf morphology and physiological functioning demonstrate considerable plasticity within tree crowns, with various leaf traits often exhibiting pronounced vertical gradients in very tall trees. It has been proposed that the trajectory of these gradients, as determined by regression methods, could be used in conjunction with theoretical biophysical limits to estimate the maximum height to which trees can grow. Here, we examined this approach using published and new experimental data from tall conifer and angiosperm species. We showed that height predictions were sensitive to tree-to-tree variation in the shape of the regression and to the biophysical endpoints selected. We examined the suitability of proposed end-points and their theoretical validity. We also noted that site and environment influenced height predictions considerably. Use of leaf mass per unit area or leaf water potential coupled with vulnerability of twigs to cavitation poses a number of difficulties for predicting tree height. Photosynthetic rate and carbon isotope discrimination show more promise, but in the second case, the complex relationship between light, water availability, photosynthetic capacity and internal conductance to CO(2) must first be characterized.

  18. Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.

    PubMed

    Chung, Yi-Shih

    2013-12-01

    Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies. Copyright © 2012 Elsevier Ltd. All rights reserved.

  19. A stepwise regression tree for nonlinear approximation: applications to estimating subpixel land cover

    USGS Publications Warehouse

    Huang, C.; Townshend, J.R.G.

    2003-01-01

    A stepwise regression tree (SRT) algorithm was developed for approximating complex nonlinear relationships. Based on the regression tree of Breiman et al . (BRT) and a stepwise linear regression (SLR) method, this algorithm represents an improvement over SLR in that it can approximate nonlinear relationships and over BRT in that it gives more realistic predictions. The applicability of this method to estimating subpixel forest was demonstrated using three test data sets, on all of which it gave more accurate predictions than SLR and BRT. SRT also generated more compact trees and performed better than or at least as well as BRT at all 10 equal forest proportion interval ranging from 0 to 100%. This method is appealing to estimating subpixel land cover over large areas.

  20. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    PubMed

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  1. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  2. Application of Boosting Regression Trees to Preliminary Cost Estimation in Building Construction Projects

    PubMed Central

    2015-01-01

    Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project. PMID:26339227

  3. Application of Boosting Regression Trees to Preliminary Cost Estimation in Building Construction Projects.

    PubMed

    Shin, Yoonseok

    2015-01-01

    Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.

  4. [Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

    PubMed

    Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

    2016-03-01

    Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.

  5. Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

    PubMed Central

    Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

    2012-01-01

    In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999

  6. Assessment of wastewater treatment facility compliance with decreasing ammonia discharge limits using a regression tree model.

    PubMed

    Suchetana, Bihu; Rajagopalan, Balaji; Silverstein, JoAnn

    2017-11-15

    A regression tree-based diagnostic approach is developed to evaluate factors affecting US wastewater treatment plant compliance with ammonia discharge permit limits using Discharge Monthly Report (DMR) data from a sample of 106 municipal treatment plants for the period of 2004-2008. Predictor variables used to fit the regression tree are selected using random forests, and consist of the previous month's effluent ammonia, influent flow rates and plant capacity utilization. The tree models are first used to evaluate compliance with existing ammonia discharge standards at each facility and then applied assuming more stringent discharge limits, under consideration in many states. The model predicts that the ability to meet both current and future limits depends primarily on the previous month's treatment performance. With more stringent discharge limits predicted ammonia concentration relative to the discharge limit, increases. In-sample validation shows that the regression trees can provide a median classification accuracy of >70%. The regression tree model is validated using ammonia discharge data from an operating wastewater treatment plant and is able to accurately predict the observed ammonia discharge category approximately 80% of the time, indicating that the regression tree model can be applied to predict compliance for individual treatment plants providing practical guidance for utilities and regulators with an interest in controlling ammonia discharges. The proposed methodology is also used to demonstrate how to delineate reliable sources of demand and supply in a point source-to-point source nutrient credit trading scheme, as well as how planners and decision makers can set reasonable discharge limits in future. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  8. Hyperspectral Analysis of Soil Nitrogen, Carbon, Carbonate, and Organic Matter Using Regression Trees

    PubMed Central

    Gmur, Stephan; Vogt, Daniel; Zabowski, Darlene; Moskal, L. Monika

    2012-01-01

    The characterization of soil attributes using hyperspectral sensors has revealed patterns in soil spectra that are known to respond to mineral composition, organic matter, soil moisture and particle size distribution. Soil samples from different soil horizons of replicated soil series from sites located within Washington and Oregon were analyzed with the FieldSpec Spectroradiometer to measure their spectral signatures across the electromagnetic range of 400 to 1,000 nm. Similarity rankings of individual soil samples reveal differences between replicate series as well as samples within the same replicate series. Using classification and regression tree statistical methods, regression trees were fitted to each spectral response using concentrations of nitrogen, carbon, carbonate and organic matter as the response variables. Statistics resulting from fitted trees were: nitrogen R2 0.91 (p < 0.01) at 403, 470, 687, and 846 nm spectral band widths, carbonate R2 0.95 (p < 0.01) at 531 and 898 nm band widths, total carbon R2 0.93 (p < 0.01) at 400, 409, 441 and 907 nm band widths, and organic matter R2 0.98 (p < 0.01) at 300, 400, 441, 832 and 907 nm band widths. Use of the 400 to 1,000 nm electromagnetic range utilizing regression trees provided a powerful, rapid and inexpensive method for assessing nitrogen, carbon, carbonate and organic matter for upper soil horizons in a nondestructive method. PMID:23112620

  9. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  10. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  11. Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition

    EPA Science Inventory

    Boosted regression tree (BRT) models were developed to quantify the nonlinear relationships between landscape variables and nutrient concentrations in a mesoscale mixed land cover watershed during base-flow conditions. Factors that affect instream biological components, based on ...

  12. Cloud-Free Satellite Image Mosaics with Regression Trees and Histogram Matching.

    Treesearch

    E.H. Helmer; B. Ruefenacht

    2005-01-01

    Cloud-free optical satellite imagery simplifies remote sensing, but land-cover phenology limits existing solutions to persistent cloudiness to compositing temporally resolute, spatially coarser imagery. Here, a new strategy for developing cloud-free imagery at finer resolution permits simple automatic change detection. The strategy uses regression trees to predict...

  13. Perceived Organizational Support for Enhancing Welfare at Work: A Regression Tree Model

    PubMed Central

    Giorgi, Gabriele; Dubin, David; Perez, Javier Fiz

    2016-01-01

    When trying to examine outcomes such as welfare and well-being, research tends to focus on main effects and take into account limited numbers of variables at a time. There are a number of techniques that may help address this problem. For example, many statistical packages available in R provide easy-to-use methods of modeling complicated analysis such as classification and tree regression (i.e., recursive partitioning). The present research illustrates the value of recursive partitioning in the prediction of perceived organizational support in a sample of more than 6000 Italian bankers. Utilizing the tree function party package in R, we estimated a regression tree model predicting perceived organizational support from a multitude of job characteristics including job demand, lack of job control, lack of supervisor support, training, etc. The resulting model appears particularly helpful in pointing out several interactions in the prediction of perceived organizational support. In particular, training is the dominant factor. Another dimension that seems to influence organizational support is reporting (perceived communication about safety and stress concerns). Results are discussed from a theoretical and methodological point of view. PMID:28082924

  14. A review of logistic regression models used to predict post-fire tree mortality of western North American conifers

    Treesearch

    Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen Fitzgerald

    2012-01-01

    Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...

  15. Regression Trees Identify Relevant Interactions: Can This Improve the Predictive Performance of Risk Adjustment?

    PubMed

    Buchner, Florian; Wasem, Jürgen; Schillo, Sonja

    2017-01-01

    Risk equalization formulas have been refined since their introduction about two decades ago. Because of the complexity and the abundance of possible interactions between the variables used, hardly any interactions are considered. A regression tree is used to systematically search for interactions, a methodologically new approach in risk equalization. Analyses are based on a data set of nearly 2.9 million individuals from a major German social health insurer. A two-step approach is applied: In the first step a regression tree is built on the basis of the learning data set. Terminal nodes characterized by more than one morbidity-group-split represent interaction effects of different morbidity groups. In the second step the 'traditional' weighted least squares regression equation is expanded by adding interaction terms for all interactions detected by the tree, and regression coefficients are recalculated. The resulting risk adjustment formula shows an improvement in the adjusted R 2 from 25.43% to 25.81% on the evaluation data set. Predictive ratios are calculated for subgroups affected by the interactions. The R 2 improvement detected is only marginal. According to the sample level performance measures used, not involving a considerable number of morbidity interactions forms no relevant loss in accuracy. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.

  16. Using ROC curves to compare neural networks and logistic regression for modeling individual noncatastrophic tree mortality

    Treesearch

    Susan L. King

    2003-01-01

    The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...

  17. Estimating parameters for tree basal area growth with a system of equations and seemingly unrelated regressions

    Treesearch

    Charles E. Rose; Thomas B. Lynch

    2001-01-01

    A method was developed for estimating parameters in an individual tree basal area growth model using a system of equations based on dbh rank classes. The estimation method developed is a compromise between an individual tree and a stand level basal area growth model that accounts for the correlation between trees within a plot by using seemingly unrelated regression (...

  18. Additivity in tree biomass components of Pyrenean oak (Quercus pyrenaica Willd.)

    Treesearch

    Joao P. Carvalho; Bernard R. Parresol

    2003-01-01

    In tree biomass estimations, it is important to consider the property of additivity, i.e., the total tree biomass should equal the sum of the components. This work presents functions that allow estimation of the stem and crown dry weight components of Pyrenean oak (Quercus pyrenaica Willd.) trees. A procedure that considers additivity of tree biomass...

  19. Reconstructing missing daily precipitation data using regression trees and artificial neural networks

    USDA-ARS?s Scientific Manuscript database

    Incomplete meteorological data has been a problem in environmental modeling studies. The objective of this work was to develop a technique to reconstruct missing daily precipitation data in the central part of Chesapeake Bay Watershed using regression trees (RT) and artificial neural networks (ANN)....

  20. Reconstructing missing daily precipitation data using regression trees and artificial neural networks

    USDA-ARS?s Scientific Manuscript database

    Missing meteorological data have to be estimated for agricultural and environmental modeling. The objective of this work was to develop a technique to reconstruct the missing daily precipitation data in the central part of the Chesapeake Bay Watershed using regression trees (RT) and artificial neura...

  1. Development of hybrid genetic-algorithm-based neural networks using regression trees for modeling air quality inside a public transportation bus.

    PubMed

    Kadiyala, Akhil; Kaur, Devinder; Kumar, Ashok

    2013-02-01

    The present study developed a novel approach to modeling indoor air quality (IAQ) of a public transportation bus by the development of hybrid genetic-algorithm-based neural networks (also known as evolutionary neural networks) with input variables optimized from using the regression trees, referred as the GART approach. This study validated the applicability of the GART modeling approach in solving complex nonlinear systems by accurately predicting the monitored contaminants of carbon dioxide (CO2), carbon monoxide (CO), nitric oxide (NO), sulfur dioxide (SO2), 0.3-0.4 microm sized particle numbers, 0.4-0.5 microm sized particle numbers, particulate matter (PM) concentrations less than 1.0 microm (PM10), and PM concentrations less than 2.5 microm (PM2.5) inside a public transportation bus operating on 20% grade biodiesel in Toledo, OH. First, the important variables affecting each monitored in-bus contaminant were determined using regression trees. Second, the analysis of variance was used as a complimentary sensitivity analysis to the regression tree results to determine a subset of statistically significant variables affecting each monitored in-bus contaminant. Finally, the identified subsets of statistically significant variables were used as inputs to develop three artificial neural network (ANN) models. The models developed were regression tree-based back-propagation network (BPN-RT), regression tree-based radial basis function network (RBFN-RT), and GART models. Performance measures were used to validate the predictive capacity of the developed IAQ models. The results from this approach were compared with the results obtained from using a theoretical approach and a generalized practicable approach to modeling IAQ that included the consideration of additional independent variables when developing the aforementioned ANN models. The hybrid GART models were able to capture majority of the variance in the monitored in-bus contaminants. The genetic

  2. Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression.

    PubMed

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-08-01

    Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  3. Additivity of nonlinear biomass equations

    Treesearch

    Bernard R. Parresol

    2001-01-01

    Two procedures that guarantee the property of additivity among the components of tree biomass and total tree biomass utilizing nonlinear functions are developed. Procedure 1 is a simple combination approach, and procedure 2 is based on nonlinear joint-generalized regression (nonlinear seemingly unrelated regressions) with parameter restrictions. Statistical theory is...

  4. Prevalence and Determinants of Preterm Birth in Tehran, Iran: A Comparison between Logistic Regression and Decision Tree Methods.

    PubMed

    Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi

    2017-06-01

    Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p < 0.05). Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.

  5. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  6. Estimating cavity tree and snag abundance using negative binomial regression models and nearest neighbor imputation methods

    Treesearch

    Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett

    2009-01-01

    Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....

  7. Logistic regression trees for initial selection of interesting loci in case-control studies

    PubMed Central

    Nickolov, Radoslav Z; Milanov, Valentin B

    2007-01-01

    Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557

  8. Industrial and occupational ergonomics in the petrochemical process industry: a regression trees approach.

    PubMed

    Bevilacqua, M; Ciarapica, F E; Giacchetta, G

    2008-07-01

    This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.

  9. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  10. Combining logistic regression with classification and regression tree to predict quality of care in a home health nursing data set.

    PubMed

    Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun

    2006-01-01

    In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.

  11. An optimal sample data usage strategy to minimize overfitting and underfitting effects in regression tree models based on remotely-sensed data

    USGS Publications Warehouse

    Gu, Yingxin; Wylie, Bruce K.; Boyte, Stephen; Picotte, Joshua J.; Howard, Danny; Smith, Kelcy; Nelson, Kurtis

    2016-01-01

    Regression tree models have been widely used for remote sensing-based ecosystem mapping. Improper use of the sample data (model training and testing data) may cause overfitting and underfitting effects in the model. The goal of this study is to develop an optimal sampling data usage strategy for any dataset and identify an appropriate number of rules in the regression tree model that will improve its accuracy and robustness. Landsat 8 data and Moderate-Resolution Imaging Spectroradiometer-scaled Normalized Difference Vegetation Index (NDVI) were used to develop regression tree models. A Python procedure was designed to generate random replications of model parameter options across a range of model development data sizes and rule number constraints. The mean absolute difference (MAD) between the predicted and actual NDVI (scaled NDVI, value from 0–200) and its variability across the different randomized replications were calculated to assess the accuracy and stability of the models. In our case study, a six-rule regression tree model developed from 80% of the sample data had the lowest MAD (MADtraining = 2.5 and MADtesting = 2.4), which was suggested as the optimal model. This study demonstrates how the training data and rule number selections impact model accuracy and provides important guidance for future remote-sensing-based ecosystem modeling.

  12. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units.

    PubMed

    Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios Gruson, Martha; Baechler, Sébastien

    2015-09-01

    According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as

  13. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China.

    PubMed

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (P<0.05). However, the biomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia.

  14. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China

    PubMed Central

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (P<0.05). However, the biomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia. PMID:27002822

  15. Spatial Assessment of Model Errors from Four Regression Techniques

    Treesearch

    Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove

    2005-01-01

    Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...

  16. Regionalization of meso-scale physically based nitrogen modeling outputs to the macro-scale by the use of regression trees

    NASA Astrophysics Data System (ADS)

    Künne, A.; Fink, M.; Kipka, H.; Krause, P.; Flügel, W.-A.

    2012-06-01

    In this paper, a method is presented to estimate excess nitrogen on large scales considering single field processes. The approach was implemented by using the physically based model J2000-S to simulate the nitrogen balance as well as the hydrological dynamics within meso-scale test catchments. The model input data, the parameterization, the results and a detailed system understanding were used to generate the regression tree models with GUIDE (Loh, 2002). For each landscape type in the federal state of Thuringia a regression tree was calibrated and validated using the model data and results of excess nitrogen from the test catchments. Hydrological parameters such as precipitation and evapotranspiration were also used to predict excess nitrogen by the regression tree model. Hence they had to be calculated and regionalized as well for the state of Thuringia. Here the model J2000g was used to simulate the water balance on the macro scale. With the regression trees the excess nitrogen was regionalized for each landscape type of Thuringia. The approach allows calculating the potential nitrogen input into the streams of the drainage area. The results show that the applied methodology was able to transfer the detailed model results of the meso-scale catchments to the entire state of Thuringia by low computing time without losing the detailed knowledge from the nitrogen transport modeling. This was validated with modeling results from Fink (2004) in a catchment lying in the regionalization area. The regionalized and modeled excess nitrogen correspond with 94%. The study was conducted within the framework of a project in collaboration with the Thuringian Environmental Ministry, whose overall aim was to assess the effect of agro-environmental measures regarding load reduction in the water bodies of Thuringia to fulfill the requirements of the European Water Framework Directive (Bäse et al., 2007; Fink, 2006; Fink et al., 2007).

  17. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  18. Boosting structured additive quantile regression for longitudinal childhood obesity data.

    PubMed

    Fenske, Nora; Fahrmeir, Ludwig; Hothorn, Torsten; Rzehak, Peter; Höhle, Michael

    2013-07-25

    Childhood obesity and the investigation of its risk factors has become an important public health issue. Our work is based on and motivated by a German longitudinal study including 2,226 children with up to ten measurements on their body mass index (BMI) and risk factors from birth to the age of 10 years. We introduce boosting of structured additive quantile regression as a novel distribution-free approach for longitudinal quantile regression. The quantile-specific predictors of our model include conventional linear population effects, smooth nonlinear functional effects, varying-coefficient terms, and individual-specific effects, such as intercepts and slopes. Estimation is based on boosting, a computer intensive inference method for highly complex models. We propose a component-wise functional gradient descent boosting algorithm that allows for penalized estimation of the large variety of different effects, particularly leading to individual-specific effects shrunken toward zero. This concept allows us to flexibly estimate the nonlinear age curves of upper quantiles of the BMI distribution, both on population and on individual-specific level, adjusted for further risk factors and to detect age-varying effects of categorical risk factors. Our model approach can be regarded as the quantile regression analog of Gaussian additive mixed models (or structured additive mean regression models), and we compare both model classes with respect to our obesity data.

  19. Boosted Regression Tree Models to Explain Watershed ...

    EPA Pesticide Factsheets

    Boosted regression tree (BRT) models were developed to quantify the nonlinear relationships between landscape variables and nutrient concentrations in a mesoscale mixed land cover watershed during base-flow conditions. Factors that affect instream biological components, based on the Index of Biotic Integrity (IBI), were also analyzed. Seasonal BRT models at two spatial scales (watershed and riparian buffered area [RBA]) for nitrite-nitrate (NO2-NO3), total Kjeldahl nitrogen, and total phosphorus (TP) and annual models for the IBI score were developed. Two primary factors — location within the watershed (i.e., geographic position, stream order, and distance to a downstream confluence) and percentage of urban land cover (both scales) — emerged as important predictor variables. Latitude and longitude interacted with other factors to explain the variability in summer NO2-NO3 concentrations and IBI scores. BRT results also suggested that location might be associated with indicators of sources (e.g., land cover), runoff potential (e.g., soil and topographic factors), and processes not easily represented by spatial data indicators. Runoff indicators (e.g., Hydrological Soil Group D and Topographic Wetness Indices) explained a substantial portion of the variability in nutrient concentrations as did point sources for TP in the summer months. The results from our BRT approach can help prioritize areas for nutrient management in mixed-use and heavily impacted watershed

  20. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  1. New machine learning tools for predictive vegetation mapping after climate change: Bagging and Random Forest perform better than Regression Tree Analysis

    Treesearch

    L.R. Iverson; A.M. Prasad; A. Liaw

    2004-01-01

    More and better machine learning tools are becoming available for landscape ecologists to aid in understanding species-environment relationships and to map probable species occurrence now and potentially into the future. To thal end, we evaluated three statistical models: Regression Tree Analybib (RTA), Bagging Trees (BT) and Random Forest (RF) for their utility in...

  2. Classification and regression tree (CART) analysis of endometrial carcinoma: Seeing the forest for the trees.

    PubMed

    Barlin, Joyce N; Zhou, Qin; St Clair, Caryn M; Iasonos, Alexia; Soslow, Robert A; Alektiar, Kaled M; Hensley, Martee L; Leitao, Mario M; Barakat, Richard R; Abu-Rustum, Nadeem R

    2013-09-01

    The objectives of the study are to evaluate which clinicopathologic factors influenced overall survival (OS) in endometrial carcinoma and to determine if the surgical effort to assess para-aortic (PA) lymph nodes (LNs) at initial staging surgery impacts OS. All patients diagnosed with endometrial cancer from 1/1993-12/2011 who had LNs excised were included. PALN assessment was defined by the identification of one or more PALNs on final pathology. A multivariate analysis was performed to assess the effect of PALNs on OS. A form of recursive partitioning called classification and regression tree (CART) analysis was implemented. Variables included: age, stage, tumor subtype, grade, myometrial invasion, total LNs removed, evaluation of PALNs, and adjuvant chemotherapy. The cohort included 1920 patients, with a median age of 62 years. The median number of LNs removed was 16 (range, 1-99). The removal of PALNs was not associated with OS (P=0.450). Using the CART hierarchically, stage I vs. stages II-IV and grades 1-2 vs. grade 3 emerged as predictors of OS. If the tree was allowed to grow, further branching was based on age and myometrial invasion. Total number of LNs removed and assessment of PALNs as defined in this study were not predictive of OS. This innovative CART analysis emphasized the importance of proper stage assignment and a binary grading system in impacting OS. Notably, the total number of LNs removed and specific evaluation of PALNs as defined in this study were not important predictors of OS. Copyright © 2013 Elsevier Inc. All rights reserved.

  3. Downscaling soil moisture over East Asia through multi-sensor data fusion and optimization of regression trees

    NASA Astrophysics Data System (ADS)

    Park, Seonyoung; Im, Jungho; Park, Sumin; Rhee, Jinyoung

    2017-04-01

    Soil moisture is one of the most important keys for understanding regional and global climate systems. Soil moisture is directly related to agricultural processes as well as hydrological processes because soil moisture highly influences vegetation growth and determines water supply in the agroecosystem. Accurate monitoring of the spatiotemporal pattern of soil moisture is important. Soil moisture has been generally provided through in situ measurements at stations. Although field survey from in situ measurements provides accurate soil moisture with high temporal resolution, it requires high cost and does not provide the spatial distribution of soil moisture over large areas. Microwave satellite (e.g., advanced Microwave Scanning Radiometer on the Earth Observing System (AMSR2), the Advanced Scatterometer (ASCAT), and Soil Moisture Active Passive (SMAP)) -based approaches and numerical models such as Global Land Data Assimilation System (GLDAS) and Modern- Era Retrospective Analysis for Research and Applications (MERRA) provide spatial-temporalspatiotemporally continuous soil moisture products at global scale. However, since those global soil moisture products have coarse spatial resolution ( 25-40 km), their applications for agriculture and water resources at local and regional scales are very limited. Thus, soil moisture downscaling is needed to overcome the limitation of the spatial resolution of soil moisture products. In this study, GLDAS soil moisture data were downscaled up to 1 km spatial resolution through the integration of AMSR2 and ASCAT soil moisture data, Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM), and Moderate Resolution Imaging Spectroradiometer (MODIS) data—Land Surface Temperature, Normalized Difference Vegetation Index, and Land cover—using modified regression trees over East Asia from 2013 to 2015. Modified regression trees were implemented using Cubist, a commercial software tool based on machine learning. An

  4. Quantile regression via vector generalized additive models.

    PubMed

    Yee, Thomas W

    2004-07-30

    One of the most popular methods for quantile regression is the LMS method of Cole and Green. The method naturally falls within a penalized likelihood framework, and consequently allows for considerable flexible because all three parameters may be modelled by cubic smoothing splines. The model is also very understandable: for a given value of the covariate, the LMS method applies a Box-Cox transformation to the response in order to transform it to standard normality; to obtain the quantiles, an inverse Box-Cox transformation is applied to the quantiles of the standard normal distribution. The purposes of this article are three-fold. Firstly, LMS quantile regression is presented within the framework of the class of vector generalized additive models. This confers a number of advantages such as a unifying theory and estimation process. Secondly, a new LMS method based on the Yeo-Johnson transformation is proposed, which has the advantage that the response is not restricted to be positive. Lastly, this paper describes a software implementation of three LMS quantile regression methods in the S language. This includes the LMS-Yeo-Johnson method, which is estimated efficiently by a new numerical integration scheme. The LMS-Yeo-Johnson method is illustrated by way of a large cross-sectional data set from a New Zealand working population. Copyright 2004 John Wiley & Sons, Ltd.

  5. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  6. Regression models for estimating leaf area of seedlings and adult individuals of Neotropical rainforest tree species.

    PubMed

    Brito-Rocha, E; Schilling, A C; Dos Anjos, L; Piotto, D; Dalmolin, A C; Mielke, M S

    2016-01-01

    Individual leaf area (LA) is a key variable in studies of tree ecophysiology because it directly influences light interception, photosynthesis and evapotranspiration of adult trees and seedlings. We analyzed the leaf dimensions (length - L and width - W) of seedlings and adults of seven Neotropical rainforest tree species (Brosimum rubescens, Manilkara maxima, Pouteria caimito, Pouteria torta, Psidium cattleyanum, Symphonia globulifera and Tabebuia stenocalyx) with the objective to test the feasibility of single regression models to estimate LA of both adults and seedlings. In southern Bahia, Brazil, a first set of data was collected between March and October 2012. From the seven species analyzed, only two (P. cattleyanum and T. stenocalyx) had very similar relationships between LW and LA in both ontogenetic stages. For these two species, a second set of data was collected in August 2014, in order to validate the single models encompassing adult and seedlings. Our results show the possibility of development of models for predicting individual leaf area encompassing different ontogenetic stages for tropical tree species. The development of these models was more dependent on the species than the differences in leaf size between seedlings and adults.

  7. Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods

    Treesearch

    Gretchen G. Moisen; Elizabeth A. Freeman; Jock A. Blackard; Tracey S. Frescino; Niklaus E. Zimmermann; Thomas C. Edwards

    2006-01-01

    Many efforts are underway to produce broad-scale forest attribute maps by modelling forest class and structure variables collected in forest inventories as functions of satellite-based and biophysical information. Typically, variants of classification and regression trees implemented in Rulequest's© See5 and Cubist (for binary and continuous responses,...

  8. [Application of regression tree in analyzing the effects of climate factors on NDVI in loess hilly area of Shaanxi Province].

    PubMed

    Liu, Yang; Lü, Yi-he; Zheng, Hai-feng; Chen, Li-ding

    2010-05-01

    Based on the 10-day SPOT VEGETATION NDVI data and the daily meteorological data from 1998 to 2007 in Yan' an City, the main meteorological variables affecting the annual and interannual variations of NDVI were determined by using regression tree. It was found that the effects of test meteorological variables on the variability of NDVI differed with seasons and time lags. Temperature and precipitation were the most important meteorological variables affecting the annual variation of NDVI, and the average highest temperature was the most important meteorological variable affecting the inter-annual variation of NDVI. Regression tree was very powerful in determining the key meteorological variables affecting NDVI variation, but could not build quantitative relations between NDVI and meteorological variables, which limited its further and wider application.

  9. Foot and hip contributions to high frontal plane knee projection angle in athletes: a classification and regression tree approach.

    PubMed

    Bittencourt, Natalia F N; Ocarino, Juliana M; Mendonça, Luciana D M; Hewett, Timothy E; Fonseca, Sergio T

    2012-12-01

    Cross-sectional. To investigate predictors of increased frontal plane knee projection angle (FPKPA) in athletes. The underlying mechanisms that lead to increased FPKPA are likely multifactorial and depend on how the musculoskeletal system adapts to the possible interactions between its distal and proximal segments. Bivariate and linear analyses traditionally employed to analyze the occurrence of increased FPKPA are not sufficiently robust to capture complex relationships among predictors. The investigation of nonlinear interactions among biomechanical factors is necessary to further our understanding of the interdependence of lower-limb segments and resultant dynamic knee alignment. The FPKPA was assessed in 101 athletes during a single-leg squat and in 72 athletes at the moment of landing from a jump. The investigated predictors were sex, hip abductor isometric torque, passive range of motion (ROM) of hip internal rotation (IR), and shank-forefoot alignment. Classification and regression trees were used to investigate nonlinear interactions among predictors and their influence on the occurrence of increased FPKPA. During single-leg squatting, the occurrence of high FPKPA was predicted by the interaction between hip abductor isometric torque and passive hip IR ROM. At the moment of landing, the shank-forefoot alignment, abductor isometric torque, and passive hip IR ROM were predictors of high FPKPA. In addition, the classification and regression trees established cutoff points that could be used in clinical practice to identify athletes who are at potential risk for excessive FPKPA. The models captured nonlinear interactions between hip abductor isometric torque, passive hip IR ROM, and shank-forefoot alignment.

  10. Application of classification tree and logistic regression for the management and health intervention plans in a community-based study.

    PubMed

    Teng, Ju-Hsi; Lin, Kuan-Chia; Ho, Bin-Shenq

    2007-10-01

    A community-based aboriginal study was conducted and analysed to explore the application of classification tree and logistic regression. A total of 1066 aboriginal residents in Yilan County were screened during 2003-2004. The independent variables include demographic characteristics, physical examinations, geographic location, health behaviours, dietary habits and family hereditary diseases history. Risk factors of cardiovascular diseases were selected as the dependent variables in further analysis. The completion rate for heath interview is 88.9%. The classification tree results find that if body mass index is higher than 25.72 kg m(-2) and the age is above 51 years, the predicted probability for number of cardiovascular risk factors > or =3 is 73.6% and the population is 322. If body mass index is higher than 26.35 kg m(-2) and geographical latitude of the village is lower than 24 degrees 22.8', the predicted probability for number of cardiovascular risk factors > or =4 is 60.8% and the population is 74. As the logistic regression results indicate that body mass index, drinking habit and menopause are the top three significant independent variables. The classification tree model specifically shows the discrimination paths and interactions between the risk groups. The logistic regression model presents and analyses the statistical independent factors of cardiovascular risks. Applying both models to specific situations will provide a different angle for the design and management of future health intervention plans after community-based study.

  11. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification. © 2016 John Wiley & Sons Ltd.

  12. Estimating interaction on an additive scale between continuous determinants in a logistic regression model.

    PubMed

    Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I

    2007-10-01

    To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.

  13. Incorporating additional tree and environmental variables in a lodgepole pine stem profile model

    Treesearch

    John C. Byrne

    1993-01-01

    A new variable-form segmented stem profile model is developed for lodgepole pine (Pinus contorta) trees from the northern Rocky Mountains of the United States. I improved estimates of stem diameter by predicting two of the model coefficients with linear equations using a measure of tree form, defined as a ratio of dbh and total height. Additional improvements were...

  14. Estimating tree biomass regressions and their error, proceedings of the workshop on tree biomass regression functions and their contribution to the error

    Treesearch

    Eric H. Wharton; Tiberius Cunia

    1987-01-01

    Proceedings of a workshop co-sponsored by the USDA Forest Service, the State University of New York, and the Society of American Foresters. Presented were papers on the methodology of sample tree selection, tree biomass measurement, construction of biomass tables and estimation of their error, and combining the error of biomass tables with that of the sample plots or...

  15. Structured functional additive regression in reproducing kernel Hilbert spaces.

    PubMed

    Zhu, Hongxiao; Yao, Fang; Zhang, Hao Helen

    2014-06-01

    Functional additive models (FAMs) provide a flexible yet simple framework for regressions involving functional predictors. The utilization of data-driven basis in an additive rather than linear structure naturally extends the classical functional linear model. However, the critical issue of selecting nonlinear additive components has been less studied. In this work, we propose a new regularization framework for the structure estimation in the context of Reproducing Kernel Hilbert Spaces. The proposed approach takes advantage of the functional principal components which greatly facilitates the implementation and the theoretical analysis. The selection and estimation are achieved by penalized least squares using a penalty which encourages the sparse structure of the additive components. Theoretical properties such as the rate of convergence are investigated. The empirical performance is demonstrated through simulation studies and a real data application.

  16. [A strategy for assessing environmental influence on airway allergy using a regression binary tree-based method].

    PubMed

    Yoshioka, Fumi; Azuma, Emiko; Nakajima, Takae; Hashimoto, Masafumi; Toyoshima, Kyoichiro; Komachi, Yoshio

    2004-08-01

    To clarify the living environment factors that increase the risk of allergic sensitization to house dust mites, we applied a regression binary tree-based method (CART, Classification & Regression Trees) to an epidemiological study on airway allergy. The utility of the tree map in personal sanitary guidance for preventing allergic sensitization was examined with respect to feasibility and validity. A questionnaire was given to 386 healthy adult women, asking them about their individual living environments. Also, blood samples were collected to measure Dermatophagoides pteronyssinus (Dp)-specific IgE, the presence/absence of Dp-sensitization being expressed as positive/negative. The questionnaire consisted of nine items on (1) home ventilation by keeping windows open, (2) personal or family smoking habits, (3) use of air conditioners in hot weather, (4) type of flooring (tatami/wooden/carpet) in the living room, (5) visible mold proliferation in the kitchen, (6) type of housing (concrete/wooden), (7) residential area (heavy or light traffic area) (8) heating system (use of unventilated combustion appliances), and (9) frequency of cleaning (every day or less often). There also were queries on the past history of airway allergic diseases, such as bronchial asthma and allergic rhinitis. CART and a multivariate logistic regression analysis (MLRA) were performed. The subjects were first classified into two groups, with and without a history of airway allergic diseases (Groups WPH and WOPH). In each group, the involvement of living environment factors in Dp-sensitization was examined using CART and MLRA. In the MLRA study, individual living environment factors showed promotional or suppressive effects on Dp-sensitization with differences between the two groups. With respect to the CART results, the two groups were first split by the factor that had the most significant odds ratio for MLRA. In Group WPH, which had a Dp-sensitization risk of 19.5%, the first split was by the

  17. Structured functional additive regression in reproducing kernel Hilbert spaces

    PubMed Central

    Zhu, Hongxiao; Yao, Fang; Zhang, Hao Helen

    2013-01-01

    Summary Functional additive models (FAMs) provide a flexible yet simple framework for regressions involving functional predictors. The utilization of data-driven basis in an additive rather than linear structure naturally extends the classical functional linear model. However, the critical issue of selecting nonlinear additive components has been less studied. In this work, we propose a new regularization framework for the structure estimation in the context of Reproducing Kernel Hilbert Spaces. The proposed approach takes advantage of the functional principal components which greatly facilitates the implementation and the theoretical analysis. The selection and estimation are achieved by penalized least squares using a penalty which encourages the sparse structure of the additive components. Theoretical properties such as the rate of convergence are investigated. The empirical performance is demonstrated through simulation studies and a real data application. PMID:25013362

  18. A cross-sectional study for predicting tail biting risk in pig farms using classification and regression tree analysis.

    PubMed

    Scollo, Annalisa; Gottardo, Flaviana; Contiero, Barbara; Edwards, Sandra A

    2017-10-01

    Tail biting in pigs has been an identified behavioural, welfare and economic problem for decades, and requires appropriate but sometimes difficult on-farm interventions. The aim of the paper is to introduce the Classification and Regression Tree (CRT) methodologies to develop a tool for prevention of acute tail biting lesions in pigs on-farm. A sample of 60 commercial farms rearing heavy pigs were involved; an on-farm visit and an interview with the farmer collected data on general management, herd health, disease prevention, climate control, feeding and production traits. Results suggest a value for the CRT analysis in managing the risk factors behind tail biting on a farm-specific level, showing 86.7% sensitivity for the Classification Tree and a correlation of 0.7 between observed and predicted prevalence of tail biting obtained with the Regression Tree. CRT analysis showed five main variables (stocking density, ammonia levels, number of pigs per stockman, type of floor and timeliness in feed supply) as critical predictors of acute tail biting lesions, which demonstrate different importance in different farms subgroups. The model might have reliable and practical applications for the support and implementation of tail biting prevention interventions, especially in case of subgroups of pigs with higher risk, helping farmers and veterinarians to assess the risk in their own farm and to manage their predisposing variables in order to reduce acute tail biting lesions. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Regression modeling and mapping of coniferous forest basal area and tree density from discrete-return lidar and multispectral data

    Treesearch

    Andrew T. Hudak; Nicholas L. Crookston; Jeffrey S. Evans; Michael K. Falkowski; Alistair M. S. Smith; Paul E. Gessler; Penelope Morgan

    2006-01-01

    We compared the utility of discrete-return light detection and ranging (lidar) data and multispectral satellite imagery, and their integration, for modeling and mapping basal area and tree density across two diverse coniferous forest landscapes in north-central Idaho. We applied multiple linear regression models subset from a suite of 26 predictor variables derived...

  20. Grassland and cropland net ecosystem production of the U.S. Great Plains: Regression tree model development and comparative analysis

    USGS Publications Warehouse

    Wylie, Bruce K.; Howard, Daniel; Dahal, Devendra; Gilmanov, Tagir; Ji, Lei; Zhang, Li; Smith, Kelcy

    2016-01-01

    This paper presents the methodology and results of two ecological-based net ecosystem production (NEP) regression tree models capable of up scaling measurements made at various flux tower sites throughout the U.S. Great Plains. Separate grassland and cropland NEP regression tree models were trained using various remote sensing data and other biogeophysical data, along with 15 flux towers contributing to the grassland model and 15 flux towers for the cropland model. The models yielded weekly mean daily grassland and cropland NEP maps of the U.S. Great Plains at 250 m resolution for 2000–2008. The grassland and cropland NEP maps were spatially summarized and statistically compared. The results of this study indicate that grassland and cropland ecosystems generally performed as weak net carbon (C) sinks, absorbing more C from the atmosphere than they released from 2000 to 2008. Grasslands demonstrated higher carbon sink potential (139 g C·m−2·year−1) than non-irrigated croplands. A closer look into the weekly time series reveals the C fluctuation through time and space for each land cover type.

  1. Which sociodemographic factors are important on smoking behaviour of high school students? The contribution of classification and regression tree methodology in a broad epidemiological survey

    PubMed Central

    Özge, C; Toros, F; Bayramkaya, E; Çamdeviren, H; Şaşmaz, T

    2006-01-01

    Background The purpose of this study is to evaluate the most important sociodemographic factors on smoking status of high school students using a broad randomised epidemiological survey. Methods Using in‐class, self administered questionnaire about their sociodemographic variables and smoking behaviour, a representative sample of total 3304 students of preparatory, 9th, 10th, and 11th grades, from 22 randomly selected schools of Mersin, were evaluated and discriminative factors have been determined using appropriate statistics. In addition to binary logistic regression analysis, the study evaluated combined effects of these factors using classification and regression tree methodology, as a new statistical method. Results The data showed that 38% of the students reported lifetime smoking and 16.9% of them reported current smoking with a male predominancy and increasing prevalence by age. Second hand smoking was reported at a 74.3% frequency with father predominance (56.6%). The significantly important factors that affect current smoking in these age groups were increased by household size, late birth rank, certain school types, low academic performance, increased second hand smoking, and stress (especially reported as separation from a close friend or because of violence at home). Classification and regression tree methodology showed the importance of some neglected sociodemographic factors with a good classification capacity. Conclusions It was concluded that, as closely related with sociocultural factors, smoking was a common problem in this young population, generating important academic and social burden in youth life and with increasing data about this behaviour and using new statistical methods, effective coping strategies could be composed. PMID:16891446

  2. Which sociodemographic factors are important on smoking behaviour of high school students? The contribution of classification and regression tree methodology in a broad epidemiological survey.

    PubMed

    Ozge, C; Toros, F; Bayramkaya, E; Camdeviren, H; Sasmaz, T

    2006-08-01

    The purpose of this study is to evaluate the most important sociodemographic factors on smoking status of high school students using a broad randomised epidemiological survey. Using in-class, self administered questionnaire about their sociodemographic variables and smoking behaviour, a representative sample of total 3304 students of preparatory, 9th, 10th, and 11th grades, from 22 randomly selected schools of Mersin, were evaluated and discriminative factors have been determined using appropriate statistics. In addition to binary logistic regression analysis, the study evaluated combined effects of these factors using classification and regression tree methodology, as a new statistical method. The data showed that 38% of the students reported lifetime smoking and 16.9% of them reported current smoking with a male predominancy and increasing prevalence by age. Second hand smoking was reported at a 74.3% frequency with father predominance (56.6%). The significantly important factors that affect current smoking in these age groups were increased by household size, late birth rank, certain school types, low academic performance, increased second hand smoking, and stress (especially reported as separation from a close friend or because of violence at home). Classification and regression tree methodology showed the importance of some neglected sociodemographic factors with a good classification capacity. It was concluded that, as closely related with sociocultural factors, smoking was a common problem in this young population, generating important academic and social burden in youth life and with increasing data about this behaviour and using new statistical methods, effective coping strategies could be composed.

  3. Regression tree modeling of forest NPP using site conditions and climate variables across eastern USA

    NASA Astrophysics Data System (ADS)

    Kwon, Y.

    2013-12-01

    As evidence of global warming continue to increase, being able to predict forest response to climate changes, such as expected rise of temperature and precipitation, will be vital for maintaining the sustainability and productivity of forests. To map forest species redistribution by climate change scenario has been successful, however, most species redistribution maps lack mechanistic understanding to explain why trees grow under the novel conditions of chaining climate. Distributional map is only capable of predicting under the equilibrium assumption that the communities would exist following a prolonged period under the new climate. In this context, forest NPP as a surrogate for growth rate, the most important facet that determines stand dynamics, can lead to valid prediction on the transition stage to new vegetation-climate equilibrium as it represents changes in structure of forest reflecting site conditions and climate factors. The objective of this study is to develop forest growth map using regression tree analysis by extracting large-scale non-linear structures from both field-based FIA and remotely sensed MODIS data set. The major issue addressed in this approach is non-linear spatial patterns of forest attributes. Forest inventory data showed complex spatial patterns that reflect environmental states and processes that originate at different spatial scales. At broad scales, non-linear spatial trends in forest attributes and mixture of continuous and discrete types of environmental variables make traditional statistical (multivariate regression) and geostatistical (kriging) models inefficient. It calls into question some traditional underlying assumptions of spatial trends that uncritically accepted in forest data. To solve the controversy surrounding the suitability of forest data, regression tree analysis are performed using Software See5 and Cubist. Four publicly available data sets were obtained: First, field-based Forest Inventory and Analysis (USDA

  4. Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree.

    PubMed

    Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-Koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid

    2016-10-01

    Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. The proposed model had an accuracy of 94.84% (. 24.42) in order to correct prediction of the ESD disease. Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD.

  5. Assessing College Student Interest in Math and/or Computer Science in a Cross-National Sample Using Classification and Regression Trees

    ERIC Educational Resources Information Center

    Kitsantas, Anastasia; Kitsantas, Panagiota; Kitsantas, Thomas

    2012-01-01

    The purpose of this exploratory study was to assess the relative importance of a number of variables in predicting students' interest in math and/or computer science. Classification and regression trees (CART) were employed in the analysis of survey data collected from 276 college students enrolled in two U.S. and Greek universities. The results…

  6. Real-time quality monitoring in debutanizer column with regression tree and ANFIS

    NASA Astrophysics Data System (ADS)

    Siddharth, Kumar; Pathak, Amey; Pani, Ajaya Kumar

    2018-05-01

    A debutanizer column is an integral part of any petroleum refinery. Online composition monitoring of debutanizer column outlet streams is highly desirable in order to maximize the production of liquefied petroleum gas. In this article, data-driven models for debutanizer column are developed for real-time composition monitoring. The dataset used has seven process variables as inputs and the output is the butane concentration in the debutanizer column bottom product. The input-output dataset is divided equally into a training (calibration) set and a validation (testing) set. The training set data were used to develop fuzzy inference, adaptive neuro fuzzy (ANFIS) and regression tree models for the debutanizer column. The accuracy of the developed models were evaluated by simulation of the models with the validation dataset. It is observed that the ANFIS model has better estimation accuracy than other models developed in this work and many data-driven models proposed so far in the literature for the debutanizer column.

  7. The use of regression tree analysis for predicting the functional outcome following traumatic spinal cord injury.

    PubMed

    Facchinello, Yann; Beauséjour, Marie; Richard-Denis, Andreane; Thompson, Cynthia; Mac-Thiong, Jean-Marc

    2017-10-25

    Predicting the long-term functional outcome following traumatic spinal cord injury is needed to adapt medical strategies and to plan an optimized rehabilitation. This study investigates the use of regression tree for the development of predictive models based on acute clinical and demographic predictors. This prospective study was performed on 172 patients hospitalized following traumatic spinal cord injury. Functional outcome was quantified using the Spinal Cord Independence Measure collected within the first-year post injury. Age, delay prior to surgery and Injury Severity Score were considered as continuous predictors while energy of injury, trauma mechanisms, neurological level of injury, injury severity, occurrence of early spasticity, urinary tract infection, pressure ulcer and pneumonia were coded as categorical inputs. A simplified model was built using only injury severity, neurological level, energy and age as predictor and was compared to a more complex model considering all 11 predictors mentioned above The models built using 4 and 11 predictors were found to explain 51.4% and 62.3% of the variance of the Spinal Cord Independence Measure total score after validation, respectively. The severity of the neurological deficit at admission was found to be the most important predictor. Other important predictors were the Injury Severity Score, age, neurological level and delay prior to surgery. Regression trees offer promising performances for predicting the functional outcome after a traumatic spinal cord injury. It could help to determine the number and type of predictors leading to a prediction model of the functional outcome that can be used clinically in the future.

  8. Additive or non-additive effect of mixing oak in pine stands on soil properties depends on the tree species in Mediterranean forests.

    PubMed

    Brunel, Caroline; Gros, Raphael; Ziarelli, Fabio; Farnet Da Silva, Anne Marie

    2017-07-15

    This study investigated how oak abundance in pine stands (using relative Oak Basal Area %, OBA%) may modulate soil microbial functioning. Forests were composed of sclerophyllous species i.e. Quercus ilex mixed with Pinus halepensis Miller or of Q. pubescens mixed with P. sylvestris. We used a series of plots with OBA% ranging from 0 to 100% in the two types of stand (n=60) and both OLF and A-horizon compartments were analysed. Relations between OBA% and either soil chemical (C and N contents, quality of organic matter via solid-state NMR, pH, CaCO 3 ) or microbial (enzyme activities, basal respiration, biomass and catabolic diversity via BIOLOG) characteristics were described. OBA% increase led to a decrease in the recalcitrant fraction of organic matter (OM) in OLF and promoted microbial growth. Catabolic profiles of microbial communities from A-horizon were significantly modulated in Q. ilex and P. halepensis stand by OBA% and alkyl C to carboxyl C ratio (characteristic of cutin from Q. ilex tissues) and in Q. pubescens and P. sylvestris stands, by OBA% and pH. In A-horizon under Q. ilex and P. halepensis stands, linear regressions were found between catabolic diversity, microbial biomass and OBA% suggesting an additive effect. Conversely, in A-horizon Q. pubescens and P. sylvestris stands, the relationship between OBA% and either cellulase activities, polysaccharides or ammonium contents, suggested a non-additive effect of Q. pubescens and P. sylvestris, enhancing mineralization of the OM labile fraction for plots characterized by an OBA% ranging from 40% to 60%. Mixing oak with pine thus favored microbial dynamics in both type of stands though OBA% print varied with tree species and consequently sustainable soil functioning depend strongly on the composition of mixed stands. Our study indeed revealed that, when evaluating the benefits of forest mixed stand on soil microbial functioning and OM turnover, the identity of tree species has to be considered

  9. Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree

    PubMed Central

    Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid

    2016-01-01

    Introduction: Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. Objective: we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. Methods: we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. Results: The proposed model had an accuracy of 94.84% ( Standard Deviation: 24.42) in order to correct prediction of the ESD disease. Conclusions: Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD. PMID:28077889

  10. A regression tree for identifying combinations of fall risk factors associated to recurrent falling: a cross-sectional elderly population-based study.

    PubMed

    Kabeshova, A; Annweiler, C; Fantino, B; Philip, T; Gromov, V A; Launay, C P; Beauchet, O

    2014-06-01

    Regression tree (RT) analyses are particularly adapted to explore the risk of recurrent falling according to various combinations of fall risk factors compared to logistic regression models. The aims of this study were (1) to determine which combinations of fall risk factors were associated with the occurrence of recurrent falls in older community-dwellers, and (2) to compare the efficacy of RT and multiple logistic regression model for the identification of recurrent falls. A total of 1,760 community-dwelling volunteers (mean age ± standard deviation, 71.0 ± 5.1 years; 49.4 % female) were recruited prospectively in this cross-sectional study. Age, gender, polypharmacy, use of psychoactive drugs, fear of falling (FOF), cognitive disorders and sad mood were recorded. In addition, the history of falls within the past year was recorded using a standardized questionnaire. Among 1,760 participants, 19.7 % (n = 346) were recurrent fallers. The RT identified 14 nodes groups and 8 end nodes with FOF as the first major split. Among participants with FOF, those who had sad mood and polypharmacy formed the end node with the greatest OR for recurrent falls (OR = 6.06 with p < 0.001). Among participants without FOF, those who were male and not sad had the lowest OR for recurrent falls (OR = 0.25 with p < 0.001). The RT correctly classified 1,356 from 1,414 non-recurrent fallers (specificity = 95.6 %), and 65 from 346 recurrent fallers (sensitivity = 18.8 %). The overall classification accuracy was 81.0 %. The multiple logistic regression correctly classified 1,372 from 1,414 non-recurrent fallers (specificity = 97.0 %), and 61 from 346 recurrent fallers (sensitivity = 17.6 %). The overall classification accuracy was 81.4 %. Our results show that RT may identify specific combinations of risk factors for recurrent falls, the combination most associated with recurrent falls involving FOF, sad mood and polypharmacy. The FOF emerged as the risk factor strongly associated with

  11. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

    NASA Astrophysics Data System (ADS)

    Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.

  12. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  13. Predicting the occurrence of wildfires with binary structured additive regression models.

    PubMed

    Ríos-Pena, Laura; Kneib, Thomas; Cadarso-Suárez, Carmen; Marey-Pérez, Manuel

    2017-02-01

    Wildfires are one of the main environmental problems facing societies today, and in the case of Galicia (north-west Spain), they are the main cause of forest destruction. This paper used binary structured additive regression (STAR) for modelling the occurrence of wildfires in Galicia. Binary STAR models are a recent contribution to the classical logistic regression and binary generalized additive models. Their main advantage lies in their flexibility for modelling non-linear effects, while simultaneously incorporating spatial and temporal variables directly, thereby making it possible to reveal possible relationships among the variables considered. The results showed that the occurrence of wildfires depends on many covariates which display variable behaviour across space and time, and which largely determine the likelihood of ignition of a fire. The joint possibility of working on spatial scales with a resolution of 1 × 1 km cells and mapping predictions in a colour range makes STAR models a useful tool for plotting and predicting wildfire occurrence. Lastly, it will facilitate the development of fire behaviour models, which can be invaluable when it comes to drawing up fire-prevention and firefighting plans. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Regression analysis of informative current status data with the additive hazards model.

    PubMed

    Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo

    2015-04-01

    This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.

  15. Modeling individual tree survial

    Treesearch

    Quang V. Cao

    2016-01-01

    Information provided by growth and yield models is the basis for forest managers to make decisions on how to manage their forests. Among different types of growth models, whole-stand models offer predictions at stand level, whereas individual-tree models give detailed information at tree level. The well-known logistic regression is commonly used to predict tree...

  16. Regression: The Apple Does Not Fall Far From the Tree.

    PubMed

    Vetter, Thomas R; Schober, Patrick

    2018-05-15

    Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.

  17. Analysis of the effect of evergreen and deciduous trees on urban nitrogen dioxide levels in the U.S. using land-use regression

    NASA Astrophysics Data System (ADS)

    Rao, M.; George, L. A.

    2012-12-01

    Nitrogen dioxide (NO2), an atmospheric pollutant generated primarily by anthropogenic combustion processes, is typically found at higher concentrations in urban areas compared to non-urbanized environments. Elevated NO2 levels have multiple ecosystem effects at different spatial scales. At the local scale, elevated levels affect human health directly and through the formation of secondary pollutants such as ozone and aerosols; at the regional scale secondary pollutants such as nitric acid and organic nitrates have deleterious effects on non-urbanized areas; and, at the global scale, nitrogen oxide emissions significantly alter the natural biogeochemical nitrogen cycle. As cities globally become larger and larger sources of nitrogen oxide emissions, it is important to assess possible mitigation strategies to reduce the impact of emissions locally, regionally and globally. In this study, we build a national land-use regression (LUR) model to compare the impacts of deciduous and evergreen trees on urban NO2 levels in the United States. We use the EPA monitoring network values of NO2 levels for 2006, the 2006 NLCD tree canopy data for deciduous and evergreen canopies, and the US Census Bureau's TIGER shapefiles for roads, railroads, impervious area & population density as proxies for NO2 sources on-road traffic, railroad traffic, off-road and area sources respectively. Our preliminary LUR model corroborates previous LUR studies showing that the presence of trees is associated with reduced urban NO2 levels. Additionally, our model indicates that deciduous and evergreen trees reduce NO2 to different extents, and that the amount of NO2 reduced varies seasonally. The model indicates that every square kilometer of deciduous canopy within a 2km buffer is associated with a reduction in ambient NO2 levels of 0.64 ppb in summer and 0.46ppb in winter. Similarly, every square kilometer of evergreen tree canopy within a 2 km buffer is associated with a reduction in ambient NO2 by

  18. CART (Classification and Regression Trees) Program: The Implementation of the CART Program and Its Application to Estimating Attrition Rates.

    DTIC Science & Technology

    1985-12-01

    consists of the node t and all descendants of t in T. (3) Definition 3. Pruning a branch Tt from a tree T con- sists of deleting from T all...The default is 1.0 so that actually, this keyword did not need to appear in the above file. (5) DELETE . This keyword does not appear in our example, but...when it is used associated with some variable names, it indicates that we want to delete these vari- ables from the regression. If this keyword is

  19. Prediction of fishing effort distributions using boosted regression trees.

    PubMed

    Soykan, Candan U; Eguchi, Tomoharu; Kohin, Suzanne; Dewar, Heidi

    2014-01-01

    Concerns about bycatch of protected species have become a dominant factor shaping fisheries management. However, efforts to mitigate bycatch are often hindered by a lack of data on the distributions of fishing effort and protected species. One approach to overcoming this problem has been to overlay the distribution of past fishing effort with known locations of protected species, often obtained through satellite telemetry and occurrence data, to identify potential bycatch hotspots. This approach, however, generates static bycatch risk maps, calling into question their ability to forecast into the future, particularly when dealing with spatiotemporally dynamic fisheries and highly migratory bycatch species. In this study, we use boosted regression trees to model the spatiotemporal distribution of fishing effort for two distinct fisheries in the North Pacific Ocean, the albacore (Thunnus alalunga) troll fishery and the California drift gillnet fishery that targets swordfish (Xiphias gladius). Our results suggest that it is possible to accurately predict fishing effort using < 10 readily available predictor variables (cross-validated correlations between model predictions and observed data -0.6). Although the two fisheries are quite different in their gears and fishing areas, their respective models had high predictive ability, even when input data sets were restricted to a fraction of the full time series. The implications for conservation and management are encouraging: Across a range of target species, fishing methods, and spatial scales, even a relatively short time series of fisheries data may suffice to accurately predict the location of fishing effort into the future. In combination with species distribution modeling of bycatch species, this approach holds promise as a mitigation tool when observer data are limited. Even in data-rich regions, modeling fishing effort and bycatch may provide more accurate estimates of bycatch risk than partial observer coverage

  20. Combinations of Stressors in Midlife: Examining Role and Domain Stressors Using Regression Trees and Random Forests

    PubMed Central

    2013-01-01

    Objectives. Global perceptions of stress (GPS) have major implications for mental and physical health, and stress in midlife may influence adaptation in later life. Thus, it is important to determine the unique and interactive effects of diverse influences of role stress (at work or in personal relationships), loneliness, life events, time pressure, caregiving, finances, discrimination, and neighborhood circumstances on these GPS. Method. Exploratory regression trees and random forests were used to examine complex interactions among myriad events and chronic stressors in middle-aged participants’ (N = 410; mean age = 52.12) GPS. Results. Different role and domain stressors were influential at high and low levels of loneliness. Varied combinations of these stressors resulting in similar levels of perceived stress are also outlined as examples of equifinality. Loneliness emerged as an important predictor across trees. Discussion. Exploring multiple stressors simultaneously provides insights into the diversity of stressor combinations across individuals—even those with similar levels of global perceived stress—and answers theoretical mandates to better understand the influence of stress by sampling from many domain and role stressors. Further, the unique influences of each predictor relative to the others inform theory and applied work. Finally, examples of equifinality and multifinality call for targeted interventions. PMID:23341437

  1. Using Classification and Regression Trees (CART) and random forests to analyze attrition: Results from two simulations.

    PubMed

    Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J

    2015-12-01

    In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. (c) 2015 APA, all rights reserved).

  2. Using Classification and Regression Trees (CART) and Random Forests to Analyze Attrition: Results From Two Simulations

    PubMed Central

    Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J.

    2016-01-01

    In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. PMID:26389526

  3. Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression.

    PubMed

    Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula

    2011-01-01

    Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.

  4. Additive hazards regression and partial likelihood estimation for ecological monitoring data across space.

    PubMed

    Lin, Feng-Chang; Zhu, Jun

    2012-01-01

    We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.

  5. Comprehensive database of diameter-based biomass regressions for North American tree species

    Treesearch

    Jennifer C. Jenkins; David C. Chojnacky; Linda S. Heath; Richard A. Birdsey

    2004-01-01

    A database consisting of 2,640 equations compiled from the literature for predicting the biomass of trees and tree components from diameter measurements of species found in North America. Bibliographic information, geographic locations, diameter limits, diameter and biomass units, equation forms, statistical errors, and coefficients are provided for each equation,...

  6. Ensemble of trees approaches to risk adjustment for evaluating a hospital's performance.

    PubMed

    Liu, Yang; Traskin, Mikhail; Lorch, Scott A; George, Edward I; Small, Dylan

    2015-03-01

    A commonly used method for evaluating a hospital's performance on an outcome is to compare the hospital's observed outcome rate to the hospital's expected outcome rate given its patient (case) mix and service. The process of calculating the hospital's expected outcome rate given its patient mix and service is called risk adjustment (Iezzoni 1997). Risk adjustment is critical for accurately evaluating and comparing hospitals' performances since we would not want to unfairly penalize a hospital just because it treats sicker patients. The key to risk adjustment is accurately estimating the probability of an Outcome given patient characteristics. For cases with binary outcomes, the method that is commonly used in risk adjustment is logistic regression. In this paper, we consider ensemble of trees methods as alternatives for risk adjustment, including random forests and Bayesian additive regression trees (BART). Both random forests and BART are modern machine learning methods that have been shown recently to have excellent performance for prediction of outcomes in many settings. We apply these methods to carry out risk adjustment for the performance of neonatal intensive care units (NICU). We show that these ensemble of trees methods outperform logistic regression in predicting mortality among babies treated in NICU, and provide a superior method of risk adjustment compared to logistic regression.

  7. Identification of sexually abused female adolescents at risk for suicidal ideations: a classification and regression tree analysis.

    PubMed

    Brabant, Marie-Eve; Hébert, Martine; Chagnon, François

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression, posttraumatic stress symptoms, and hopelessness discriminated profiles of suicidal and nonsuicidal survivors. The elevated prevalence of suicidal ideations among adolescent survivors of sexual abuse underscores the importance of investigating the presence of suicidal ideations in sexual abuse survivors. However, suicidal ideation is not the sole variable that needs to be investigated; depression, hopelessness and posttraumatic stress symptoms are also related to suicidal ideations in survivors and could therefore guide interventions.

  8. Towards lidar-based mapping of tree age at the Arctic forest tundra ecotone.

    NASA Astrophysics Data System (ADS)

    Jensen, J.; Maguire, A.; Oelkers, R.; Andreu-Hayles, L.; Boelman, N.; D'Arrigo, R.; Griffin, K. L.; Jennewein, J. S.; Hiers, E.; Meddens, A. J.; Russell, M.; Vierling, L. A.; Eitel, J.

    2017-12-01

    Climate change may cause spatial shifts in the forest-tundra ecotone (FTE). To improve our ability to study these spatial shifts, information on tree demography along the FTE is needed. The objective of this study was to assess the suitability of lidar derived tree heights as a surrogate for tree age. We calculated individual tree age from 48 tree cores collected at basal height from white spruce (Picea glauca) within the FTE in northern Alaska. Tree height was obtained from terrestrial lidar scans (<1cm spatial resolution). The relationship between age and height was examined using a linear regression model forced through the origin. We found a very strong predictive relationship between tree height and age (R2 = 0.90, RMSE = 19.34 years) for trees that ranged between 14 to 230 years. Separate regression models were also developed for small (height < 3 m) and large trees (height >= 3 m), yielding strong predictive relationships between height and age (R2 = 0.86, RMSE 12.21 years, and R2 = 0.93, RMSE = 25.16 years, respectively). The slope coefficient for small and large tree models (16.83 and 12.98 years/m, respectively) indicate that small trees grow 1.3 times faster than large trees at these FTE study sites. Although a strong, predictive relationship between age and height is uncommon in light-limited forest environments, our findings suggest that the sparseness of trees within the FTE may explain the strong tree height-age relationships found herein. Further analysis of 36 additional tree cores recently collected within the FTE near Inuvik, Canada will be performed. Our preliminary analysis suggests that lidar derived tree height could be a reliable proxy for tree age at the FTE, thereby establishing a new technique for scaling tree structure and demographics across larger portions of this sensitive ecotone.

  9. A regional classification scheme for estimating reference water quality in streams using land-use-adjusted spatial regression-tree analysis

    USGS Publications Warehouse

    Robertson, Dale M.; Saad, D.A.; Heisey, D.M.

    2006-01-01

    Various approaches are used to subdivide large areas into regions containing streams that have similar reference or background water quality and that respond similarly to different factors. For many applications, such as establishing reference conditions, it is preferable to use physical characteristics that are not affected by human activities to delineate these regions. However, most approaches, such as ecoregion classifications, rely on land use to delineate regions or have difficulties compensating for the effects of land use. Land use not only directly affects water quality, but it is often correlated with the factors used to define the regions. In this article, we describe modifications to SPARTA (spatial regression-tree analysis), a relatively new approach applied to water-quality and environmental characteristic data to delineate zones with similar factors affecting water quality. In this modified approach, land-use-adjusted (residualized) water quality and environmental characteristics are computed for each site. Regression-tree analysis is applied to the residualized data to determine the most statistically important environmental characteristics describing the distribution of a specific water-quality constituent. Geographic information for small basins throughout the study area is then used to subdivide the area into relatively homogeneous environmental water-quality zones. For each zone, commonly used approaches are subsequently used to define its reference water quality and how its water quality responds to changes in land use. SPARTA is used to delineate zones of similar reference concentrations of total phosphorus and suspended sediment throughout the upper Midwestern part of the United States. ?? 2006 Springer Science+Business Media, Inc.

  10. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    NASA Astrophysics Data System (ADS)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  11. Predictors of adherence with self-care guidelines among persons with type 2 diabetes: results from a logistic regression tree analysis.

    PubMed

    Yamashita, Takashi; Kart, Cary S; Noe, Douglas A

    2012-12-01

    Type 2 diabetes is known to contribute to health disparities in the U.S. and failure to adhere to recommended self-care behaviors is a contributing factor. Intervention programs face difficulties as a result of patient diversity and limited resources. With data from the 2005 Behavioral Risk Factor Surveillance System, this study employs a logistic regression tree algorithm to identify characteristics of sub-populations with type 2 diabetes according to their reported frequency of adherence to four recommended diabetes self-care behaviors including blood glucose monitoring, foot examination, eye examination and HbA1c testing. Using Andersen's health behavior model, need factors appear to dominate the definition of which sub-groups were at greatest risk for low as well as high adherence. Findings demonstrate the utility of easily interpreted tree diagrams to design specific culturally appropriate intervention programs targeting sub-populations of diabetes patients who need to improve their self-care behaviors. Limitations and contributions of the study are discussed.

  12. Schistosoma mansoni reinfection: Analysis of risk factors by classification and regression tree (CART) modeling

    PubMed Central

    Oliveira-Prado, Roberta; Matoso, Leonardo Ferreira; Veloso, Bráulio M.; Andrade, Gisele; Kloos, Helmut; Bethony, Jeffrey M.; Assunção, Renato M.; Correa-Oliveira, Rodrigo

    2017-01-01

    Praziquantel (PZQ) is an effective chemotherapy for schistosomiasis mansoni and a mainstay for its control and potential elimination. However, it does not prevent against reinfection, which can occur rapidly in areas with active transmission. A guide to ranking the risk factors for Schistosoma mansoni reinfection would greatly contribute to prioritizing resources and focusing prevention and control measures to prevent rapid reinfection. The objective of the current study was to explore the relationship among the socioeconomic, demographic, and epidemiological factors that can influence reinfection by S. mansoni one year after successful treatment with PZQ in school-aged children in Northeastern Minas Gerais state Brazil. Parasitological, socioeconomic, demographic, and water contact information were surveyed in 506 S. mansoni-infected individuals, aged 6 to 15 years, resident in these endemic areas. Eligible individuals were treated with PZQ until they were determined to be negative by the absence of S. mansoni eggs in the feces on two consecutive days of Kato-Katz fecal thick smear. These individuals were surveyed again 12 months from the date of successful treatment with PZQ. A classification and regression tree modeling (CART) was then used to explore the relationship between socioeconomic, demographic, and epidemiological variables and their reinfection status. The most important risk factor identified for S. mansoni reinfection was their “heavy” infection at baseline. Additional analyses, excluding heavy infection status, showed that lower socioeconomic status and a lower level of education of the household head were also most important risk factors for S. mansoni reinfection. Our results provide an important contribution toward the control and possible elimination of schistosomiasis by identifying three major risk factors that can be used for targeted treatment and monitoring of reinfection. We suggest that control measures that target heavily infected

  13. Digression and Value Concatenation to Enable Privacy-Preserving Regression.

    PubMed

    Li, Xiao-Bai; Sarkar, Sumit

    2014-09-01

    Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals' sensitive data. This problem, which we call a "regression attack," has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression , which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis.

  14. Marginal regression approach for additive hazards models with clustered current status data.

    PubMed

    Su, Pei-Fang; Chi, Yunchan

    2014-01-15

    Current status data arise naturally from tumorigenicity experiments, epidemiology studies, biomedicine, econometrics and demographic and sociology studies. Moreover, clustered current status data may occur with animals from the same litter in tumorigenicity experiments or with subjects from the same family in epidemiology studies. Because the only information extracted from current status data is whether the survival times are before or after the monitoring or censoring times, the nonparametric maximum likelihood estimator of survival function converges at a rate of n(1/3) to a complicated limiting distribution. Hence, semiparametric regression models such as the additive hazards model have been extended for independent current status data to derive the test statistics, whose distributions converge at a rate of n(1/2) , for testing the regression parameters. However, a straightforward application of these statistical methods to clustered current status data is not appropriate because intracluster correlation needs to be taken into account. Therefore, this paper proposes two estimating functions for estimating the parameters in the additive hazards model for clustered current status data. The comparative results from simulation studies are presented, and the application of the proposed estimating functions to one real data set is illustrated. Copyright © 2013 John Wiley & Sons, Ltd.

  15. Decision trees in epidemiological research.

    PubMed

    Venkatasubramaniam, Ashwini; Wolfson, Julian; Mitchell, Nathan; Barnes, Timothy; JaKa, Meghan; French, Simone

    2017-01-01

    In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.

  16. Extensions and applications of ensemble-of-trees methods in machine learning

    NASA Astrophysics Data System (ADS)

    Bleich, Justin

    Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of

  17. Weighing risk factors associated with bee colony collapse disorder by classification and regression tree analysis.

    PubMed

    VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude

    2010-10-01

    Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.

  18. Modeling Caribbean tree stem diameters from tree height and crown width measurements

    Treesearch

    Thomas Brandeis; KaDonna Randolph; Mike Strub

    2009-01-01

    Regression models to predict diameter at breast height (DBH) as a function of tree height and maximum crown radius were developed for Caribbean forests based on data collected by the U.S. Forest Service in the Commonwealth of Puerto Rico and Territory of the U.S. Virgin Islands. The model predicting DBH from tree height fit reasonably well (R2 = 0.7110), with...

  19. Online monitoring and conditional regression tree test: Useful tools for a better understanding of combined sewer network behavior.

    PubMed

    Bersinger, T; Bareille, G; Pigot, T; Bru, N; Le Hécho, I

    2018-06-01

    A good knowledge of the dynamic of pollutant concentration and flux in a combined sewer network is necessary when considering solutions to limit the pollutants discharged by combined sewer overflow (CSO) into receiving water during wet weather. Identification of the parameters that influence pollutant concentration and flux is important. Nevertheless, few studies have obtained satisfactory results for the identification of these parameters using statistical tools. Thus, this work uses a large database of rain events (116 over one year) obtained via continuous measurement of rainfall, discharge flow and chemical oxygen demand (COD) estimated using online turbidity for the identification of these parameters. We carried out a statistical study of the parameters influencing the maximum COD concentration, the discharge flow and the discharge COD flux. In this study a new test was used that has never been used in this field: the conditional regression tree test. We have demonstrated that the antecedent dry weather period, the rain event average intensity and the flow before the event are the three main factors influencing the maximum COD concentration during a rainfall event. Regarding the discharge flow, it is mainly influenced by the overall rainfall height but not by the maximum rainfall intensity. Finally, COD discharge flux is influenced by the discharge volume and the maximum COD concentration. Regression trees seem much more appropriate than common tests like PCA and PLS for this type of study as they take into account the thresholds and cumulative effects of various parameters as a function of the target variable. These results could help to improve sewer and CSO management in order to decrease the discharge of pollutants into receiving waters. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. [Application of SAS macro to evaluated multiplicative and additive interaction in logistic and Cox regression in clinical practices].

    PubMed

    Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q

    2016-05-01

    Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.

  1. Using boosted regression trees to predict the near-saturated hydraulic conductivity of undisturbed soils

    NASA Astrophysics Data System (ADS)

    Koestel, John; Bechtold, Michel; Jorda, Helena; Jarvis, Nicholas

    2015-04-01

    The saturated and near-saturated hydraulic conductivity of soil is of key importance for modelling water and solute fluxes in the vadose zone. Hydraulic conductivity measurements are cumbersome at the Darcy scale and practically impossible at larger scales where water and solute transport models are mostly applied. Hydraulic conductivity must therefore be estimated from proxy variables. Such pedotransfer functions are known to work decently well for e.g. water retention curves but rather poorly for near-saturated and saturated hydraulic conductivities. Recently, Weynants et al. (2009, Revisiting Vereecken pedotransfer functions: Introducing a closed-form hydraulic model. Vadose Zone Journal, 8, 86-95) reported a coefficients of determination of 0.25 (validation with an independent data set) for the saturated hydraulic conductivity from lab-measurements of Belgian soil samples. In our study, we trained boosted regression trees on a global meta-database containing tension-disk infiltrometer data (see Jarvis et al. 2013. Influence of soil, land use and climatic factors on the hydraulic conductivity of soil. Hydrology & Earth System Sciences, 17, 5185-5195) to predict the saturated hydraulic conductivity (Ks) and the conductivity at a tension of 10 cm (K10). We found coefficients of determination of 0.39 and 0.62 under a simple 10-fold cross-validation for Ks and K10. When carrying out the validation folded over the data-sources, i.e. the source publications, we found that the corresponding coefficients of determination reduced to 0.15 and 0.36, respectively. We conclude that the stricter source-wise cross-validation should be applied in future pedotransfer studies to prevent overly optimistic validation results. The boosted regression trees also allowed for an investigation of relevant predictors for estimating the near-saturated hydraulic conductivity. We found that land use and bulk density were most important to predict Ks. We also observed that Ks is large in fine

  2. Effects of aluminum and iron nanoparticle additives on composite AP/HTPB solid propellant regression rate

    NASA Astrophysics Data System (ADS)

    Styborski, Jeremy A.

    This project was started in the interest of supplementing existing data on additives to composite solid propellants. The study on the addition of iron and aluminum nanoparticles to composite AP/HTPB propellants was conducted at the Combustion and Energy Systems Laboratory at RPI in the new strand-burner experiment setup. For this study, a large literature review was conducted on history of solid propellant combustion modeling and the empirical results of tests on binders, plasticizers, AP particle size, and additives. The study focused on the addition of nano-scale aluminum and iron in small concentrations to AP/HTPB solid propellants with an average AP particle size of 200 microns. Replacing 1% of the propellant's AP with 40-60 nm aluminum particles produced no change in combustive behavior. The addition of 1% 60-80 nm iron particles produced a significant increase in burn rate, although the increase was lesser at higher pressures. These results are summarized in Table 2. The increase in the burn rate at all pressures due to the addition of iron nanoparticles warranted further study on the effect of concentration of iron. Tests conducted at 10 atm showed that the mean regression rate varied with iron concentration, peaking at 1% and 3%. Regardless of the iron concentration, the regression rate was higher than the baseline AP/HTPB propellants. These results are summarized in Table 3.

  3. Ecological Factors of Being Bullied Among Adolescents: a Classification and Regression Tree Approach

    PubMed Central

    Moon, Sung Seek; Kim, Heeyoung; Seay, Kristen; Small, Eusebius; Kim, Youn Kyoung

    2015-01-01

    Being bullied is a well-recognized trauma for adolescents. Bullying can best be understood through an ecological framework since bullying or being bullied involves risk factors at multiple contextual levels. The purpose of the study was to identify the risk and protective factors that best differentiate groups along with the outcome variable of interest (being bullied) using Classification and Regression Tree (CART) analysis. The study used the Health Behavior in School-Aged Children (HBSC) data collected from a nationally representative sample of students in grades six through ten during the 2005–2006 school years. This study identified that for adolescents 12 and younger, lower parental support is a critical risk factor associated with bullying and among those 13 to 14 with lower parent support, adolescent with higher academic pressure reported experiencing more bullying. For the older group of adolescents (aged 15 and older), school related factors were identified to increase the risk level of being bullied. There was a critical age (15 years old) for implementing victimization interventions to reduce the damage from being bullied. Service providers working with adolescents aged 14 and less should focus more on family-oriented intervention and those working with adolescents aged 15 and more should offer peer- or school-related interventions. PMID:27617043

  4. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  5. Using nonlinear quantile regression to estimate the self-thinning boundary curve

    Treesearch

    Quang V. Cao; Thomas J. Dean

    2015-01-01

    The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...

  6. Forest inventory predictions from individual tree crowns: regression modeling within a sample framework

    Treesearch

    James W. Flewelling

    2009-01-01

    Remotely sensed data can be used to make digital maps showing individual tree crowns (ITC) for entire forests. Attributes of the ITCs may include area, shape, height, and color. The crown map is sampled in a way that provides an unbiased linkage between ITCs and identifiable trees measured on the ground. Methods of avoiding edge bias are given. In an example from a...

  7. Generalized and synthetic regression estimators for randomized branch sampling

    Treesearch

    David L. R. Affleck; Timothy G. Gregoire

    2015-01-01

    In felled-tree studies, ratio and regression estimators are commonly used to convert more readily measured branch characteristics to dry crown mass estimates. In some cases, data from multiple trees are pooled to form these estimates. This research evaluates the utility of both tactics in the estimation of crown biomass following randomized branch sampling (...

  8. The limits to tree height.

    PubMed

    Koch, George W; Sillett, Stephen C; Jennings, Gregory M; Davis, Stephen D

    2004-04-22

    Trees grow tall where resources are abundant, stresses are minor, and competition for light places a premium on height growth. The height to which trees can grow and the biophysical determinants of maximum height are poorly understood. Some models predict heights of up to 120 m in the absence of mechanical damage, but there are historical accounts of taller trees. Current hypotheses of height limitation focus on increasing water transport constraints in taller trees and the resulting reductions in leaf photosynthesis. We studied redwoods (Sequoia sempervirens), including the tallest known tree on Earth (112.7 m), in wet temperate forests of northern California. Our regression analyses of height gradients in leaf functional characteristics estimate a maximum tree height of 122-130 m barring mechanical damage, similar to the tallest recorded trees of the past. As trees grow taller, increasing leaf water stress due to gravity and path length resistance may ultimately limit leaf expansion and photosynthesis for further height growth, even with ample soil moisture.

  9. Estimating Dbh of Trees Employing Multiple Linear Regression of the best Lidar-Derived Parameter Combination Automated in Python in a Natural Broadleaf Forest in the Philippines

    NASA Astrophysics Data System (ADS)

    Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.

    2016-06-01

    Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).

  10. Modelling the spatial distribution of Fasciola hepatica in bovines using decision tree, logistic regression and GIS query approaches for Brazil.

    PubMed

    Bennema, S C; Molento, M B; Scholte, R G; Carvalho, O S; Pritsch, I

    2017-11-01

    Fascioliasis is a condition caused by the trematode Fasciola hepatica. In this paper, the spatial distribution of F. hepatica in bovines in Brazil was modelled using a decision tree approach and a logistic regression, combined with a geographic information system (GIS) query. In the decision tree and the logistic model, isothermality had the strongest influence on disease prevalence. Also, the 50-year average precipitation in the warmest quarter of the year was included as a risk factor, having a negative influence on the parasite prevalence. The risk maps developed using both techniques, showed a predicted higher prevalence mainly in the South of Brazil. The prediction performance seemed to be high, but both techniques failed to reach a high accuracy in predicting the medium and high prevalence classes to the entire country. The GIS query map, based on the range of isothermality, minimum temperature of coldest month, precipitation of warmest quarter of the year, altitude and the average dailyland surface temperature, showed a possibility of presence of F. hepatica in a very large area. The risk maps produced using these methods can be used to focus activities of animal and public health programmes, even on non-evaluated F. hepatica areas.

  11. Log and tree sawing times for hardwood mills

    Treesearch

    Everette D. Rast

    1974-01-01

    Data on 6,850 logs and 1,181 trees were analyzed to predict sawing times. For both logs and trees, regression equations were derived that express (in minutes) sawing time per log or tree and per Mbf. For trees, merchantable height is expressed in number of logs as well as in feet. One of the major uses for the tables of average sawing times is as a bench mark against...

  12. Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model

    NASA Astrophysics Data System (ADS)

    Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.

    2017-02-01

    Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the

  13. Multivariate regression model for partitioning tree volume of white oak into round-product classes

    Treesearch

    Daniel A. Yaussy; David L. Sonderman

    1984-01-01

    Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...

  14. A hierarchical linear model for tree height prediction.

    Treesearch

    Vicente J. Monleon

    2003-01-01

    Measuring tree height is a time-consuming process. Often, tree diameter is measured and height is estimated from a published regression model. Trees used to develop these models are clustered into stands, but this structure is ignored and independence is assumed. In this study, hierarchical linear models that account explicitly for the clustered structure of the data...

  15. Estimating carbon and showing impacts of drought using satellite data in regression-tree models

    USGS Publications Warehouse

    Boyte, Stephen; Wylie, Bruce K.; Howard, Danny; Dahal, Devendra; Gilmanov, Tagir G.

    2018-01-01

    Integrating spatially explicit biogeophysical and remotely sensed data into regression-tree models enables the spatial extrapolation of training data over large geographic spaces, allowing a better understanding of broad-scale ecosystem processes. The current study presents annual gross primary production (GPP) and annual ecosystem respiration (RE) for 2000–2013 in several short-statured vegetation types using carbon flux data from towers that are located strategically across the conterminous United States (CONUS). We calculate carbon fluxes (annual net ecosystem production [NEP]) for each year in our study period, which includes 2012 when drought and higher-than-normal temperatures influence vegetation productivity in large parts of the study area. We present and analyse carbon flux dynamics in the CONUS to better understand how drought affects GPP, RE, and NEP. Model accuracy metrics show strong correlation coefficients (r) (r ≥ 94%) between training and estimated data for both GPP and RE. Overall, average annual GPP, RE, and NEP are relatively constant throughout the study period except during 2012 when almost 60% less carbon is sequestered than normal. These results allow us to conclude that this modelling method effectively estimates carbon dynamics through time and allows the exploration of impacts of meteorological anomalies and vegetation types on carbon dynamics.

  16. Is Susceptibility to Prenatal Methylmercury Exposure from Fish Consumption Non-Homogeneous? Tree-Structured Analysis for the Seychelles Child Development Study

    PubMed Central

    Huang, Li-Shan; Myers, Gary J.; Davidson, Philip W.; Cox, Christopher; Xiao, Fenyuan; Thurston, Sally W.; Cernichiari, Elsa; Shamlaye, Conrad F.; Sloane-Reeves, Jean; Georger, Lesley; Clarkson, Thomas W.

    2007-01-01

    Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age nine years. The analyses for the most recent nine-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated nonlinearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of twenty-one endpoints available at age nine years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other nine-year endpoints that in the linear analysis has a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels. PMID:17942158

  17. Is susceptibility to prenatal methylmercury exposure from fish consumption non-homogeneous? Tree-structured analysis for the Seychelles Child Development Study.

    PubMed

    Huang, Li-Shan; Myers, Gary J; Davidson, Philip W; Cox, Christopher; Xiao, Fenyuan; Thurston, Sally W; Cernichiari, Elsa; Shamlaye, Conrad F; Sloane-Reeves, Jean; Georger, Lesley; Clarkson, Thomas W

    2007-11-01

    Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age 9 years. The analyses for the most recent 9-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated non-linearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of 21 endpoints available at age 9 years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other 9-year endpoints that in the linear analysis had a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels.

  18. Predicting outcome on admission and post-admission for acetaminophen-induced acute liver failure using classification and regression tree models.

    PubMed

    Speiser, Jaime Lynn; Lee, William M; Karvellas, Constantine J

    2015-01-01

    Assessing prognosis for acetaminophen-induced acute liver failure (APAP-ALF) patients often presents significant challenges. King's College (KCC) has been validated on hospital admission, but little has been published on later phases of illness. We aimed to improve determinations of prognosis both at the time of and following admission for APAP-ALF using Classification and Regression Tree (CART) models. CART models were applied to US ALFSG registry data to predict 21-day death or liver transplant early (on admission) and post-admission (days 3-7) for 803 APAP-ALF patients enrolled 01/1998-09/2013. Accuracy in prediction of outcome (AC), sensitivity (SN), specificity (SP), and area under receiver-operating curve (AUROC) were compared between 3 models: KCC (INR, creatinine, coma grade, pH), CART analysis using only KCC variables (KCC-CART) and a CART model using new variables (NEW-CART). Traditional KCC yielded 69% AC, 90% SP, 27% SN, and 0.58 AUROC on admission, with similar performance post-admission. KCC-CART at admission offered predictive 66% AC, 65% SP, 67% SN, and 0.74 AUROC. Post-admission, KCC-CART had predictive 82% AC, 86% SP, 46% SN and 0.81 AUROC. NEW-CART models using MELD (Model for end stage liver disease), lactate and mechanical ventilation on admission yielded predictive 72% AC, 71% SP, 77% SN and AUROC 0.79. For later stages, NEW-CART (MELD, lactate, coma grade) offered predictive AC 86%, SP 91%, SN 46%, AUROC 0.73. CARTs offer simple prognostic models for APAP-ALF patients, which have higher AUROC and SN than KCC, with similar AC and negligibly worse SP. Admission and post-admission predictions were developed. • Prognostication in acetaminophen-induced acute liver failure (APAP-ALF) is challenging beyond admission • Little has been published regarding the use of King's College Criteria (KCC) beyond admission and KCC has shown limited sensitivity in subsequent studies • Classification and Regression Tree (CART) methodology allows the

  19. Simple street tree sampling

    Treesearch

    David J. Nowak; Jeffrey T. Walton; James Baldwin; Jerry Bond

    2015-01-01

    Information on street trees is critical for management of this important resource. Sampling of street tree populations provides an efficient means to obtain street tree population information. Long-term repeat measures of street tree samples supply additional information on street tree changes and can be used to report damages from catastrophic events. Analyses of...

  20. Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

    NASA Astrophysics Data System (ADS)

    Galelli, S.; Castelletti, A.

    2013-02-01

    Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modeling. In this paper we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modeling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalization property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally very efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analyzed on two real-world case studies (Marina catchment (Singapore) and Canning River (Western Australia)) representing two different morphoclimatic contexts comparatively with other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.

  1. Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

    NASA Astrophysics Data System (ADS)

    Galelli, S.; Castelletti, A.

    2013-07-01

    Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.

  2. Method for estimating potential tree-grade distributions for northeastern forest species

    Treesearch

    Daniel A. Yaussy; Daniel A. Yaussy

    1993-01-01

    Generalized logistic regression was used to distribute trees into four potential tree grades for 20 northeastern species groups. The potential tree grade is defined as the tree grade based on the length and amount of clear cuttings and defects only, disregarding minimum grading diameter. The algorithms described use site index and tree diameter as the predictive...

  3. Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data

    PubMed Central

    Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.

    2016-01-01

    We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872

  4. Classification and regression tree (CART) analyses of genomic signatures reveal sets of tetramers that discriminate temperature optima of archaea and bacteria

    PubMed Central

    Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.

    2008-01-01

    Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742

  5. Applying additive logistic regression to data derived from sensors monitoring behavioral and physiological characteristics of dairy cows to detect lameness.

    PubMed

    Kamphuis, C; Frank, E; Burke, J K; Verkerk, G A; Jago, J G

    2013-01-01

    The hypothesis was that sensors currently available on farm that monitor behavioral and physiological characteristics have potential for the detection of lameness in dairy cows. This was tested by applying additive logistic regression to variables derived from sensor data. Data were collected between November 2010 and June 2012 on 5 commercial pasture-based dairy farms. Sensor data from weigh scales (liveweight), pedometers (activity), and milk meters (milking order, unadjusted and adjusted milk yield in the first 2 min of milking, total milk yield, and milking duration) were collected at every milking from 4,904 cows. Lameness events were recorded by farmers who were trained in detecting lameness before the study commenced. A total of 318 lameness events affecting 292 cows were available for statistical analyses. For each lameness event, the lame cow's sensor data for a time period of 14 d before observation date were randomly matched by farm and date to 10 healthy cows (i.e., cows that were not lame and had no other health event recorded for the matched time period). Sensor data relating to the 14-d time periods were used for developing univariable (using one source of sensor data) and multivariable (using multiple sources of sensor data) models. Model development involved the use of additive logistic regression by applying the LogitBoost algorithm with a regression tree as base learner. The model's output was a probability estimate for lameness, given the sensor data collected during the 14-d time period. Models were validated using leave-one-farm-out cross-validation and, as a result of this validation, each cow in the data set (318 lame and 3,180 nonlame cows) received a probability estimate for lameness. Based on the area under the curve (AUC), results indicated that univariable models had low predictive potential, with the highest AUC values found for liveweight (AUC=0.66), activity (AUC=0.60), and milking order (AUC=0.65). Combining these 3 sensors improved

  6. Finding structure in data using multivariate tree boosting

    PubMed Central

    Miller, Patrick J.; Lubke, Gitta H.; McArtor, Daniel B.; Bergeman, C. S.

    2016-01-01

    Technology and collaboration enable dramatic increases in the size of psychological and psychiatric data collections, but finding structure in these large data sets with many collected variables is challenging. Decision tree ensembles such as random forests (Strobl, Malley, & Tutz, 2009) are a useful tool for finding structure, but are difficult to interpret with multiple outcome variables which are often of interest in psychology. To find and interpret structure in data sets with multiple outcomes and many predictors (possibly exceeding the sample size), we introduce a multivariate extension to a decision tree ensemble method called gradient boosted regression trees (Friedman, 2001). Our extension, multivariate tree boosting, is a method for nonparametric regression that is useful for identifying important predictors, detecting predictors with nonlinear effects and interactions without specification of such effects, and for identifying predictors that cause two or more outcome variables to covary. We provide the R package ‘mvtboost’ to estimate, tune, and interpret the resulting model, which extends the implementation of univariate boosting in the R package ‘gbm’ (Ridgeway et al., 2015) to continuous, multivariate outcomes. To illustrate the approach, we analyze predictors of psychological well-being (Ryff & Keyes, 1995). Simulations verify that our approach identifies predictors with nonlinear effects and achieves high prediction accuracy, exceeding or matching the performance of (penalized) multivariate multiple regression and multivariate decision trees over a wide range of conditions. PMID:27918183

  7. Regression trees modeling and forecasting of PM10 air pollution in urban areas

    NASA Astrophysics Data System (ADS)

    Stoimenova, M.; Voynikova, D.; Ivanov, A.; Gocheva-Ilieva, S.; Iliev, I.

    2017-10-01

    Fine particulate matter (PM10) air pollution is a serious problem affecting the health of the population in many Bulgarian cities. As an example, the object of this study is the pollution with PM10 of the town of Pleven, Northern Bulgaria. The measured concentrations of this air pollutant for this city consistently exceeded the permissible limits set by European and national legislation. Based on data for the last 6 years (2011-2016), the analysis shows that this applies both to the daily limit of 50 micrograms per cubic meter and the allowable number of daily concentration exceedances to 35 per year. Also, the average annual concentration of PM10 exceeded the prescribed norm of no more than 40 micrograms per cubic meter. The aim of this work is to build high performance mathematical models for effective prediction and forecasting the level of PM10 pollution. The study was conducted with the powerful flexible data mining technique Classification and Regression Trees (CART). The values of PM10 were fitted with respect to meteorological data such as maximum and minimum air temperature, relative humidity, wind speed and direction and others, as well as with time and autoregressive variables. As a result the obtained CART models demonstrate high predictive ability and fit the actual data with up to 80%. The best models were applied for forecasting the level pollution for 3 to 7 days ahead. An interpretation of the modeling results is presented.

  8. Bayesian structured additive regression modeling of epidemic data: application to cholera

    PubMed Central

    2012-01-01

    Background A significant interest in spatial epidemiology lies in identifying associated risk factors which enhances the risk of infection. Most studies, however, make no, or limited use of the spatial structure of the data, as well as possible nonlinear effects of the risk factors. Methods We develop a Bayesian Structured Additive Regression model for cholera epidemic data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (MCMC) simulations. The model is applied to cholera epidemic data in the Kumasi Metropolis, Ghana. Proximity to refuse dumps, density of refuse dumps, and proximity to potential cholera reservoirs were modeled as continuous functions; presence of slum settlers and population density were modeled as fixed effects, whereas spatial references to the communities were modeled as structured and unstructured spatial effects. Results We observe that the risk of cholera is associated with slum settlements and high population density. The risk of cholera is equal and lower for communities with fewer refuse dumps, but variable and higher for communities with more refuse dumps. The risk is also lower for communities distant from refuse dumps and potential cholera reservoirs. The results also indicate distinct spatial variation in the risk of cholera infection. Conclusion The study highlights the usefulness of Bayesian semi-parametric regression model analyzing public health data. These findings could serve as novel information to help health planners and policy makers in making effective decisions to control or prevent cholera epidemics. PMID:22866662

  9. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    PubMed

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.

  10. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy.

    PubMed

    Letunic, Ivica; Bork, Peer

    2011-07-01

    Interactive Tree Of Life (http://itol.embl.de) is a web-based tool for the display, manipulation and annotation of phylogenetic trees. It is freely available and open to everyone. In addition to classical tree viewer functions, iTOL offers many novel ways of annotating trees with various additional data. Current version introduces numerous new features and greatly expands the number of supported data set types. Trees can be interactively manipulated and edited. A free personal account system is available, providing management and sharing of trees in user defined workspaces and projects. Export to various bitmap and vector graphics formats is supported. Batch access interface is available for programmatic access or inclusion of interactive trees into other web services.

  11. Assessing visual green effects of individual urban trees using airborne Lidar data.

    PubMed

    Chen, Ziyue; Xu, Bing; Gao, Bingbo

    2015-12-01

    Urban trees benefit people's daily life in terms of air quality, local climate, recreation and aesthetics. Among these functions, a growing number of studies have been conducted to understand the relationship between residents' preference towards local environments and visual green effects of urban greenery. However, except for on-site photography, there are few quantitative methods to calculate green visibility, especially tree green visibility, from viewers' perspectives. To fill this research gap, a case study was conducted in the city of Cambridge, which has a diversity of tree species, sizes and shapes. Firstly, a photograph-based survey was conducted to approximate the actual value of visual green effects of individual urban trees. In addition, small footprint airborne Lidar (Light detection and ranging) data was employed to measure the size and shape of individual trees. Next, correlations between visual tree green effects and tree structural parameters were examined. Through experiments and gradual refinement, a regression model with satisfactory R2 and limited large errors is proposed. Considering the diversity of sample trees and the result of cross-validation, this model has the potential to be applied to other study sites. This research provides urban planners and decision makers with an innovative method to analyse and evaluate landscape patterns in terms of tree greenness. Copyright © 2015 Elsevier B.V. All rights reserved.

  12. Decision tree modeling using R.

    PubMed

    Zhang, Zhongheng

    2016-08-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.

  13. Fault-Tree Compiler

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Boerschlein, David P.

    1993-01-01

    Fault-Tree Compiler (FTC) program, is software tool used to calculate probability of top event in fault tree. Gates of five different types allowed in fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N. High-level input language easy to understand and use. In addition, program supports hierarchical fault-tree definition feature, which simplifies tree-description process and reduces execution time. Set of programs created forming basis for reliability-analysis workstation: SURE, ASSIST, PAWS/STEM, and FTC fault-tree tool (LAR-14586). Written in PASCAL, ANSI-compliant C language, and FORTRAN 77. Other versions available upon request.

  14. Regression with Small Data Sets: A Case Study using Code Surrogates in Additive Manufacturing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kamath, C.; Fan, Y. J.

    There has been an increasing interest in recent years in the mining of massive data sets whose sizes are measured in terabytes. While it is easy to collect such large data sets in some application domains, there are others where collecting even a single data point can be very expensive, so the resulting data sets have only tens or hundreds of samples. For example, when complex computer simulations are used to understand a scientific phenomenon, we want to run the simulation for many different values of the input parameters and analyze the resulting output. The data set relating the simulationmore » inputs and outputs is typically quite small, especially when each run of the simulation is expensive. However, regression techniques can still be used on such data sets to build an inexpensive \\surrogate" that could provide an approximate output for a given set of inputs. A good surrogate can be very useful in sensitivity analysis, uncertainty analysis, and in designing experiments. In this paper, we compare different regression techniques to determine how well they predict melt-pool characteristics in the problem domain of additive manufacturing. Our analysis indicates that some of the commonly used regression methods do perform quite well even on small data sets.« less

  15. Trees grow on money: urban tree canopy cover and environmental justice.

    PubMed

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G; Zhou, Weiqi; McHale, Melissa; Grove, J Morgan; O'Neil-Dunne, Jarlath; McFadden, Joseph P; Buckley, Geoffrey L; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman's correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns.

  16. Trees Grow on Money: Urban Tree Canopy Cover and Environmental Justice

    PubMed Central

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G.; Zhou, Weiqi; McHale, Melissa; Grove, J. Morgan; O’Neil-Dunne, Jarlath; McFadden, Joseph P.; Buckley, Geoffrey L.; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L.

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman’s correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns. PMID:25830303

  17. Regression estimators for late-instar gypsy moth larvae at low pupulation densities

    Treesearch

    W.E. Wallnr; A.S. Devito; Stanley J. Zarnoch

    1989-01-01

    Two regression estimators were developed for determining densities of late-instar gypsy moth, Lymantria dispar (Lepidoptera: Lymantriidae), larvae from burlap band and pyrethrin spray counts on oak trees in Vermont, Massachusetts, Connecticut, and New York. Studies were conducted by marking larvae on individual burlap banded trees within 15...

  18. Diameter-growth model across shortleaf pine range using regression tree analysis

    Treesearch

    Daniel Yaussy; Louis Iverson; Anantha Prasad

    1999-01-01

    Diameter growth of a tree in most gap-phase models is limited by light, nutrients, moisture, and temperature. Growing-season temperature is represented by growing degree days (gdd), which is the sum of the average daily temperatures above a baseline temperature. Gap-phase models determine the north-south range of a species by the gdd limits at the north and south...

  19. Trees in the city: valuing street trees in Portland, Oregon

    Treesearch

    G.H. Donovan; D.T. Butry

    2010-01-01

    We use a hedonic price model to simultaneously estimate the effects of street trees on the sales price and the time-on-market (TOM) of houses in Portland. Oregon. On average, street trees add $8,870 to sales price and reduce TOM by 1.7 days. In addition, we found that the benefits of street trees spill over to neighboring houses. Because the provision and maintenance...

  20. An introduction to tree-structured modeling with application to quality of life data.

    PubMed

    Su, Xiaogang; Azuero, Andres; Cho, June; Kvale, Elizabeth; Meneses, Karen M; McNees, M Patrick

    2011-01-01

    Investigators addressing nursing research are faced increasingly with the need to analyze data that involve variables of mixed types and are characterized by complex nonlinearity and interactions. Tree-based methods, also called recursive partitioning, are gaining popularity in various fields. In addition to efficiency and flexibility in handling multifaceted data, tree-based methods offer ease of interpretation. The aims of this study were to introduce tree-based methods, discuss their advantages and pitfalls in application, and describe their potential use in nursing research. In this article, (a) an introduction to tree-structured methods is presented, (b) the technique is illustrated via quality of life (QOL) data collected in the Breast Cancer Education Intervention study, and (c) implications for their potential use in nursing research are discussed. As illustrated by the QOL analysis example, tree methods generate interesting and easily understood findings that cannot be uncovered via traditional linear regression analysis. The expanding breadth and complexity of nursing research may entail the use of new tools to improve efficiency and gain new insights. In certain situations, tree-based methods offer an attractive approach that help address such needs.

  1. Differentiating regressed melanoma from regressed lichenoid keratosis.

    PubMed

    Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

    2017-04-01

    Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. Risk Factors Predicting Infectious Lactational Mastitis: Decision Tree Approach versus Logistic Regression Analysis.

    PubMed

    Fernández, Leónides; Mediano, Pilar; García, Ricardo; Rodríguez, Juan M; Marín, María

    2016-09-01

    Objectives Lactational mastitis frequently leads to a premature abandonment of breastfeeding; its development has been associated with several risk factors. This study aims to use a decision tree (DT) approach to establish the main risk factors involved in mastitis and to compare its performance for predicting this condition with a stepwise logistic regression (LR) model. Methods Data from 368 cases (breastfeeding women with mastitis) and 148 controls were collected by a questionnaire about risk factors related to medical history of mother and infant, pregnancy, delivery, postpartum, and breastfeeding practices. The performance of the DT and LR analyses was compared using the area under the receiver operating characteristic (ROC) curve. Sensitivity, specificity and accuracy of both models were calculated. Results Cracked nipples, antibiotics and antifungal drugs during breastfeeding, infant age, breast pumps, familial history of mastitis and throat infection were significant risk factors associated with mastitis in both analyses. Bottle-feeding and milk supply were related to mastitis for certain subgroups in the DT model. The areas under the ROC curves were similar for LR and DT models (0.870 and 0.835, respectively). The LR model had better classification accuracy and sensitivity than the DT model, but the last one presented better specificity at the optimal threshold of each curve. Conclusions The DT and LR models constitute useful and complementary analytical tools to assess the risk of lactational infectious mastitis. The DT approach identifies high-risk subpopulations that need specific mastitis prevention programs and, therefore, it could be used to make the most of public health resources.

  3. DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony.

    PubMed

    Wehe, André; Bansal, Mukul S; Burleigh, J Gordon; Eulenstein, Oliver

    2008-07-01

    DupTree is a new software program for inferring rooted species trees from collections of gene trees using the gene tree parsimony approach. The program implements a novel algorithm that significantly improves upon the run time of standard search heuristics for gene tree parsimony, and enables the first truly genome-scale phylogenetic analyses. In addition, DupTree allows users to examine alternate rootings and to weight the reconciliation costs for gene trees. DupTree is an open source project written in C++. DupTree for Mac OS X, Windows, and Linux along with a sample dataset and an on-line manual are available at http://genome.cs.iastate.edu/CBL/DupTree

  4. Using multiobjective tradeoff sets and Multivariate Regression Trees to identify critical and robust decisions for long term water utility planning

    NASA Astrophysics Data System (ADS)

    Smith, R.; Kasprzyk, J. R.; Balaji, R.

    2017-12-01

    In light of deeply uncertain factors like future climate change and population shifts, responsible resource management will require new types of information and strategies. For water utilities, this entails potential expansion and efficient management of water supply infrastructure systems for changes in overall supply; changes in frequency and severity of climate extremes such as droughts and floods; and variable demands, all while accounting for conflicting long and short term performance objectives. Multiobjective Evolutionary Algorithms (MOEAs) are emerging decision support tools that have been used by researchers and, more recently, water utilities to efficiently generate and evaluate thousands of planning portfolios. The tradeoffs between conflicting objectives are explored in an automated way to produce (often large) suites of portfolios that strike different balances of performance. Once generated, the sets of optimized portfolios are used to support relatively subjective assertions of priorities and human reasoning, leading to adoption of a plan. These large tradeoff sets contain information about complex relationships between decisions and between groups of decisions and performance that, until now, has not been quantitatively described. We present a novel use of Multivariate Regression Trees (MRTs) to analyze tradeoff sets to reveal these relationships and critical decisions. Additionally, when MRTs are applied to tradeoff sets developed for different realizations of an uncertain future, they can identify decisions that are robust across a wide range of conditions and produce fundamental insights about the system being optimized.

  5. The allometry of coarse root biomass: log-transformed linear regression or nonlinear regression?

    PubMed

    Lai, Jiangshan; Yang, Bo; Lin, Dunmei; Kerkhoff, Andrew J; Ma, Keping

    2013-01-01

    Precise estimation of root biomass is important for understanding carbon stocks and dynamics in forests. Traditionally, biomass estimates are based on allometric scaling relationships between stem diameter and coarse root biomass calculated using linear regression (LR) on log-transformed data. Recently, it has been suggested that nonlinear regression (NLR) is a preferable fitting method for scaling relationships. But while this claim has been contested on both theoretical and empirical grounds, and statistical methods have been developed to aid in choosing between the two methods in particular cases, few studies have examined the ramifications of erroneously applying NLR. Here, we use direct measurements of 159 trees belonging to three locally dominant species in east China to compare the LR and NLR models of diameter-root biomass allometry. We then contrast model predictions by estimating stand coarse root biomass based on census data from the nearby 24-ha Gutianshan forest plot and by testing the ability of the models to predict known root biomass values measured on multiple tropical species at the Pasoh Forest Reserve in Malaysia. Based on likelihood estimates for model error distributions, as well as the accuracy of extrapolative predictions, we find that LR on log-transformed data is superior to NLR for fitting diameter-root biomass scaling models. More importantly, inappropriately using NLR leads to grossly inaccurate stand biomass estimates, especially for stands dominated by smaller trees.

  6. Tree STEM and Canopy Biomass Estimates from Terrestrial Laser Scanning Data

    NASA Astrophysics Data System (ADS)

    Olofsson, K.; Holmgren, J.

    2017-10-01

    In this study an automatic method for estimating both the tree stem and the tree canopy biomass is presented. The point cloud tree extraction techniques operate on TLS data and models the biomass using the estimated stem and canopy volume as independent variables. The regression model fit error is of the order of less than 5 kg, which gives a relative model error of about 5 % for the stem estimate and 10-15 % for the spruce and pine canopy biomass estimates. The canopy biomass estimate was improved by separating the models by tree species which indicates that the method is allometry dependent and that the regression models need to be recomputed for different areas with different climate and different vegetation.

  7. Does the high–tech industry consistently reduce CO{sub 2} emissions? Results from nonparametric additive regression model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu, Bin; Research Center of Applied Statistics, Jiangxi University of Finance and Economics, Nanchang, Jiangxi 330013; Lin, Boqiang, E-mail: bqlin@xmu.edu.cn

    China is currently the world's largest carbon dioxide (CO{sub 2}) emitter. Moreover, total energy consumption and CO{sub 2} emissions in China will continue to increase due to the rapid growth of industrialization and urbanization. Therefore, vigorously developing the high–tech industry becomes an inevitable choice to reduce CO{sub 2} emissions at the moment or in the future. However, ignoring the existing nonlinear links between economic variables, most scholars use traditional linear models to explore the impact of the high–tech industry on CO{sub 2} emissions from an aggregate perspective. Few studies have focused on nonlinear relationships and regional differences in China. Basedmore » on panel data of 1998–2014, this study uses the nonparametric additive regression model to explore the nonlinear effect of the high–tech industry from a regional perspective. The estimated results show that the residual sum of squares (SSR) of the nonparametric additive regression model in the eastern, central and western regions are 0.693, 0.054 and 0.085 respectively, which are much less those that of the traditional linear regression model (3.158, 4.227 and 7.196). This verifies that the nonparametric additive regression model has a better fitting effect. Specifically, the high–tech industry produces an inverted “U–shaped” nonlinear impact on CO{sub 2} emissions in the eastern region, but a positive “U–shaped” nonlinear effect in the central and western regions. Therefore, the nonlinear impact of the high–tech industry on CO{sub 2} emissions in the three regions should be given adequate attention in developing effective abatement policies. - Highlights: • The nonlinear effect of the high–tech industry on CO{sub 2} emissions was investigated. • The high–tech industry yields an inverted “U–shaped” effect in the eastern region. • The high–tech industry has a positive “U–shaped” nonlinear effect in other regions. • The linear

  8. Use of generalized regression tree models to characterize vegetation favoring Anopheles albimanus breeding.

    PubMed

    Hernandez, J E; Epstein, L D; Rodriguez, M H; Rodriguez, A D; Rejmankova, E; Roberts, D R

    1997-03-01

    We propose the use of generalized tree models (GTMs) to analyze data from entomological field studies. Generalized tree models can be used to characterize environments with different mosquito breeding capacity. A GTM simultaneously analyzes a set of predictor variables (e.g., vegetation coverage) in relation to a response variable (e.g., counts of Anopheles albimanus larvae), and how it varies with respect to a set of criterion variables (e.g., presence of predators). The algorithm produces a treelike graphical display with its root at the top and 2 branches stemming down from each node. At each node, conditions on the value of predictors partition the observations into subgroups (environments) in which the relation between response and criterion variables is most homogeneous.

  9. Propensity score estimation: machine learning and classification methods as alternatives to logistic regression

    PubMed Central

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-01-01

    Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332

  10. Atlas of United States Trees, Volume 2: Alaska Trees and Common Shrubs.

    ERIC Educational Resources Information Center

    Viereck, Leslie A.; Little, Elbert L., Jr.

    This volume is the second in a series of atlases describing the natural distribution or range of native tree species in the United States. The 82 species maps include 32 of trees in Alaska, 6 of shrubs rarely reaching tree size, and 44 more of common shrubs. More than 20 additional maps summarize environmental factors and furnish general…

  11. Distribution of cavity trees in midwesternold-growth and second-growth forests

    Treesearch

    Zhaofei Fan; Stephen R. Shifley; Martin A. Spetich; Frank R., III Thompson; David R. Larsen

    2003-01-01

    We used classification and regression tree analysis to determine the primary variables associated with the occurrence of cavity trees and the hierarchical structure among those variables. We applied that information to develop logistic models predicting cavity tree probability as a function of diameter, species group, and decay class. Inventories of cavity abundance in...

  12. The influence of tree morphology on stemflow generation in a tropical lowland rainforest

    NASA Astrophysics Data System (ADS)

    Uber, Magdalena; Levia, Delphis F.; Zimmermann, Beate; Zimmermann, Alexander

    2014-05-01

    Even though stemflow usually accounts for only a small proportion of rainfall, it is an important point source of water and ion input to forest floors and may, for instance, influence soil moisture patterns and groundwater recharge. Previous studies showed that the generation of stemflow depends on a multitude of meteorological and biological factors. Interestingly, despite the tremendous progress in stemflow research during the last decades it is still largely unknown which combination of tree characteristics determines stemflow volumes in species-rich tropical forests. This knowledge gap motivated us to analyse the influence of tree characteristics on stemflow volumes in a 1 hectare plot located in a Panamanian lowland rainforest. Our study comprised stemflow measurements in six randomly selected 10 m by 10 m subplots. In each subplot we measured stemflow of all trees with a diameter at breast height (DBH) > 5 cm on an event-basis for a period of six weeks. Additionally, we identified all tree species and determined a set of tree characteristics including DBH, crown diameter, bark roughness, bark furrowing, epiphyte coverage, tree architecture, stem inclination, and crown position. During the sampling period, we collected 985 L of stemflow (0.98 % of total rainfall). Based on regression analyses and comparisons among plant functional groups we show that palms were most efficient in yielding stemflow due to their large inclined fronds. Trees with large emergent crowns also produced relatively large amounts of stemflow. Due to their abundance, understory trees contribute much to stemflow yield not on individual but on the plot scale. Even though parameters such as crown diameter, branch inclination and position of the crown influence stemflow generation to some extent, these parameters explain less than 30 % of the variation in stemflow volumes. In contrast to published results from temperate forests, we did not detect a negative correlation between bark roughness

  13. Predicting Potential Changes in Suitable Habitat and Distribution by 2100 for Tree Species of the Eastern United States

    Treesearch

    Louis R Iverson; Anantha M. Prasad; Mark W. Schwartz; Mark W. Schwartz

    2005-01-01

    We predict current distribution and abundance for tree species present in eastern North America, and subsequently estimate potential suitable habitat for those species under a changed climate with 2 x CO2. We used a series of statistical models (i.e., Regression Tree Analysis (RTA), Multivariate Adaptive Regression Splines (MARS), Bagging Trees (...

  14. Trees of Yap: a field guide

    Treesearch

    Marjorie V. Cushing Falanruw

    2015-01-01

    Descriptions, drawings, and photographs are presented for trees found on the Yap Islands in the Federated States of Micronesia. Included are all recorded native trees and most introduced trees as well as new records of native and introduced trees. Additional information is provided on tree distribution, status, vernacular names in Micronesia, and English names when...

  15. City housing atmospheric pollutant impact on emergency visit for asthma: A classification and regression tree approach.

    PubMed

    Mazenq, Julie; Dubus, Jean-Christophe; Gaudart, Jean; Charpin, Denis; Viudes, Gilles; Noel, Guilhem

    2017-11-01

    Particulate matter, nitrogen dioxide (NO 2 ) and ozone are recognized as the three pollutants that most significantly affect human health. Asthma is a multifactorial disease. However, the place of residence has rarely been investigated. We compared the impact of air pollution, measured near patients' homes, on emergency department (ED) visits for asthma or trauma (controls) within the Provence-Alpes-Côte-d'Azur region. Variables were selected using classification and regression trees on asthmatic and control population, 3-99 years, visiting ED from January 1 to December 31, 2013. Then in a nested case control study, randomization was based on the day of ED visit and on defined age groups. Pollution, meteorological, pollens and viral data measured that day were linked to the patient's ZIP code. A total of 794,884 visits were reported including 6250 for asthma and 278,192 for trauma. Factors associated with an excess risk of emergency visit for asthma included short-term exposure to NO 2 , female gender, high viral load and a combination of low temperature and high humidity. Short-term exposures to high NO 2 concentrations, as assessed close to the homes of the patients, were significantly associated with asthma-related ED visits in children and adults. Copyright © 2017 Elsevier Ltd. All rights reserved.

  16. Distribution of cavity trees in midwestern old-growth and second-growth forests

    Treesearch

    Zhaofei Fan; Stephen R. Shifley; Martin A. Spetich; Frank R. Thompson; David R. Larsen

    2003-01-01

    We used classification and regression tree analysis to determine the primary variables associated with the occurrence of cavity trees and the hierarchical structure among those variables. We applied that information to develop logistic models predicting cavity tree probability as a function of diameter, species group, and decay class. Inventories of cavity abundance in...

  17. More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011

    NASA Astrophysics Data System (ADS)

    Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R.

    2016-01-01

    Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.

  18. More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011.

    PubMed

    Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R

    2016-01-01

    Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.

  19. On Tree-Based Phylogenetic Networks.

    PubMed

    Zhang, Louxin

    2016-07-01

    A large class of phylogenetic networks can be obtained from trees by the addition of horizontal edges between the tree edges. These networks are called tree-based networks. We present a simple necessary and sufficient condition for tree-based networks and prove that a universal tree-based network exists for any number of taxa that contains as its base every phylogenetic tree on the same set of taxa. This answers two problems posted by Francis and Steel recently. A byproduct is a computer program for generating random binary phylogenetic networks under the uniform distribution model.

  20. Developing Models to Forcast Sales of Natural Christmas Trees

    Treesearch

    Lawrence D. Garrett; Thomas H. Pendleton

    1977-01-01

    A study of practices for marketing Christmas trees in Winston-Salem, North Carolina, and Denver, Colorado, revealed that such factors as retail lot competition, tree price, consumer traffic, and consumer income were very important in determining a particular retailer's sales. Analyses of 4 years of market data were used in developing regression models for...

  1. Tree biomass in the Swiss landscape: nationwide modelling for improved accounting for forest and non-forest trees.

    PubMed

    Price, B; Gomez, A; Mathys, L; Gardi, O; Schellenberger, A; Ginzler, C; Thürig, E

    2017-03-01

    Trees outside forest (TOF) can perform a variety of social, economic and ecological functions including carbon sequestration. However, detailed quantification of tree biomass is usually limited to forest areas. Taking advantage of structural information available from stereo aerial imagery and airborne laser scanning (ALS), this research models tree biomass using national forest inventory data and linear least-square regression and applies the model both inside and outside of forest to create a nationwide model for tree biomass (above ground and below ground). Validation of the tree biomass model against TOF data within settlement areas shows relatively low model performance (R 2 of 0.44) but still a considerable improvement on current biomass estimates used for greenhouse gas inventory and carbon accounting. We demonstrate an efficient and easily implementable approach to modelling tree biomass across a large heterogeneous nationwide area. The model offers significant opportunity for improved estimates on land use combination categories (CC) where tree biomass has either not been included or only roughly estimated until now. The ALS biomass model also offers the advantage of providing greater spatial resolution and greater within CC spatial variability compared to the current nationwide estimates.

  2. Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods

    USGS Publications Warehouse

    Moisen, Gretchen G.; Freeman, E.A.; Blackard, J.A.; Frescino, T.S.; Zimmermann, N.E.; Edwards, T.C.

    2006-01-01

    Many efforts are underway to produce broad-scale forest attribute maps by modelling forest class and structure variables collected in forest inventories as functions of satellite-based and biophysical information. Typically, variants of classification and regression trees implemented in Rulequest's?? See5 and Cubist (for binary and continuous responses, respectively) are the tools of choice in many of these applications. These tools are widely used in large remote sensing applications, but are not easily interpretable, do not have ties with survey estimation methods, and use proprietary unpublished algorithms. Consequently, three alternative modelling techniques were compared for mapping presence and basal area of 13 species located in the mountain ranges of Utah, USA. The modelling techniques compared included the widely used See5/Cubist, generalized additive models (GAMs), and stochastic gradient boosting (SGB). Model performance was evaluated using independent test data sets. Evaluation criteria for mapping species presence included specificity, sensitivity, Kappa, and area under the curve (AUC). Evaluation criteria for the continuous basal area variables included correlation and relative mean squared error. For predicting species presence (setting thresholds to maximize Kappa), SGB had higher values for the majority of the species for specificity and Kappa, while GAMs had higher values for the majority of the species for sensitivity. In evaluating resultant AUC values, GAM and/or SGB models had significantly better results than the See5 models where significant differences could be detected between models. For nine out of 13 species, basal area prediction results for all modelling techniques were poor (correlations less than 0.5 and relative mean squared errors greater than 0.8), but SGB provided the most stable predictions in these instances. SGB and Cubist performed equally well for modelling basal area for three species with moderate prediction success

  3. Tree nut allergens.

    PubMed

    Geiselhart, Sabine; Hoffmann-Sommergruber, Karin; Bublin, Merima

    2018-04-18

    Tree nuts are considered as part of a healthy diet due to their high nutritional quality. However, they are also a potent source of allergenic proteins inducing IgE mediated hypersensitivity often causing serious, life-threatening reactions. The reported prevalence of tree nut allergy is up to 4.9% worldwide. The general term "tree nuts" comprises a number of nuts, seeds, and drupes, derived from trees from different botanical families. For hazelnut and walnut several allergens have been identified which are already partly applied in component resolved diagnosis, while for other tree nuts such as macadamia, coconut, and Brazil nut only individual allergens were identified and data on additional allergenic proteins are missing. This review summarizes the current knowledge on tree nut allergens and describes their physicochemical and immunological characterization and clinical relevance. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. Current and Potential Tree Locations in Tree Line Ecotone of Changbai Mountains, Northeast China: The Controlling Effects of Topography

    PubMed Central

    Zong, Shengwei; Wu, Zhengfang; Xu, Jiawei; Li, Ming; Gao, Xiaofeng; He, Hongshi; Du, Haibo; Wang, Lei

    2014-01-01

    Tree line ecotone in the Changbai Mountains has undergone large changes in the past decades. Tree locations show variations on the four sides of the mountains, especially on the northern and western sides, which has not been fully explained. Previous studies attributed such variations to the variations in temperature. However, in this study, we hypothesized that topographic controls were responsible for causing the variations in the tree locations in tree line ecotone of the Changbai Mountains. To test the hypothesis, we used IKONOS images and WorldView-1 image to identify the tree locations and developed a logistic regression model using topographical variables to identify the dominant controls of the tree locations. The results showed that aspect, wetness, and slope were dominant controls for tree locations on western side of the mountains, whereas altitude, SPI, and aspect were the dominant factors on northern side. The upmost altitude a tree can currently reach was 2140 m asl on the northern side and 2060 m asl on western side. The model predicted results showed that habitats above the current tree line on the both sides were available for trees. Tree recruitments under the current tree line may take advantage of the available habitats at higher elevations based on the current tree location. Our research confirmed the controlling effects of topography on the tree locations in the tree line ecotone of Changbai Mountains and suggested that it was essential to assess the tree response to topography in the research of tree line ecotone. PMID:25170918

  5. Current and potential tree locations in tree line ecotone of Changbai Mountains, Northeast China: the controlling effects of topography.

    PubMed

    Zong, Shengwei; Wu, Zhengfang; Xu, Jiawei; Li, Ming; Gao, Xiaofeng; He, Hongshi; Du, Haibo; Wang, Lei

    2014-01-01

    Tree line ecotone in the Changbai Mountains has undergone large changes in the past decades. Tree locations show variations on the four sides of the mountains, especially on the northern and western sides, which has not been fully explained. Previous studies attributed such variations to the variations in temperature. However, in this study, we hypothesized that topographic controls were responsible for causing the variations in the tree locations in tree line ecotone of the Changbai Mountains. To test the hypothesis, we used IKONOS images and WorldView-1 image to identify the tree locations and developed a logistic regression model using topographical variables to identify the dominant controls of the tree locations. The results showed that aspect, wetness, and slope were dominant controls for tree locations on western side of the mountains, whereas altitude, SPI, and aspect were the dominant factors on northern side. The upmost altitude a tree can currently reach was 2140 m asl on the northern side and 2060 m asl on western side. The model predicted results showed that habitats above the current tree line on the both sides were available for trees. Tree recruitments under the current tree line may take advantage of the available habitats at higher elevations based on the current tree location. Our research confirmed the controlling effects of topography on the tree locations in the tree line ecotone of Changbai Mountains and suggested that it was essential to assess the tree response to topography in the research of tree line ecotone.

  6. Remeasuring tree heights on permanent plots using rectangular coordinates and one angle per tree

    Treesearch

    Robert L. Neal

    1973-01-01

    Heights of permanent sample trees with tops visible from any point can be measured from that point with any clinometer, measuring one vertical angle per tree. Two horizontal angles and one additional vertical angle per observation point are necessary to orient the point to the plot. Permanently recorded coordinates and elevations of tree locations are used with the...

  7. Modeling time-to-event (survival) data using classification tree analysis.

    PubMed

    Linden, Ariel; Yarnold, Paul R

    2017-12-01

    Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability. Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves. The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time. Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework. © 2017 John Wiley & Sons, Ltd.

  8. Rate of tree carbon accumulation increases continuously with tree size.

    PubMed

    Stephenson, N L; Das, A J; Condit, R; Russo, S E; Baker, P J; Beckman, N G; Coomes, D A; Lines, E R; Morris, W K; Rüger, N; Alvarez, E; Blundo, C; Bunyavejchewin, S; Chuyong, G; Davies, S J; Duque, A; Ewango, C N; Flores, O; Franklin, J F; Grau, H R; Hao, Z; Harmon, M E; Hubbell, S P; Kenfack, D; Lin, Y; Makana, J-R; Malizia, A; Malizia, L R; Pabst, R J; Pongpattananurak, N; Su, S-H; Sun, I-F; Tan, S; Thomas, D; van Mantgem, P J; Wang, X; Wiser, S K; Zavala, M A

    2014-03-06

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle--particularly net primary productivity and carbon storage--increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree's total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to undertand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  9. QSRR modeling for diverse drugs using different feature selection methods coupled with linear and nonlinear regressions.

    PubMed

    Goodarzi, Mohammad; Jensen, Richard; Vander Heyden, Yvan

    2012-12-01

    A Quantitative Structure-Retention Relationship (QSRR) is proposed to estimate the chromatographic retention of 83 diverse drugs on a Unisphere poly butadiene (PBD) column, using isocratic elutions at pH 11.7. Previous work has generated QSRR models for them using Classification And Regression Trees (CART). In this work, Ant Colony Optimization is used as a feature selection method to find the best molecular descriptors from a large pool. In addition, several other selection methods have been applied, such as Genetic Algorithms, Stepwise Regression and the Relief method, not only to evaluate Ant Colony Optimization as a feature selection method but also to investigate its ability to find the important descriptors in QSRR. Multiple Linear Regression (MLR) and Support Vector Machines (SVMs) were applied as linear and nonlinear regression methods, respectively, giving excellent correlation between the experimental, i.e. extrapolated to a mobile phase consisting of pure water, and predicted logarithms of the retention factors of the drugs (logk(w)). The overall best model was the SVM one built using descriptors selected by ACO. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    NASA Astrophysics Data System (ADS)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  11. Error analysis of leaf area estimates made from allometric regression models

    NASA Technical Reports Server (NTRS)

    Feiveson, A. H.; Chhikara, R. S.

    1986-01-01

    Biological net productivity, measured in terms of the change in biomass with time, affects global productivity and the quality of life through biochemical and hydrological cycles and by its effect on the overall energy balance. Estimating leaf area for large ecosystems is one of the more important means of monitoring this productivity. For a particular forest plot, the leaf area is often estimated by a two-stage process. In the first stage, known as dimension analysis, a small number of trees are felled so that their areas can be measured as accurately as possible. These leaf areas are then related to non-destructive, easily-measured features such as bole diameter and tree height, by using a regression model. In the second stage, the non-destructive features are measured for all or for a sample of trees in the plots and then used as input into the regression model to estimate the total leaf area. Because both stages of the estimation process are subject to error, it is difficult to evaluate the accuracy of the final plot leaf area estimates. This paper illustrates how a complete error analysis can be made, using an example from a study made on aspen trees in northern Minnesota. The study was a joint effort by NASA and the University of California at Santa Barbara known as COVER (Characterization of Vegetation with Remote Sensing).

  12. Our Air: Unfit for Trees.

    ERIC Educational Resources Information Center

    Dochinger, Leon S.

    To help urban, suburban, and rural tree owners know about air pollution's effects on trees and their tolerance and intolerance to pollutants, the USDA Forest Service has prepared this booklet. It answers the following questions about atmospheric pollution: Where does it come from? What can it do to trees? and What can we do about it? In addition,…

  13. Rate of tree carbon accumulation increases continuously with tree size

    USGS Publications Warehouse

    Stephenson, N.L.; Das, A.J.; Condit, R.; Russo, S.E.; Baker, P.J.; Beckman, N.G.; Coomes, D.A.; Lines, E.R.; Morris, W.K.; Rüger, N.; Álvarez, E.; Blundo, C.; Bunyavejchewin, S.; Chuyong, G.; Davies, S.J.; Duque, Á.; Ewango, C.N.; Flores, O.; Franklin, J.F.; Grau, H.R.; Hao, Z.; Harmon, M.E.; Hubbell, S.P.; Kenfack, D.; Lin, Y.; Makana, J.-R.; Malizia, A.; Malizia, L.R.; Pabst, R.J.; Pongpattananurak, N.; Su, S.-H.; Sun, I-F.; Tan, S.; Thomas, D.; van Mantgem, P.J.; Wang, X.; Wiser, S.K.; Zavala, M.A.

    2014-01-01

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle—particularly net primary productivity and carbon storage - increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree’s total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to understand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  14. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  15. Relationships between individual-tree mortality and water-balance variables indicate positive trends in water stress-induced tree mortality across North America.

    PubMed

    Hember, Robbie A; Kurz, Werner A; Coops, Nicholas C

    2017-04-01

    Accounting for water stress-induced tree mortality in forest productivity models remains a challenge due to uncertainty in stress tolerance of tree populations. In this study, logistic regression models were developed to assess species-specific relationships between probability of mortality (P m ) and drought, drawing on 8.1 million observations of change in vital status (m) of individual trees across North America. Drought was defined by standardized (relative) values of soil water content (W s,z ) and reference evapotranspiration (ET r,z ) at each field plot. The models additionally tested for interactions between the water-balance variables, aridity class of the site (AC), and estimated tree height (h). Considering drought improved model performance in 95 (80) per cent of the 64 tested species during calibration (cross-validation). On average, sensitivity to relative drought increased with site AC (i.e. aridity). Interaction between water-balance variables and estimated tree height indicated that drought sensitivity commonly decreased during early height development and increased during late height development, which may reflect expansion of the root system and decreasing whole-plant, leaf-specific hydraulic conductance, respectively. Across North America, predictions suggested that changes in the water balance caused mortality to increase from 1.1% yr -1 in 1951 to 2.0% yr -1 in 2014 (a net change of 0.9 ± 0.3% yr -1 ). Interannual variation in mortality also increased, driven by increasingly severe droughts in 1988, 1998, 2006, 2007 and 2012. With strong confidence, this study indicates that water stress is a common cause of tree mortality. With weak-to-moderate confidence, this study strengthens previous claims attributing positive trends in mortality to increasing levels of water stress. This 'learn-as-we-go' approach - defined by sampling rare drought events as they continue to intensify - will help to constrain the hydraulic limits of dominant tree

  16. A spatially explicit approach to the study of socio-demographic inequality in the spatial distribution of trees across Boston neighborhoods.

    PubMed

    Duncan, Dustin T; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A; Arbia, Giuseppe; Castro, Marcia C; White, Kellee; Williams, David R

    2014-04-01

    The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran's I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran's I range from 0.24 to 0.86, all P =0.001), for tree density (Global Moran's I =0.452, P =0.001), and in the OLS regression residuals (Global Moran's I range from 0.32 to 0.38, all P <0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (r S =-0.19; conventional P -value=0.016; spatially adjusted P -value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (r S =-0.18; conventional P -value=0.019; spatially adjusted P -value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial

  17. A spatially explicit approach to the study of socio-demographic inequality in the spatial distribution of trees across Boston neighborhoods

    PubMed Central

    Duncan, Dustin T.; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A.; Arbia, Giuseppe; Castro, Marcia C.; White, Kellee; Williams, David R.

    2017-01-01

    The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran’s I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran’s I range from 0.24 to 0.86, all P=0.001), for tree density (Global Moran’s I=0.452, P=0.001), and in the OLS regression residuals (Global Moran’s I range from 0.32 to 0.38, all P<0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (rS=−0.19; conventional P-value=0.016; spatially adjusted P-value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (rS=−0.18; conventional P-value=0.019; spatially adjusted P-value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial

  18. Sampling the quality of hardwood trees

    Treesearch

    Adrian M. Gilbert

    1959-01-01

    Anyone acquainted with the conversion of hardwood trees into wood products knows that timber has a wide range in quality. Some trees will yield better products than others. So, in addition to rate of growth and size, tree values are affected by the quality of products yielded.

  19. Assessment of land use factors associated with dengue cases in Malaysia using Boosted Regression Trees.

    PubMed

    Cheong, Yoon Ling; Leitão, Pedro J; Lakes, Tobia

    2014-07-01

    The transmission of dengue disease is influenced by complex interactions among vector, host and virus. Land use such as water bodies or certain agricultural practices have been identified as likely risk factors for dengue because of the provision of suitable habitats for the vector. Many studies have focused on the land use factors of dengue vector abundance in small areas but have not yet studied the relationship between land use factors and dengue cases for large regions. This study aims to clarify if land use factors other than human settlements, e.g. different types of agricultural land use, water bodies and forest are associated with reported dengue cases from 2008 to 2010 in the state of Selangor, Malaysia. From the correlative relationship, we aim to generate a prediction risk map. We used Boosted Regression Trees (BRT) to account for nonlinearities and interactions between the factors with high predictive accuracies. Our model with a cross-validated performance score (Area Under the Receiver Operator Characteristic Curve, ROC AUC) of 0.81 showed that the most important land use factors are human settlements (model importance of 39.2%), followed by water bodies (16.1%), mixed horticulture (8.7%), open land (7.5%) and neglected grassland (6.7%). A risk map after 100 model runs with a cross-validated ROC AUC mean of 0.81 (±0.001 s.d.) is presented. Our findings may be an important asset for improving surveillance and control interventions for dengue. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. Evaluating multimedia chemical persistence: Classification and regression tree analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.

    2000-04-01

    For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Widely distributed persistent chemicals are impossible to remove from the environment and remediation by natural processes may take decades, which is problematic if adverse health or ecological effects are discovered after prolonged release into the environment. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as persistent or nonpersistent based on the chemical properties. In thismore » approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robust for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and can be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.« less

  1. Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

    NASA Astrophysics Data System (ADS)

    Kaskhedikar, Apoorva Prakash

    According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI

  2. Classification and regression tree (CART) model to predict pulmonary tuberculosis in hospitalized patients.

    PubMed

    Aguiar, Fabio S; Almeida, Luciana L; Ruffino-Netto, Antonio; Kritski, Afranio Lineu; Mello, Fernanda Cq; Werneck, Guilherme L

    2012-08-07

    Tuberculosis (TB) remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission. Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART) model was generated and validated. The area under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005. We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear) and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%. The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with clinical suspicion of TB in tertiary health facilities in

  3. Regression Analysis of Mixed Recurrent-Event and Panel-Count Data with Additive Rate Models

    PubMed Central

    Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L.

    2015-01-01

    Summary Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007; Zhao et al., 2011). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013). In this paper, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study. PMID:25345405

  4. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  5. Integration of vessel traits, wood density, and height in angiosperm shrubs and trees.

    PubMed

    Martínez-Cabrera, Hugo I; Schenk, H Jochen; Cevallos-Ferriz, Sergio R S; Jones, Cynthia S

    2011-05-01

    Trees and shrubs tend to occupy different niches within and across ecosystems; therefore, traits related to their resource use and life history are expected to differ. Here we analyzed how growth form is related to variation in integration among vessel traits, wood density, and height. We also considered the ecological and evolutionary consequences of such differences. In a sample of 200 woody plant species (65 shrubs and 135 trees) from Argentina, Mexico, and the United States, standardized major axis (SMA) regression, correlation analyses, and ANOVA were used to determine whether relationships among traits differed between growth forms. The influence of phylogenetic relationships was examined with a phylogenetic ANOVA and phylogenetically independent contrasts (PICs). A principal component analysis was conducted to determine whether trees and shrubs occupy different portions of multivariate trait space. Wood density did not differ between shrubs and trees, but there were significant differences in vessel diameter, vessel density, theoretical conductivity, and as expected, height. In addition, relationships between vessel traits and wood density differed between growth forms. Trees showed coordination among vessel traits, wood density, and height, but in shrubs, wood density and vessel traits were independent. These results hold when phylogenetic relationships were considered. In the multivariate analyses, these differences translated as significantly different positions in multivariate trait space occupied by shrubs and trees. Differences in trait integration between growth forms suggest that evolution of growth form in some lineages might be associated with the degree of trait interrelation.

  6. STX--Fortran-4 program for estimates of tree populations from 3P sample-tree-measurements

    Treesearch

    L. R. Grosenbaugh

    1967-01-01

    Describes how to use an improved and greatly expanded version of an earlier computer program (1964) that converts dendrometer measurements of 3P-sample trees to population values in terms of whatever units user desires. Many new options are available, including that of obtaining a product-yield and appraisal report based on regression coefficients supplied by user....

  7. Tree-, stand- and site-specific controls on landscape-scale patterns of transpiration

    NASA Astrophysics Data System (ADS)

    Hassler, Sibylle; Markus, Weiler; Theresa, Blume

    2017-04-01

    Transpiration is a key process in the hydrological cycle and a sound understanding and quantification of transpiration and its spatial variability is essential for management decisions as well as for improving the parameterisation of hydrological and soil-vegetation-atmosphere transfer models. For individual trees, transpiration is commonly estimated by measuring sap flow. Besides evaporative demand and water availability, tree-specific characteristics such as species, size or social status control sap flow amounts of individual trees. Within forest stands, properties such as species composition, basal area or stand density additionally affect sap flow, for example via competition mechanisms. Finally, sap flow patterns might also be influenced by landscape-scale characteristics such as geology, slope position or aspect because they affect water and energy availability; however, little is known about the dynamic interplay of these controls. We studied the relative importance of various tree-, stand- and site-specific characteristics with multiple linear regression models to explain the variability of sap velocity measurements in 61 beech and oak trees, located at 24 sites spread over a 290 km2-catchment in Luxembourg. For each of 132 consecutive days of the growing season of 2014 we modelled the daily sap velocities of these 61 trees and determined the importance of the different predictors. Results indicate that a combination of tree-, stand- and site-specific factors controls sap velocity patterns in the landscape, namely tree species, tree diameter, the stand density, geology and aspect. Compared to these predictors, spatial variability of atmospheric demand and soil moisture explains only a small fraction of the variability in the daily datasets. However, the temporal dynamics of the explanatory power of the tree-specific characteristics, especially species, are correlated to the temporal dynamics of potential evaporation. Thus, transpiration estimates at the

  8. A Metric on Phylogenetic Tree Shapes

    PubMed Central

    Plazzotta, G.

    2018-01-01

    Abstract The shapes of evolutionary trees are influenced by the nature of the evolutionary process but comparisons of trees from different processes are hindered by the challenge of completely describing tree shape. We present a full characterization of the shapes of rooted branching trees in a form that lends itself to natural tree comparisons. We use this characterization to define a metric, in the sense of a true distance function, on tree shapes. The metric distinguishes trees from random models known to produce different tree shapes. It separates trees derived from tropical versus USA influenza A sequences, which reflect the differing epidemiology of tropical and seasonal flu. We describe several metrics based on the same core characterization, and illustrate how to extend the metric to incorporate trees’ branch lengths or other features such as overall imbalance. Our approach allows us to construct addition and multiplication on trees, and to create a convex metric on tree shapes which formally allows computation of average tree shapes. PMID:28472435

  9. National assessment of Tree City USA participation

    EPA Pesticide Factsheets

    Tree City USA is a national program that recognizes municipal commitment to community forestry. In return for meeting program requirements, Tree City USA participants expect social, economic, and/or environmental benefits. Understanding the geographic distribution and socioeconomic characteristics of Tree City USA communities at the national scale can offer insights into the motivations or barriers to program participation, and provide context for community forestry research at finer scales. In this study, researchers assessed patterns in Tree City USA participation for all U.S. communities with more than 2,500 people according to geography, community population size, and socioeconomic characteristics, such as income, education, and race. Nationally, 23.5% of communities studied were Tree City USA participants, and this accounted for 53.9% of the total population in these communities. Tree City USA participation rates varied substantially by U.S. region, but in each region participation rates were higher in larger communities, and long-term participants tended to be larger communities than more recent enrollees. In logistic regression models, owner occupancy rates were significant negative predictors of Tree City USA participation, education and percent white population were positive predictors in many U.S. regions, and inconsistent patterns were observed for income and population age. The findings indicate that communities with smaller populations, lower educat

  10. Contaminant gradients in trees: Directional tree coring reveals boundaries of soil and soil-gas contamination with potential applications in vapor intrusion assessment

    USGS Publications Warehouse

    Wilson, Jordan L.; Samaranayake, V.A.; Limmer, Matthew A.; Schumacher, John G.; Burken, Joel G.

    2017-01-01

    Contaminated sites pose ecological and human-health risks through exposure to contaminated soil and groundwater. Whereas we can readily locate, monitor, and track contaminants in groundwater, it is harder to perform these tasks in the vadose zone. In this study, tree-core samples were collected at a Superfund site to determine if the sample-collection location around a particular tree could reveal the subsurface location, or direction, of soil and soil-gas contaminant plumes. Contaminant-centroid vectors were calculated from tree-core data to reveal contaminant distributions in directional tree samples at a higher resolution, and vectors were correlated with soil-gas characterization collected using conventional methods. Results clearly demonstrated that directional tree coring around tree trunks can indicate gradients in soil and soil-gas contaminant plumes, and the strength of the correlations were directly proportionate to the magnitude of tree-core concentration gradients (spearman’s coefficient of -0.61 and -0.55 in soil and tree-core gradients, respectively). Linear regression indicates agreement between the concentration-centroid vectors is significantly affected by in-planta and soil concentration gradients and when concentration centroids in soil are closer to trees. Given the existing link between soil-gas and vapor intrusion, this study also indicates that directional tree coring might be applicable in vapor intrusion assessment.

  11. Contaminant Gradients in Trees: Directional Tree Coring Reveals Boundaries of Soil and Soil-Gas Contamination with Potential Applications in Vapor Intrusion Assessment.

    PubMed

    Wilson, Jordan L; Samaranayake, V A; Limmer, Matthew A; Schumacher, John G; Burken, Joel G

    2017-12-19

    Contaminated sites pose ecological and human-health risks through exposure to contaminated soil and groundwater. Whereas we can readily locate, monitor, and track contaminants in groundwater, it is harder to perform these tasks in the vadose zone. In this study, tree-core samples were collected at a Superfund site to determine if the sample-collection location around a particular tree could reveal the subsurface location, or direction, of soil and soil-gas contaminant plumes. Contaminant-centroid vectors were calculated from tree-core data to reveal contaminant distributions in directional tree samples at a higher resolution, and vectors were correlated with soil-gas characterization collected using conventional methods. Results clearly demonstrated that directional tree coring around tree trunks can indicate gradients in soil and soil-gas contaminant plumes, and the strength of the correlations were directly proportionate to the magnitude of tree-core concentration gradients (spearman's coefficient of -0.61 and -0.55 in soil and tree-core gradients, respectively). Linear regression indicates agreement between the concentration-centroid vectors is significantly affected by in planta and soil concentration gradients and when concentration centroids in soil are closer to trees. Given the existing link between soil-gas and vapor intrusion, this study also indicates that directional tree coring might be applicable in vapor intrusion assessment.

  12. Hide and vanish: data sets where the most parsimonious tree is known but hard to find, and their implications for tree search methods.

    PubMed

    Goloboff, Pablo A

    2014-10-01

    Three different types of data sets, for which the uniquely most parsimonious tree can be known exactly but is hard to find with heuristic tree search methods, are studied. Tree searches are complicated more by the shape of the tree landscape (i.e. the distribution of homoplasy on different trees) than by the sheer abundance of homoplasy or character conflict. Data sets of Type 1 are those constructed by Radel et al. (2013). Data sets of Type 2 present a very rugged landscape, with narrow peaks and valleys, but relatively low amounts of homoplasy. For such a tree landscape, subjecting the trees to TBR and saving suboptimal trees produces much better results when the sequence of clipping for the tree branches is randomized instead of fixed. An unexpected finding for data sets of Types 1 and 2 is that starting a search from a random tree instead of a random addition sequence Wagner tree may increase the probability that the search finds the most parsimonious tree; a small artificial example where these probabilities can be calculated exactly is presented. Data sets of Type 3, the most difficult data sets studied here, comprise only congruent characters, and a single island with only one most parsimonious tree. Even if there is a single island, missing entries create a very flat landscape which is difficult to traverse with tree search algorithms because the number of equally parsimonious trees that need to be saved and swapped to effectively move around the plateaus is too large. Minor modifications of the parameters of tree drifting, ratchet, and sectorial searches allow travelling around these plateaus much more efficiently than saving and swapping large numbers of equally parsimonious trees with TBR. For these data sets, two new related criteria for selecting taxon addition sequences in Wagner trees (the "selected" and "informative" addition sequences) produce much better results than the standard random or closest addition sequences. These new methods for Wagner

  13. Estimating tree crown widths for the primary Acadian species in Maine

    Treesearch

    Matthew B. Russell; Aaron R. Weiskittel

    2012-01-01

    In this analysis, data for seven conifer and eight hardwood species were gathered from across the state of Maine for estimating tree crown widths. Maximum and largest crown width equations were developed using tree diameter at breast height as the primary predicting variable. Quantile regression techniques were used to estimate the maximum crown width and a constrained...

  14. The fault-tree compiler

    NASA Technical Reports Server (NTRS)

    Martensen, Anna L.; Butler, Ricky W.

    1987-01-01

    The Fault Tree Compiler Program is a new reliability tool used to predict the top event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N gates. The high level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precise (within the limits of double precision floating point arithmetic) to the five digits in the answer. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Corporation VAX with the VMS operation system.

  15. Structured Additive Quantile Regression for Assessing the Determinants of Childhood Anemia in Rwanda.

    PubMed

    Habyarimana, Faustin; Zewotir, Temesgen; Ramroop, Shaun

    2017-06-17

    Childhood anemia is among the most significant health problems faced by public health departments in developing countries. This study aims at assessing the determinants and possible spatial effects associated with childhood anemia in Rwanda. The 2014/2015 Rwanda Demographic and Health Survey (RDHS) data was used. The analysis was done using the structured spatial additive quantile regression model. The findings of this study revealed that the child's age; the duration of breastfeeding; gender of the child; the nutritional status of the child (whether underweight and/or wasting); whether the child had a fever; had a cough in the two weeks prior to the survey or not; whether the child received vitamin A supplementation in the six weeks before the survey or not; the household wealth index; literacy of the mother; mother's anemia status; mother's age at the birth are all significant factors associated with childhood anemia in Rwanda. Furthermore, significant structured spatial location effects on childhood anemia was found.

  16. Recovery efficiency of whole-tree harvesting

    Treesearch

    Bryce J. Stokes; William F. Watson

    1988-01-01

    The recovery of total tree biomass and most components of a stand is a practical economic and management alternative to tree-length harvesting. First, the increased utilization of woody biomass provides additional revenues from the site. Second, the removal and utilization of the stems and crowns reduces site preparation costs and makes tree planting easier. Third,...

  17. Landscape-scale consequences of differential tree mortality from catastrophic wind disturbance in the Amazon.

    PubMed

    Rifai, Sami W; Urquiza Muñoz, José D; Negrón-Juárez, Robinson I; Ramírez Arévalo, Fredy R; Tello-Espinoza, Rodil; Vanderwel, Mark C; Lichstein, Jeremy W; Chambers, Jeffrey Q; Bohlman, Stephanie A

    2016-10-01

    Wind disturbance can create large forest blowdowns, which greatly reduces live biomass and adds uncertainty to the strength of the Amazon carbon sink. Observational studies from within the central Amazon have quantified blowdown size and estimated total mortality but have not determined which trees are most likely to die from a catastrophic wind disturbance. Also, the impact of spatial dependence upon tree mortality from wind disturbance has seldom been quantified, which is important because wind disturbance often kills clusters of trees due to large treefalls killing surrounding neighbors. We examine (1) the causes of differential mortality between adult trees from a 300-ha blowdown event in the Peruvian region of the northwestern Amazon, (2) how accounting for spatial dependence affects mortality predictions, and (3) how incorporating both differential mortality and spatial dependence affect the landscape level estimation of necromass produced from the blowdown. Standard regression and spatial regression models were used to estimate how stem diameter, wood density, elevation, and a satellite-derived disturbance metric influenced the probability of tree death from the blowdown event. The model parameters regarding tree characteristics, topography, and spatial autocorrelation of the field data were then used to determine the consequences of non-random mortality for landscape production of necromass through a simulation model. Tree mortality was highly non-random within the blowdown, where tree mortality rates were highest for trees that were large, had low wood density, and were located at high elevation. Of the differential mortality models, the non-spatial models overpredicted necromass, whereas the spatial model slightly underpredicted necromass. When parameterized from the same field data, the spatial regression model with differential mortality estimated only 7.5% more dead trees across the entire blowdown than the random mortality model, yet it estimated 51

  18. A Metric on Phylogenetic Tree Shapes.

    PubMed

    Colijn, C; Plazzotta, G

    2018-01-01

    The shapes of evolutionary trees are influenced by the nature of the evolutionary process but comparisons of trees from different processes are hindered by the challenge of completely describing tree shape. We present a full characterization of the shapes of rooted branching trees in a form that lends itself to natural tree comparisons. We use this characterization to define a metric, in the sense of a true distance function, on tree shapes. The metric distinguishes trees from random models known to produce different tree shapes. It separates trees derived from tropical versus USA influenza A sequences, which reflect the differing epidemiology of tropical and seasonal flu. We describe several metrics based on the same core characterization, and illustrate how to extend the metric to incorporate trees' branch lengths or other features such as overall imbalance. Our approach allows us to construct addition and multiplication on trees, and to create a convex metric on tree shapes which formally allows computation of average tree shapes. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.

  19. Analysis of the Importance of Oxides and Clays in Cd, Cr, Cu, Ni, Pb and Zn Adsorption and Retention with Regression Trees

    PubMed Central

    González-Costa, Juan José; Reigosa, Manuel Joaquín; Matías, José María; Fernández-Covelo, Emma

    2017-01-01

    This study determines the influence of the different soil components and of the cation-exchange capacity on the adsorption and retention of different heavy metals: cadmium, chromium, copper, nickel, lead and zinc. In order to do so, regression models were created through decision trees and the importance of soil components was assessed. Used variables were: humified organic matter, specific cation-exchange capacity, percentages of sand and silt, proportions of Mn, Fe and Al oxides and hematite, and the proportion of quartz, plagioclase and mica, and the proportions of the different clays: kaolinite, vermiculite, gibbsite and chlorite. The most important components in the obtained models were vermiculite and gibbsite, especially for the adsorption of cadmium and zinc, while clays were less relevant. Oxides are less important than clays, especially for the adsorption of chromium and lead and the retention of chromium, copper and lead. PMID:28072849

  20. Regression analysis of mixed recurrent-event and panel-count data with additive rate models.

    PubMed

    Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L

    2015-03-01

    Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007, The Statistical Analysis of Recurrent Events. New York: Springer-Verlag; Zhao et al., 2011, Test 20, 1-42). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013, Statistics in Medicine 32, 1954-1963). In this article, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study. © 2014, The International Biometric Society.

  1. Prediction of Patient-Controlled Analgesic Consumption: A Multimodel Regression Tree Approach.

    PubMed

    Hu, Yuh-Jyh; Ku, Tien-Hsiung; Yang, Yu-Hung; Shen, Jia-Ying

    2018-01-01

    Several factors contribute to individual variability in postoperative pain, therefore, individuals consume postoperative analgesics at different rates. Although many statistical studies have analyzed postoperative pain and analgesic consumption, most have identified only the correlation and have not subjected the statistical model to further tests in order to evaluate its predictive accuracy. In this study involving 3052 patients, a multistrategy computational approach was developed for analgesic consumption prediction. This approach uses data on patient-controlled analgesia demand behavior over time and combines clustering, classification, and regression to mitigate the limitations of current statistical models. Cross-validation results indicated that the proposed approach significantly outperforms various existing regression methods. Moreover, a comparison between the predictions by anesthesiologists and medical specialists and those of the computational approach for an independent test data set of 60 patients further evidenced the superiority of the computational approach in predicting analgesic consumption because it produced markedly lower root mean squared errors.

  2. Phylogenetic trees and Euclidean embeddings.

    PubMed

    Layer, Mark; Rhodes, John A

    2017-01-01

    It was recently observed by de Vienne et al. (Syst Biol 60(6):826-832, 2011) that a simple square root transformation of distances between taxa on a phylogenetic tree allowed for an embedding of the taxa into Euclidean space. While the justification for this was based on a diffusion model of continuous character evolution along the tree, here we give a direct and elementary explanation for it that provides substantial additional insight. We use this embedding to reinterpret the differences between the NJ and BIONJ tree building algorithms, providing one illustration of how this embedding reflects tree structures in data.

  3. Tree growth response to ENSO in Durango, Mexico

    NASA Astrophysics Data System (ADS)

    Pompa-García, Marin; Miranda-Aragón, Liliana; Aguirre-Salado, Carlos Arturo

    2015-01-01

    The dynamics of forest ecosystems worldwide have been driven largely by climatic teleconnections. El Niño-Southern Oscillation (ENSO) is the strongest interannual variation of the Earth's climate, affecting the regional climatic regime. These teleconnections may impact plant phenology, growth rate, forest extent, and other gradual changes in forest ecosystems. The objective of this study was to investigate how Pinus cooperi populations face the influence of ENSO and regional microclimates in five ecozones in northwestern Mexico. Using standard dendrochronological techniques, tree-ring chronologies (TRI) were generated. TRI, ENSO, and climate relationships were correlated from 1950-2010. Additionally, multiple regressions were conducted in order to detect those ENSO months with direct relations in TRI ( p < 0.1). The five chronologies showed similar trends during the period they overlapped, indicating that the P. cooperi populations shared an interannual growth variation. In general, ENSO index showed correspondences with tree-ring growth in synchronous periods. We concluded that ENSO had connectivity with regional climate in northern Mexico and radial growth of P. cooperi populations has been driven largely by positive ENSO values (El Niño episodes).

  4. Tree growth response to ENSO in Durango, Mexico.

    PubMed

    Pompa-García, Marin; Miranda-Aragón, Liliana; Aguirre-Salado, Carlos Arturo

    2015-01-01

    The dynamics of forest ecosystems worldwide have been driven largely by climatic teleconnections. El Niño-Southern Oscillation (ENSO) is the strongest interannual variation of the Earth's climate, affecting the regional climatic regime. These teleconnections may impact plant phenology, growth rate, forest extent, and other gradual changes in forest ecosystems. The objective of this study was to investigate how Pinus cooperi populations face the influence of ENSO and regional microclimates in five ecozones in northwestern Mexico. Using standard dendrochronological techniques, tree-ring chronologies (TRI) were generated. TRI, ENSO, and climate relationships were correlated from 1950-2010. Additionally, multiple regressions were conducted in order to detect those ENSO months with direct relations in TRI (p < 0.1). The five chronologies showed similar trends during the period they overlapped, indicating that the P. cooperi populations shared an interannual growth variation. In general, ENSO index showed correspondences with tree-ring growth in synchronous periods. We concluded that ENSO had connectivity with regional climate in northern Mexico and radial growth of P. cooperi populations has been driven largely by positive ENSO values (El Niño episodes).

  5. Faster Bit-Parallel Algorithms for Unordered Pseudo-tree Matching and Tree Homeomorphism

    NASA Astrophysics Data System (ADS)

    Kaneta, Yusaku; Arimura, Hiroki

    In this paper, we consider the unordered pseudo-tree matching problem, which is a problem of, given two unordered labeled trees P and T, finding all occurrences of P in T via such many-one embeddings that preserve node labels and parent-child relationship. This problem is closely related to tree pattern matching problem for XPath queries with child axis only. If m > w , we present an efficient algorithm that solves the problem in O(nm log(w)/w) time using O(hm/w + mlog(w)/w) space and O(m log(w)) preprocessing on a unit-cost arithmetic RAM model with addition, where m is the number of nodes in P, n is the number of nodes in T, h is the height of T, and w is the word length. We also discuss a modification of our algorithm for the unordered tree homeomorphism problem, which corresponds to a tree pattern matching problem for XPath queries with descendant axis only.

  6. Tree growth and competition in an old-growth Picea abies forest of boreal Sweden: influence of tree spatial patterning

    USGS Publications Warehouse

    Fraver, Shawn; D'Amato, Anthony W.; Bradford, John B.; Jonsson, Bengt Gunnar; Jönsson, Mari; Esseen, Per-Anders

    2013-01-01

    Question: What factors best characterize tree competitive environments in this structurally diverse old-growth forest, and do these factors vary spatially within and among stands? Location: Old-growth Picea abies forest of boreal Sweden. Methods: Using long-term, mapped permanent plot data augmented with dendrochronological analyses, we evaluated the effect of neighbourhood competition on focal tree growth by means of standard competition indices, each modified to include various metrics of trees size, neighbour mortality weighting (for neighbours that died during the inventory period), and within-neighbourhood tree clustering. Candidate models were evaluated using mixed-model linear regression analyses, with mean basal area increment as the response variable. We then analysed stand-level spatial patterns of competition indices and growth rates (via kriging) to determine if the relationship between these patterns could further elucidate factors influencing tree growth. Results: Inter-tree competition clearly affected growth rates, with crown volume being the size metric most strongly influencing the neighbourhood competitive environment. Including neighbour tree mortality weightings in models only slightly improved descriptions of competitive interactions. Although the within-neighbourhood clustering index did not improve model predictions, competition intensity was influenced by the underlying stand-level tree spatial arrangement: stand-level clustering locally intensified competition and reduced tree growth, whereas in the absence of such clustering, inter-tree competition played a lesser role in constraining tree growth. Conclusions: Our findings demonstrate that competition continues to influence forest processes and structures in an old-growth system that has not experienced major disturbances for at least two centuries. The finding that the underlying tree spatial pattern influenced the competitive environment suggests caution in interpreting traditional tree

  7. Nonbinary Tree-Based Phylogenetic Networks.

    PubMed

    Jetten, Laura; van Iersel, Leo

    2018-01-01

    Rooted phylogenetic networks are used to describe evolutionary histories that contain non-treelike evolutionary events such as hybridization and horizontal gene transfer. In some cases, such histories can be described by a phylogenetic base-tree with additional linking arcs, which can, for example, represent gene transfer events. Such phylogenetic networks are called tree-based. Here, we consider two possible generalizations of this concept to nonbinary networks, which we call tree-based and strictly-tree-based nonbinary phylogenetic networks. We give simple graph-theoretic characterizations of tree-based and strictly-tree-based nonbinary phylogenetic networks. Moreover, we show for each of these two classes that it can be decided in polynomial time whether a given network is contained in the class. Our approach also provides a new view on tree-based binary phylogenetic networks. Finally, we discuss two examples of nonbinary phylogenetic networks in biology and show how our results can be applied to them.

  8. Availability and capacity of substance abuse programs in correctional settings: A classification and regression tree analysis.

    PubMed

    Taxman, Faye S; Kitsantas, Panagiota

    2009-08-01

    OBJECTIVE TO BE ADDRESSED: The purpose of this study was to investigate the structural and organizational factors that contribute to the availability and increased capacity for substance abuse treatment programs in correctional settings. We used classification and regression tree statistical procedures to identify how multi-level data can explain the variability in availability and capacity of substance abuse treatment programs in jails and probation/parole offices. The data for this study combined the National Criminal Justice Treatment Practices (NCJTP) Survey and the 2000 Census. The NCJTP survey was a nationally representative sample of correctional administrators for jails and probation/parole agencies. The sample size included 295 substance abuse treatment programs that were classified according to the intensity of their services: high, medium, and low. The independent variables included jurisdictional-level structural variables, attributes of the correctional administrators, and program and service delivery characteristics of the correctional agency. The two most important variables in predicting the availability of all three types of services were stronger working relationships with other organizations and the adoption of a standardized substance abuse screening tool by correctional agencies. For high and medium intensive programs, the capacity increased when an organizational learning strategy was used by administrators and the organization used a substance abuse screening tool. Implications on advancing treatment practices in correctional settings are discussed, including further work to test theories on how to better understand access to intensive treatment services. This study presents the first phase of understanding capacity-related issues regarding treatment programs offered in correctional settings.

  9. Chronic subdural hematoma: Surgical management and outcome in 986 cases: A classification and regression tree approach

    PubMed Central

    Rovlias, Aristedis; Theodoropoulos, Spyridon; Papoutsakis, Dimitrios

    2015-01-01

    Background: Chronic subdural hematoma (CSDH) is one of the most common clinical entities in daily neurosurgical practice which carries a most favorable prognosis. However, because of the advanced age and medical problems of patients, surgical therapy is frequently associated with various complications. This study evaluated the clinical features, radiological findings, and neurological outcome in a large series of patients with CSDH. Methods: A classification and regression tree (CART) technique was employed in the analysis of data from 986 patients who were operated at Asclepeion General Hospital of Athens from January 1986 to December 2011. Burr holes evacuation with closed system drainage has been the operative technique of first choice at our institution for 29 consecutive years. A total of 27 prognostic factors were examined to predict the outcome at 3-month postoperatively. Results: Our results indicated that neurological status on admission was the best predictor of outcome. With regard to the other data, age, brain atrophy, thickness and density of hematoma, subdural accumulation of air, and antiplatelet and anticoagulant therapy were found to correlate significantly with prognosis. The overall cross-validated predictive accuracy of CART model was 85.34%, with a cross-validated relative error of 0.326. Conclusions: Methodologically, CART technique is quite different from the more commonly used methods, with the primary benefit of illustrating the important prognostic variables as related to outcome. Since, the ideal therapy for the treatment of CSDH is still under debate, this technique may prove useful in developing new therapeutic strategies and approaches for patients with CSDH. PMID:26257985

  10. [Predicting very early rebleeding after acute variceal bleeding based in classification and regression tree analysis (CRTA).].

    PubMed

    Altamirano, J; Augustin, S; Muntaner, L; Zapata, L; González-Angulo, A; Martínez, B; Flores-Arroyo, A; Camargo, L; Genescá, J

    2010-01-01

    Variceal bleeding (VB) is the main cause of death among cirrhotic patients. About 30-50% of early rebleeding is encountered few days after the acute episode of VB. It is necessary to stratify patients with high risk of very early rebleeding (VER) for more aggressive therapies. However, there are few and incompletely understood prognostic models for this purpose. To determine the risk factors associated with VER after an acute VB. Assessment and comparison of a novel prognostic model generated by Classification and Regression Tree Analysis (CART) with classic-used models (MELD and Child-Pugh [CP]). Sixty consecutive cirrhotic patients with acute variceal bleeding. CART analysis, MELD and Child-Pugh scores were performed at admission. Receiver operating characteristic (ROC) curves were constructed to evaluate the predictive performance of the models. Very early rebleeding rate was 13%. Variables associated with VER were: serum albumin (p = 0.027), creatinine (p = 0.021) and transfused blood units in the first 24 hrs (p = 0.05). The area under the ROC for MELD, CHILD-Pugh and CART were 0.46, 0.50 and 0.82, respectively. The value of cut analyzed by CART for the significant variables were: 1) Albumin 2.85 mg/dL, 2) Packed red cells 2 units and 3) Creatinine 1.65 mg/dL the ABC-ROC. Serum albumin, creatinine and number of transfused blood units were associated with VER. A simple CART algorithm combining these variables allows an accurate predictive assessment of VER after acute variceal bleeding. Key words: cirrhosis, variceal bleeding, esophageal varices, prognosis, portal hypertension.

  11. Phylogenetic trees in bioinformatics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding themore » best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.« less

  12. Harmonic regression of Landsat time series for modeling attributes from national forest inventory data

    NASA Astrophysics Data System (ADS)

    Wilson, Barry T.; Knight, Joseph F.; McRoberts, Ronald E.

    2018-03-01

    Imagery from the Landsat Program has been used frequently as a source of auxiliary data for modeling land cover, as well as a variety of attributes associated with tree cover. With ready access to all scenes in the archive since 2008 due to the USGS Landsat Data Policy, new approaches to deriving such auxiliary data from dense Landsat time series are required. Several methods have previously been developed for use with finer temporal resolution imagery (e.g. AVHRR and MODIS), including image compositing and harmonic regression using Fourier series. The manuscript presents a study, using Minnesota, USA during the years 2009-2013 as the study area and timeframe. The study examined the relative predictive power of land cover models, in particular those related to tree cover, using predictor variables based solely on composite imagery versus those using estimated harmonic regression coefficients. The study used two common non-parametric modeling approaches (i.e. k-nearest neighbors and random forests) for fitting classification and regression models of multiple attributes measured on USFS Forest Inventory and Analysis plots using all available Landsat imagery for the study area and timeframe. The estimated Fourier coefficients developed by harmonic regression of tasseled cap transformation time series data were shown to be correlated with land cover, including tree cover. Regression models using estimated Fourier coefficients as predictor variables showed a two- to threefold increase in explained variance for a small set of continuous response variables, relative to comparable models using monthly image composites. Similarly, the overall accuracies of classification models using the estimated Fourier coefficients were approximately 10-20 percentage points higher than the models using the image composites, with corresponding individual class accuracies between six and 45 percentage points higher.

  13. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    NASA Astrophysics Data System (ADS)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  14. Optimizing a basal bark spray of dinotefuran to manage armored scales (Hemiptera: Diaspididae) in Christmas tree plantations.

    PubMed

    Cowles, Richard S

    2010-10-01

    The armored scales Fiorinia externa Ferris and Aspidiotus cryptomeriae Kuwana (Hemiptera: Diaspididae) are increasingly damaging to Christmas tree plantings in southern New England. The systemic insecticide dinotefuran was investigated for selectively suppressing armored scale populations relative to their natural enemies in cooperating growers' fields in 2008 and 2009. Banded soil application of dinotefuran resulted in poor control. However, a dinotefuran spray applied to the basal 25 cm of trunk resulted in its absorption through the bark, translocation to the foliage, and good efficacy. The basal bark spray did not significantly impact the activity of predators Chilocorus stigma (Say) or Cybocephalus nipponicus Enrody-Younga and in 2009 showed a dosage-dependent improvement in the percentage of scales parasitized by Encarsia citrina Craw. A field dosage-response factorial experiment revealed that a 0.25% (vol:vol) addition of a surfactant with dinotefuran did not enhance insecticidal effect. Probit-transformed scale population reduction relative to the untreated check was subjected to linear regression analysis; reduction of scale populations was proportional to the log of insecticide dosage, whereas basal bark spray efficacy declined in proportion to the cube of tree height. The regression equation can be used to optimize dosage relative to tree height. Excellent efficacy resulted from basal bark spray application dates of 28 April (prebud break) to mid-June, but earlier spray timing within that treatment window had fewer crawlers discoloring new growth with their short-lived feeding. A basal bark spray of dinotefuran is well suited for integration with natural enemies to manage armored scales in Christmas tree plantations.

  15. Acid rain, air pollution, and tree growth in southeastern New York

    USGS Publications Warehouse

    Puckett, L.J.

    1982-01-01

    Whether dendroecological analyses could be used to detect changes in the relationship of tree growth to climate that might have resulted from chronic exposure to components of the acid rain-air pollution complex was determined. Tree-ring indices of white pine (Pinus strobus L.), eastern hemlock (Tsuga canadensis (L.) Cart.), pitch pine (Pinus rigida Mill.), and chestnut oak (Quercus prinus L.) were regressed against orthogonally transformed values of temperature and precipitation in order to derive a response-function relationship. Results of the regression analyses for three time periods, 1901–1920, 1926–1945, and 1954–1973 suggest that the relationship of tree growth to climate has been altered. Statistical tests of the temperature and precipitation data suggest that this change was nonclimatic. Temporally, the shift in growth response appears to correspond with the suspected increase in acid rain and air pollution in the Shawangunk Mountain area of southeastern New York in the early 1950's. This change could be the result of physiological stress induced by components of the acid rain-air pollution complex, causing climatic conditions to be more limiting to tree growth.

  16. Using structured additive regression models to estimate risk factors of malaria: analysis of 2010 Malawi malaria indicator survey data.

    PubMed

    Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

    2014-01-01

    After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities.

  17. Height-age relationships for regeneration-size trees in the northern Rocky Mountains, USA

    Treesearch

    Dennis E. Ferguson; Clinton E. Carlson

    2010-01-01

    Regression equations were developed to predict heights of 10 conifer species inregenerating stands in central and northern Idaho, western Montana, and eastern Washington. Most sample trees were natural regeneration that became established after conventional harvest and site preparation methods. Heights are predicted as a function of tree age, residual overstory density...

  18. Los Angeles 1-Million tree canopy cover assessment

    Treesearch

    Gregory E. McPherson; James R. Simpson; Qingfu Xiao; Wu Chunxia

    2008-01-01

    The Million Trees LA initiative intends to chart a course for sustainable growth through planting and stewardship of trees. The purpose of this study was to measure Los Angeles's existing tree canopy cover (TCC), determine if space exists for 1 million additional trees, and estimate future benefits from the planting. High resolution QuickBird remote sensing data,...

  19. Converting international ¼ inch tree volume to Doyle

    Treesearch

    Aaron Holley; John R. Brooks; Stuart A. Moss

    2014-01-01

    An equation for converting Mesavage and Girard's International ¼ inch tree volumes to the Doyle log rule is presented as a function of tree diameter. Volume error for trees having less than four logs exhibited volume prediction errors within a range of ±10 board feet. In addition, volume prediction error as a percent of actual Doyle tree volume...

  20. Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling.

    PubMed

    Melcher, Michael; Scharl, Theresa; Luchner, Markus; Striedner, Gerald; Leisch, Friedrich

    2017-02-01

    The quality of biopharmaceuticals and patients' safety are of highest priority and there are tremendous efforts to replace empirical production process designs by knowledge-based approaches. Main challenge in this context is that real-time access to process variables related to product quality and quantity is severely limited. To date comprehensive on- and offline monitoring platforms are used to generate process data sets that allow for development of mechanistic and/or data driven models for real-time prediction of these important quantities. Ultimate goal is to implement model based feed-back control loops that facilitate online control of product quality. In this contribution, we explore structured additive regression (STAR) models in combination with boosting as a variable selection tool for modeling the cell dry mass, product concentration, and optical density on the basis of online available process variables and two-dimensional fluorescence spectroscopic data. STAR models are powerful extensions of linear models allowing for inclusion of smooth effects or interactions between predictors. Boosting constructs the final model in a stepwise manner and provides a variable importance measure via predictor selection frequencies. Our results show that the cell dry mass can be modeled with a relative error of about ±3%, the optical density with ±6%, the soluble protein with ±16%, and the insoluble product with an accuracy of ±12%. Biotechnol. Bioeng. 2017;114: 321-334. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  1. Modeling Tree Mortality Following Wildfire in Pinus ponderosa Forests in the Central Sierra Nevada of California

    Treesearch

    Jon C. Regelbrugge

    1993-01-01

    Abstract. We modeled tree mortality occurring two years following wildfire in Pinus ponderosa forests using data from 1275 trees in 25 stands burned during the 1987 Stanislaus Complex fires. We used logistic regression analysis to develop models relating the probability of wildfire-induced mortality with tree size and fire severity for Pinus ponderosa, Calocedrus...

  2. Prediction of strontium bromide laser efficiency using cluster and decision tree analysis

    NASA Astrophysics Data System (ADS)

    Iliev, Iliycho; Gocheva-Ilieva, Snezhana; Kulin, Chavdar

    2018-01-01

    Subject of investigation is a new high-powered strontium bromide (SrBr2) vapor laser emitting in multiline region of wavelengths. The laser is an alternative to the atom strontium lasers and electron free lasers, especially at the line 6.45 μm which line is used in surgery for medical processing of biological tissues and bones with minimal damage. In this paper the experimental data from measurements of operational and output characteristics of the laser are statistically processed by means of cluster analysis and tree-based regression techniques. The aim is to extract the more important relationships and dependences from the available data which influence the increase of the overall laser efficiency. There are constructed and analyzed a set of cluster models. It is shown by using different cluster methods that the seven investigated operational characteristics (laser tube diameter, length, supplied electrical power, and others) and laser efficiency are combined in 2 clusters. By the built regression tree models using Classification and Regression Trees (CART) technique there are obtained dependences to predict the values of efficiency, and especially the maximum efficiency with over 95% accuracy.

  3. Tree-Structured Infinite Sparse Factor Model

    PubMed Central

    Zhang, XianXing; Dunson, David B.; Carin, Lawrence

    2013-01-01

    A tree-structured multiplicative gamma process (TMGP) is developed, for inferring the depth of a tree-based factor-analysis model. This new model is coupled with the nested Chinese restaurant process, to nonparametrically infer the depth and width (structure) of the tree. In addition to developing the model, theoretical properties of the TMGP are addressed, and a novel MCMC sampler is developed. The structure of the inferred tree is used to learn relationships between high-dimensional data, and the model is also applied to compressive sensing and interpolation of incomplete images. PMID:25279389

  4. Tree-, stand- and site-specific controls on landscape-scale patterns of transpiration

    NASA Astrophysics Data System (ADS)

    Kathrin Hassler, Sibylle; Weiler, Markus; Blume, Theresa

    2018-01-01

    Transpiration is a key process in the hydrological cycle, and a sound understanding and quantification of transpiration and its spatial variability is essential for management decisions as well as for improving the parameterisation and evaluation of hydrological and soil-vegetation-atmosphere transfer models. For individual trees, transpiration is commonly estimated by measuring sap flow. Besides evaporative demand and water availability, tree-specific characteristics such as species, size or social status control sap flow amounts of individual trees. Within forest stands, properties such as species composition, basal area or stand density additionally affect sap flow, for example via competition mechanisms. Finally, sap flow patterns might also be influenced by landscape-scale characteristics such as geology and soils, slope position or aspect because they affect water and energy availability; however, little is known about the dynamic interplay of these controls.We studied the relative importance of various tree-, stand- and site-specific characteristics with multiple linear regression models to explain the variability of sap velocity measurements in 61 beech and oak trees, located at 24 sites across a 290 km2 catchment in Luxembourg. For each of 132 consecutive days of the growing season of 2014 we modelled the daily sap velocity and derived sap flow patterns of these 61 trees, and we determined the importance of the different controls.Results indicate that a combination of mainly tree- and site-specific factors controls sap velocity patterns in the landscape, namely tree species, tree diameter, geology and aspect. For sap flow we included only the stand- and site-specific predictors in the models to ensure variable independence. Of those, geology and aspect were most important. Compared to these predictors, spatial variability of atmospheric demand and soil moisture explains only a small fraction of the variability in the daily datasets. However, the temporal

  5. The effect of different distance measures in detecting outliers using clustering-based algorithm for circular regression model

    NASA Astrophysics Data System (ADS)

    Di, Nur Faraidah Muhammad; Satari, Siti Zanariah

    2017-05-01

    Outlier detection in linear data sets has been done vigorously but only a small amount of work has been done for outlier detection in circular data. In this study, we proposed multiple outliers detection in circular regression models based on the clustering algorithm. Clustering technique basically utilizes distance measure to define distance between various data points. Here, we introduce the similarity distance based on Euclidean distance for circular model and obtain a cluster tree using the single linkage clustering algorithm. Then, a stopping rule for the cluster tree based on the mean direction and circular standard deviation of the tree height is proposed. We classify the cluster group that exceeds the stopping rule as potential outlier. Our aim is to demonstrate the effectiveness of proposed algorithms with the similarity distances in detecting the outliers. It is found that the proposed methods are performed well and applicable for circular regression model.

  6. The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis.

    PubMed

    Koziol, James A; Feng, Anne C; Jia, Zhenyu; Wang, Yipeng; Goodison, Seven; McClelland, Michael; Mercola, Dan

    2009-01-01

    Classification and regression trees have long been used for cancer diagnosis and prognosis. Nevertheless, instability and variable selection bias, as well as overfitting, are well-known problems of tree-based methods. In this article, we investigate whether ensemble tree classifiers can ameliorate these difficulties, using data from two recent studies of radical prostatectomy in prostate cancer. Using time to progression following prostatectomy as the relevant clinical endpoint, we found that ensemble tree classifiers robustly and reproducibly identified three subgroups of patients in the two clinical datasets: non-progressors, early progressors and late progressors. Moreover, the consensus classifications were independent predictors of time to progression compared to known clinical prognostic factors.

  7. Potential Changes in Tree Species Richness and Forest Community Types following Climate Change

    Treesearch

    Louis R. Iverson; Anantha M. Prasad

    2001-01-01

    Potential changes in tree species richness and forest community types were evaluated for the eastern United States according to five scenarios of future climate change resulting from a doubling of atmospheric carbon dioxide (CO2). DISTRIB, an empirical model that uses a regression tree analysis approach, was used to generate suitable habitat, or potential future...

  8. Unravelling the limits to tree height: a major role for water and nutrient trade-offs.

    PubMed

    Cramer, Michael D

    2012-05-01

    Competition for light has driven forest trees to grow exceedingly tall, but the lack of a single universal limit to tree height indicates multiple interacting environmental limitations. Because soil nutrient availability is determined by both nutrient concentrations and soil water, water and nutrient availabilities may interact in determining realised nutrient availability and consequently tree height. In SW Australia, which is characterised by nutrient impoverished soils that support some of the world's tallest forests, total [P] and water availability were independently correlated with tree height (r = 0.42 and 0.39, respectively). However, interactions between water availability and each of total [P], pH and [Mg] contributed to a multiple linear regression model of tree height (r = 0.72). A boosted regression tree model showed that maximum tree height was correlated with water availability (24%), followed by soil properties including total P (11%), Mg (10%) and total N (9%), amongst others, and that there was an interaction between water availability and total [P] in determining maximum tree height. These interactions indicated a trade-off between water and P availability in determining maximum tree height in SW Australia. This is enabled by a species assemblage capable of growing tall and surviving (some) disturbances. The mechanism for this trade-off is suggested to be through water enabling mass-flow and diffusive mobility of P, particularly of relatively mobile organic P, although water interactions with microbial activity could also play a role.

  9. Using Structured Additive Regression Models to Estimate Risk Factors of Malaria: Analysis of 2010 Malawi Malaria Indicator Survey Data

    PubMed Central

    Chirombo, James; Lowe, Rachel; Kazembe, Lawrence

    2014-01-01

    Background After years of implementing Roll Back Malaria (RBM) interventions, the changing landscape of malaria in terms of risk factors and spatial pattern has not been fully investigated. This paper uses the 2010 malaria indicator survey data to investigate if known malaria risk factors remain relevant after many years of interventions. Methods We adopted a structured additive logistic regression model that allowed for spatial correlation, to more realistically estimate malaria risk factors. Our model included child and household level covariates, as well as climatic and environmental factors. Continuous variables were modelled by assuming second order random walk priors, while spatial correlation was specified as a Markov random field prior, with fixed effects assigned diffuse priors. Inference was fully Bayesian resulting in an under five malaria risk map for Malawi. Results Malaria risk increased with increasing age of the child. With respect to socio-economic factors, the greater the household wealth, the lower the malaria prevalence. A general decline in malaria risk was observed as altitude increased. Minimum temperatures and average total rainfall in the three months preceding the survey did not show a strong association with disease risk. Conclusions The structured additive regression model offered a flexible extension to standard regression models by enabling simultaneous modelling of possible nonlinear effects of continuous covariates, spatial correlation and heterogeneity, while estimating usual fixed effects of categorical and continuous observed variables. Our results confirmed that malaria epidemiology is a complex interaction of biotic and abiotic factors, both at the individual, household and community level and that risk factors are still relevant many years after extensive implementation of RBM activities. PMID:24991915

  10. Tests with VHR images for the identification of olive trees and other fruit trees in the European Union

    NASA Astrophysics Data System (ADS)

    Masson, Josiane; Soille, Pierre; Mueller, Rick

    2004-10-01

    In the context of the Common Agricultural Policy (CAP) there is a strong interest of the European Commission for counting and individually locating fruit trees. An automatic counting algorithm developed by the JRC (OLICOUNT) was used in the past for olive trees only, on 1m black and white orthophotos but with limits in case of young trees or irregular groves. This study investigates the improvement of fruit tree identification using VHR images on a large set of data in three test sites, one in Creta (Greece; one in the south-east of France with a majority of olive trees and associated fruit trees, and the last one in Florida on citrus trees. OLICOUNT was compared with two other automatic tree counting, applications, one using the CRISP software on citrus trees and the other completely automatic based on regional minima (morphological image analysis). Additional investigation was undertaken to refine the methods. This paper describes the automatic methods and presents the results derived from the tests.

  11. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  12. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models.

    PubMed

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  13. Decision Tree Approach for Soil Liquefaction Assessment

    PubMed Central

    Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498

  14. Decision tree approach for soil liquefaction assessment.

    PubMed

    Gandomi, Amir H; Fridline, Mark M; Roke, David A

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view.

  15. Censored quantile regression with recursive partitioning-based weights

    PubMed Central

    Wey, Andrew; Wang, Lan; Rudser, Kyle

    2014-01-01

    Censored quantile regression provides a useful alternative to the Cox proportional hazards model for analyzing survival data. It directly models the conditional quantile of the survival time and hence is easy to interpret. Moreover, it relaxes the proportionality constraint on the hazard function associated with the popular Cox model and is natural for modeling heterogeneity of the data. Recently, Wang and Wang (2009. Locally weighted censored quantile regression. Journal of the American Statistical Association 103, 1117–1128) proposed a locally weighted censored quantile regression approach that allows for covariate-dependent censoring and is less restrictive than other censored quantile regression methods. However, their kernel smoothing-based weighting scheme requires all covariates to be continuous and encounters practical difficulty with even a moderate number of covariates. We propose a new weighting approach that uses recursive partitioning, e.g. survival trees, that offers greater flexibility in handling covariate-dependent censoring in moderately high dimensions and can incorporate both continuous and discrete covariates. We prove that this new weighting scheme leads to consistent estimation of the quantile regression coefficients and demonstrate its effectiveness via Monte Carlo simulations. We also illustrate the new method using a widely recognized data set from a clinical trial on primary biliary cirrhosis. PMID:23975800

  16. Chilling and heat requirements for flowering in temperate fruit trees

    NASA Astrophysics Data System (ADS)

    Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike

    2014-08-01

    Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut ( Castanea mollissima Blume) and jujube ( Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing's cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.

  17. Chilling and heat requirements for flowering in temperate fruit trees.

    PubMed

    Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike

    2014-08-01

    Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut (Castanea mollissima Blume) and jujube (Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing’s cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.

  18. Comprehensive decision tree models in bioinformatics.

    PubMed

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class attributes and a high number of possibly

  19. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  20. Tree-Ring Based May-July Temperature Reconstruction Since AD 1630 on the Western Loess Plateau, China

    PubMed Central

    Song, Huiming; Liu, Yu; Li, Qiang; Gao, Na; Ma, Yongyong; Zhang, Yanhua

    2014-01-01

    Tree-ring samples from Chinese Pine (Pinus tabulaeformis Carr.) collected at Mt. Shimen on the western Loess Plateau, China, were used to reconstruct the mean May–July temperature during AD 1630–2011. The regression model explained 48% of the adjusted variance in the instrumentally observed mean May–July temperature. The reconstruction revealed significant temperature variations at interannual to decadal scales. Cool periods observed in the reconstruction coincided with reduced solar activities. The reconstructed temperature matched well with two other tree-ring based temperature reconstructions conducted on the northern slope of the Qinling Mountains (on the southern margin of the Loess Plateau of China) for both annual and decadal scales. In addition, this study agreed well with several series derived from different proxies. This reconstruction improves upon the sparse network of high-resolution paleoclimatic records for the western Loess Plateau, China. PMID:24690885

  1. Understanding poisson regression.

    PubMed

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.

  2. Annual Tree Growth Predictions From Periodic Measurements

    Treesearch

    Quang V. Cao

    2004-01-01

    Data from annual measurements of a loblolly pine (Pinus taeda L.) plantation were available for this study. Regression techniques were employed to model annual changes of individual trees in terms of diameters, heights, and survival probabilities. Subsets of the data that include measurements every 2, 3, 4, 5, and 6 years were used to fit the same...

  3. Random Forest as a Predictive Analytics Alternative to Regression in Institutional Research

    ERIC Educational Resources Information Center

    He, Lingjun; Levine, Richard A.; Fan, Juanjuan; Beemer, Joshua; Stronach, Jeanne

    2018-01-01

    In institutional research, modern data mining approaches are seldom considered to address predictive analytics problems. The goal of this paper is to highlight the advantages of tree-based machine learning algorithms over classic (logistic) regression methods for data-informed decision making in higher education problems, and stress the success of…

  4. Use of GLM approach to assess the responses of tropical trees to urban air pollution in relation to leaf functional traits and tree characteristics.

    PubMed

    Mukherjee, Arideep; Agrawal, Madhoolika

    2018-05-15

    Responses of urban vegetation to air pollution stress in relation to their tolerance and sensitivity have been extensively studied, however, studies related to air pollution responses based on different leaf functional traits and tree characteristics are limited. In this paper, we have tried to assess combined and individual effects of major air pollutants PM 10 (particulate matter ≤ 10 µm), TSP (total suspended particulate matter), SO 2 (sulphur dioxide), NO 2 (nitrogen dioxide) and O 3 (ozone) on thirteen tropical tree species in relation to fifteen leaf functional traits and different tree characteristics. Stepwise linear regression a general linear modelling approach was used to quantify the pollution response of trees against air pollutants. The study was performed for six successive seasons for two years in three distinct urban areas (traffic, industrial and residential) of Varanasi city in India. At all the study sites, concentrations of air pollutants, specifically PM (particulate matter) and NO 2 were above the specified standards. Distinct variations were recorded in all the fifteen leaf functional traits with pollution load. Caesalpinia sappan was identified as most tolerant species followed by Psidium guajava, Dalbergia sissoo and Albizia lebbeck. Stepwise regression analysis identified maximum response of Eucalyptus citriodora and P. guajava to air pollutants explaining overall 59% and 58% variability's in leaf functional traits, respectively. Among leaf functional traits, maximum effect of air pollutants was observed on non-enzymatic antioxidants followed by photosynthetic pigments and leaf water status. Among the pollutants, PM was identified as the major stress factor followed by O 3 explaining 47% and 33% variability's in leaf functional traits. Tolerance and pollution response were regulated by different tree characteristics such as height, canopy size, leaf from, texture and nature of tree. Outcomes of this study will help in urban forest

  5. The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis

    PubMed Central

    Koziol, James A.; Feng, Anne C.; Jia, Zhenyu; Wang, Yipeng; Goodison, Seven; McClelland, Michael; Mercola, Dan

    2009-01-01

    Motivation: Classification and regression trees have long been used for cancer diagnosis and prognosis. Nevertheless, instability and variable selection bias, as well as overfitting, are well-known problems of tree-based methods. In this article, we investigate whether ensemble tree classifiers can ameliorate these difficulties, using data from two recent studies of radical prostatectomy in prostate cancer. Results: Using time to progression following prostatectomy as the relevant clinical endpoint, we found that ensemble tree classifiers robustly and reproducibly identified three subgroups of patients in the two clinical datasets: non-progressors, early progressors and late progressors. Moreover, the consensus classifications were independent predictors of time to progression compared to known clinical prognostic factors. Contact: dmercola@uci.edu PMID:18628288

  6. A study of Solar-Enso correlation with southern Brazil tree ring index (1955- 1991)

    NASA Astrophysics Data System (ADS)

    Rigozo, N.; Nordemann, D.; Vieira, L.; Echer, E.

    The effects of solar activity and El Niño-Southern Oscillation on tree growth in Southern Brazil were studied by correlation analysis. Trees for this study were native Araucaria (Araucaria Angustifolia)from four locations in Rio Grande do Sul State, in Southern Brazil: Canela (29o18`S, 50o51`W, 790 m asl), Nova Petropolis (29o2`S, 51o10`W, 579 m asl), Sao Francisco de Paula (29o25`S, 50o24`W, 930 m asl) and Sao Martinho da Serra (29o30`S, 53o53`W, 484 m asl). From these four sites, an average tree ring Index for this region was derived, for the period 1955-1991. Linear correlations were made on annual and 10 year running averages of this tree ring Index, of sunspot number Rz and SOI. For annual averages, the correlation coefficients were low, and the multiple regression between tree ring and SOI and Rz indicates that 20% of the variance in tree rings was explained by solar activity and ENSO variability. However, when the 10 year running averages correlations were made, the coefficient correlations were much higher. A clear anticorrelation is observed between SOI and Index (r=-0.81) whereas Rz and Index show a positive correlation (r=0.67). The multiple regression of 10 year running averages indicates that 76% of the variance in tree ring INdex was explained by solar activity and ENSO. These results indicate that the effects of solar activity and ENSO on tree rings are better seen on long timescales.

  7. Coalescent methods for estimating phylogenetic trees.

    PubMed

    Liu, Liang; Yu, Lili; Kubatko, Laura; Pearl, Dennis K; Edwards, Scott V

    2009-10-01

    We review recent models to estimate phylogenetic trees under the multispecies coalescent. Although the distinction between gene trees and species trees has come to the fore of phylogenetics, only recently have methods been developed that explicitly estimate species trees. Of the several factors that can cause gene tree heterogeneity and discordance with the species tree, deep coalescence due to random genetic drift in branches of the species tree has been modeled most thoroughly. Bayesian approaches to estimating species trees utilizes two likelihood functions, one of which has been widely used in traditional phylogenetics and involves the model of nucleotide substitution, and the second of which is less familiar to phylogeneticists and involves the probability distribution of gene trees given a species tree. Other recent parametric and nonparametric methods for estimating species trees involve parsimony criteria, summary statistics, supertree and consensus methods. Species tree approaches are an appropriate goal for systematics, appear to work well in some cases where concatenation can be misleading, and suggest that sampling many independent loci will be paramount. Such methods can also be challenging to implement because of the complexity of the models and computational time. In addition, further elaboration of the simplest of coalescent models will be required to incorporate commonly known issues such as deviation from the molecular clock, gene flow and other genetic forces.

  8. The hydrological vulnerability of western North American boreal tree species based on ground-based observations of tree mortality

    NASA Astrophysics Data System (ADS)

    Hember, R. A.; Kurz, W. A.; Coops, N. C.

    2017-12-01

    Several studies indicate that climate change has increased rates of tree mortality, adversely affecting timber supply and carbon storage in western North American boreal forests. Statistical models of tree mortality can play a complimentary role in detecting and diagnosing forest change. Yet, such models struggle to address real-world complexity, including expectations that hydrological vulnerability arises from both drought stress and excess-water stress, and that these effects vary by species, tree size, and competitive status. Here, we describe models that predict annual probability of tree mortality (Pm) of common boreal tree species based on tree height (H), biomass of larger trees (BLT), soil water content (W), reference evapotranspiration (E), and two-way interactions. We show that interactions among H and hydrological variables are consistently significant. Vulnerability to extreme droughts consistently increases as H approaches maximum observed values of each species, while some species additionally show increasing vulnerability at low H. Some species additionally show increasing vulnerability to low W under high BLT, or increasing drought vulnerability under low BLT. These results suggest that vulnerability of trees to increasingly severe droughts depends on the hydraulic efficiency, competitive status, and microclimate of individual trees. Static simulations of Pm across a 1-km grid (i.e., with time-independent inputs of H, BLT, and species composition) indicate complex spatial patterns in the time trends during 1965-2014 and a mean change in Pm of 42 %. Lastly, we discuss how the size-dependence of hydrological vulnerability, in concert with increasingly severe drought events, may shape future responses of stand-level biomass production to continued warming and increasing carbon dioxide concentration in the region.

  9. Measuring (bio)physical tree properties using accelerometers

    NASA Astrophysics Data System (ADS)

    van Emmerik, Tim; Steele-Dunne, Susan; Hut, Rolf; Gentine, Pierre; Selker, John; van de Giesen, Nick

    2017-04-01

    Trees play a crucial role in the water, carbon and nitrogen cycle on local, regional and global scales. Understanding the exchange of heat, water, and CO2 between trees and the atmosphere is important to assess the impact of drought, deforestation and climate change. Unfortunately, ground measurements of tree dynamics are often expensive, or difficult due to challenging environments. We demonstrate the potential of measuring (bio)physical properties of trees using robust and affordable acceleration sensors. Tree sway is dependent on e.g. mass and wind energy absorption of the tree. By measuring tree acceleration we can relate the tree motion to external loads (e.g. precipitation), and tree (bio)physical properties (e.g. mass). Using five months of acceleration data of 19 trees in the Brazilian Amazon, we show that the frequency spectrum of tree sway is related to mass, precipitation, and canopy drag. This presentation aims to show the concept of using accelerometers to measure tree dynamics, and we acknowledge that the presented example applications is not an exhaustive list. Further analyses are the scope of current research, and we hope to inspire others to explore additional applications.

  10. Characteristics of the tree-drawing test in chronic schizophrenia.

    PubMed

    Kaneda, Ayako; Yasui-Furukori, Norio; Saito, Manabu; Sugawara, Norio; Nakagami, Taku; Furukori, Hanako; Kaneko, Sunao

    2010-04-01

    A tree-drawing test acts as both a projective psychological examination as well as a supplementary psychodiagnostic tool. There is little information relating the characteristics of schizophrenia and the tree-drawing test. The present study compared the structural and morphological differences in the results of the tree-drawing test between schizophrenic patients and healthy individuals, as well as between schizophrenic patients who responded well to treatment and those who responded poorly. The subjects included 202 chronic schizophrenic patients and 113 healthy individuals. The schizophrenic patients were categorized as 'good responders' or 'poor responders' based on their response to medical treatments. The tree-drawing test was performed on all subjects. The tree drawn by each subject was analyzed structurally and morphologically. There were significant differences between the trunk and branches drawn by schizophrenic patients and those drawn by healthy controls. There were no significant differences between the good responders and the poor responders in any aspect of the tree drawings. Multiple regression models showed that the ratio of the tree area to the total area of the drawing paper, the width of the trunk, the trunk base opening, and the size of the branch ends were significantly associated with schizophrenia. The present study suggests that the trees drawn by schizophrenic patients are significantly different from those drawn by healthy individuals, but among schizophrenic patients, it is difficult to distinguish between good responders and poor responders using the tree-drawing test.

  11. Potential redistribution of tree species habitat under five climate change scenarios in the eastern US

    Treesearch

    Louis R. Iverson; Anantha M. Prasad; Anantha M. Prasad

    2002-01-01

    Global climate change could have profound effects on the Earth's biota, including large redistributions of tree species and forest types. We used DISTRIB, a deterministic regression tree analysis model, to examine environmental drivers related to current forest-species distributions and then model potential suitable habitat under five climate change scenarios...

  12. Categorizing Ideas about Trees: A Tree of Trees

    PubMed Central

    Fisler, Marie; Lecointre, Guillaume

    2013-01-01

    The aim of this study is to explore whether matrices and MP trees used to produce systematic categories of organisms could be useful to produce categories of ideas in history of science. We study the history of the use of trees in systematics to represent the diversity of life from 1766 to 1991. We apply to those ideas a method inspired from coding homologous parts of organisms. We discretize conceptual parts of ideas, writings and drawings about trees contained in 41 main writings; we detect shared parts among authors and code them into a 91-characters matrix and use a tree representation to show who shares what with whom. In other words, we propose a hierarchical representation of the shared ideas about trees among authors: this produces a “tree of trees.” Then, we categorize schools of tree-representations. Classical schools like “cladists” and “pheneticists” are recovered but others are not: “gradists” are separated into two blocks, one of them being called here “grade theoreticians.” We propose new interesting categories like the “buffonian school,” the “metaphoricians,” and those using “strictly genealogical classifications.” We consider that networks are not useful to represent shared ideas at the present step of the study. A cladogram is made for showing who is sharing what with whom, but also heterobathmy and homoplasy of characters. The present cladogram is not modelling processes of transmission of ideas about trees, and here it is mostly used to test for proximity of ideas of the same age and for categorization. PMID:23950877

  13. Tree Morphologic Plasticity Explains Deviation from Metabolic Scaling Theory in Semi-Arid Conifer Forests, Southwestern USA

    PubMed Central

    O’Connor, Christopher D.; Lynch, Ann M.

    2016-01-01

    A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST. PMID:27391084

  14. Tree Morphologic Plasticity Explains Deviation from Metabolic Scaling Theory in Semi-Arid Conifer Forests, Southwestern USA.

    PubMed

    Swetnam, Tyson L; O'Connor, Christopher D; Lynch, Ann M

    2016-01-01

    A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST.

  15. Ultrasonographic Diagnosis of Biliary Atresia Based on a Decision-Making Tree Model.

    PubMed

    Lee, So Mi; Cheon, Jung-Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun-Hae; Cho, Hyun-Hye; Kim, In-One; You, Sun Kyoung

    2015-01-01

    To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology.

  16. Climatic response of annual tree-rings

    NASA Astrophysics Data System (ADS)

    Ageev, Boris G.; Gruzdev, Aleksandr N.; Ponomarev, Yurii N.; Sapozhnikova, Valeria A.

    2014-11-01

    Extensive literature devoted to investigations into the influence of environmental conditions on the plant respiration and respiration rate. It is generally accepted that the respired CO2 generated in a stem completely diffuses into the atmosphere. Results obtained from explorations into the CO2 content in disc tree rings by the method proposed in this work shows that a major part of CO2 remains in tree stems and exhibits inter-annual variability. Different methods are used to describe of CO2 and H2O distributions in disc tree rings. The relation of CO2 and H2O variations in a Siberian stone pine disc to meteorological parameters are analyzed with use of wavelet, spectral and cross-spectral techniques. According to a multiple linear regression model, the time evolution of the width of Siberian stone pine rings can be partly explained by a combined influence of air temperature, precipitation, cloudiness and solar activity. Conclusions are made regarding the response of the CO2 and H2O content in coniferous tree disc rings to various climatic factors. Suggested method of CO2, (CO2+H2O) detection can be used for studying of a stem respiration in ecological risk areas.

  17. Blood oxygen level dependent magnetic resonance imaging for detecting pathological patterns in lupus nephritis patients: a preliminary study using a decision tree model.

    PubMed

    Shi, Huilan; Jia, Junya; Li, Dong; Wei, Li; Shang, Wenya; Zheng, Zhenfeng

    2018-02-09

    Precise renal histopathological diagnosis will guide therapy strategy in patients with lupus nephritis. Blood oxygen level dependent (BOLD) magnetic resonance imaging (MRI) has been applicable noninvasive technique in renal disease. This current study was performed to explore whether BOLD MRI could contribute to diagnose renal pathological pattern. Adult patients with lupus nephritis renal pathological diagnosis were recruited for this study. Renal biopsy tissues were assessed based on the lupus nephritis ISN/RPS 2003 classification. The Blood oxygen level dependent magnetic resonance imaging (BOLD-MRI) was used to obtain functional magnetic resonance parameter, R2* values. Several functions of R2* values were calculated and used to construct algorithmic models for renal pathological patterns. In addition, the algorithmic models were compared as to their diagnostic capability. Both Histopathology and BOLD MRI were used to examine a total of twelve patients. Renal pathological patterns included five classes III (including 3 as class III + V) and seven classes IV (including 4 as class IV + V). Three algorithmic models, including decision tree, line discriminant, and logistic regression, were constructed to distinguish the renal pathological pattern of class III and class IV. The sensitivity of the decision tree model was better than that of the line discriminant model (71.87% vs 59.48%, P < 0.001) and inferior to that of the Logistic regression model (71.87% vs 78.71%, P < 0.001). The specificity of decision tree model was equivalent to that of the line discriminant model (63.87% vs 63.73%, P = 0.939) and higher than that of the logistic regression model (63.87% vs 38.0%, P < 0.001). The Area under the ROC curve (AUROCC) of the decision tree model was greater than that of the line discriminant model (0.765 vs 0.629, P < 0.001) and logistic regression model (0.765 vs 0.662, P < 0.001). BOLD MRI is a useful non-invasive imaging technique

  18. Variances in the projections, resulting from CLIMEX, Boosted Regression Trees and Random Forests techniques

    NASA Astrophysics Data System (ADS)

    Shabani, Farzin; Kumar, Lalit; Solhjouy-fard, Samaneh

    2017-08-01

    The aim of this study was to have a comparative investigation and evaluation of the capabilities of correlative and mechanistic modeling processes, applied to the projection of future distributions of date palm in novel environments and to establish a method of minimizing uncertainty in the projections of differing techniques. The location of this study on a global scale is in Middle Eastern Countries. We compared the mechanistic model CLIMEX (CL) with the correlative models MaxEnt (MX), Boosted Regression Trees (BRT), and Random Forests (RF) to project current and future distributions of date palm ( Phoenix dactylifera L.). The Global Climate Model (GCM), the CSIRO-Mk3.0 (CS) using the A2 emissions scenario, was selected for making projections. Both indigenous and alien distribution data of the species were utilized in the modeling process. The common areas predicted by MX, BRT, RF, and CL from the CS GCM were extracted and compared to ascertain projection uncertainty levels of each individual technique. The common areas identified by all four modeling techniques were used to produce a map indicating suitable and unsuitable areas for date palm cultivation for Middle Eastern countries, for the present and the year 2100. The four different modeling approaches predict fairly different distributions. Projections from CL were more conservative than from MX. The BRT and RF were the most conservative methods in terms of projections for the current time. The combination of the final CL and MX projections for the present and 2100 provide higher certainty concerning those areas that will become highly suitable for future date palm cultivation. According to the four models, cold, hot, and wet stress, with differences on a regional basis, appears to be the major restrictions on future date palm distribution. The results demonstrate variances in the projections, resulting from different techniques. The assessment and interpretation of model projections requires reservations

  19. Determinants of reproductive success in dominant pairs of clownfish: a boosted regression tree analysis.

    PubMed

    Buston, Peter M; Elith, Jane

    2011-05-01

    1. Central questions of behavioural and evolutionary ecology are what factors influence the reproductive success of dominant breeders and subordinate nonbreeders within animal societies? A complete understanding of any society requires that these questions be answered for all individuals. 2. The clown anemonefish, Amphiprion percula, forms simple societies that live in close association with sea anemones, Heteractis magnifica. Here, we use data from a well-studied population of A. percula to determine the major predictors of reproductive success of dominant pairs in this species. 3. We analyse the effect of multiple predictors on four components of reproductive success, using a relatively new technique from the field of statistical learning: boosted regression trees (BRTs). BRTs have the potential to model complex relationships in ways that give powerful insight. 4. We show that the reproductive success of dominant pairs is unrelated to the presence, number or phenotype of nonbreeders. This is consistent with the observation that nonbreeders do not help or hinder breeders in any way, confirming and extending the results of a previous study. 5. Primarily, reproductive success is negatively related to male growth and positively related to breeding experience. It is likely that these effects are interrelated because males that grow a lot have little breeding experience. These effects are indicative of a trade-off between male growth and parental investment. 6. Secondarily, reproductive success is positively related to female growth and size. In this population, female size is positively related to group size and anemone size, also. These positive correlations among traits likely are caused by variation in site quality and are suggestive of a silver-spoon effect. 7. Noteworthily, whereas reproductive success is positively related to female size, it is unrelated to male size. This observation provides support for the size advantage hypothesis for sex change: both

  20. Comparing pseudo-absences generation techniques in Boosted Regression Trees models for conservation purposes: A case study on amphibians in a protected area.

    PubMed

    Cerasoli, Francesco; Iannella, Mattia; D'Alessandro, Paola; Biondi, Maurizio

    2017-01-01

    Boosted Regression Trees (BRT) is one of the modelling techniques most recently applied to biodiversity conservation and it can be implemented with presence-only data through the generation of artificial absences (pseudo-absences). In this paper, three pseudo-absences generation techniques are compared, namely the generation of pseudo-absences within target-group background (TGB), testing both the weighted (WTGB) and unweighted (UTGB) scheme, and the generation at random (RDM), evaluating their performance and applicability in distribution modelling and species conservation. The choice of the target group fell on amphibians, because of their rapid decline worldwide and the frequent lack of guidelines for conservation strategies and regional-scale planning, which instead could be provided through an appropriate implementation of SDMs. Bufo bufo, Salamandrina perspicillata and Triturus carnifex were considered as target species, in order to perform our analysis with species having different ecological and distributional characteristics. The study area is the "Gran Sasso-Monti della Laga" National Park, which hosts 15 Natura 2000 sites and represents one of the most important biodiversity hotspots in Europe. Our results show that the model calibration ameliorates when using the target-group based pseudo-absences compared to the random ones, especially when applying the WTGB. Contrarily, model discrimination did not significantly vary in a consistent way among the three approaches with respect to the tree target species. Both WTGB and RDM clearly isolate the highly contributing variables, supplying many relevant indications for species conservation actions. Moreover, the assessment of pairwise variable interactions and their three-dimensional visualization further increase the amount of useful information for protected areas' managers. Finally, we suggest the use of RDM as an admissible alternative when it is not possible to individuate a suitable set of species as a

  1. Improving Cluster Analysis with Automatic Variable Selection Based on Trees

    DTIC Science & Technology

    2014-12-01

    regression trees Daisy DISsimilAritY PAM partitioning around medoids PMA penalized multivariate analysis SPC sparse principal components UPGMA unweighted...unweighted pair-group average method ( UPGMA ). This method measures dissimilarities between all objects in two clusters and takes the average value

  2. Million trees Los Angeles canopy cover and benefit assessment

    Treesearch

    E.G. McPherson; J.R. Simpson; Q. Xiao; C. Wu

    2011-01-01

    The Million Trees LA initiative intends to improve Los Angeles’s environment through planting and stewardship of 1 million trees. The purpose of this study was to measure Los Angeles’s existing tree canopy cover (TCC), determine if space exists for 1 million additional trees, and estimate future benefits from the planting. High-resolution QuickBird remote sensing data...

  3. Observed Methods for Felling Hardwood Trees with Chain Saws

    Treesearch

    Jerry L. Koger

    1983-01-01

    The angles and lengths of the cutting surfaces made by chain saw operators on hardwood tree stumps are described by means, standard deviations, ranges, and regression equations. Recommended felling guidelines are compared with observed felling methods used by experienced timber cutters in the southern Appalachian Mountains.

  4. A way forward for fire-caused tree mortality prediction: Modeling a physiological consequence of fire

    Treesearch

    Kathleen L. Kavanaugh; Matthew B. Dickinson; Anthony S. Bova

    2010-01-01

    Current operational methods for predicting tree mortality from fire injury are regression-based models that only indirectly consider underlying causes and, thus, have limited generality. A better understanding of the physiological consequences of tree heating and injury are needed to develop biophysical process models that can make predictions under changing or novel...

  5. Estimating tree species diversity in the savannah using NDVI and woody canopy cover

    NASA Astrophysics Data System (ADS)

    Madonsela, Sabelo; Cho, Moses Azong; Ramoelo, Abel; Mutanga, Onisimo; Naidoo, Laven

    2018-04-01

    Remote sensing applications in biodiversity research often rely on the establishment of relationships between spectral information from the image and tree species diversity measured in the field. Most studies have used normalized difference vegetation index (NDVI) to estimate tree species diversity on the basis that it is sensitive to primary productivity which defines spatial variation in plant diversity. The NDVI signal is influenced by photosynthetically active vegetation which, in the savannah, includes woody canopy foliage and grasses. The question is whether the relationship between NDVI and tree species diversity in the savanna depends on the woody cover percentage. This study explored the relationship between woody canopy cover (WCC) and tree species diversity in the savannah woodland of southern Africa and also investigated whether there is a significant interaction between seasonal NDVI and WCC in the factorial model when estimating tree species diversity. To fulfil our aim, we followed stratified random sampling approach and surveyed tree species in 68 plots of 90 m × 90 m across the study area. Within each plot, all trees with diameter at breast height of >10 cm were sampled and Shannon index - a common measure of species diversity which considers both species richness and abundance - was used to quantify tree species diversity. We then extracted WCC in each plot from existing fractional woody cover product produced from Synthetic Aperture Radar (SAR) data. Factorial regression model was used to determine the interaction effect between NDVI and WCC when estimating tree species diversity. Results from regression analysis showed that (i) WCC has a highly significant relationship with tree species diversity (r2 = 0.21; p < 0.01), (ii) the interaction between the NDVI and WCC is not significant, however, the factorial model significantly reduced the error of prediction (RMSE = 0.47, p < 0.05) compared to NDVI (RMSE = 0.49) or WCC (RMSE = 0.49) model during

  6. Improving ensemble decision tree performance using Adaboost and Bagging

    NASA Astrophysics Data System (ADS)

    Hasan, Md. Rajib; Siraj, Fadzilah; Sainin, Mohd Shamrie

    2015-12-01

    Ensemble classifier systems are considered as one of the most promising in medical data classification and the performance of deceision tree classifier can be increased by the ensemble method as it is proven to be better than single classifiers. However, in a ensemble settings the performance depends on the selection of suitable base classifier. This research employed two prominent esemble s namely Adaboost and Bagging with base classifiers such as Random Forest, Random Tree, j48, j48grafts and Logistic Model Regression (LMT) that have been selected independently. The empirical study shows that the performance varries when different base classifiers are selected and even some places overfitting issue also been noted. The evidence shows that ensemble decision tree classfiers using Adaboost and Bagging improves the performance of selected medical data sets.

  7. Does Gene Tree Discordance Explain the Mismatch between Macroevolutionary Models and Empirical Patterns of Tree Shape and Branching Times?

    PubMed Central

    Stadler, Tanja; Degnan, James H.; Rosenberg, Noah A.

    2016-01-01

    Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth–death and multispecies coalescent model can explain the difference in empirical trees and birth–death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion. PMID:26968785

  8. Association of tree nut and coconut sensitizations.

    PubMed

    Polk, Brooke I; Dinakarpandian, Deendayal; Nanda, Maya; Barnes, Charles; Dinakar, Chitra

    2016-10-01

    Coconut (Cocos nucifera), despite being a drupe, was added to the US Food and Drug Administration list of tree nuts in 2006, causing potential confusion regarding the prevalence of coconut allergy among tree nut allergic patients. To determine whether sensitization to tree nuts is associated with increased odds of coconut sensitization. A single-center retrospective analysis of serum specific IgE levels to coconut, tree nuts (almond, Brazil nut, cashew, chestnut, hazelnut, macadamia, pecan, pistachio, and walnut), and controls (milk and peanut) was performed using deidentified data from January 2000 to August 2012. Spearman correlation (ρ) between coconut and each tree nut was determined, followed by hierarchical clustering. Sensitization was defined as a nut specific IgE level of 0.35 kU/L or higher. Unadjusted and adjusted associations between coconut and tree nut sensitization were tested by logistic regression. Of 298 coconut IgE values, 90 (30%) were considered positive results, with a mean (SD) of 1.70 (8.28) kU/L. Macadamia had the strongest correlation (ρ = 0.77), whereas most other tree nuts had significant (P < .05) but low correlation (ρ < 0.5) with coconut. The adjusted odds ratio between coconut and macadamia was 7.39 (95% confidence interval, 2.60-21.02; P < .001) and 5.32 (95% confidence interval, 2.18-12.95; P < .001) between coconut and almond, with other nuts not being statistically significant. Our findings suggest that although sensitization to most tree nuts appears to correlate with coconut, this is largely explained by sensitization to almond and macadamia. This finding has not previously been reported in the literature. Further study correlating these results with clinical symptoms is planned. Copyright © 2016 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  9. Fuzzy tree automata and syntactic pattern recognition.

    PubMed

    Lee, E T

    1982-04-01

    An approach of representing patterns by trees and processing these trees by fuzzy tree automata is described. Fuzzy tree automata are defined and investigated. The results include that the class of fuzzy root-to-frontier recognizable ¿-trees is closed under intersection, union, and complementation. Thus, the class of fuzzy root-to-frontier recognizable ¿-trees forms a Boolean algebra. Fuzzy tree automata are applied to processing fuzzy tree representation of patterns based on syntactic pattern recognition. The grade of acceptance is defined and investigated. Quantitative measures of ``approximate isosceles triangle,'' ``approximate elongated isosceles triangle,'' ``approximate rectangle,'' and ``approximate cross'' are defined and used in the illustrative examples of this approach. By using these quantitative measures, a house, a house with high roof, and a church are also presented as illustrative examples. In addition, three fuzzy tree automata are constructed which have the capability of processing the fuzzy tree representations of ``fuzzy houses,'' ``houses with high roofs,'' and ``fuzzy churches,'' respectively. The results may have useful applications in pattern recognition, image processing, artificial intelligence, pattern database design and processing, image science, and pictorial information systems.

  10. Predicting Diameter at Breast Height from Stump Diameters for Northeastern Tree Species

    Treesearch

    Eric H. Wharton; Eric H. Wharton

    1984-01-01

    Presents equations to predict diameter at breast height from stump diameter measurements for 17 northeastern tree species. Simple linear regression was used to develop the equations. Application of the equations is discussed.

  11. Multi-scale remote sensing sagebrush characterization with regression trees over Wyoming, USA: laying a foundation for monitoring

    USGS Publications Warehouse

    Homer, Collin G.; Aldridge, Cameron L.; Meyer, Debra K.; Schell, Spencer J.

    2012-01-01

    agebrush ecosystems in North America have experienced extensive degradation since European settlement. Further degradation continues from exotic invasive plants, altered fire frequency, intensive grazing practices, oil and gas development, and climate change – adding urgency to the need for ecosystem-wide understanding. Remote sensing is often identified as a key information source to facilitate ecosystem-wide characterization, monitoring, and analysis; however, approaches that characterize sagebrush with sufficient and accurate local detail across large enough areas to support this paradigm are unavailable. We describe the development of a new remote sensing sagebrush characterization approach for the state of Wyoming, U.S.A. This approach integrates 2.4 m QuickBird, 30 m Landsat TM, and 56 m AWiFS imagery into the characterization of four primary continuous field components including percent bare ground, percent herbaceous cover, percent litter, and percent shrub, and four secondary components including percent sagebrush (Artemisia spp.), percent big sagebrush (Artemisia tridentata), percent Wyoming sagebrush (Artemisia tridentata Wyomingensis), and shrub height using a regression tree. According to an independent accuracy assessment, primary component root mean square error (RMSE) values ranged from 4.90 to 10.16 for 2.4 m QuickBird, 6.01 to 15.54 for 30 m Landsat, and 6.97 to 16.14 for 56 m AWiFS. Shrub and herbaceous components outperformed the current data standard called LANDFIRE, with a shrub RMSE value of 6.04 versus 12.64 and a herbaceous component RMSE value of 12.89 versus 14.63. This approach offers new advancements in sagebrush characterization from remote sensing and provides a foundation to quantitatively monitor these components into the future.

  12. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  13. Understanding child stunting in India: a comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression.

    PubMed

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A

    2013-01-01

    Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Using cross-sectional data for children aged 0-24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role.

  14. Socio-economic and lifestyle parameters associated with diet quality of children and adolescents using classification and regression tree analysis: the DIATROFI study.

    PubMed

    Yannakoulia, Mary; Lykou, Anastasia; Kastorini, Christina Maria; Saranti Papasaranti, Eirini; Petralias, Athanassios; Veloudaki, Afroditi; Linos, Athena

    2016-02-01

    To explore factors affecting children's and adolescents' diet quality, in the framework of a food aid and promotion of healthy nutrition programme implemented in areas of low socio-economic status of Greece, during the current financial crisis. From a total of 162 schools participating in the programme during 2012-2013, we gathered 15 897 questionnaires recording sociodemographic characteristics, lifestyle parameters and dietary habits of children and their families. As a measure of socio-economic status, the Family Affluence Scale (FAS) was used; whereas for the assessment of diet quality, the KIDMED score was computed. Associations between KIDMED and FAS, physical activity and socio-economic parameters were examined using regression and classification-regression tree analysis (CART). The higher the FAS score, the greater the percentage of children and adolescents who reported to consume, on a daily basis, fruits and vegetables, dairy products and breakfast (P<0·001). Results from CART showed that children and adolescents in the medium or high FAS groups had higher KIDMED score, compared with those in the low FAS group. For those in the low FAS group, KIDMED score is expected to increase by 12·4 % when they spend more than 0·25 h/week in sports activities. The respective threshold for the medium and high FAS groups is 1·75 h/week, while education of the mother and father affected KIDMED score significantly as well. Diet quality is strongly influenced by socio-economic parameters in children and adolescents living in economically disadvantaged areas of Greece, so that lower family affluence is associated with worse diet quality.

  15. Benefits of tree mixes in carbon plantings

    NASA Astrophysics Data System (ADS)

    Hulvey, Kristin B.; Hobbs, Richard J.; Standish, Rachel J.; Lindenmayer, David B.; Lach, Lori; Perring, Michael P.

    2013-10-01

    Increasingly governments and the private sector are using planted forests to offset carbon emissions. Few studies, however, examine how tree diversity -- defined here as species richness and/or stand composition -- affects carbon storage in these plantings. Using aboveground tree biomass as a proxy for carbon storage, we used meta-analysis to compare carbon storage in tree mixtures with monoculture plantings. Tree mixes stored at least as much carbon as monocultures consisting of the mixture's most productive species and at times outperformed monoculture plantings. In mixed-species stands, individual species, and in particular nitrogen-fixing trees, increased stand biomass. Further motivations for incorporating tree richness into planted forests include the contribution of diversity to total forest carbon-pool development, carbon-pool stability and the provision of extra ecosystem services. Our findings suggest a two-pronged strategy for designing carbon plantings including: (1) increased tree species richness; and (2) the addition of species that contribute to carbon storage and other target functions.

  16. Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

    NASA Astrophysics Data System (ADS)

    Murtagh, Fionn

    2017-06-01

    This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or `photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

  17. Delayed conifer tree mortality following fire in California

    Treesearch

    Sharon M. Hood; Sheri L. Smith; Daniel R. Cluck

    2007-01-01

    Fire injury was characterized and survival monitored for 5,246 trees from five wildfires in California that occurred between 1999 and 2002. Logistic regression models for predicting the probability of mortality were developed for incense-cedar, Jeffrey pine, ponderosa pine, red fir and white fir. Two-year post-fire preliminary models were developed for incense-cedar,...

  18. Category of trees in representation theory of quantum algebras

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Moskaliuk, N. M.; Moskaliuk, S. S., E-mail: mss@bitp.kiev.ua

    2013-10-15

    New applications of categorical methods are connected with new additional structures on categories. One of such structures in representation theory of quantum algebras, the category of Kuznetsov-Smorodinsky-Vilenkin-Smirnov (KSVS) trees, is constructed, whose objects are finite rooted KSVS trees and morphisms generated by the transition from a KSVS tree to another one.

  19. Tree thinning as an option to increase herbaceous yield of an encroached semi-arid savanna in South Africa

    PubMed Central

    Smit, Gert N

    2005-01-01

    Background The investigation was conducted in a savanna area covered by what was considered an undesirably dense stand of Colophospermum mopane trees, mainly because such a dense stand of trees often results in the suppression of herbaceous plants. The objectives of this study were to determine the influence of intensity of tree thinning on the dry matter yield of herbaceous plants (notably grasses) and to investigate differences in herbaceous species composition between defined subhabitats (under tree canopies, between tree canopies and where trees have been removed). Seven plots (65 × 180 m) were subjected to different intensities of tree thinning, ranging from a totally cleared plot (0 %) to plots thinned to the equivalent of 10 %, 20%, 35 %, 50% and 75 % of the leaf biomass of a control plot (100 %) with a tree density of 2711 plants ha-1. The establishment of herbaceous plants (grasses and forbs) in response to reduced competition from the woody plants was measured during three full growing seasons following the thinning treatments. Results The grass component reacted positively to the tree thinning in terms of total dry matter (DM) yield, but forbs were negatively influenced. Rainfall interacted with tree density and the differences between grass DM yields in thinned plots during years of below average rainfall were substantially higher than those of the control. At high tree densities, yields differed little between seasons of varying rainfall. The relation between grass DM yield and tree biomass was curvilinear, best described by the exponential regression equation. Subhabitat differentiation by C. mopane trees did provide some qualitative benefits, with certain desirable grass species showing a preference for the subhabitat under tree canopies. Conclusion While it can be concluded from this study that high tree densities suppress herbaceous production, the decision to clear/thin the C. mopane trees should include additional considerations. Thinning of C

  20. Exposure and effects of perfluoroalkyl substances in tree ...

    EPA Pesticide Factsheets

    The exposure and effects of perfluoroalkyl substances (PFASs) were studied at eight locations in Minnesota and Wisconsin between 2007 and 2011 using tree swallows (Tachycineta bicolor) as sentinel species. These eight sites covered a range of possible exposure pathways and ecological settings. Concentrations in various swallow tissues were quantified as were reproductive success endpoints. The sample egg method was used wherein an egg sample is collected and the hatching success of the remaining eggs in the nest is assessed. The association between PFAS exposure and reproductive success was assessed by site comparisons, logistic regression analysis, and multistate modeling, a technique that has not previously been used in this context. There was a negative association between concentrations of PFASs in eggs and hatching success; this is the second field study in which a negative association was found. The concentration at which effects became evident (150 200 ng/g wet wt.) was far below effect levels found in laboratory feeding trials or egg injection studies on other avian species. This discrepancy was likely because behavioral effects and other extrinsic factors are not accounted for in these laboratory studies; further, there is a mixture of PFASs in field studies rather than a single-contaminant used in laboratory studies, and the possibility that tree swallows are unusually sensitive to PFASs. Additional field effect studies on other avian species

  1. What Makes a Tree a Tree?

    ERIC Educational Resources Information Center

    NatureScope, 1986

    1986-01-01

    Provides: (1) background information on trees, focusing on the parts of trees and how they differ from other plants; (2) eight activities; and (3) ready-to-copy pages dealing with tree identification and tree rings. Activities include objective(s), recommended age level(s), subject area(s), list of materials needed, and procedures. (JN)

  2. TreeCmp: Comparison of Trees in Polynomial Time

    PubMed Central

    Bogdanowicz, Damian; Giaro, Krzysztof; Wróbel, Borys

    2012-01-01

    When a phylogenetic reconstruction does not result in one tree but in several, tree metrics permit finding out how far the reconstructed trees are from one another. They also permit to assess the accuracy of a reconstruction if a true tree is known. TreeCmp implements eight metrics that can be calculated in polynomial time for arbitrary (not only bifurcating) trees: four for unrooted (Matching Split metric, which we have recently proposed, Robinson-Foulds, Path Difference, Quartet) and four for rooted trees (Matching Cluster, Robinson-Foulds cluster, Nodal Splitted and Triple). TreeCmp is the first implementation of Matching Split/Cluster metrics and the first efficient and convenient implementation of Nodal Splitted. It allows to compare relatively large trees. We provide an example of the application of TreeCmp to compare the accuracy of ten approaches to phylogenetic reconstruction with trees up to 5000 external nodes, using a measure of accuracy based on normalized similarity between trees.

  3. IND - THE IND DECISION TREE PACKAGE

    NASA Technical Reports Server (NTRS)

    Buntine, W.

    1994-01-01

    A common approach to supervised classification and prediction in artificial intelligence and statistical pattern recognition is the use of decision trees. A tree is "grown" from data using a recursive partitioning algorithm to create a tree which has good prediction of classes on new data. Standard algorithms are CART (by Breiman Friedman, Olshen and Stone) and ID3 and its successor C4 (by Quinlan). As well as reimplementing parts of these algorithms and offering experimental control suites, IND also introduces Bayesian and MML methods and more sophisticated search in growing trees. These produce more accurate class probability estimates that are important in applications like diagnosis. IND is applicable to most data sets consisting of independent instances, each described by a fixed length vector of attribute values. An attribute value may be a number, one of a set of attribute specific symbols, or it may be omitted. One of the attributes is delegated the "target" and IND grows trees to predict the target. Prediction can then be done on new data or the decision tree printed out for inspection. IND provides a range of features and styles with convenience for the casual user as well as fine-tuning for the advanced user or those interested in research. IND can be operated in a CART-like mode (but without regression trees, surrogate splits or multivariate splits), and in a mode like the early version of C4. Advanced features allow more extensive search, interactive control and display of tree growing, and Bayesian and MML algorithms for tree pruning and smoothing. These often produce more accurate class probability estimates at the leaves. IND also comes with a comprehensive experimental control suite. IND consists of four basic kinds of routines: data manipulation routines, tree generation routines, tree testing routines, and tree display routines. The data manipulation routines are used to partition a single large data set into smaller training and test sets. The

  4. Beating the Odds: Trees to Success in Different Countries

    ERIC Educational Resources Information Center

    Finch, W. Holmes; Marchant, Gregory J.

    2017-01-01

    A recursive partitioning model approach in the form of classification and regression trees (CART) was used with 2012 PISA data for five countries (Canada, Finland, Germany, Singapore-China, and the Unites States). The objective of the study was to determine demographic and educational variables that differentiated between low SES student that were…

  5. Updated generalized biomass equations for North American tree species

    Treesearch

    David C. Chojnacky; Linda S. Heath; Jennifer C. Jenkins

    2014-01-01

    Historically, tree biomass at large scales has been estimated by applying dimensional analysis techniques and field measurements such as diameter at breast height (dbh) in allometric regression equations. Equations often have been developed using differing methods and applied only to certain species or isolated areas. We previously had compiled and combined (in meta-...

  6. Exposure and effects of perfluoroalkyl substances in tree swallows nesting in Minnesota and Wisconsin, USA

    USGS Publications Warehouse

    Custer, Christine M.; Custer, Thomas W.; Dummer, Paul; Etterson, Matthew A.; Thogmartin, Wayne E.; Wu, Qian; Kannan, Kurunthachalam; Trowbridge, Annette; McKann, Patrick C.

    2013-01-01

    The exposure and effects of perfluoroalkyl substances (PFASs) were studied at eight locations in Minnesota and Wisconsin between 2007 and 2011 using tree swallows (Tachycineta bicolor). Concentrations of PFASs were quantified as were reproductive success end points. The sample egg method was used wherein an egg sample is collected, and the hatching success of the remaining eggs in the nest is assessed. The association between PFAS exposure and reproductive success was assessed by site comparisons, logistic regression analysis, and multistate modeling, a technique not previously used in this context. There was a negative association between concentrations of perfluorooctane sulfonate (PFOS) in eggs and hatching success. The concentration at which effects became evident (150–200 ng/g wet weight) was far lower than effect levels found in laboratory feeding trials or egg-injection studies of other avian species. This discrepancy was likely because behavioral effects and other extrinsic factors are not accounted for in these laboratory studies and the possibility that tree swallows are unusually sensitive to PFASs. The results from multistate modeling and simple logistic regression analyses were nearly identical. Multistate modeling provides a better method to examine possible effects of additional covariates and assessment of models using Akaike information criteria analyses. There was a credible association between PFOS concentrations in plasma and eggs, so extrapolation between these two commonly sampled tissues can be performed.

  7. Coping with Multicollinearity: An Example on Application of Principal Components Regression in Dendroecology

    Treesearch

    B. Desta Fekedulegn; J.J. Colbert; R.R., Jr. Hicks; Michael E. Schuckers

    2002-01-01

    The theory and application of principal components regression, a method for coping with multicollinearity among independent variables in analyzing ecological data, is exhibited in detail. A concrete example of the complex procedures that must be carried out in developing a diagnostic growth-climate model is provided. We use tree radial increment data taken from breast...

  8. Size matters a lot: tree height and prior growth predict drought-induced tree death in Italian oak forests

    NASA Astrophysics Data System (ADS)

    Ripullone, F.; Colangelo, M.; Camarero, J. J.; Gazol, A.; Borghetti, M.; Gentilesca, T.

    2016-12-01

    Climate warming is expected to amplify drought stress resulting in the occurrence of more widespread dieback episodes and increasing mortality rates. This has pushed the search of reliable and robust early-warning indicators of impending drought-triggered tree death. Recent studies highlight how level of defoliation or age of trees strictly coact with drought in leading to forest decline. In addition, tree size and the tree-to-tree competition for water could also contribute to tree death in drought-prone sites. In this regard, it has been predicted that tall trees with isohydric stomatal regulation are most likely to die due to drought stress. Here, we test this hypothesis by analyzing size, age, competition and growth data in a Mediterranean oak species characterized by anisohydric behaviour, showing recent drought-induced mortality in two Italian forest sites. At both study sites, tree height was associated to the probability of dying. However, this association was opposite to published predictions because living trees were taller than dead trees at both sites. Neither age nor competition intensity played significant roles as drivers of tree mortality. Regarding growth data, trends in basal area increment were significantly smaller in dead than in living trees. Differences were most marked at mid (15 years prior to death) than at short (10 years) or long-term (35 year) scales. This is probably not related to intrinsic growth features of the study species but it can be explained because the most severe drought since 1950 occurred in 2000 at the study area, i.e. 15 years prior to the increase of tree mortality and when growth of living and dead trees started diverging. Lastly, we discuss potential factors which may explain why smaller individuals of anisohydric tree species such as Mediterranean oaks are prone to drought-induced tree death.

  9. Using tree diversity to compare phylogenetic heuristics.

    PubMed

    Sul, Seung-Jin; Matthews, Suzanne; Williams, Tiffani L

    2009-04-29

    Evolutionary trees are family trees that represent the relationships between a group of organisms. Phylogenetic heuristics are used to search stochastically for the best-scoring trees in tree space. Given that better tree scores are believed to be better approximations of the true phylogeny, traditional evaluation techniques have used tree scores to determine the heuristics that find the best scores in the fastest time. We develop new techniques to evaluate phylogenetic heuristics based on both tree scores and topologies to compare Pauprat and Rec-I-DCM3, two popular Maximum Parsimony search algorithms. Our results show that although Pauprat and Rec-I-DCM3 find the trees with the same best scores, topologically these trees are quite different. Furthermore, the Rec-I-DCM3 trees cluster distinctly from the Pauprat trees. In addition to our heatmap visualizations of using parsimony scores and the Robinson-Foulds distance to compare best-scoring trees found by the two heuristics, we also develop entropy-based methods to show the diversity of the trees found. Overall, Pauprat identifies more diverse trees than Rec-I-DCM3. Overall, our work shows that there is value to comparing heuristics beyond the parsimony scores that they find. Pauprat is a slower heuristic than Rec-I-DCM3. However, our work shows that there is tremendous value in using Pauprat to reconstruct trees-especially since it finds identical scoring but topologically distinct trees. Hence, instead of discounting Pauprat, effort should go in improving its implementation. Ultimately, improved performance measures lead to better phylogenetic heuristics and will result in better approximations of the true evolutionary history of the organisms of interest.

  10. Modifiable risk factors predicting major depressive disorder at four year follow-up: a decision tree approach.

    PubMed

    Batterham, Philip J; Christensen, Helen; Mackinnon, Andrew J

    2009-11-22

    Relative to physical health conditions such as cardiovascular disease, little is known about risk factors that predict the prevalence of depression. The present study investigates the expected effects of a reduction of these risks over time, using the decision tree method favoured in assessing cardiovascular disease risk. The PATH through Life cohort was used for the study, comprising 2,105 20-24 year olds, 2,323 40-44 year olds and 2,177 60-64 year olds sampled from the community in the Canberra region, Australia. A decision tree methodology was used to predict the presence of major depressive disorder after four years of follow-up. The decision tree was compared with a logistic regression analysis using ROC curves. The decision tree was found to distinguish and delineate a wide range of risk profiles. Previous depressive symptoms were most highly predictive of depression after four years, however, modifiable risk factors such as substance use and employment status played significant roles in assessing the risk of depression. The decision tree was found to have better sensitivity and specificity than a logistic regression using identical predictors. The decision tree method was useful in assessing the risk of major depressive disorder over four years. Application of the model to the development of a predictive tool for tailored interventions is discussed.

  11. Reconstructing Unrooted Phylogenetic Trees from Symbolic Ternary Metrics.

    PubMed

    Grünewald, Stefan; Long, Yangjing; Wu, Yaokun

    2018-03-09

    Böcker and Dress (Adv Math 138:105-125, 1998) presented a 1-to-1 correspondence between symbolically dated rooted trees and symbolic ultrametrics. We consider the corresponding problem for unrooted trees. More precisely, given a tree T with leaf set X and a proper vertex coloring of its interior vertices, we can map every triple of three different leaves to the color of its median vertex. We characterize all ternary maps that can be obtained in this way in terms of 4- and 5-point conditions, and we show that the corresponding tree and its coloring can be reconstructed from a ternary map that satisfies those conditions. Further, we give an additional condition that characterizes whether the tree is binary, and we describe an algorithm that reconstructs general trees in a bottom-up fashion.

  12. Measuring urban tree loss dynamics across residential landscapes.

    PubMed

    Ossola, Alessandro; Hopton, Matthew E

    2018-01-15

    The spatial arrangement of urban vegetation depends on urban morphology and socio-economic settings. Urban vegetation changes over time because of human management. Urban trees are removed due to hazard prevention or aesthetic preferences. Previous research attributed tree loss to decreases in canopy cover. However, this provides little information about location and structural characteristics of trees lost, as well as environmental and social factors affecting tree loss dynamics. This is particularly relevant in residential landscapes where access to residential parcels for field surveys is limited. We tested whether multi-temporal airborne LiDAR and multi-spectral imagery collected at a 5-year interval can be used to investigate urban tree loss dynamics across residential landscapes in Denver, CO and Milwaukee, WI, covering 400,705 residential parcels in 444 census tracts. Position and stem height of trees lost were extracted from canopy height models calculated as the difference between final (year 5) and initial (year 0) vegetation height derived from LiDAR. Multivariate regression models were used to predict number and height of tree stems lost in residential parcels in each census tract based on urban morphological and socio-economic variables. A total of 28,427 stems were lost from residential parcels in Denver and Milwaukee over 5years. Overall, 7% of residential parcels lost one stem, averaging 90.87 stems per km 2 . Average stem height was 10.16m, though trees lost in Denver were taller compared to Milwaukee. The number of stems lost was higher in neighborhoods with higher canopy cover and developed before the 1970s. However, socio-economic characteristics had little effect on tree loss dynamics. The study provides a simple method for measuring urban tree loss dynamics within and across entire cities, and represents a further step toward high resolution assessments of the three-dimensional change of urban vegetation at large spatial scales. Published by

  13. Selective Tree-ring Models: A Novel Method for Reconstructing Streamflow Using Tree Rings

    NASA Astrophysics Data System (ADS)

    Foard, M. B.; Nelson, A. S.; Harley, G. L.

    2017-12-01

    Surface water is among the most instrumental and vulnerable resources in the Northwest United States (NW). Recent observations show that overall water quantity is declining in streams across the region, while extreme flooding events occur more frequently. Historical streamflow models inform probabilities of extreme flow events (flood or drought) by describing frequency and duration of past events. There are numerous examples of tree-rings being utilized to reconstruct streamflow in the NW. These models confirm that tree-rings are highly accurate at predicting streamflow, however there are many nuances that limit their applicability through time and space. For example, most models predict streamflow from hydrologically altered rivers (e.g. dammed, channelized) which may hinder our ability to predict natural prehistoric flow. They also have a tendency to over/under-predict extreme flow events. Moreover, they often neglect to capture the changing relationships between tree-growth and streamflow over time and space. To address these limitations, we utilized national tree-ring and streamflow archives to investigate the relationships between the growth of multiple coniferous species and free-flowing streams across the NW using novel species-and site-specific streamflow models - a term we coined"selective tree-ring models." Correlation function analysis and regression modeling were used to evaluate the strengths and directions of the flow-growth relationships. Species with significant relationships in the same direction were identified as strong candidates for selective models. Temporal and spatial patterns of these relationships were examined using running correlations and inverse distance weighting interpolation, respectively. Our early results indicate that (1) species adapted to extreme climates (e.g. hot-dry, cold-wet) exhibit the most consistent relationships across space, (2) these relationships weaken in locations with mild climatic variability, and (3) some

  14. Using Evidence-Based Decision Trees Instead of Formulas to Identify At-Risk Readers. REL 2014-036

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov; Foorman, Barbara R.

    2014-01-01

    This study examines whether the classification and regression tree (CART) model improves the early identification of students at risk for reading comprehension difficulties compared with the more difficult to interpret logistic regression model. CART is a type of predictive modeling that relies on nonparametric techniques. It presents results in…

  15. The Inference of Gene Trees with Species Trees

    PubMed Central

    Szöllősi, Gergely J.; Tannier, Eric; Daubin, Vincent; Boussau, Bastien

    2015-01-01

    This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution. PMID:25070970

  16. Boosted Regression Trees Outperforms Support Vector Machines in Predicting (Regional) Yields of Winter Wheat from Single and Cumulated Dekadal Spot-VGT Derived Normalized Difference Vegetation Indices

    NASA Astrophysics Data System (ADS)

    Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos

    2016-08-01

    This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.

  17. Influence of Tree-Scale Environmental Variability on Tree-Ring Reconstructions of Temperature at Sonora Pass, CA

    NASA Astrophysics Data System (ADS)

    Ma, L.; Stine, A.

    2016-12-01

    Tree-ring width from treeline environments tend to covary with local interannual temperature variabilities. However, other environmental factors such as moisture and light availability may further modulate tree growth in cold climates. We investigate the influence of various environmental factors on a tree-ring record from a research plot near Sonora Pass, CA (38.32N, 119.64W; elev. 3130 m). This treeline ecotone is dominated by whitebark pine (Pinus albicaulis) growing as individuals and as stands, and at the transition between tree form and krummholtz. We surveyed all trees in the 160m x 90m site, mapping and coring all trees with a diameter at breast height greater than 10 cm. We use survey data to test for an influence of inter-tree competition on growth. We also test for modulation of growth by variation in distance from surface water, aspect and slope, and soil types. Initial result shows a relationship between tree ring width and local May-July temperature (R = 0.33, p < 0.01), suggesting summer temperature as a large-scale control on growth. Incorporating the tree-level metadata, we test for the effect of spatial variability on mean growth rate and on reconstructed temperatures. Trees that have larger or closer neighboring trees experience greater competition, and we hypothesize that competition will be inversely related to average growth rate. Further, we test the sensitivity of ring-width interannual variability to other non-temperature environmental drivers such as moisture availability, light competition, and spatial relations in the microenvironment. We hypothesize that trees that have ready access to light and water will likely produce ring records more closely correlated with the temperature record, and thus will produce a temperature reconstruction with a higher signal-to-noise ratio; whereas trees that experience more microenvironment limitations or competition will produce ring records resembling temperature and additional environmental factors or

  18. Regression analysis of clustered failure time data with informative cluster size under the additive transformation models.

    PubMed

    Chen, Ling; Feng, Yanqin; Sun, Jianguo

    2017-10-01

    This paper discusses regression analysis of clustered failure time data, which occur when the failure times of interest are collected from clusters. In particular, we consider the situation where the correlated failure times of interest may be related to cluster sizes. For inference, we present two estimation procedures, the weighted estimating equation-based method and the within-cluster resampling-based method, when the correlated failure times of interest arise from a class of additive transformation models. The former makes use of the inverse of cluster sizes as weights in the estimating equations, while the latter can be easily implemented by using the existing software packages for right-censored failure time data. An extensive simulation study is conducted and indicates that the proposed approaches work well in both the situations with and without informative cluster size. They are applied to a dental study that motivated this study.

  19. Long tree-ring chronologies provide evidence of recent tree growth decrease in a Central African tropical forest.

    PubMed

    Battipaglia, Giovanna; Zalloni, Enrica; Castaldi, Simona; Marzaioli, Fabio; Cazzolla-Gatti, Roberto; Lasserre, Bruno; Tognetti, Roberto; Marchetti, Marco; Valentini, Riccardo

    2015-01-01

    It is still unclear whether the exponential rise of atmospheric CO2 concentration has produced a fertilization effect on tropical forests, thus incrementing their growth rate, in the last two centuries. As many factors affect tree growth patterns, short -term studies might be influenced by the confounding effect of several interacting environmental variables on plant growth. Long-term analyses of tree growth can elucidate long-term trends of plant growth response to dominant drivers. The study of annual rings, applied to long tree-ring chronologies in tropical forest trees enables such analysis. Long-term tree-ring chronologies of three widespread African species were measured in Central Africa to analyze the growth of trees over the last two centuries. Growth trends were correlated to changes in global atmospheric CO2 concentration and local variations in the main climatic drivers, temperature and rainfall. Our results provided no evidence for a fertilization effect of CO2 on tree growth. On the contrary, an overall growth decline was observed for all three species in the last century, which appears to be significantly correlated to the increase in local temperature. These findings provide additional support to the global observations of a slowing down of C sequestration in the trunks of forest trees in recent decades. Data indicate that the CO2 increase alone has not been sufficient to obtain a tree growth increase in tropical trees. The effect of other changing environmental factors, like temperature, may have overridden the fertilization effect of CO2.

  20. Long Tree-Ring Chronologies Provide Evidence of Recent Tree Growth Decrease in a Central African Tropical Forest

    PubMed Central

    Battipaglia, Giovanna; Zalloni, Enrica; Castaldi, Simona; Marzaioli, Fabio; Cazzolla- Gatti, Roberto; Lasserre, Bruno; Tognetti, Roberto; Marchetti, Marco; Valentini, Riccardo

    2015-01-01

    It is still unclear whether the exponential rise of atmospheric CO2 concentration has produced a fertilization effect on tropical forests, thus incrementing their growth rate, in the last two centuries. As many factors affect tree growth patterns, short -term studies might be influenced by the confounding effect of several interacting environmental variables on plant growth. Long-term analyses of tree growth can elucidate long-term trends of plant growth response to dominant drivers. The study of annual rings, applied to long tree-ring chronologies in tropical forest trees enables such analysis. Long-term tree-ring chronologies of three widespread African species were measured in Central Africa to analyze the growth of trees over the last two centuries. Growth trends were correlated to changes in global atmospheric CO2 concentration and local variations in the main climatic drivers, temperature and rainfall. Our results provided no evidence for a fertilization effect of CO2 on tree growth. On the contrary, an overall growth decline was observed for all three species in the last century, which appears to be significantly correlated to the increase in local temperature. These findings provide additional support to the global observations of a slowing down of C sequestration in the trunks of forest trees in recent decades. Data indicate that the CO2 increase alone has not been sufficient to obtain a tree growth increase in tropical trees. The effect of other changing environmental factors, like temperature, may have overridden the fertilization effect of CO2. PMID:25806946

  1. The View from the Trees: Nocturnal Bull Ants, Myrmecia midas, Use the Surrounding Panorama While Descending from Trees

    PubMed Central

    Freas, Cody A.; Wystrach, Antione; Narendra, Ajay; Cheng, Ken

    2018-01-01

    Solitary foraging ants commonly use visual cues from their environment for navigation. Foragers are known to store visual scenes from the surrounding panorama for later guidance to known resources and to return successfully back to the nest. Several ant species travel not only on the ground, but also climb trees to locate resources. The navigational information that guides animals back home during their descent, while their body is perpendicular to the ground, is largely unknown. Here, we investigate in a nocturnal ant, Myrmecia midas, whether foragers travelling down a tree use visual information to return home. These ants establish nests at the base of a tree on which they forage and in addition, they also forage on nearby trees. We collected foragers and placed them on the trunk of the nest tree or a foraging tree in multiple compass directions. Regardless of the displacement location, upon release ants immediately moved to the side of the trunk facing the nest during their descent. When ants were released on non-foraging trees near the nest, displaced foragers again travelled around the tree to the side facing the nest. All the displaced foragers reached the correct side of the tree well before reaching the ground. However, when the terrestrial cues around the tree were blocked, foragers were unable to orient correctly, suggesting that the surrounding panorama is critical to successful orientation on the tree. Through analysis of panoramic pictures, we show that views acquired at the base of the foraging tree nest can provide reliable nest-ward orientation up to 1.75 m above the ground. We discuss, how animals descending from trees compare their current scene to a memorised scene and report on the similarities in visually guided behaviour while navigating on the ground and descending from trees. PMID:29422880

  2. The View from the Trees: Nocturnal Bull Ants, Myrmecia midas, Use the Surrounding Panorama While Descending from Trees.

    PubMed

    Freas, Cody A; Wystrach, Antione; Narendra, Ajay; Cheng, Ken

    2018-01-01

    Solitary foraging ants commonly use visual cues from their environment for navigation. Foragers are known to store visual scenes from the surrounding panorama for later guidance to known resources and to return successfully back to the nest. Several ant species travel not only on the ground, but also climb trees to locate resources. The navigational information that guides animals back home during their descent, while their body is perpendicular to the ground, is largely unknown. Here, we investigate in a nocturnal ant, Myrmecia midas , whether foragers travelling down a tree use visual information to return home. These ants establish nests at the base of a tree on which they forage and in addition, they also forage on nearby trees. We collected foragers and placed them on the trunk of the nest tree or a foraging tree in multiple compass directions. Regardless of the displacement location, upon release ants immediately moved to the side of the trunk facing the nest during their descent. When ants were released on non-foraging trees near the nest, displaced foragers again travelled around the tree to the side facing the nest. All the displaced foragers reached the correct side of the tree well before reaching the ground. However, when the terrestrial cues around the tree were blocked, foragers were unable to orient correctly, suggesting that the surrounding panorama is critical to successful orientation on the tree. Through analysis of panoramic pictures, we show that views acquired at the base of the foraging tree nest can provide reliable nest-ward orientation up to 1.75 m above the ground. We discuss, how animals descending from trees compare their current scene to a memorised scene and report on the similarities in visually guided behaviour while navigating on the ground and descending from trees.

  3. Weighted linear regression using D2H and D2 as the independent variables

    Treesearch

    Hans T. Schreuder; Michael S. Williams

    1998-01-01

    Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...

  4. Using Classification Trees to Predict Alumni Giving for Higher Education

    ERIC Educational Resources Information Center

    Weerts, David J.; Ronca, Justin M.

    2009-01-01

    As the relative level of public support for higher education declines, colleges and universities aim to maximize alumni-giving to keep their programs competitive. Anchored in a utility maximization framework, this study employs the classification and regression tree methodology to examine characteristics of alumni donors and non-donors at a…

  5. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA.

    PubMed

    Momen, Bahram; Behling, Shawna J; Lawrence, Greg B; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  6. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA

    PubMed Central

    Momen, Bahram; Behling, Shawna J.; Lawrence, Greg B.; Sullivan, Joseph H.

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (A qe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; A qe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  7. Photosynthetic and growth response of sugar maple (Acer saccharum Marsh.) mature trees and seedlings to calcium, magnesium, and nitrogen additions in the Catskill Mountains, NY, USA

    USGS Publications Warehouse

    Momen, Bahram; Behling, Shawna J; Lawrence, Gregory B.; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the

  8. Beyond Tree Throw: Wind, Water, Rock and the Mechanics of Tree-Driven Bedrock Physical Weathering

    NASA Astrophysics Data System (ADS)

    Marshall, J. A.; Anderson, R. S.; Dawson, T. E.; Dietrich, W. E.; Minear, J. T.

    2017-12-01

    Tree throw is often invoked as the dominant process in converting bedrock to soil and thus helping to build the Critical Zone (CZ). In addition, observations of tree roots lifting sidewalk slabs, occupying cracks, and prying slabs of rock from cliff faces have led to a general belief in the power of plant growth forces. These common observations have led to conceptual models with trees at the center of the soil genesis process. This is despite the observation that tree throw is rare in many forested settings, and a dearth of field measurements that quantify the magnitude of growth forces. While few trees blow down, every tree grows roots, inserting many tens of percent of its mass below ground. Yet we lack data quantifying the role of trees in both damaging bedrock and detaching it (and thus producing soil). By combing force measurements at the tree-bedrock interface with precipitation, solar radiation, wind speed, and wind-driven tree sway data we quantified the magnitude and frequency of tree-driven soil-production mechanisms from two contrasting climatic and lithologic regimes (Boulder and Eel Creek CZ Observatories). Preliminary data suggests that in settings with relatively thin soils, trees can damage and detach rock due to diurnal fluctuations, wind response and rainfall events. Surprisingly, our data suggests that forces from roots and trunks growing against bedrock are insufficient to pry rock apart or damage bedrock although much more work is needed in this area. The frequency, magnitude and style of wind-driven tree forces at the bedrock interface varies considerably from one to another species. This suggests that tree properties such as mass, elasticity, stiffness and branch structure determine whether trees respond to gusts big or small, move at the same frequency as large wind gusts, or are able to self-dampen near-ground sway response to extended wind forces. Our measurements of precipitation-driven and daily fluctuations in root pressures exerted on

  9. Using GA-Ridge regression to select hydro-geological parameters influencing groundwater pollution vulnerability.

    PubMed

    Ahn, Jae Joon; Kim, Young Min; Yoo, Keunje; Park, Joonhong; Oh, Kyong Joo

    2012-11-01

    For groundwater conservation and management, it is important to accurately assess groundwater pollution vulnerability. This study proposed an integrated model using ridge regression and a genetic algorithm (GA) to effectively select the major hydro-geological parameters influencing groundwater pollution vulnerability in an aquifer. The GA-Ridge regression method determined that depth to water, net recharge, topography, and the impact of vadose zone media were the hydro-geological parameters that influenced trichloroethene pollution vulnerability in a Korean aquifer. When using these selected hydro-geological parameters, the accuracy was improved for various statistical nonlinear and artificial intelligence (AI) techniques, such as multinomial logistic regression, decision trees, artificial neural networks, and case-based reasoning. These results provide a proof of concept that the GA-Ridge regression is effective at determining influential hydro-geological parameters for the pollution vulnerability of an aquifer, and in turn, improves the AI performance in assessing groundwater pollution vulnerability.

  10. Landslide susceptibility mapping using decision-tree based CHi-squared automatic interaction detection (CHAID) and Logistic regression (LR) integration

    NASA Astrophysics Data System (ADS)

    Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin

    2014-06-01

    This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.

  11. GIS-based groundwater potential analysis using novel ensemble weights-of-evidence with logistic regression and functional tree models.

    PubMed

    Chen, Wei; Li, Hui; Hou, Enke; Wang, Shengquan; Wang, Guirong; Panahi, Mahdi; Li, Tao; Peng, Tao; Guo, Chen; Niu, Chao; Xiao, Lele; Wang, Jiale; Xie, Xiaoshen; Ahmad, Baharin Bin

    2018-09-01

    The aim of the current study was to produce groundwater spring potential maps using novel ensemble weights-of-evidence (WoE) with logistic regression (LR) and functional tree (FT) models. First, a total of 66 springs were identified by field surveys, out of which 70% of the spring locations were used for training the models and 30% of the spring locations were employed for the validation process. Second, a total of 14 affecting factors including aspect, altitude, slope, plan curvature, profile curvature, stream power index (SPI), topographic wetness index (TWI), sediment transport index (STI), lithology, normalized difference vegetation index (NDVI), land use, soil, distance to roads, and distance to streams was used to analyze the spatial relationship between these affecting factors and spring occurrences. Multicollinearity analysis and feature selection of the correlation attribute evaluation (CAE) method were employed to optimize the affecting factors. Subsequently, the novel ensembles of the WoE, LR, and FT models were constructed using the training dataset. Finally, the receiver operating characteristic (ROC) curves, standard error, confidence interval (CI) at 95%, and significance level P were employed to validate and compare the performance of three models. Overall, all three models performed well for groundwater spring potential evaluation. The prediction capability of the FT model, with the highest AUC values, the smallest standard errors, the narrowest CIs, and the smallest P values for the training and validation datasets, is better compared to those of other models. The groundwater spring potential maps can be adopted for the management of water resources and land use by planners and engineers. Copyright © 2018 Elsevier B.V. All rights reserved.

  12. Can tree species diversity be assessed with Landsat data in a temperate forest?

    PubMed

    Arekhi, Maliheh; Yılmaz, Osman Yalçın; Yılmaz, Hatice; Akyüz, Yaşar Feyza

    2017-10-28

    The diversity of forest trees as an indicator of ecosystem health can be assessed using the spectral characteristics of plant communities through remote sensing data. The objectives of this study were to investigate alpha and beta tree diversity using Landsat data for six dates in the Gönen dam watershed of Turkey. We used richness and the Shannon and Simpson diversity indices to calculate tree alpha diversity. We also represented the relationship between beta diversity and remotely sensed data using species composition similarity and spectral distance similarity of sampling plots via quantile regression. A total of 99 sampling units, each 20 m × 20 m, were selected using geographically stratified random sampling method. Within each plot, the tree species were identified, and all of the trees with a diameter at breast height (dbh) larger than 7 cm were measured. Presence/absence and abundance data (tree species number and tree species basal area) of tree species were used to determine the relationship between richness and the Shannon and Simpson diversity indices, which were computed with ground field data, and spectral variables derived (2 × 2 pixels and 3 × 3 pixels) from Landsat 8 OLI data. The Shannon-Weiner index had the highest correlation. For all six dates, NDVI (normalized difference vegetation index) was the spectral variable most strongly correlated with the Shannon index and the tree diversity variables. The Ratio of green to red (VI) was the spectral variable least correlated with the tree diversity variables and the Shannon basal area. In both beta diversity curves, the slope of the OLS regression was low, while in the upper quantile, it was approximately twice the lower quantiles. The Jaccard index is closed to one with little difference in both two beta diversity approaches. This result is due to increasing the similarity between the sampling plots when they are located close to each other. The intercept differences between two

  13. Static terrestrial laser scanning of juvenile understory trees for field phenotyping

    NASA Astrophysics Data System (ADS)

    Wang, Huanhuan; Lin, Yi

    2014-11-01

    This study was to attempt the cutting-edge 3D remote sensing technique of static terrestrial laser scanning (TLS) for parametric 3D reconstruction of juvenile understory trees. The data for test was collected with a Leica HDS6100 TLS system in a single-scan way. The geometrical structures of juvenile understory trees are extracted by model fitting. Cones are used to model trunks and branches. Principal component analysis (PCA) is adopted to calculate their major axes. Coordinate transformation and orthogonal projection are used to estimate the parameters of the cones. Then, AutoCAD is utilized to simulate the morphological characteristics of the understory trees, and to add secondary branches and leaves in a random way. Comparison of the reference values and the estimated values gives the regression equation and shows that the proposed algorithm of extracting parameters is credible. The results have basically verified the applicability of TLS for field phenotyping of juvenile understory trees.

  14. ETE: a python Environment for Tree Exploration.

    PubMed

    Huerta-Cepas, Jaime; Dopazo, Joaquín; Gabaldón, Toni

    2010-01-13

    Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org.

  15. ETE: a python Environment for Tree Exploration

    PubMed Central

    2010-01-01

    Background Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Results Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. Conclusions ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org. PMID:20070885

  16. Random forests and stochastic gradient boosting for predicting tree canopy cover: Comparing tuning processes and model performance

    Treesearch

    E. Freeman; G. Moisen; J. Coulston; B. Wilson

    2014-01-01

    Random forests (RF) and stochastic gradient boosting (SGB), both involving an ensemble of classification and regression trees, are compared for modeling tree canopy cover for the 2011 National Land Cover Database (NLCD). The objectives of this study were twofold. First, sensitivity of RF and SGB to choices in tuning parameters was explored. Second, performance of the...

  17. Multiple tree-ring isotopes as environmental indicators of diffuse atmospheric pollution in a peri-urban area

    NASA Astrophysics Data System (ADS)

    Doucet, A.; Savard, M. M.; Bégin, C.; Ouarda, T. B.; Marion, J.

    2010-12-01

    The combined analyses of tree-ring δ13C, δ18O, δ15N, 206Pb/207Pb, 206Pb/204Pb and 206Pb/208Pb isotope ratios of three red spruce specimens from the Tantaré ecological reserve located 40 km northwest of Québec City (Canada) were studied with the aim of reconstructing environmental conditions and unravel past air-quality changes of the 1880-2007 period. To separate the tree-ring δ18O and δ13C patterns induced by natural conditions from those generated by anthropogenic perturbations, a linear regression was applied between the most explicative meteorological parameters and the isotopic series for the period of low pollution (1880 to 1909). The model equations were then applied to the most recent part of the series (1910-2007) to verify if climatic conditions have remained the main driver of the tree-ring isotopic variations. The good fit between the modeled and measured δ18O series for the entire studied period suggests that the assimilation of oxygen by red spruce trees is not significantly affected by pollution stress near Québec City. However, the deviation between the measured and modeled δ13C values for the 1944-2007 period indicates that diffuse pollution affected carbon assimilation by the investigated trees. To independently validate if atmospheric pollution could have generated the deviation between the measured and the estimated δ13C values, a linear regression was applied between the portion of the residual δ13C values and atmospheric pollution (Canadian fossil fuel proxy from 1958 to 2000). The nice fit between the modeled δ13C values from the combination of the two regression analyses based on climate and emission proxy strongly supports the hypothesis that there is a natural and an anthropogenic portion in the δ13C variations of the studied specimens. The short-term variations of the red spruce δ15N series are correlated with the instrumentally measured amounts of provincial N emissions for the 1990 to 2006 period (longest measurements

  18. Estimating the prevalence of 26 health-related indicators at neighbourhood level in the Netherlands using structured additive regression.

    PubMed

    van de Kassteele, Jan; Zwakhals, Laurens; Breugelmans, Oscar; Ameling, Caroline; van den Brink, Carolien

    2017-07-01

    Local policy makers increasingly need information on health-related indicators at smaller geographic levels like districts or neighbourhoods. Although more large data sources have become available, direct estimates of the prevalence of a health-related indicator cannot be produced for neighbourhoods for which only small samples or no samples are available. Small area estimation provides a solution, but unit-level models for binary-valued outcomes that can handle both non-linear effects of the predictors and spatially correlated random effects in a unified framework are rarely encountered. We used data on 26 binary-valued health-related indicators collected on 387,195 persons in the Netherlands. We associated the health-related indicators at the individual level with a set of 12 predictors obtained from national registry data. We formulated a structured additive regression model for small area estimation. The model captured potential non-linear relations between the predictors and the outcome through additive terms in a functional form using penalized splines and included a term that accounted for spatially correlated heterogeneity between neighbourhoods. The registry data were used to predict individual outcomes which in turn are aggregated into higher geographical levels, i.e. neighbourhoods. We validated our method by comparing the estimated prevalences with observed prevalences at the individual level and by comparing the estimated prevalences with direct estimates obtained by weighting methods at municipality level. We estimated the prevalence of the 26 health-related indicators for 415 municipalities, 2599 districts and 11,432 neighbourhoods in the Netherlands. We illustrate our method on overweight data and show that there are distinct geographic patterns in the overweight prevalence. Calibration plots show that the estimated prevalences agree very well with observed prevalences at the individual level. The estimated prevalences agree reasonably well with the

  19. The space of ultrametric phylogenetic trees.

    PubMed

    Gavryushkin, Alex; Drummond, Alexei J

    2016-08-21

    The reliability of a phylogenetic inference method from genomic sequence data is ensured by its statistical consistency. Bayesian inference methods produce a sample of phylogenetic trees from the posterior distribution given sequence data. Hence the question of statistical consistency of such methods is equivalent to the consistency of the summary of the sample. More generally, statistical consistency is ensured by the tree space used to analyse the sample. In this paper, we consider two standard parameterisations of phylogenetic time-trees used in evolutionary models: inter-coalescent interval lengths and absolute times of divergence events. For each of these parameterisations we introduce a natural metric space on ultrametric phylogenetic trees. We compare the introduced spaces with existing models of tree space and formulate several formal requirements that a metric space on phylogenetic trees must possess in order to be a satisfactory space for statistical analysis, and justify them. We show that only a few known constructions of the space of phylogenetic trees satisfy these requirements. However, our results suggest that these basic requirements are not enough to distinguish between the two metric spaces we introduce and that the choice between metric spaces requires additional properties to be considered. Particularly, that the summary tree minimising the square distance to the trees from the sample might be different for different parameterisations. This suggests that further fundamental insight is needed into the problem of statistical consistency of phylogenetic inference methods. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  1. ColorTree: a batch customization tool for phylogenic trees

    PubMed Central

    Chen, Wei-Hua; Lercher, Martin J

    2009-01-01

    Background Genome sequencing projects and comparative genomics studies typically aim to trace the evolutionary history of large gene sets, often requiring human inspection of hundreds of phylogenetic trees. If trees are checked for compatibility with an explicit null hypothesis (e.g., the monophyly of certain groups), this daunting task is greatly facilitated by an appropriate coloring scheme. Findings In this note, we introduce ColorTree, a simple yet powerful batch customization tool for phylogenic trees. Based on pattern matching rules, ColorTree applies a set of customizations to an input tree file, e.g., coloring labels or branches. The customized trees are saved to an output file, which can then be viewed and further edited by Dendroscope (a freely available tree viewer). ColorTree runs on any Perl installation as a stand-alone command line tool, and its application can thus be easily automated. This way, hundreds of phylogenic trees can be customized for easy visual inspection in a matter of minutes. Conclusion ColorTree allows efficient and flexible visual customization of large tree sets through the application of a user-supplied configuration file to multiple tree files. PMID:19646243

  2. ColorTree: a batch customization tool for phylogenic trees.

    PubMed

    Chen, Wei-Hua; Lercher, Martin J

    2009-07-31

    Genome sequencing projects and comparative genomics studies typically aim to trace the evolutionary history of large gene sets, often requiring human inspection of hundreds of phylogenetic trees. If trees are checked for compatibility with an explicit null hypothesis (e.g., the monophyly of certain groups), this daunting task is greatly facilitated by an appropriate coloring scheme. In this note, we introduce ColorTree, a simple yet powerful batch customization tool for phylogenic trees. Based on pattern matching rules, ColorTree applies a set of customizations to an input tree file, e.g., coloring labels or branches. The customized trees are saved to an output file, which can then be viewed and further edited by Dendroscope (a freely available tree viewer). ColorTree runs on any Perl installation as a stand-alone command line tool, and its application can thus be easily automated. This way, hundreds of phylogenic trees can be customized for easy visual inspection in a matter of minutes. ColorTree allows efficient and flexible visual customization of large tree sets through the application of a user-supplied configuration file to multiple tree files.

  3. Understanding how roadside concentrations of NOx are influenced by the background levels, traffic density, and meteorological conditions using Boosted Regression Trees

    NASA Astrophysics Data System (ADS)

    Sayegh, Arwa; Tate, James E.; Ropkins, Karl

    2016-02-01

    Oxides of Nitrogen (NOx) is a major component of photochemical smog and its constituents are considered principal traffic-related pollutants affecting human health. This study investigates the influence of background concentrations of NOx, traffic density, and prevailing meteorological conditions on roadside concentrations of NOx at UK urban, open motorway, and motorway tunnel sites using the statistical approach Boosted Regression Trees (BRT). BRT models have been fitted using hourly concentration, traffic, and meteorological data for each site. The models predict, rank, and visualise the relationship between model variables and roadside NOx concentrations. A strong relationship between roadside NOx and monitored local background concentrations is demonstrated. Relationships between roadside NOx and other model variables have been shown to be strongly influenced by the quality and resolution of background concentrations of NOx, i.e. if it were based on monitored data or modelled prediction. The paper proposes a direct method of using site-specific fundamental diagrams for splitting traffic data into four traffic states: free-flow, busy-flow, congested, and severely congested. Using BRT models, the density of traffic (vehicles per kilometre) was observed to have a proportional influence on the concentrations of roadside NOx, with different fitted regression line slopes for the different traffic states. When other influences are conditioned out, the relationship between roadside concentrations and ambient air temperature suggests NOx concentrations reach a minimum at around 22 °C with high concentrations at low ambient air temperatures which could be associated to restricted atmospheric dispersion and/or to changes in road traffic exhaust emission characteristics at low ambient air temperatures. This paper uses BRT models to study how different critical factors, and their relative importance, influence the variation of roadside NOx concentrations. The paper

  4. bcgTree: automatized phylogenetic tree building from bacterial core genomes.

    PubMed

    Ankenbrand, Markus J; Keller, Alexander

    2016-10-01

    The need for multi-gene analyses in scientific fields such as phylogenetics and DNA barcoding has increased in recent years. In particular, these approaches are increasingly important for differentiating bacterial species, where reliance on the standard 16S rDNA marker can result in poor resolution. Additionally, the assembly of bacterial genomes has become a standard task due to advances in next-generation sequencing technologies. We created a bioinformatic pipeline, bcgTree, which uses assembled bacterial genomes either from databases or own sequencing results from the user to reconstruct their phylogenetic history. The pipeline automatically extracts 107 essential single-copy core genes, found in a majority of bacteria, using hidden Markov models and performs a partitioned maximum-likelihood analysis. Here, we describe the workflow of bcgTree and, as a proof-of-concept, its usefulness in resolving the phylogeny of 293 publically available bacterial strains of the genus Lactobacillus. We also evaluate its performance in both low- and high-level taxonomy test sets. The tool is freely available at github ( https://github.com/iimog/bcgTree ) and our institutional homepage ( http://www.dna-analytics.biozentrum.uni-wuerzburg.de ).

  5. Influence of meteorological variables on rainfall partitioning for deciduous and coniferous tree species in urban area

    NASA Astrophysics Data System (ADS)

    Zabret, Katarina; Rakovec, Jože; Šraj, Mojca

    2018-03-01

    Rainfall partitioning is an important part of the ecohydrological cycle, influenced by numerous variables. Rainfall partitioning for pine (Pinus nigra Arnold) and birch (Betula pendula Roth.) trees was measured from January 2014 to June 2017 in an urban area of Ljubljana, Slovenia. 180 events from more than three years of observations were analyzed, focusing on 13 meteorological variables, including the number of raindrops, their diameter, and velocity. Regression tree and boosted regression tree analyses were performed to evaluate the influence of the variables on rainfall interception loss, throughfall, and stemflow in different phenoseasons. The amount of rainfall was recognized as the most influential variable, followed by rainfall intensity and the number of raindrops. Higher rainfall amount, intensity, and the number of drops decreased percentage of rainfall interception loss. Rainfall amount and intensity were the most influential on interception loss by birch and pine trees during the leafed and leafless periods, respectively. Lower wind speed was found to increase throughfall, whereas wind direction had no significant influence. Consideration of drop size spectrum properties proved to be important, since the number of drops, drop diameter, and median volume diameter were often recognized as important influential variables.

  6. Automatic localization of bifurcations and vessel crossings in digital fundus photographs using location regression

    NASA Astrophysics Data System (ADS)

    Niemeijer, Meindert; Dumitrescu, Alina V.; van Ginneken, Bram; Abrámoff, Michael D.

    2011-03-01

    Parameters extracted from the vasculature on the retina are correlated with various conditions such as diabetic retinopathy and cardiovascular diseases such as stroke. Segmentation of the vasculature on the retina has been a topic that has received much attention in the literature over the past decade. Analysis of the segmentation result, however, has only received limited attention with most works describing methods to accurately measure the width of the vessels. Analyzing the connectedness of the vascular network is an important step towards the characterization of the complete vascular tree. The retinal vascular tree, from an image interpretation point of view, originates at the optic disc and spreads out over the retina. The tree bifurcates and the vessels also cross each other. The points where this happens form the key to determining the connectedness of the complete tree. We present a supervised method to detect the bifurcations and crossing points of the vasculature of the retina. The method uses features extracted from the vasculature as well as the image in a location regression approach to find those locations of the segmented vascular tree where the bifurcation or crossing occurs (from here, POI, points of interest). We evaluate the method on the publicly available DRIVE database in which an ophthalmologist has marked the POI.

  7. How tree roots respond to drought

    PubMed Central

    Brunner, Ivano; Herzog, Claude; Dawes, Melissa A.; Arend, Matthias; Sperisen, Christoph

    2015-01-01

    The ongoing climate change is characterized by increased temperatures and altered precipitation patterns. In addition, there has been an increase in both the frequency and intensity of extreme climatic events such as drought. Episodes of drought induce a series of interconnected effects, all of which have the potential to alter the carbon balance of forest ecosystems profoundly at different scales of plant organization and ecosystem functioning. During recent years, considerable progress has been made in the understanding of how aboveground parts of trees respond to drought and how these responses affect carbon assimilation. In contrast, processes of belowground parts are relatively underrepresented in research on climate change. In this review, we describe current knowledge about responses of tree roots to drought. Tree roots are capable of responding to drought through a variety of strategies that enable them to avoid and tolerate stress. Responses include root biomass adjustments, anatomical alterations, and physiological acclimations. The molecular mechanisms underlying these responses are characterized to some extent, and involve stress signaling and the induction of numerous genes, leading to the activation of tolerance pathways. In addition, mycorrhizas seem to play important protective roles. The current knowledge compiled in this review supports the view that tree roots are well equipped to withstand drought situations and maintain morphological and physiological functions as long as possible. Further, the reviewed literature demonstrates the important role of tree roots in the functioning of forest ecosystems and highlights the need for more research in this emerging field. PMID:26284083

  8. Identifying the critical success factors in the coverage of low vision services using the classification analysis and regression tree methodology.

    PubMed

    Chiang, Peggy Pei-Chia; Xie, Jing; Keeffe, Jill Elizabeth

    2011-04-25

    To identify the critical success factors (CSF) associated with coverage of low vision services. Data were collected from a survey distributed to Vision 2020 contacts, government, and non-government organizations (NGOs) in 195 countries. The Classification and Regression Tree Analysis (CART) was used to identify the critical success factors of low vision service coverage. Independent variables were sourced from the survey: policies, epidemiology, provision of services, equipment and infrastructure, barriers to services, human resources, and monitoring and evaluation. Socioeconomic and demographic independent variables: health expenditure, population statistics, development status, and human resources in general, were sourced from the World Health Organization (WHO), World Bank, and the United Nations (UN). The findings identified that having >50% of children obtaining devices when prescribed (χ(2) = 44; P < 0.000), multidisciplinary care (χ(2) = 14.54; P = 0.002), >3 rehabilitation workers per 10 million of population (χ(2) = 4.50; P = 0.034), higher percentage of population urbanized (χ(2) = 14.54; P = 0.002), a level of private investment (χ(2) = 14.55; P = 0.015), and being fully funded by government (χ(2) = 6.02; P = 0.014), are critical success factors associated with coverage of low vision services. This study identified the most important predictors for countries with better low vision coverage. The CART is a useful and suitable methodology in survey research and is a novel way to simplify a complex global public health issue in eye care.

  9. Estimating forest crown area removed by selection cutting: a linked regression-GIS approach based on stump diameters

    USGS Publications Warehouse

    Anderson, S.C.; Kupfer, J.A.; Wilson, R.R.; Cooper, R.J.

    2000-01-01

    The purpose of this research was to develop a model that could be used to provide a spatial representation of uneven-aged silvicultural treatments on forest crown area. We began by developing species-specific linear regression equations relating tree DBH to crown area for eight bottomland tree species at White River National Wildlife Refuge, Arkansas, USA. The relationships were highly significant for all species, with coefficients of determination (r(2)) ranging from 0.37 for Ulmus crassifolia to nearly 0.80 for Quercus nuttalliii and Taxodium distichum. We next located and measured the diameters of more than 4000 stumps from a single tree-group selection timber harvest. Stump locations were recorded with respect to an established gl id point system and entered into a Geographic Information System (ARC/INFO). The area occupied by the crown of each logged individual was then estimated by using the stump dimensions (adjusted to DBHs) and the regression equations relating tree DBH to crown area. Our model projected that the selection cuts removed roughly 300 m(2) of basal area from the logged sites resulting in the loss of approximate to 55 000 m(2) of crown area. The model developed in this research represents a tool that can be used in conjunction with remote sensing applications to assist in forest inventory and management, as well as to estimate the impacts of selective timber harvest on wildlife.

  10. Biomass equations for major tree species of the Northeast

    Treesearch

    Louise M. Tritton; James W. Hornbeck

    1982-01-01

    Regression equations are used in both forestry and ecosystem studies to estimate tree biomass from field measurements of dbh (diameter at breast height) or a combination of dbh and height. Literature on biomass is reviewed, and 178 sets of publish equation for 25 species common to the Northeastern Unites States are listed. On the basis of these equations, estimates of...

  11. Association between split selection instability and predictive error in survival trees.

    PubMed

    Radespiel-Tröger, M; Gefeller, O; Rabenstein, T; Hothorn, T

    2006-01-01

    To evaluate split selection instability in six survival tree algorithms and its relationship with predictive error by means of a bootstrap study. We study the following algorithms: logrank statistic with multivariate p-value adjustment without pruning (LR), Kaplan-Meier distance of survival curves (KM), martingale residuals (MR), Poisson regression for censored data (PR), within-node impurity (WI), and exponential log-likelihood loss (XL). With the exception of LR, initial trees are pruned by using split-complexity, and final trees are selected by means of cross-validation. We employ a real dataset from a clinical study of patients with gallbladder stones. The predictive error is evaluated using the integrated Brier score for censored data. The relationship between split selection instability and predictive error is evaluated by means of box-percentile plots, covariate and cutpoint selection entropy, and cutpoint selection coefficients of variation, respectively, in the root node. We found a positive association between covariate selection instability and predictive error in the root node. LR yields the lowest predictive error, while KM and MR yield the highest predictive error. The predictive error of survival trees is related to split selection instability. Based on the low predictive error of LR, we recommend the use of this algorithm for the construction of survival trees. Unpruned survival trees with multivariate p-value adjustment can perform equally well compared to pruned trees. The analysis of split selection instability can be used to communicate the results of tree-based analyses to clinicians and to support the application of survival trees.

  12. Strengthening the Regression Discontinuity Design Using Additional Design Elements: A Within-Study Comparison

    ERIC Educational Resources Information Center

    Wing, Coady; Cook, Thomas D.

    2013-01-01

    The sharp regression discontinuity design (RDD) has three key weaknesses compared to the randomized clinical trial (RCT). It has lower statistical power, it is more dependent on statistical modeling assumptions, and its treatment effect estimates are limited to the narrow subpopulation of cases immediately around the cutoff, which is rarely of…

  13. Modeling non-linear growth responses to temperature and hydrology in wetland trees

    NASA Astrophysics Data System (ADS)

    Keim, R.; Allen, S. T.

    2016-12-01

    Growth responses of wetland trees to flooding and climate variations are difficult to model because they depend on multiple, apparently interacting factors, but are a critical link in hydrological control of wetland carbon budgets. To more generally understand tree growth to hydrological forcing, we modeled non-linear responses of tree ring growth to flooding and climate at sub-annual time steps, using Vaganov-Shashkin response functions. We calibrated the model to six baldcypress tree-ring chronologies from two hydrologically distinct sites in southern Louisiana, and tested several hypotheses of plasticity in wetlands tree responses to interacting environmental variables. The model outperformed traditional multiple linear regression. More importantly, optimized response parameters were generally similar among sites with varying hydrological conditions, suggesting generality to the functions. Model forms that included interacting responses to multiple forcing factors were more effective than were single response functions, indicating the principle of a single limiting factor is not correct in wetlands and both climatic and hydrological variables must be considered in predicting responses to hydrological or climate change.

  14. Tree Colors: Color Schemes for Tree-Structured Data.

    PubMed

    Tennekes, Martijn; de Jonge, Edwin

    2014-12-01

    We present a method to map tree structures to colors from the Hue-Chroma-Luminance color model, which is known for its well balanced perceptual properties. The Tree Colors method can be tuned with several parameters, whose effect on the resulting color schemes is discussed in detail. We provide a free and open source implementation with sensible parameter defaults. Categorical data are very common in statistical graphics, and often these categories form a classification tree. We evaluate applying Tree Colors to tree structured data with a survey on a large group of users from a national statistical institute. Our user study suggests that Tree Colors are useful, not only for improving node-link diagrams, but also for unveiling tree structure in non-hierarchical visualizations.

  15. Generalized linear and generalized additive models in studies of species distributions: Setting the scene

    USGS Publications Warehouse

    Guisan, Antoine; Edwards, T.C.; Hastie, T.

    2002-01-01

    An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.

  16. Tree diversity and the role of non-host neighbour tree species in reducing fungal pathogen infestation

    PubMed Central

    Hantsch, Lydia; Bien, Steffen; Radatz, Stine; Braun, Uwe; Auge, Harald; Bruelheide, Helge

    2014-01-01

    The degree to which plant pathogen infestation occurs in a host plant is expected to be strongly influenced by the level of species diversity among neighbouring host and non-host plant species. Since pathogen infestation can negatively affect host plant performance, it can mediate the effects of local biodiversity on ecosystem functioning. We tested the effects of tree diversity and the proportion of neighbouring host and non-host species with respect to the foliar fungal pathogens of Tilia cordata and Quercus petraea in the Kreinitz tree diversity experiment in Germany. We hypothesized that fungal pathogen richness increases while infestation decreases with increasing local tree diversity. In addition, we tested whether fungal pathogen richness and infestation are dependent on the proportion of host plant species present or on the proportion of particular non-host neighbouring tree species. Leaves of the two target species were sampled across three consecutive years with visible foliar fungal pathogens on the leaf surface being identified macro- and microscopically. Effects of diversity among neighbouring trees were analysed: (i) for total fungal species richness and fungal infestation on host trees and (ii) for infestation by individual fungal species. We detected four and five fungal species on T. cordata and Q. petraea, respectively. High local tree diversity reduced (i) total fungal species richness and infestation of T. cordata and fungal infestation of Q. petraea and (ii) infestation by three host-specialized fungal pathogen species. These effects were brought about by local tree diversity and were independent of host species proportion. In general, host species proportion had almost no effect on fungal species richness and infestation. Strong effects associated with the proportion of particular non-host neighbouring tree species on fungal species richness and infestation were, however, recorded. Synthesis. For the first time, we experimentally

  17. Tree diversity and the role of non-host neighbour tree species in reducing fungal pathogen infestation.

    PubMed

    Hantsch, Lydia; Bien, Steffen; Radatz, Stine; Braun, Uwe; Auge, Harald; Bruelheide, Helge

    2014-11-01

    The degree to which plant pathogen infestation occurs in a host plant is expected to be strongly influenced by the level of species diversity among neighbouring host and non-host plant species. Since pathogen infestation can negatively affect host plant performance, it can mediate the effects of local biodiversity on ecosystem functioning.We tested the effects of tree diversity and the proportion of neighbouring host and non-host species with respect to the foliar fungal pathogens of Tilia cordata and Quercus petraea in the Kreinitz tree diversity experiment in Germany. We hypothesized that fungal pathogen richness increases while infestation decreases with increasing local tree diversity. In addition, we tested whether fungal pathogen richness and infestation are dependent on the proportion of host plant species present or on the proportion of particular non-host neighbouring tree species.Leaves of the two target species were sampled across three consecutive years with visible foliar fungal pathogens on the leaf surface being identified macro- and microscopically. Effects of diversity among neighbouring trees were analysed: (i) for total fungal species richness and fungal infestation on host trees and (ii) for infestation by individual fungal species.We detected four and five fungal species on T. cordata and Q. petraea , respectively. High local tree diversity reduced (i) total fungal species richness and infestation of T. cordata and fungal infestation of Q. petraea and (ii) infestation by three host-specialized fungal pathogen species. These effects were brought about by local tree diversity and were independent of host species proportion. In general, host species proportion had almost no effect on fungal species richness and infestation. Strong effects associated with the proportion of particular non-host neighbouring tree species on fungal species richness and infestation were, however, recorded. Synthesis . For the first time, we experimentally

  18. TreePOD: Sensitivity-Aware Selection of Pareto-Optimal Decision Trees.

    PubMed

    Muhlbacher, Thomas; Linhardt, Lorenz; Moller, Torsten; Piringer, Harald

    2018-01-01

    Balancing accuracy gains with other objectives such as interpretability is a key challenge when building decision trees. However, this process is difficult to automate because it involves know-how about the domain as well as the purpose of the model. This paper presents TreePOD, a new approach for sensitivity-aware model selection along trade-offs. TreePOD is based on exploring a large set of candidate trees generated by sampling the parameters of tree construction algorithms. Based on this set, visualizations of quantitative and qualitative tree aspects provide a comprehensive overview of possible tree characteristics. Along trade-offs between two objectives, TreePOD provides efficient selection guidance by focusing on Pareto-optimal tree candidates. TreePOD also conveys the sensitivities of tree characteristics on variations of selected parameters by extending the tree generation process with a full-factorial sampling. We demonstrate how TreePOD supports a variety of tasks involved in decision tree selection and describe its integration in a holistic workflow for building and selecting decision trees. For evaluation, we illustrate a case study for predicting critical power grid states, and we report qualitative feedback from domain experts in the energy sector. This feedback suggests that TreePOD enables users with and without statistical background a confident and efficient identification of suitable decision trees.

  19. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  20. Calcium addition at the Hubbard Brook Experimental Forest increases the capacity for stress tolerance and carbon capture in red spruce (Picea rubens) trees during the cold season

    Treesearch

    Paul G. Schaberg; Rakesh Minocha; Stephanie Long; Joshua M. Halman; Gary J. Hawley; Christopher Eagar

    2011-01-01

    Red spruce (Picea rubens Sarg.) trees are uniquely vulnerable to foliar freezing injury during the cold season (fall and winter), but are also capable of photosynthetic activity if temperatures moderate. To evaluate the influence of calcium (Ca) addition on the physiology of red spruce during the cold season, we measured concentrations of foliar...

  1. Two Trees: Migrating Fault Trees to Decision Trees for Real Time Fault Detection on International Space Station

    NASA Technical Reports Server (NTRS)

    Lee, Charles; Alena, Richard L.; Robinson, Peter

    2004-01-01

    We started from ISS fault trees example to migrate to decision trees, presented a method to convert fault trees to decision trees. The method shows that the visualizations of root cause of fault are easier and the tree manipulating becomes more programmatic via available decision tree programs. The visualization of decision trees for the diagnostic shows a format of straight forward and easy understands. For ISS real time fault diagnostic, the status of the systems could be shown by mining the signals through the trees and see where it stops at. The other advantage to use decision trees is that the trees can learn the fault patterns and predict the future fault from the historic data. The learning is not only on the static data sets but also can be online, through accumulating the real time data sets, the decision trees can gain and store faults patterns in the trees and recognize them when they come.

  2. The Fault Tree Compiler (FTC): Program and mathematics

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Martensen, Anna L.

    1989-01-01

    The Fault Tree Compiler Program is a new reliability tool used to predict the top-event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, AND m OF n gates. The high-level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precisely (within the limits of double precision floating point arithmetic) within a user specified number of digits accuracy. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Equipment Corporation (DEC) VAX computer with the VMS operation system.

  3. The relationship between tree canopy and crime rates across an urban-rural gradient in the greater Baltimore region

    Treesearch

    Austin Troy; J. Morgan Grove; Jarlath O' Neill-Dunne

    2012-01-01

    The extent to which urban tree cover influences crime is in debate in the literature. This research took advantage of geocoded crime point data and high resolution tree canopy data to address this question in Baltimore City and County, MD, an area that includes a significant urban-rural gradient. Using ordinary least squares and spatially adjusted regression and...

  4. Bark flammability as a fire-response trait for subalpine trees

    PubMed Central

    Frejaville, Thibaut; Curt, Thomas; Carcaillet, Christopher

    2013-01-01

    Relationships between the flammability properties of a given plant and its chances of survival after a fire still remain unknown. We hypothesize that the bark flammability of a tree reduces the potential for tree survival following surface fires, and that if tree resistance to fire is provided by a thick insulating bark, the latter must be few flammable. We test, on subalpine tree species, the relationship between the flammability of bark and its insulating ability, identifies the biological traits that determine bark flammability, and assesses their relative susceptibility to surface fires from their bark properties. The experimental set of burning properties was analyzed by Principal Component Analysis to assess the bark flammability. Bark insulating ability was expressed by the critical time to cambium kill computed from bark thickness. Log-linear regressions indicated that bark flammability varies with the bark thickness and the density of wood under bark and that the most flammable barks have poor insulating ability. Susceptibility to surface fires increases from gymnosperm to angiosperm subalpine trees. The co-dominant subalpine species Larix decidua (Mill.) and Pinus cembra (L.) exhibit large differences in both flammability and insulating ability of the bark that should partly explain their contrasted responses to fires in the past. PMID:24324473

  5. Regression methods for spatially correlated data: an example using beetle attacks in a seed orchard

    Treesearch

    Preisler Haiganoush; Nancy G. Rappaport; David L. Wood

    1997-01-01

    We present a statistical procedure for studying the simultaneous effects of observed covariates and unmeasured spatial variables on responses of interest. The procedure uses regression type analyses that can be used with existing statistical software packages. An example using the rate of twig beetle attacks on Douglas-fir trees in a seed orchard illustrates the...

  6. Classification and regression trees

    Treesearch

    G. G. Moisen

    2008-01-01

    Frequently, ecologists are interested in exploring ecological relationships, describing patterns and processes, or making spatial or temporal predictions. These purposes often can be addressed by modeling the relationship between some outcome or response and a set of features or explanatory variables.

  7. Phytoforensics—Using trees to find contamination

    USGS Publications Warehouse

    Wilson, Jordan L.

    2017-09-28

    The water we drink, air we breathe, and soil we come into contact with have the potential to adversely affect our health because of contaminants in the environment. Environmental samples can characterize the extent of potential contamination, but traditional methods for collecting water, air, and soil samples below the ground (for example, well drilling or direct-push soil sampling) are expensive and time consuming. Trees are closely connected to the subsurface and sampling tree trunks can indicate subsurface pollutants, a process called phytoforensics. Scientists at the Missouri Water Science Center were among the first to use phytoforensics to screen sites for contamination before using traditional sampling methods, to guide additional sampling, and to show the large cost savings associated with tree sampling compared to traditional methods. 

  8. Increased spruce tree growth in Central Europe since 1960s.

    PubMed

    Cienciala, Emil; Altman, Jan; Doležal, Jiří; Kopáček, Jiří; Štěpánek, Petr; Ståhl, Göran; Tumajer, Jan

    2018-04-01

    Tree growth response to recent environmental changes is of key interest for forest ecology. This study addressed the following questions with respect to Norway spruce (Picea abies, L. Karst.) in Central Europe: Has tree growth accelerated during the last five decades? What are the main environmental drivers of the observed tree radial stem growth and how much variability can be explained by them? Using a nationwide dendrochronological sampling of Norway spruce in the Czech Republic (1246 trees, 266 plots), novel regional tree-ring width chronologies for 40(±10)- and 60(±10)-year old trees were assembled, averaged across three elevation zones (break points at 500 and 700m). Correspondingly averaged drivers, including temperature, precipitation, nitrogen (N) deposition and ambient CO 2 concentration, were used in a general linear model (GLM) to analyze the contribution of these in explaining tree ring width variability for the period from 1961 to 2013. Spruce tree radial stem growth responded strongly to the changing environment in Central Europe during the period, with a mean tree ring width increase of 24 and 32% for the 40- and 60-year old trees, respectively. The indicative General Linear Model analysis identified CO 2 , precipitation during the vegetation season, spring air temperature (March-May) and N-deposition as the significant covariates of growth, with the latter including interactions with elevation zones. The regression models explained 57% and 55% of the variability in the two tree ring width chronologies, respectively. Growth response to N-deposition showed the highest variability along the elevation gradient with growth stimulation/limitation at sites below/above 700m. A strong sensitivity of stem growth to CO 2 was also indicated, suggesting that the effect of rising ambient CO 2 concentration (direct or indirect by increased water use efficiency) should be considered in analyses of long-term growth together with climatic factors and N

  9. Modelling the ecological consequences of whole tree harvest for bioenergy production

    NASA Astrophysics Data System (ADS)

    Skår, Silje; Lange, Holger; Sogn, Trine

    2013-04-01

    There is an increasing demand for energy from biomass as a substitute to fossil fuels worldwide, and the Norwegian government plans to double the production of bioenergy to 9% of the national energy production or to 28 TWh per year by 2020. A large part of this increase may come from forests, which have a great potential with respect to biomass supply as forest growth increasingly has exceeded harvest in the last decades. One feasible option is the utilization of forest residues (needles, twigs and branches) in addition to stems, known as Whole Tree Harvest (WTH). As opposed to WTH, the residues are traditionally left in the forest with Conventional Timber Harvesting (CH). However, the residues contain a large share of the treés nutrients, indicating that WTH may possibly alter the supply of nutrients and organic matter to the soil and the forest ecosystem. This may potentially lead to reduced tree growth. Other implications can be nutrient imbalance, loss of carbon from the soil and changes in species composition and diversity. This study aims to identify key factors and appropriate strategies for ecologically sustainable WTH in Norway spruce (Picea abies) and Scots pine (Pinus sylvestris) forest stands in Norway. We focus on identifying key factors driving soil organic matter, nutrients, biomass, biodiversity etc. Simulations of the effect on the carbon and nitrogen budget with the two harvesting methods will also be conducted. Data from field trials and long-term manipulation experiments are used to obtain a first overview of key variables. The relationships between the variables are hitherto unknown, but it is by no means obvious that they could be assumed as linear; thus, an ordinary multiple linear regression approach is expected to be insufficient. Here we apply two advanced and highly flexible modelling frameworks which hardly have been used in the context of tree growth, nutrient balances and biomass removal so far: Generalized Additive Models (GAMs) and

  10. CartograTree: connecting tree genomes, phenotypes and environment.

    PubMed

    Vasquez-Gross, Hans A; Yu, John J; Figueroa, Ben; Gessler, Damian D G; Neale, David B; Wegrzyn, Jill L

    2013-05-01

    Today, researchers spend a tremendous amount of time gathering, formatting, filtering and visualizing data collected from disparate sources. Under the umbrella of forest tree biology, we seek to provide a platform and leverage modern technologies to connect biotic and abiotic data. Our goal is to provide an integrated web-based workspace that connects environmental, genomic and phenotypic data via geo-referenced coordinates. Here, we connect the genomic query web-based workspace, DiversiTree and a novel geographical interface called CartograTree to data housed on the TreeGenes database. To accomplish this goal, we implemented Simple Semantic Web Architecture and Protocol to enable the primary genomics database, TreeGenes, to communicate with semantic web services regardless of platform or back-end technologies. The novelty of CartograTree lies in the interactive workspace that allows for geographical visualization and engagement of high performance computing (HPC) resources. The application provides a unique tool set to facilitate research on the ecology, physiology and evolution of forest tree species. CartograTree can be accessed at: http://dendrome.ucdavis.edu/cartogratree. © 2013 Blackwell Publishing Ltd.

  11. Using decision trees to understand structure in missing data

    PubMed Central

    Tierney, Nicholas J; Harden, Fiona A; Harden, Maurice J; Mengersen, Kerrie L

    2015-01-01

    Objectives Demonstrate the application of decision trees—classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs)—to understand structure in missing data. Setting Data taken from employees at 3 different industrial sites in Australia. Participants 7915 observations were included. Materials and methods The approach was evaluated using an occupational health data set comprising results of questionnaires, medical tests and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the type of data (medical or environmental), the site in which it was collected, the number of visits, and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured as compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusions Researchers are encouraged to use CART and BRT models to explore and understand missing data. PMID:26124509

  12. Interaction Models for Functional Regression.

    PubMed

    Usset, Joseph; Staicu, Ana-Maria; Maity, Arnab

    2016-02-01

    A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data.

  13. Tree Nut Allergies

    MedlinePlus

    ... Blog Vision Awards Common Allergens Tree Nut Allergy Tree Nut Allergy Learn about tree nut allergy, how ... a Tree Nut Label card . Allergic Reactions to Tree Nuts Tree nuts can cause a severe and ...

  14. Analyzing and synthesizing phylogenies using tree alignment graphs.

    PubMed

    Smith, Stephen A; Brown, Joseph W; Hinchliff, Cody E

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.

  15. Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

    PubMed Central

    Smith, Stephen A.; Brown, Joseph W.; Hinchliff, Cody E.

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe. PMID:24086118

  16. Trends and Tipping Points of Drought-induced Tree Mortality

    NASA Astrophysics Data System (ADS)

    Huang, K.; Yi, C.; Wu, D.; Zhou, T.; Zhao, X.; Blanford, W. J.; Wei, S.; Wu, H.; Du, L.

    2014-12-01

    Drought-induced tree mortality worldwide has been recently reported in a review of the literature by Allen et al. (2010). However, a quantitative relationship between widespread loss of forest from mortality and drought is still a key knowledge gap. Specifically, the field lacks quantitative knowledge of tipping point in trees when coping with water stress, which inhibits the assessments of how climate change affects the forest ecosystem. We investigate the statistical relationships for different (seven) conifer species between Ring Width Index (RWI) and Standardized Precipitation Evapotranspiration Index (SPEI), based on 411 chronologies from the International Tree-Ring Data Bank across 11 states of the western United States. We found robust species-specific relationships between RWI and SPEI for all seven conifer species at dry condition. The regression models show that the RWI decreases with SPEI decreasing (drying) and more than 76% variation of tree growth (RWI) can be explained by the drought index (SPEI). However, when soil water is sufficient (i.e., SPEI>SPEIu), soil water is no longer a restrictive factor for tree growth and, therefore, the RWI shows a weak correlation with SPEI. Based on the statistical models, we derived the tipping point of SPEI (SPEItp) where the RWI equals 0, which means the carbon efflux by tree respiration equals carbon influx by tree photosynthesis. When the severity of drought exceeds this tipping point(i.e. SPEItrees might not be able to sustain their lives as the carbon assimilated by photosynthesis could not suffice the lowest need of trees maintain respiration. The ranges of the tipping points for seven species-specific trees vary between -2.45 and -1.40. The lower value of a tipping point represents the stronger ability to endure drought. The predicted tipping points can be used as reference of tree mortality for assessment of forest mortality risk under climate change.This work was supported by the Fund for

  17. A combined M5P tree and hazard-based duration model for predicting urban freeway traffic accident durations.

    PubMed

    Lin, Lei; Wang, Qian; Sadek, Adel W

    2016-06-01

    The duration of freeway traffic accidents duration is an important factor, which affects traffic congestion, environmental pollution, and secondary accidents. Among previous studies, the M5P algorithm has been shown to be an effective tool for predicting incident duration. M5P builds a tree-based model, like the traditional classification and regression tree (CART) method, but with multiple linear regression models as its leaves. The problem with M5P for accident duration prediction, however, is that whereas linear regression assumes that the conditional distribution of accident durations is normally distributed, the distribution for a "time-to-an-event" is almost certainly nonsymmetrical. A hazard-based duration model (HBDM) is a better choice for this kind of a "time-to-event" modeling scenario, and given this, HBDMs have been previously applied to analyze and predict traffic accidents duration. Previous research, however, has not yet applied HBDMs for accident duration prediction, in association with clustering or classification of the dataset to minimize data heterogeneity. The current paper proposes a novel approach for accident duration prediction, which improves on the original M5P tree algorithm through the construction of a M5P-HBDM model, in which the leaves of the M5P tree model are HBDMs instead of linear regression models. Such a model offers the advantage of minimizing data heterogeneity through dataset classification, and avoids the need for the incorrect assumption of normality for traffic accident durations. The proposed model was then tested on two freeway accident datasets. For each dataset, the first 500 records were used to train the following three models: (1) an M5P tree; (2) a HBDM; and (3) the proposed M5P-HBDM, and the remainder of data were used for testing. The results show that the proposed M5P-HBDM managed to identify more significant and meaningful variables than either M5P or HBDMs. Moreover, the M5P-HBDM had the lowest overall mean

  18. Detecting treatment-subgroup interactions in clustered data with generalized linear mixed-effects model trees.

    PubMed

    Fokkema, M; Smits, N; Zeileis, A; Hothorn, T; Kelderman, H

    2017-10-25

    Identification of subgroups of patients for whom treatment A is more effective than treatment B, and vice versa, is of key importance to the development of personalized medicine. Tree-based algorithms are helpful tools for the detection of such interactions, but none of the available algorithms allow for taking into account clustered or nested dataset structures, which are particularly common in psychological research. Therefore, we propose the generalized linear mixed-effects model tree (GLMM tree) algorithm, which allows for the detection of treatment-subgroup interactions, while accounting for the clustered structure of a dataset. The algorithm uses model-based recursive partitioning to detect treatment-subgroup interactions, and a GLMM to estimate the random-effects parameters. In a simulation study, GLMM trees show higher accuracy in recovering treatment-subgroup interactions, higher predictive accuracy, and lower type II error rates than linear-model-based recursive partitioning and mixed-effects regression trees. Also, GLMM trees show somewhat higher predictive accuracy than linear mixed-effects models with pre-specified interaction effects, on average. We illustrate the application of GLMM trees on an individual patient-level data meta-analysis on treatments for depression. We conclude that GLMM trees are a promising exploratory tool for the detection of treatment-subgroup interactions in clustered datasets.

  19. Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent

    PubMed Central

    Zhu, Sha; Degnan, James H.

    2017-01-01

    Abstract Recent work in estimating species relationships from gene trees has included inferring networks assuming that past hybridization has occurred between species. Probabilistic models using the multispecies coalescent can be used in this framework for likelihood-based inference of both network topologies and parameters, including branch lengths and hybridization parameters. A difficulty for such methods is that it is not always clear whether, or to what extent, networks are identifiable—that is whether there could be two distinct networks that lead to the same distribution of gene trees. For cases in which incomplete lineage sorting occurs in addition to hybridization, we demonstrate a new representation of the species network likelihood that expresses the probability distribution of the gene tree topologies as a linear combination of gene tree distributions given a set of species trees. This representation makes it clear that in some cases in which two distinct networks give the same distribution of gene trees when sampling one allele per species, the two networks can be distinguished theoretically when multiple individuals are sampled per species. This result means that network identifiability is not only a function of the trees displayed by the networks but also depends on allele sampling within species. We additionally give an example in which two networks that display exactly the same trees can be distinguished from their gene trees even when there is only one lineage sampled per species. PMID:27780899

  20. Big trees, old trees, and growth factor tables

    Treesearch

    Kevin T. Smith

    2018-01-01

    The potential for a tree to reach a great size and to live a long life frequently captures the public's imagination. Sometimes the desire to know the age of an impressively large tree is simple curiosity. For others, the date-of-tree establishment can make a big diff erence for management, particularly for trees at historic sites or those mentioned in property...

  1. Properties of wood-plastic composites: effect of inorganic additives

    NASA Astrophysics Data System (ADS)

    Bakraji, Elias Hanna; Salman, Numan

    2003-01-01

    Wood-plastic composites from Syrian tree species (white poplar, cypress tree, and white willow) were prepared using gamma-ray irradiation. Dry wood was impregnated with acrylamide or butylmethacrylate at various methanol compositions as the swelling solvent. Effect of inorganic additives and co-additives such as lithium nitrate (LiNO 3), copper sulfate (CuSO 4) and sulfuric acid (H 2SO 4), used at a very low concentration (1%), on the polymer loading (PL) and the compression strength (CS) was also investigated. It has been found that all the additives and co-additives, except Cu 2+, increase the PL values and only Li + has a positive effect on CS.

  2. Random forests of interaction trees for estimating individualized treatment effects in randomized trials.

    PubMed

    Su, Xiaogang; Peña, Annette T; Liu, Lei; Levine, Richard A

    2018-04-29

    Assessing heterogeneous treatment effects is a growing interest in advancing precision medicine. Individualized treatment effects (ITEs) play a critical role in such an endeavor. Concerning experimental data collected from randomized trials, we put forward a method, termed random forests of interaction trees (RFIT), for estimating ITE on the basis of interaction trees. To this end, we propose a smooth sigmoid surrogate method, as an alternative to greedy search, to speed up tree construction. The RFIT outperforms the "separate regression" approach in estimating ITE. Furthermore, standard errors for the estimated ITE via RFIT are obtained with the infinitesimal jackknife method. We assess and illustrate the use of RFIT via both simulation and the analysis of data from an acupuncture headache trial. Copyright © 2018 John Wiley & Sons, Ltd.

  3. Unsupervised individual tree crown detection in high-resolution satellite imagery

    DOE PAGES

    Skurikhin, Alexei N.; McDowell, Nate G.; Middleton, Richard S.

    2016-01-26

    Rapidly and accurately detecting individual tree crowns in satellite imagery is a critical need for monitoring and characterizing forest resources. We present a two-stage semiautomated approach for detecting individual tree crowns using high spatial resolution (0.6 m) satellite imagery. First, active contours are used to recognize tree canopy areas in a normalized difference vegetation index image. Given the image areas corresponding to tree canopies, we then identify individual tree crowns as local extrema points in the Laplacian of Gaussian scale-space pyramid. The approach simultaneously detects tree crown centers and estimates tree crown sizes, parameters critical to multiple ecosystem models. Asmore » a demonstration, we used a ground validated, 0.6 m resolution QuickBird image of a sparse forest site. The two-stage approach produced a tree count estimate with an accuracy of 78% for a naturally regenerating forest with irregularly spaced trees, a success rate equivalent to or better than existing approaches. In addition, our approach detects tree canopy areas and individual tree crowns in an unsupervised manner and helps identify overlapping crowns. Furthermore, the method also demonstrates significant potential for further improvement.« less

  4. Unsupervised individual tree crown detection in high-resolution satellite imagery

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Skurikhin, Alexei N.; McDowell, Nate G.; Middleton, Richard S.

    Rapidly and accurately detecting individual tree crowns in satellite imagery is a critical need for monitoring and characterizing forest resources. We present a two-stage semiautomated approach for detecting individual tree crowns using high spatial resolution (0.6 m) satellite imagery. First, active contours are used to recognize tree canopy areas in a normalized difference vegetation index image. Given the image areas corresponding to tree canopies, we then identify individual tree crowns as local extrema points in the Laplacian of Gaussian scale-space pyramid. The approach simultaneously detects tree crown centers and estimates tree crown sizes, parameters critical to multiple ecosystem models. Asmore » a demonstration, we used a ground validated, 0.6 m resolution QuickBird image of a sparse forest site. The two-stage approach produced a tree count estimate with an accuracy of 78% for a naturally regenerating forest with irregularly spaced trees, a success rate equivalent to or better than existing approaches. In addition, our approach detects tree canopy areas and individual tree crowns in an unsupervised manner and helps identify overlapping crowns. Furthermore, the method also demonstrates significant potential for further improvement.« less

  5. Tree allometry and improved estimation of carbon stocks and balance in tropical forests.

    PubMed

    Chave, J; Andalo, C; Brown, S; Cairns, M A; Chambers, J Q; Eamus, D; Fölster, H; Fromard, F; Higuchi, N; Kira, T; Lescure, J-P; Nelson, B W; Ogawa, H; Puig, H; Riéra, B; Yamakura, T

    2005-08-01

    Tropical forests hold large stores of carbon, yet uncertainty remains regarding their quantitative contribution to the global carbon cycle. One approach to quantifying carbon biomass stores consists in inferring changes from long-term forest inventory plots. Regression models are used to convert inventory data into an estimate of aboveground biomass (AGB). We provide a critical reassessment of the quality and the robustness of these models across tropical forest types, using a large dataset of 2,410 trees >or= 5 cm diameter, directly harvested in 27 study sites across the tropics. Proportional relationships between aboveground biomass and the product of wood density, trunk cross-sectional area, and total height are constructed. We also develop a regression model involving wood density and stem diameter only. Our models were tested for secondary and old-growth forests, for dry, moist and wet forests, for lowland and montane forests, and for mangrove forests. The most important predictors of AGB of a tree were, in decreasing order of importance, its trunk diameter, wood specific gravity, total height, and forest type (dry, moist, or wet). Overestimates prevailed, giving a bias of 0.5-6.5% when errors were averaged across all stands. Our regression models can be used reliably to predict aboveground tree biomass across a broad range of tropical forests. Because they are based on an unprecedented dataset, these models should improve the quality of tropical biomass estimates, and bring consensus about the contribution of the tropical forest biome and tropical deforestation to the global carbon cycle.

  6. Can dendrochronology procedures estimate historical Tree Water Footprint?

    NASA Astrophysics Data System (ADS)

    Fernandes, Tarcísio J. G.; Del Campo, Antonio D.; Molina, Antonio J.

    2013-04-01

    transformed into tree transpiration using sapwood area, obtaining 6,768 and 5,844 litres per tree, respectively. BAI-i and vs were significantly related. The Pearson correlation was higher and positive when the growth from the rings formed during the span of sap flow measurement was considered, i.e., the 2009 and 2010 rings. An empirical model was fitted for the BAI-i and vs allowing a preliminary reconstruction of the stand's transpiration history. Linear regressions between vs and BAI-i were significant (R2 ≈ 0.65). Applying the linear equation in each BAI-i along the time (1960-2010) it was possible to reconstruct water use per tree, sometimes defined as the "green" water footprint. In conclusion dendrochronology methods can be used to estimate the Tree-Water-Footprint, and more experimental data should be used for better accuracy.

  7. Spatial trends in leaf size of Amazonian rainforest trees

    NASA Astrophysics Data System (ADS)

    Malhado, A. C. M.; Malhi, Y.; Whittaker, R. J.; Ladle, R. J.; Ter Steege, H.; Phillips, O. L.; Butt, N.; Aragão, L. E. O. C.; Quesada, C. A.; Araujo-Murakami, A.; Arroyo, L.; Peacock, J.; Lopez-Gonzalez, G.; Baker, T. R.; Anderson, L. O.; Almeida, S.; Higuchi, N.; Killeen, T. J.; Monteagudo, A.; Neill, D.; Pitman, N.; Prieto, A.; Salomão, R. P.; Vásquez-Martínez, R.; Laurance, W. F.

    2009-08-01

    Leaf size influences many aspects of tree function such as rates of transpiration and photosynthesis and, consequently, often varies in a predictable way in response to environmental gradients. The recent development of pan-Amazonian databases based on permanent botanical plots has now made it possible to assess trends in leaf size across environmental gradients in Amazonia. Previous plot-based studies have shown that the community structure of Amazonian trees breaks down into at least two major ecological gradients corresponding with variations in soil fertility (decreasing from southwest to northeast) and length of the dry season (increasing from northwest to south and east). Here we describe the geographic distribution of leaf size categories based on 121 plots distributed across eight South American countries. We find that the Amazon forest is predominantly populated by tree species and individuals in the mesophyll size class (20.25-182.25 cm2). The geographic distribution of species and individuals with large leaves (>20.25 cm2) is complex but is generally characterized by a higher proportion of such trees in the northwest of the region. Spatially corrected regressions reveal weak correlations between the proportion of large-leaved species and metrics of water availability. We also find a significant negative relationship between leaf size and wood density.

  8. TreeNetViz: revealing patterns of networks over tree structures.

    PubMed

    Gou, Liang; Zhang, Xiaolong Luke

    2011-12-01

    Network data often contain important attributes from various dimensions such as social affiliations and areas of expertise in a social network. If such attributes exhibit a tree structure, visualizing a compound graph consisting of tree and network structures becomes complicated. How to visually reveal patterns of a network over a tree has not been fully studied. In this paper, we propose a compound graph model, TreeNet, to support visualization and analysis of a network at multiple levels of aggregation over a tree. We also present a visualization design, TreeNetViz, to offer the multiscale and cross-scale exploration and interaction of a TreeNet graph. TreeNetViz uses a Radial, Space-Filling (RSF) visualization to represent the tree structure, a circle layout with novel optimization to show aggregated networks derived from TreeNet, and an edge bundling technique to reduce visual complexity. Our circular layout algorithm reduces both total edge-crossings and edge length and also considers hierarchical structure constraints and edge weight in a TreeNet graph. These experiments illustrate that the algorithm can reduce visual cluttering in TreeNet graphs. Our case study also shows that TreeNetViz has the potential to support the analysis of a compound graph by revealing multiscale and cross-scale network patterns. © 2011 IEEE

  9. Foliar ozone injury on different-sized Prumus serotina Ehrh. trees

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fredericksen, T.S.; Skelly, J.M.; Steiner, K.C.

    1995-06-01

    Black cherry (Prunus serotina Ehrh.) is a common tree species in the eastern U.S. that is highly sensitive to ozone relative to other associated deciduous tree species. Because of difficulties in conducting exposure-response experiments on large trees, air pollution studies have often utilized seedlings and extrapolated the results to predict the potential response of larger forest trees. However, physiological differences between seedlings and mature forest trees may alter responses to air pollutants. A comparative study of seedling, sapling, and canopy black cherry trees was conducted to determine the response of different-sized trees to known ozone exposures and amounts of ozonemore » uptake. Apparent foliar sensitivity to ozone, observed as a dark adaxial leaf stipple, decreased with increasing tree size. An average of 46% of seedling leaf area was symptomatic by early September, compared to 15% - 20% for saplings and canopy trees. In addition to visible symptoms, seedlings also appeared to have greater rates of early leaf abscission than larger trees. Greater sensitivity (i.e., foliar symptoms) per unit exposure with decreasing tree size was closely correlated with rates of stomatal conductance. However, after accounting for differences in stomatal conductance, sensitivity appeared to increase with tree size.« less

  10. Technical Tree Climbing.

    ERIC Educational Resources Information Center

    Jenkins, Peter

    Tree climbing offers a safe, inexpensive adventure sport that can be performed almost anywhere. Using standard procedures practiced in tree surgery or rock climbing, almost any tree can be climbed. Tree climbing provides challenge and adventure as well as a vigorous upper-body workout. Tree Climbers International classifies trees using a system…

  11. Robust best linear estimator for Cox regression with instrumental variables in whole cohort and surrogates with additive measurement error in calibration sample.

    PubMed

    Wang, Ching-Yun; Song, Xiao

    2016-11-01

    Biomedical researchers are often interested in estimating the effect of an environmental exposure in relation to a chronic disease endpoint. However, the exposure variable of interest may be measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies an additive measurement error model, but it may not have repeated measurements. The subset in which the surrogate variables are available is called a calibration sample. In addition to the surrogate variables that are available among the subjects in the calibration sample, we consider the situation when there is an instrumental variable available for all study subjects. An instrumental variable is correlated with the unobserved true exposure variable, and hence can be useful in the estimation of the regression coefficients. In this paper, we propose a nonparametric method for Cox regression using the observed data from the whole cohort. The nonparametric estimator is the best linear combination of a nonparametric correction estimator from the calibration sample and the difference of the naive estimators from the calibration sample and the whole cohort. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via intensive simulation studies. The methods are applied to the Nutritional Biomarkers Study of the Women's Health Initiative. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Robust best linear estimator for Cox regression with instrumental variables in whole cohort and surrogates with additive measurement error in calibration sample

    PubMed Central

    Wang, Ching-Yun; Song, Xiao

    2017-01-01

    SUMMARY Biomedical researchers are often interested in estimating the effect of an environmental exposure in relation to a chronic disease endpoint. However, the exposure variable of interest may be measured with errors. In a subset of the whole cohort, a surrogate variable is available for the true unobserved exposure variable. The surrogate variable satisfies an additive measurement error model, but it may not have repeated measurements. The subset in which the surrogate variables are available is called a calibration sample. In addition to the surrogate variables that are available among the subjects in the calibration sample, we consider the situation when there is an instrumental variable available for all study subjects. An instrumental variable is correlated with the unobserved true exposure variable, and hence can be useful in the estimation of the regression coefficients. In this paper, we propose a nonparametric method for Cox regression using the observed data from the whole cohort. The nonparametric estimator is the best linear combination of a nonparametric correction estimator from the calibration sample and the difference of the naive estimators from the calibration sample and the whole cohort. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via intensive simulation studies. The methods are applied to the Nutritional Biomarkers Study of the Women’s Health Initiative. PMID:27546625

  13. Predictors of success of external cephalic version and cephalic presentation at birth among 1253 women with non-cephalic presentation using logistic regression and classification tree analyses.

    PubMed

    Hutton, Eileen K; Simioni, Julia C; Thabane, Lehana

    2017-08-01

    Among women with a fetus with a non-cephalic presentation, external cephalic version (ECV) has been shown to reduce the rate of breech presentation at birth and cesarean birth. Compared with ECV at term, beginning ECV prior to 37 weeks' gestation decreases the number of infants in a non-cephalic presentation at birth. The purpose of this secondary analysis was to investigate factors associated with a successful ECV procedure and to present this in a clinically useful format. Data were collected as part of the Early ECV Pilot and Early ECV2 Trials, which randomized 1776 women with a fetus in breech presentation to either early ECV (34-36 weeks' gestation) or delayed ECV (at or after 37 weeks). The outcome of interest was successful ECV, defined as the fetus being in a cephalic presentation immediately following the procedure, as well as at the time of birth. The importance of several factors in predicting successful ECV was investigated using two statistical methods: logistic regression and classification and regression tree (CART) analyses. Among nulliparas, non-engagement of the presenting part and an easily palpable fetal head were independently associated with success. Among multiparas, non-engagement of the presenting part, gestation less than 37 weeks and an easily palpable fetal head were found to be independent predictors of success. These findings were consistent with results of the CART analyses. Regardless of parity, descent of the presenting part was the most discriminating factor in predicting successful ECV and cephalic presentation at birth. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.

  14. Northern Arkansas Spring Precipitation Reconstructed from Tree Rings, 1023-1992 A.D.

    Treesearch

    Malcolm K. Cleaveland

    2001-01-01

    Three baldcypress (Taxodium distichum (L.) Rich.) tree-ring chronologies in northeastern Arkansas and southeastern Missouri respond strongly to April-June (spring) rainfall in northern Arkansas. I used regression to reconstruct an average of spring rainfall in the three climatic divisions of northern Arkansas since 1023 A.D. The reconstruction was...

  15. Monitoring individual tree-based change with airborne lidar.

    PubMed

    Duncanson, Laura; Dubayah, Ralph

    2018-05-01

    Understanding the carbon flux of forests is critical for constraining the global carbon cycle and managing forests to mitigate climate change. Monitoring forest growth and mortality rates is critical to this effort, but has been limited in the past, with estimates relying primarily on field surveys. Advances in remote sensing enable the potential to monitor tree growth and mortality across landscapes. This work presents an approach to measure tree growth and loss using multidate lidar campaigns in a high-biomass forest in California, USA. Individual tree crowns were delineated in 2008 and again in 2013 using a 3D crown segmentation algorithm, with derived heights and crown radii extracted and used to estimate individual tree aboveground biomass. Tree growth, loss, and aboveground biomass were analyzed with respect to tree height and crown radius. Both tree growth and loss rates decrease with increasing tree height, following the expectation that trees slow in growth rate as they age. Additionally, our aboveground biomass analysis suggests that, while the system is a net source of aboveground carbon, these carbon dynamics are governed by size class with the largest sources coming from the loss of a relatively small number of large individuals. This study demonstrates that monitoring individual tree-based growth and loss can be conducted with multidate airborne lidar, but these methods remain relatively immature. Disparities between lidar acquisitions were particularly difficult to overcome and decreased the sample of trees analyzed for growth rate in this study to 21% of the full number of delineated crowns. However, this study illuminates the potential of airborne remote sensing for ecologically meaningful forest monitoring at an individual tree level. As methods continue to improve, airborne multidate lidar will enable a richer understanding of the drivers of tree growth, loss, and aboveground carbon flux.

  16. STRIDE: Species Tree Root Inference from Gene Duplication Events.

    PubMed

    Emms, David M; Kelly, Steven

    2017-12-01

    The correct interpretation of any phylogenetic tree is dependent on that tree being correctly rooted. We present STRIDE, a fast, effective, and outgroup-free method for identification of gene duplication events and species tree root inference in large-scale molecular phylogenetic analyses. STRIDE identifies sets of well-supported in-group gene duplication events from a set of unrooted gene trees, and analyses these events to infer a probability distribution over an unrooted species tree for the location of its root. We show that STRIDE correctly identifies the root of the species tree in multiple large-scale molecular phylogenetic data sets spanning a wide range of timescales and taxonomic groups. We demonstrate that the novel probability model implemented in STRIDE can accurately represent the ambiguity in species tree root assignment for data sets where information is limited. Furthermore, application of STRIDE to outgroup-free inference of the origin of the eukaryotic tree resulted in a root probability distribution that provides additional support for leading hypotheses for the origin of the eukaryotes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Estimating Leaf Water Potential of Giant Sequoia Trees from Airborne Hyperspectral Imagery

    NASA Astrophysics Data System (ADS)

    Francis, E. J.; Asner, G. P.

    2015-12-01

    Recent drought-induced forest dieback events have motivated research on the mechanisms of tree survival and mortality during drought. Leaf water potential, a measure of the force exerted by the evaporation of water from the leaf surface, is an indicator of plant water stress and can help predict tree mortality in response to drought. Scientists have traditionally measured water potentials on a tree-by-tree basis, but have not been able to produce maps of tree water potential at the scale of a whole forest, leaving forest managers unaware of forest drought stress patterns and their ecosystem-level consequences. Imaging spectroscopy, a technique for remote measurement of chemical properties, has been used to successfully estimate leaf water potentials in wheat and maize crops and pinyon-pine and juniper trees, but these estimates have never been scaled to the canopy level. We used hyperspectral reflectance data collected by the Carnegie Airborne Observatory (CAO) to map leaf water potentials of giant sequoia trees (Sequoiadendron giganteum) in an 800-hectare grove in Sequoia National Park. During the current severe drought in California, we measured predawn and midday leaf water potentials of 48 giant sequoia trees, using the pressure bomb method on treetop foliage samples collected with tree-climbing techniques. The CAO collected hyperspectral reflectance data at 1-meter resolution from the same grove within 1-2 weeks of the tree-level measurements. A partial least squares regression was used to correlate reflectance data extracted from the 48 focal trees with their water potentials, producing a model that predicts water potential of giant sequoia trees. Results show that giant sequoia trees can be mapped in the imagery with a classification accuracy of 0.94, and we predicted the water potential of the mapped trees to assess 1) similarities and differences between a leaf water potential map and a canopy water content map produced from airborne hyperspectral data, 2

  18. Urban trees and the risk of poor birth outcomes

    Treesearch

    Geoffrey H. Donovan; Yvonne L. Michael; David T. Butry; Amy D. Sullivan; John M. Chase

    2011-01-01

    This paper investigated whether greater tree-canopy cover is associated with reduced risk of poor birth outcomes in Portland, Oregon. Residential addresses were geocoded and linked to classified-aerial imagery to calculate tree-canopy cover in 50, 100, and 200 m buffers around each home in our sample (n=5696). Detailed data on maternal characteristics and additional...

  19. Sampling strategies for efficient estimation of tree foliage biomass

    Treesearch

    Hailemariam Temesgen; Vicente Monleon; Aaron Weiskittel; Duncan Wilson

    2011-01-01

    Conifer crowns can be highly variable both within and between trees, particularly with respect to foliage biomass and leaf area. A variety of sampling schemes have been used to estimate biomass and leaf area at the individual tree and stand scales. Rarely has the effectiveness of these sampling schemes been compared across stands or even across species. In addition,...

  20. A novel dendrochronological approach reveals drivers of carbon sequestration in tree species of riparian forests across spatiotemporal scales.

    PubMed

    Rieger, Isaak; Kowarik, Ingo; Cherubini, Paolo; Cierjacks, Arne

    2017-01-01

    Aboveground carbon (C) sequestration in trees is important in global C dynamics, but reliable techniques for its modeling in highly productive and heterogeneous ecosystems are limited. We applied an extended dendrochronological approach to disentangle the functioning of drivers from the atmosphere (temperature, precipitation), the lithosphere (sedimentation rate), the hydrosphere (groundwater table, river water level fluctuation), the biosphere (tree characteristics), and the anthroposphere (dike construction). Carbon sequestration in aboveground biomass of riparian Quercus robur L. and Fraxinus excelsior L. was modeled (1) over time using boosted regression tree analysis (BRT) on cross-datable trees characterized by equal annual growth ring patterns and (2) across space using a subsequent classification and regression tree analysis (CART) on cross-datable and not cross-datable trees. While C sequestration of cross-datable Q. robur responded to precipitation and temperature, cross-datable F. excelsior also responded to a low Danube river water level. However, CART revealed that C sequestration over time is governed by tree height and parameters that vary over space (magnitude of fluctuation in the groundwater table, vertical distance to mean river water level, and longitudinal distance to upstream end of the study area). Thus, a uniform response to climatic drivers of aboveground C sequestration in Q. robur was only detectable in trees of an intermediate height class and in taller trees (>21.8m) on sites where the groundwater table fluctuated little (≤0.9m). The detection of climatic drivers and the river water level in F. excelsior depended on sites at lower altitudes above the mean river water level (≤2.7m) and along a less dynamic downstream section of the study area. Our approach indicates unexploited opportunities of understanding the interplay of different environmental drivers in aboveground C sequestration. Results may support species-specific and

  1. Tree mortality in response to typhoon-induced floods and mudslides is determined by tree species, size, and position in a riparian Formosan gum forest in subtropical Taiwan

    PubMed Central

    Tzeng, Hsy-Yu; Wang, Wei; Tseng, Yen-Hsueh; Chiu, Ching-An; Kuo, Chu-Chia

    2018-01-01

    Global warming-induced extreme climatic changes have increased the frequency of severe typhoons bringing heavy rains; this has considerably affected the stability of the forest ecosystems. Since the Taiwan 921 earthquake occurred in 21 September 1999, the mountain geology of the Island of Taiwan has become unstable and typhoon-induced floods and mudslides have changed the topography and geomorphology of the area; this has further affected the stability and functions of the riparian ecosystem. In this study, the vegetation of the unique Aowanda Formosan gum forest in Central Taiwan was monitored for 3 years after the occurrence of floods and mudslides during 2009–2011. Tree growth and survival, effects of floods and mudslides, and factors influencing tree survival were investigated. We hypothesized that (1) the effects of floods on the survival are significantly different for each tree species; (2) tree diameter at breast height (DBH) affects tree survival–i.e., the larger the DBH, the higher the survival rate; and (3) the relative position of trees affects tree survival after disturbances by floods and mudslides–the farther trees are from the river, the higher is their survival rate. Our results showed that after floods and mudslides, the lifespans of the major tree species varied significantly. Liquidambar formosana displayed the highest flood tolerance, and the trunks of Lagerstoemia subcostata began rooting after disturbances. Multiple regression analysis indicated that factors such as species, DBH, distance from sampled tree to the above boundary of sample plot (far from the riverbank), and distance from the upstream of the river affected the lifespans of trees; the three factors affected each tree species to different degrees. Furthermore, we showed that insect infestation had a critical role in determining tree survival rate. Our 3-year monitoring investigation revealed that severe typhoon-induced floods and mudslides disturbed the riparian vegetation

  2. Tree mortality in response to typhoon-induced floods and mudslides is determined by tree species, size, and position in a riparian Formosan gum forest in subtropical Taiwan.

    PubMed

    Tzeng, Hsy-Yu; Wang, Wei; Tseng, Yen-Hsueh; Chiu, Ching-An; Kuo, Chu-Chia; Tsai, Shang-Te

    2018-01-01

    Global warming-induced extreme climatic changes have increased the frequency of severe typhoons bringing heavy rains; this has considerably affected the stability of the forest ecosystems. Since the Taiwan 921 earthquake occurred in 21 September 1999, the mountain geology of the Island of Taiwan has become unstable and typhoon-induced floods and mudslides have changed the topography and geomorphology of the area; this has further affected the stability and functions of the riparian ecosystem. In this study, the vegetation of the unique Aowanda Formosan gum forest in Central Taiwan was monitored for 3 years after the occurrence of floods and mudslides during 2009-2011. Tree growth and survival, effects of floods and mudslides, and factors influencing tree survival were investigated. We hypothesized that (1) the effects of floods on the survival are significantly different for each tree species; (2) tree diameter at breast height (DBH) affects tree survival-i.e., the larger the DBH, the higher the survival rate; and (3) the relative position of trees affects tree survival after disturbances by floods and mudslides-the farther trees are from the river, the higher is their survival rate. Our results showed that after floods and mudslides, the lifespans of the major tree species varied significantly. Liquidambar formosana displayed the highest flood tolerance, and the trunks of Lagerstoemia subcostata began rooting after disturbances. Multiple regression analysis indicated that factors such as species, DBH, distance from sampled tree to the above boundary of sample plot (far from the riverbank), and distance from the upstream of the river affected the lifespans of trees; the three factors affected each tree species to different degrees. Furthermore, we showed that insect infestation had a critical role in determining tree survival rate. Our 3-year monitoring investigation revealed that severe typhoon-induced floods and mudslides disturbed the riparian vegetation in the

  3. Protection of individual ash trees from emerald ash borer (Coleoptera: Buprestidae) with basal soil applications of imidacloprid.

    PubMed

    Smitley, D R; Rebek, E J; Royalty, R N; Davis, T W; Newhouse, K F

    2010-02-01

    We conducted field trials at five different locations over a period of 6 yr to investigate the efficacy of imidacloprid applied each spring as a basal soil drench for protection against emerald ash borer, Agrilus planipennis Fairmaire (Coleoptera: Buprestidae). Canopy thinning and emerald ash borer larval density were used to evaluate efficacy for 3-4 yr at each location while treatments continued. Test sites included small urban trees (5-15 cm diameter at breast height [dbh]), medium to large (15-65 cm dbh) trees at golf courses, and medium to large street trees. Annual basal drenches with imidacloprid gave complete protection of small ash trees for three years. At three sites where the size of trees ranged from 23 to 37 cm dbh, we successfully protected all ash trees beginning the test with <60% canopy thinning. Regression analysis of data from two sites reveals that tree size explains 46% of the variation in efficacy of imidacloprid drenches. The smallest trees (<30 cm dbh) remained in excellent condition for 3 yr, whereas most of the largest trees (>38 cm dbh) declined to a weakened state and undesirable appearance. The five-fold increase in trunk and branch surface area of ash trees as the tree dbh doubles may account for reduced efficacy on larger trees, and suggests a need to increase treatment rates for larger trees.

  4. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 36 Parks, Forests, and Public Property 2 2011-07-01 2011-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees may...

  5. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 36 Parks, Forests, and Public Property 2 2012-07-01 2012-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees may...

  6. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 36 Parks, Forests, and Public Property 2 2014-07-01 2014-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees may...

  7. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 36 Parks, Forests, and Public Property 2 2013-07-01 2013-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees may...

  8. Quantifying post-fire fallen trees using multi-temporal lidar

    NASA Astrophysics Data System (ADS)

    Bohlin, Inka; Olsson, Håkan; Bohlin, Jonas; Granström, Anders

    2017-12-01

    Massive tree-felling due to root damage is a common fire effect on burnt areas in Scandinavia, but has so far not been analyzed in detail. Here we explore if pre- and post-fire lidar data can be used to estimate the proportion of fallen trees. The study was carried out within a large (14,000 ha) area in central Sweden burnt in August 2014, where we had access to airborne lidar data from both 2011 and 2015. Three data-sets of predictor variables were tested: POST (post-fire lidar metrics), DIF (difference between post- and pre-fire lidar metrics) and combination of those two (POST_DIF). Fractional logistic regression was used to predict the proportion of fallen trees. Training data consisted of 61 plots, where the number of fallen and standing trees was calculated both in the field and with interpretation of drone images. The accuracy of the best model was tested based on 100 randomly selected validation plots with a size of 25 × 25 m. Our results showed that multi-temporal lidar together with field-collected training data can be used for quantifying post-fire tree felling over large areas. Several height-, density- and intensity metrics correlated with the proportion of fallen trees. The best model combined metrics from both datasets (POST_DIF), resulting in a RMSE of 0.11. Results were slightly poorer in the validation plots with RMSE of 0.18 using pixel size of 12.5 m and RMSE of 0.15 using pixel size of 6.25 m. Our model performed least well for stands that had been exposed to high-intensity crown fire. This was likely due to the low amount of echoes from the standing black tree skeletons. Wall-to-wall maps produced with this model can be used for landscape level analysis of fire effects and to explore the relationship between fallen trees and forest structure, soil type, fire intensity or topography.

  9. Climate change accelerates growth of urban trees in metropolises worldwide.

    PubMed

    Pretzsch, Hans; Biber, Peter; Uhl, Enno; Dahlhausen, Jens; Schütze, Gerhard; Perkins, Diana; Rötzer, Thomas; Caldentey, Juan; Koike, Takayoshi; Con, Tran van; Chavanne, Aurélia; Toit, Ben du; Foster, Keith; Lefer, Barry

    2017-11-13

    Despite the importance of urban trees, their growth reaction to climate change and to the urban heat island effect has not yet been investigated with an international scope. While we are well informed about forest growth under recent conditions, it is unclear if this knowledge can be simply transferred to urban environments. Based on tree ring analyses in ten metropolises worldwide, we show that, in general, urban trees have undergone accelerated growth since the 1960s. In addition, urban trees tend to grow more quickly than their counterparts in the rural surroundings. However, our analysis shows that climate change seems to enhance the growth of rural trees more than that of urban trees. The benefits of growing in an urban environment seem to outweigh known negative effects, however, accelerated growth may also mean more rapid ageing and shortened lifetime. Thus, city planners should adapt to the changed dynamics in order to secure the ecosystem services provided by urban trees.

  10. Estimating leaf area and leaf biomass of open-grown deciduous urban trees

    Treesearch

    David J. Nowak

    1996-01-01

    Logarithmic regression equations were developed to predict leaf area and leaf biomass for open-grown deciduous urban trees based on stem diameter and crown parameters. Equations based on crown parameters produced more reliable estimates. The equations can be used to help quantify forest structure and functions, particularly in urbanizing and urban/suburban areas.

  11. Modeling potential future individual tree-species distributions in the eastern United States under a climate change scenario: a case study with Pinus virginiana

    Treesearch

    Louis R. Iverson; Anantha Prasad; Mark W. Schwartz; Mark W. Schwartz

    1999-01-01

    We are using a deterministic regression tree analysis model (DISTRIB) and a stochastic migration model (SHIFT) to examine potential distributions of ~66 individual species of eastern US trees under a 2 x CO2 climate change scenario. This process is demonstrated for Virginia pine (Pinus virginiana).

  12. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  13. Modified Regression Correlation Coefficient for Poisson Regression Model

    NASA Astrophysics Data System (ADS)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  14. Seedling growth responses to phosphorus reflect adult distribution patterns of tropical trees.

    PubMed

    Zalamea, Paul-Camilo; Turner, Benjamin L; Winter, Klaus; Jones, F Andrew; Sarmiento, Carolina; Dalling, James W

    2016-10-01

    Soils influence tropical forest composition at regional scales. In Panama, data on tree communities and underlying soils indicate that species frequently show distributional associations to soil phosphorus. To understand how these associations arise, we combined a pot experiment to measure seedling responses of 15 pioneer species to phosphorus addition with an analysis of the phylogenetic structure of phosphorus associations of the entire tree community. Growth responses of pioneers to phosphorus addition revealed a clear tradeoff: species from high-phosphorus sites grew fastest in the phosphorus-addition treatment, while species from low-phosphorus sites grew fastest in the low-phosphorus treatment. Traits associated with growth performance remain unclear: biomass allocation, phosphatase activity and phosphorus-use efficiency did not correlate with phosphorus associations; however, phosphatase activity was most strongly down-regulated in response to phosphorus addition in species from high-phosphorus sites. Phylogenetic analysis indicated that pioneers occur more frequently in clades where phosphorus associations are overdispersed as compared with the overall tree community, suggesting that selection on phosphorus acquisition and use may be strongest for pioneer species with high phosphorus demand. Our results show that phosphorus-dependent growth rates provide an additional explanation for the regional distribution of tree species in Panama, and possibly elsewhere. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  15. [Prediction and spatial distribution of recruitment trees of natural secondary forest based on geographically weighted Poisson model].

    PubMed

    Zhang, Ling Yu; Liu, Zhao Gang

    2017-12-01

    Based on the data collected from 108 permanent plots of the forest resources survey in Maoershan Experimental Forest Farm during 2004-2016, this study investigated the spatial distribution of recruitment trees in natural secondary forest by global Poisson regression and geographically weighted Poisson regression (GWPR) with four bandwidths of 2.5, 5, 10 and 15 km. The simulation effects of the 5 regressions and the factors influencing the recruitment trees in stands were analyzed, a description was given to the spatial autocorrelation of the regression residuals on global and local levels using Moran's I. The results showed that the spatial distribution of the number of natural secondary forest recruitment was significantly influenced by stands and topographic factors, especially average DBH. The GWPR model with small scale (2.5 km) had high accuracy of model fitting, a large range of model parameter estimates was generated, and the localized spatial distribution effect of the model parameters was obtained. The GWPR model at small scale (2.5 and 5 km) had produced a small range of model residuals, and the stability of the model was improved. The global spatial auto-correlation of the GWPR model residual at the small scale (2.5 km) was the lowe-st, and the local spatial auto-correlation was significantly reduced, in which an ideal spatial distribution pattern of small clusters with different observations was formed. The local model at small scale (2.5 km) was much better than the global model in the simulation effect on the spatial distribution of recruitment tree number.

  16. Nonparametric instrumental regression with non-convex constraints

    NASA Astrophysics Data System (ADS)

    Grasmair, M.; Scherzer, O.; Vanhems, A.

    2013-03-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.

  17. There is no temperature dependence of net biochemical fractionation of hydrogen and oxygen isotopes in tree-ring cellulose.

    PubMed

    Roden, J S; Ehleringer, J R

    2000-01-01

    The isotopic composition of tree-ring cellulose was obtained over a two-year period from small diameter, riparian zone trees along an elevational transect in Big Cottonwood Canyon, Utah, USA to test for a possible temperature dependence of net biological fractionation during cellulose synthesis. The isotope ratios of stream water varied by only 3.6% and 0.2% in deltaD and delta18O, respectively, over an elevation change of 810m. The similarity in stream water and macroenvironment over the short (13km) transect produced nearly constant stem and leaf water deltaD and delta18O values. In addition, what few seasonal variations observed in the isotopic composition of source water and atmospheric water vapor or in leaf water evaporative enrichment were experienced equally by all sites along the elevational transect. The temperature at each site along the transect spanned a range of > or = 5 degrees C as calculated using the adiabatic lapse rate. Since the deltaD and delta18O values of stem and leaf water varied little for these trees over this elevation/temperature transect, any differences in tree-ring cellulose deltaD and delta18O values should have been associated with temperature effects on net biological fractionation. However, the slopes of the regressions of elevation versus the deltaD and delta18O values of tree-ring cellulose were not significantly different from zero indicating little or no temperature dependence of net biological fractionation. Therefore, cross-site climatic reconstruction studies using the isotope ratios of cellulose need not be concerned that temperatures during the growing season have influenced results.

  18. No evidence for consistent long-term growth stimulation of 13 tropical tree species: results from tree-ring analysis.

    PubMed

    Groenendijk, Peter; van der Sleen, Peter; Vlam, Mart; Bunyavejchewin, Sarayudh; Bongers, Frans; Zuidema, Pieter A

    2015-10-01

    The important role of tropical forests in the global carbon cycle makes it imperative to assess changes in their carbon dynamics for accurate projections of future climate-vegetation feedbacks. Forest monitoring studies conducted over the past decades have found evidence for both increasing and decreasing growth rates of tropical forest trees. The limited duration of these studies restrained analyses to decadal scales, and it is still unclear whether growth changes occurred over longer time scales, as would be expected if CO2 -fertilization stimulated tree growth. Furthermore, studies have so far dealt with changes in biomass gain at forest-stand level, but insights into species-specific growth changes - that ultimately determine community-level responses - are lacking. Here, we analyse species-specific growth changes on a centennial scale, using growth data from tree-ring analysis for 13 tree species (~1300 trees), from three sites distributed across the tropics. We used an established (regional curve standardization) and a new (size-class isolation) growth-trend detection method and explicitly assessed the influence of biases on the trend detection. In addition, we assessed whether aggregated trends were present within and across study sites. We found evidence for decreasing growth rates over time for 8-10 species, whereas increases were noted for two species and one showed no trend. Additionally, we found evidence for weak aggregated growth decreases at the site in Thailand and when analysing all sites simultaneously. The observed growth reductions suggest deteriorating growth conditions, perhaps due to warming. However, other causes cannot be excluded, such as recovery from large-scale disturbances or changing forest dynamics. Our findings contrast growth patterns that would be expected if elevated CO2 would stimulate tree growth. These results suggest that commonly assumed growth increases of tropical forests may not occur, which could lead to erroneous

  19. Synthesis of phylogeny and taxonomy into a comprehensive tree of life.

    PubMed

    Hinchliff, Cody E; Smith, Stephen A; Allman, James F; Burleigh, J Gordon; Chaudhary, Ruchi; Coghill, Lyndon M; Crandall, Keith A; Deng, Jiabin; Drew, Bryan T; Gazis, Romina; Gude, Karl; Hibbett, David S; Katz, Laura A; Laughinghouse, H Dail; McTavish, Emily Jane; Midford, Peter E; Owen, Christopher L; Ree, Richard H; Rees, Jonathan A; Soltis, Douglas E; Williams, Tiffani; Cranston, Karen A

    2015-10-13

    Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips-the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics.

  20. Estimation of carbon storage based on individual tree detection in Pinus densiflora stands using a fusion of aerial photography and LiDAR data.

    PubMed

    Kim, So-Ra; Kwak, Doo-Ahn; Lee, Woo-Kyun; oLee, Woo-Kyun; Son, Yowhan; Bae, Sang-Won; Kim, Choonsig; Yoo, Seongjin

    2010-07-01

    The objective of this study was to estimate the carbon storage capacity of Pinus densiflora stands using remotely sensed data by combining digital aerial photography with light detection and ranging (LiDAR) data. A digital canopy model (DCM), generated from the LiDAR data, was combined with aerial photography for segmenting crowns of individual trees. To eliminate errors in over and under-segmentation, the combined image was smoothed using a Gaussian filtering method. The processed image was then segmented into individual trees using a marker-controlled watershed segmentation method. After measuring the crown area from the segmented individual trees, the individual tree diameter at breast height (DBH) was estimated using a regression function developed from the relationship observed between the field-measured DBH and crown area. The above ground biomass of individual trees could be calculated by an image-derived DBH using a regression function developed by the Korea Forest Research Institute. The carbon storage, based on individual trees, was estimated by simple multiplication using the carbon conversion index (0.5), as suggested in guidelines from the Intergovernmental Panel on Climate Change. The mean carbon storage per individual tree was estimated and then compared with the field-measured value. This study suggested that the biomass and carbon storage in a large forest area can be effectively estimated using aerial photographs and LiDAR data.

  1. Decision tree analysis to stratify risk of de novo non-melanoma skin cancer following liver transplantation.

    PubMed

    Tanaka, Tomohiro; Voigt, Michael D

    2018-03-01

    Non-melanoma skin cancer (NMSC) is the most common de novo malignancy in liver transplant (LT) recipients; it behaves more aggressively and it increases mortality. We used decision tree analysis to develop a tool to stratify and quantify risk of NMSC in LT recipients. We performed Cox regression analysis to identify which predictive variables to enter into the decision tree analysis. Data were from the Organ Procurement Transplant Network (OPTN) STAR files of September 2016 (n = 102984). NMSC developed in 4556 of the 105984 recipients, a mean of 5.6 years after transplant. The 5/10/20-year rates of NMSC were 2.9/6.3/13.5%, respectively. Cox regression identified male gender, Caucasian race, age, body mass index (BMI) at LT, and sirolimus use as key predictive or protective factors for NMSC. These factors were entered into a decision tree analysis. The final tree stratified non-Caucasians as low risk (0.8%), and Caucasian males > 47 years, BMI < 40 who did not receive sirolimus, as high risk (7.3% cumulative incidence of NMSC). The predictions in the derivation set were almost identical to those in the validation set (r 2  = 0.971, p < 0.0001). Cumulative incidence of NMSC in low, moderate and high risk groups at 5/10/20 year was 0.5/1.2/3.3, 2.1/4.8/11.7 and 5.6/11.6/23.1% (p < 0.0001). The decision tree model accurately stratifies the risk of developing NMSC in the long-term after LT.

  2. Improving medical diagnosis reliability using Boosted C5.0 decision tree empowered by Particle Swarm Optimization.

    PubMed

    Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin

    2015-08-01

    Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.

  3. Classification of driving workload affected by highway alignment conditions based on classification and regression tree algorithm.

    PubMed

    Hu, Jiangbi; Wang, Ronghua

    2018-02-17

    Guaranteeing a safe and comfortable driving workload can contribute to reducing traffic injuries. In order to provide safe and comfortable threshold values, this study attempted to classify driving workload from the aspects of human factors mainly affected by highway geometric conditions and to determine the thresholds of different workload classifications. This article stated a hypothesis that the values of driver workload change within a certain range. Driving workload scales were stated based on a comprehensive literature review. Through comparative analysis of different psychophysiological measures, heart rate variability (HRV) was chosen as the representative measure for quantifying driving workload by field experiments. Seventy-two participants (36 car drivers and 36 large truck drivers) and 6 highways with different geometric designs were selected to conduct field experiments. A wearable wireless dynamic multiparameter physiological detector (KF-2) was employed to detect physiological data that were simultaneously correlated to the speed changes recorded by a Global Positioning System (GPS) (testing time, driving speeds, running track, and distance). Through performing statistical analyses, including the distribution of HRV during the flat, straight segments and P-P plots of modified HRV, a driving workload calculation model was proposed. Integrating driving workload scales with values, the threshold of each scale of driving workload was determined by classification and regression tree (CART) algorithms. The driving workload calculation model was suitable for driving speeds in the range of 40 to 120 km/h. The experimental data of 72 participants revealed that driving workload had a significant effect on modified HRV, revealing a change in driving speed. When the driving speed was between 100 and 120 km/h, drivers showed an apparent increase in the corresponding modified HRV. The threshold value of the normal driving workload K was between -0.0011 and 0

  4. Analysis of the impact of recreational trail usage for prioritising management decisions: a regression tree approach

    NASA Astrophysics Data System (ADS)

    Tomczyk, Aleksandra; Ewertowski, Marek; White, Piran; Kasprzak, Leszek

    2016-04-01

    The dual role of many Protected Natural Areas in providing benefits for both conservation and recreation poses challenges for management. Although recreation-based damage to ecosystems can occur very quickly, restoration can take many years. The protection of conservation interests at the same as providing for recreation requires decisions to be made about how to prioritise and direct management actions. Trails are commonly used to divert visitors from the most important areas of a site, but high visitor pressure can lead to increases in trail width and a concomitant increase in soil erosion. Here we use detailed field data on condition of recreational trails in Gorce National Park, Poland, as the basis for a regression tree analysis to determine the factors influencing trail deterioration, and link specific trail impacts with environmental, use related and managerial factors. We distinguished 12 types of trails, characterised by four levels of degradation: (1) trails with an acceptable level of degradation; (2) threatened trails; (3) damaged trails; and (4) heavily damaged trails. Damaged trails were the most vulnerable of all trails and should be prioritised for appropriate conservation and restoration. We also proposed five types of monitoring of recreational trail conditions: (1) rapid inventory of negative impacts; (2) monitoring visitor numbers and variation in type of use; (3) change-oriented monitoring focusing on sections of trail which were subjected to changes in type or level of use or subjected to extreme weather events; (4) monitoring of dynamics of trail conditions; and (5) full assessment of trail conditions, to be carried out every 10-15 years. The application of the proposed framework can enhance the ability of Park managers to prioritise their trail management activities, enhancing trail conditions and visitor safety, while minimising adverse impacts on the conservation value of the ecosystem. A.M.T. was supported by the Polish Ministry of

  5. Spatial trends in leaf size of Amazonian rainforest trees

    NASA Astrophysics Data System (ADS)

    Malhado, A. C. M.; Malhi, Y.; Whittaker, R. J.; Ladle, R. J.; Ter Steege, H.; Aragão, L. E. O. C.; Quesada, C. A.; Araujo-Murakami, A.; Phillips, O. L.; Peacock, J.; Lopez-Gonzalez, G.; Baker, T. R.; Butt, N.; Anderson, L. O.; Arroyo, L.; Almeida, S.; Higuchi, N.; Killeen, T. J.; Monteagudo, A.; Neill, D.; Pitman, N.; Prieto, A.; Salomão, R. P.; Silva, N.; Vásquez-Martínez, R.; Laurance, W. F.

    2009-02-01

    Leaf size influences many aspects of tree function such as rates of transpiration and photosynthesis and, consequently, often varies in a predictable way in response to environmental gradients. The recent development of pan-Amazonian databases based on permanent botanical plots (e.g. RAINFOR, ATDN) has now made it possible to assess trends in leaf size across environmental gradients in Amazonia. Previous plot-based studies have shown that the community structure of Amazonian trees breaks down into at least two major ecological gradients corresponding with variations in soil fertility (decreasing south to northeast) and length of the dry season (increasing from northwest to south and east). Here we describe the results of the geographic distribution of leaf size categories based on 121 plots distributed across eight South American countries. We find that, as predicted, the Amazon forest is predominantly populated by tree species and individuals in the mesophyll size class (20.25-182.25 cm2). The geographic distribution of species and individuals with large leaves (>20.25 cm2) is complex but is generally characterized by a higher proportion of such trees in the north-west of the region. Spatially corrected regressions reveal weak correlations between the proportion of large-leaved species and metrics of water availability. We also find a significant negative relationship between leaf size and wood density.

  6. On defining a unique phylogenetic tree with homoplastic characters.

    PubMed

    Goloboff, Pablo A; Wilkinson, Mark

    2018-05-01

    This paper discusses the problem of whether creating a matrix with all the character state combinations that have a fixed number of steps (or extra steps) on a given tree T, produces the same tree T when analyzed with maximum parsimony or maximum likelihood. Exhaustive enumeration of cases up to 20 taxa for binary characters, and up to 12 taxa for 4-state characters, shows that the same tree is recovered (as unique most likely or most parsimonious tree) as long as the number of extra steps is within 1/4 of the number of taxa. This dependence, 1/4 of the number of taxa, is discussed with a general argumentation, in terms of the spread of the character changes on the tree used to select character state distributions. The present finding allows creating matrices which have as much homoplasy as possible for the most parsimonious or likely tree to be predictable, and examination of these matrices with hill-climbing search algorithms provides additional evidence on the (lack of a) necessary relationship between homoplasy and the ability of search methods to find optimal trees. Copyright © 2018 Elsevier Inc. All rights reserved.

  7. Tree species, tree genotypes and tree genotypic diversity levels affect microbe-mediated soil ecosystem functions in a subtropical forest.

    PubMed

    Purahong, Witoon; Durka, Walter; Fischer, Markus; Dommert, Sven; Schöps, Ricardo; Buscot, François; Wubet, Tesfaye

    2016-11-18

    Tree species identity and tree genotypes contribute to the shaping of soil microbial communities. However, knowledge about how these two factors influence soil ecosystem functions is still lacking. Furthermore, in forest ecosystems tree genotypes co-occur and interact with each other, thus the effects of tree genotypic diversity on soil ecosystem functions merit attention. Here we investigated the effects of tree species, tree genotypes and genotypic diversity levels, alongside soil physicochemical properties, on the overall and specific soil enzyme activity patterns. Our results indicate that tree species identity, tree genotypes and genotypic diversity level have significant influences on overall and specific soil enzyme activity patterns. These three factors influence soil enzyme patterns partly through effects on soil physicochemical properties and substrate quality. Variance partitioning showed that tree species identity, genotypic diversity level, pH and water content all together explained ~30% variations in the overall patterns of soil enzymes. However, we also found that the responses of soil ecosystem functions to tree genotypes and genotypic diversity are complex, being dependent on tree species identity and controlled by multiple factors. Our study highlights the important of inter- and intra-specific variations in tree species in shaping soil ecosystem functions in a subtropical forest.

  8. Tree species, tree genotypes and tree genotypic diversity levels affect microbe-mediated soil ecosystem functions in a subtropical forest

    PubMed Central

    Purahong, Witoon; Durka, Walter; Fischer, Markus; Dommert, Sven; Schöps, Ricardo; Buscot, François; Wubet, Tesfaye

    2016-01-01

    Tree species identity and tree genotypes contribute to the shaping of soil microbial communities. However, knowledge about how these two factors influence soil ecosystem functions is still lacking. Furthermore, in forest ecosystems tree genotypes co-occur and interact with each other, thus the effects of tree genotypic diversity on soil ecosystem functions merit attention. Here we investigated the effects of tree species, tree genotypes and genotypic diversity levels, alongside soil physicochemical properties, on the overall and specific soil enzyme activity patterns. Our results indicate that tree species identity, tree genotypes and genotypic diversity level have significant influences on overall and specific soil enzyme activity patterns. These three factors influence soil enzyme patterns partly through effects on soil physicochemical properties and substrate quality. Variance partitioning showed that tree species identity, genotypic diversity level, pH and water content all together explained ~30% variations in the overall patterns of soil enzymes. However, we also found that the responses of soil ecosystem functions to tree genotypes and genotypic diversity are complex, being dependent on tree species identity and controlled by multiple factors. Our study highlights the important of inter- and intra-specific variations in tree species in shaping soil ecosystem functions in a subtropical forest. PMID:27857198

  9. Mastication and prescribed fire influences on tree mortality and predicted fire behavior in ponderosa pine

    Treesearch

    Alicia L. Reiner; Nicole M. Vaillant; Scott N. Dailey

    2012-01-01

    The purpose of this study was to provide land managers with information on potential wildfire behavior and tree mortality associated with mastication and masticated/fire treatments in a plantation. Additionally, the effect of pulling fuels away from tree boles before applying fire treatment was studied in relation to tree mortality. Fuel characteristics and tree...

  10. A method to study response of large trees to different amounts of available soil water

    Treesearch

    D.H. Marx; Shi-Jean S. Sung; J.S. Cunningham; M.D. Thompson; L.M. White

    1995-01-01

    A method was developed to manipulate available soil water on large trees by intercepting thrufall with gutters placed under tree canopies and irrigating the intercepted thrufall onto other trees. With this design, trees were exposed for 2 years to either 25% less thrufall, normal thrufall, or 25% additional thrufall.Undercanopy construction in these plots moderately...

  11. Tree Testing of Hierarchical Menu Structures for Health Applications

    PubMed Central

    Le, Thai; Chaudhuri, Shomir; Chung, Jane; Thompson, Hilaire J; Demiris, George

    2014-01-01

    To address the need for greater evidence-based evaluation of Health Information Technology (HIT) systems we introduce a method of usability testing termed tree testing. In a tree test, participants are presented with an abstract hierarchical tree of the system taxonomy and asked to navigate through the tree in completing representative tasks. We apply tree testing to a commercially available health application, demonstrating a use case and providing a comparison with more traditional in-person usability testing methods. Online tree tests (N=54) and in-person usability tests (N=15) were conducted from August to September 2013. Tree testing provided a method to quantitatively evaluate the information structure of a system using various navigational metrics including completion time, task accuracy, and path length. The results of the analyses compared favorably to the results seen from the traditional usability test. Tree testing provides a flexible, evidence-based approach for researchers to evaluate the information structure of HITs. In addition, remote tree testing provides a quick, flexible, and high volume method of acquiring feedback in a structured format that allows for quantitative comparisons. With the diverse nature and often large quantities of health information available, addressing issues of terminology and concept classifications during the early development process of a health information system will improve navigation through the system and save future resources. Tree testing is a usability method that can be used to quickly and easily assess information hierarchy of health information systems. PMID:24582924

  12. Geospatial relationships of tree species damage caused by Hurricane Katrina in south Mississippi

    Treesearch

    Mark W. Garrigues; Zhaofei Fan; David L. Evans; Scott D. Roberts; William H. Cooke III

    2012-01-01

    Hurricane Katrina generated substantial impacts on the forests and biological resources of the affected area in Mississippi. This study seeks to use classification tree analysis (CTA) to determine which variables are significant in predicting hurricane damage (shear or windthrow) in the Southeast Mississippi Institute for Forest Inventory District. Logistic regressions...

  13. SEMIPARAMETRIC ADDITIVE RISKS REGRESSION FOR TWO-STAGE DESIGN SURVIVAL STUDIES

    PubMed Central

    Li, Gang; Wu, Tong Tong

    2011-01-01

    In this article we study a semiparametric additive risks model (McKeague and Sasieni (1994)) for two-stage design survival data where accurate information is available only on second stage subjects, a subset of the first stage study. We derive two-stage estimators by combining data from both stages. Large sample inferences are developed. As a by-product, we also obtain asymptotic properties of the single stage estimators of McKeague and Sasieni (1994) when the semiparametric additive risks model is misspecified. The proposed two-stage estimators are shown to be asymptotically more efficient than the second stage estimators. They also demonstrate smaller bias and variance for finite samples. The developed methods are illustrated using small intestine cancer data from the SEER (Surveillance, Epidemiology, and End Results) Program. PMID:21931467

  14. Spatial and seasonal distribution of adult Oithona similis in the Southern Ocean: Predictions using boosted regression trees

    NASA Astrophysics Data System (ADS)

    Pinkerton, Matt H.; Smith, Adam N. H.; Raymond, Ben; Hosie, Graham W.; Sharp, Ben; Leathwick, John R.; Bradford-Grieve, Janet M.

    2010-04-01

    We applied a multivariate statistical modelling technique called boosted regression trees to derive relationships between environmental conditions and the distribution of the adult stage of the cyclopoid copepod Oithona similis in the Southern Ocean. Nearly 20 000 samples from the Southern Ocean Continuous Plankton Recorder survey (87% from East Antarctica) were used to model the probability of detection (presence) and relative abundance of adults of this zooplankton species in surface waters. We demonstrate that it is possible to obtain reasonable models for both the presence (area under the Receiver Operating Characteristic curve of 0.77) and relative abundance (28-35% variance explained) of adult O. similis between November and March in much of the Southern Ocean. No investigation was possible where the environmental characteristics were not well represented by the SO-CPR dataset, namely, the Argentine shelf, Weddell Sea, and the frontal region north of the Amundsen Sea, or under sea-ice. Our analyses support the hypothesis that adult O. similis abundance is related to environmental conditions in a broadly similar way throughout the Southern Ocean. Compared to a compilation of net-haul data from the literature, the abundance model explained 34% of the variance in surface concentrations of adult stages of this species, and 23-59% of the variance in depth-integrated abundance of copepodite and adult stages combined. The models show higher occurrence and elevated abundances in a broad circumpolar band between the Antarctic Polar Front and the southern boundary of the Antarctic Circumpolar Current (approximately 54-64°S). Evidence of diel vertical migration by adults of this species north of 65°S was found, with surface abundances 20% higher at night than during the day. There was no evidence of diel migration south of 65°S. Five potential "hotspots" of adult O. similis were identified: in the southern Scotia Sea, two areas off east Antarctica, in the frontal

  15. TreeVector: scalable, interactive, phylogenetic trees for the web.

    PubMed

    Pethica, Ralph; Barker, Gary; Kovacs, Tim; Gough, Julian

    2010-01-28

    Phylogenetic trees are complex data forms that need to be graphically displayed to be human-readable. Traditional techniques of plotting phylogenetic trees focus on rendering a single static image, but increases in the production of biological data and large-scale analyses demand scalable, browsable, and interactive trees. We introduce TreeVector, a Scalable Vector Graphics-and Java-based method that allows trees to be integrated and viewed seamlessly in standard web browsers with no extra software required, and can be modified and linked using standard web technologies. There are now many bioinformatics servers and databases with a range of dynamic processes and updates to cope with the increasing volume of data. TreeVector is designed as a framework to integrate with these processes and produce user-customized phylogenies automatically. We also address the strengths of phylogenetic trees as part of a linked-in browsing process rather than an end graphic for print. TreeVector is fast and easy to use and is available to download precompiled, but is also open source. It can also be run from the web server listed below or the user's own web server. It has already been deployed on two recognized and widely used database Web sites.

  16. Evaluating the High Risk Groups for Suicide: A Comparison of Logistic Regression, Support Vector Machine, Decision Tree and Artificial Neural Network

    PubMed Central

    AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad

    2016-01-01

    Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463

  17. IcyTree: rapid browser-based visualization for phylogenetic trees and networks

    PubMed Central

    2017-01-01

    Abstract Summary: IcyTree is an easy-to-use application which can be used to visualize a wide variety of phylogenetic trees and networks. While numerous phylogenetic tree viewers exist already, IcyTree distinguishes itself by being a purely online tool, having a responsive user interface, supporting phylogenetic networks (ancestral recombination graphs in particular), and efficiently drawing trees that include information such as ancestral locations or trait values. IcyTree also provides intuitive panning and zooming utilities that make exploring large phylogenetic trees of many thousands of taxa feasible. Availability and Implementation: IcyTree is a web application and can be accessed directly at http://tgvaughan.github.com/icytree. Currently supported web browsers include Mozilla Firefox and Google Chrome. IcyTree is written entirely in client-side JavaScript (no plugin required) and, once loaded, does not require network access to run. IcyTree is free software, and the source code is made available at http://github.com/tgvaughan/icytree under version 3 of the GNU General Public License. Contact: tgvaughan@gmail.com PMID:28407035

  18. IcyTree: rapid browser-based visualization for phylogenetic trees and networks.

    PubMed

    Vaughan, Timothy G

    2017-08-01

    IcyTree is an easy-to-use application which can be used to visualize a wide variety of phylogenetic trees and networks. While numerous phylogenetic tree viewers exist already, IcyTree distinguishes itself by being a purely online tool, having a responsive user interface, supporting phylogenetic networks (ancestral recombination graphs in particular), and efficiently drawing trees that include information such as ancestral locations or trait values. IcyTree also provides intuitive panning and zooming utilities that make exploring large phylogenetic trees of many thousands of taxa feasible. IcyTree is a web application and can be accessed directly at http://tgvaughan.github.com/icytree . Currently supported web browsers include Mozilla Firefox and Google Chrome. IcyTree is written entirely in client-side JavaScript (no plugin required) and, once loaded, does not require network access to run. IcyTree is free software, and the source code is made available at http://github.com/tgvaughan/icytree under version 3 of the GNU General Public License. tgvaughan@gmail.com. © The Author(s) 2017. Published by Oxford University Press.

  19. Exact solutions for species tree inference from discordant gene trees.

    PubMed

    Chang, Wen-Chieh; Górecki, Paweł; Eulenstein, Oliver

    2013-10-01

    Phylogenetic analysis has to overcome the grant challenge of inferring accurate species trees from evolutionary histories of gene families (gene trees) that are discordant with the species tree along whose branches they have evolved. Two well studied approaches to cope with this challenge are to solve either biologically informed gene tree parsimony (GTP) problems under gene duplication, gene loss, and deep coalescence, or the classic RF supertree problem that does not rely on any biological model. Despite the potential of these problems to infer credible species trees, they are NP-hard. Therefore, these problems are addressed by heuristics that typically lack any provable accuracy and precision. We describe fast dynamic programming algorithms that solve the GTP problems and the RF supertree problem exactly, and demonstrate that our algorithms can solve instances with data sets consisting of as many as 22 taxa. Extensions of our algorithms can also report the number of all optimal species trees, as well as the trees themselves. To better asses the quality of the resulting species trees that best fit the given gene trees, we also compute the worst case species trees, their numbers, and optimization score for each of the computational problems. Finally, we demonstrate the performance of our exact algorithms using empirical and simulated data sets, and analyze the quality of heuristic solutions for the studied problems by contrasting them with our exact solutions.

  20. A Method to Study Response of Large Trees to Different Amounts of Available Soil Water

    Treesearch

    Donald H. Marx; Shi-jean S. Sung; James S. Cunningham; Michael D. Thompson; Linda M. White

    1995-01-01

    A method was developed to manipulate available soil water on large trees by intercepting thrufall with gutters placed under tree canopies and irrigating the intercepted thrufall onto other trees. With this design, trees were exposed for 2 years to either 25 percent less thrufall, normal tbrufall,or 25 percent additional thrufall. Undercanopy construction in these plots...

  1. The special features of tree ring gas chronologies

    NASA Astrophysics Data System (ADS)

    Ageev, Boris G.; Gruzdev, Aleksandr N.; Sapozhnikova, Valeria A.

    2015-11-01

    Stem wood is known to contain significant amounts of gases. However, literature data on the functional role of the gases are lacking. The results of our experiments show that porous wood structure is capable of annual accumulation (sorption) of the stem gas components that include H2O vapor and plant cell-respired CO2. This allows for development of additional chronologies to be used for gaining a deeper insight into the behavior of the stem gases. An analysis of the vacuum-extracted wood tree ring CO2 and H2O has revealed that the CO2 and H2O chronologies are associated with interannual variations in the total pressure of the gas components in the tree rings and are characterized by short-period cycles independent of tree age and by long-period variations with tree age. Our investigations led us to propose a procedure for using the CO2 content as a marker of year-to-year variations in the total pressure of the residual gas components found in wood tree rings.

  2. Interactions between CO2 enhancement and N addition on net primary productivity and water-use efficiency in a mesocosm with multiple subtropical tree species.

    PubMed

    Yan, Junhua; Zhang, Deqiang; Liu, Juxiu; Zhou, Guoyi

    2014-07-01

    Carbon dioxide (CO2 ) enhancement (eCO2 ) and N addition (aN) have been shown to increase net primary production (NPP) and to affect water-use efficiency (WUE) for many temperate ecosystems, but few studies have been made on subtropical tree species. This study compared the responses of NPP and WUE from a mesocosm composing five subtropical tree species to eCO2 (700 ppm), aN (10 g N m(-2) yr(-1) ) and eCO2 × aN using open-top chambers. Our results showed that mean annual ecosystem NPP did not changed significantly under eCO2 , increased by 56% under aN and 64% under eCO2 × aN. Ecosystem WUE increased by 14%, 55%, and 61% under eCO2 , aN and eCO2 × aN, respectively. We found that the observed responses of ecosystem WUE were largely driven by the responses of ecosystem NPP. Statistical analysis showed that there was no significant interactions between eCO2 and aN on ecosystem NPP (P = 0.731) or WUE (P = 0.442). Our results showed that increasing N deposition was likely to have much stronger effects on ecosystem NPP and WUE than increasing CO2 concentration for the subtropical forests. However, different tree species responded quite differently. aN significantly increased annual NPP of the fast-growing species (Schima superba). Nitrogen-fixing species (Ormosia pinnata) grew significantly faster only under eCO2 × aN. eCO2 had no effects on annual NPP of those two species but significantly increased annual NPP of other two species (Castanopsis hystrix and Acmena acuminatissima). Differential responses of the NPP among different tree species to eCO2 and aN will likely have significant implications on the species composition of subtropical forests under future global change. © 2013 John Wiley & Sons Ltd.

  3. Estimating phylogenetic relationships despite discordant gene trees across loci: the species tree of a diverse species group of feather mites (Acari: Proctophyllodidae).

    PubMed

    Knowles, Lacey L; Klimov, Pavel B

    2011-11-01

    With the increased availability of multilocus sequence data, the lack of concordance of gene trees estimated for independent loci has focused attention on both the biological processes producing the discord and the methodologies used to estimate phylogenetic relationships. What has emerged is a suite of new analytical tools for phylogenetic inference--species tree approaches. In contrast to traditional phylogenetic methods that are stymied by the idiosyncrasies of gene trees, approaches for estimating species trees explicitly take into account the cause of discord among loci and, in the process, provides a direct estimate of phylogenetic history (i.e. the history of species divergence, not divergence of specific loci). We illustrate the utility of species tree estimates with an analysis of a diverse group of feather mites, the pinnatus species group (genus Proctophyllodes). Discord among four sequenced nuclear loci is consistent with theoretical expectations, given the short time separating speciation events (as evident by short internodes relative to terminal branch lengths in the trees). Nevertheless, many of the relationships are well resolved in a Bayesian estimate of the species tree; the analysis also highlights ambiguous aspects of the phylogeny that require additional loci. The broad utility of species tree approaches is discussed, and specifically, their application to groups with high speciation rates--a history of diversification with particular prevalence in host/parasite systems where species interactions can drive rapid diversification.

  4. Urban tree crown health assessment system: a tool for communities and citizen foresters

    Treesearch

    Matthew F. Winn; Sang-Mook Lee; Philip A. Araman

    2007-01-01

    Trees are important assets to urban communities. In addition to the aesthetic values that urban trees provide, they also aid in such things as erosion control, pollution removal, and rainfall interception. The urban environment, however, can often produce stresses to these trees. Soil compaction, limited root growth, and groundwater contamination are just a few of the...

  5. Mapping tree and impervious cover using Ikonos imagery: links with water quality and stream health

    NASA Astrophysics Data System (ADS)

    Wright, R.; Goetz, S. J.; Smith, A.; Zinecker, E.

    2002-12-01

    Precision georeferened Ikonos satellite imagery was used to map tree cover and impervious surface area in Montgomery county Maryland. The derived maps were used to assess riparian zone stream buffer tree cover and to predict, with multivariate logistic regression, stream health ratings across 246 small watersheds averaging 472 km2 in size. Stream health was assessed by state and county experts using a combination of physical measurements (e.g., dissolved oxygen) and biological indicators (e.g., benthic macroinvertebrates). We found it possible to create highly accurate (90+ per cent) maps of tree and impervious cover using decision tree classifiers, provided extensive field data were available for algorithm training. Impervious surface area was found to be the primary predictor of stream health, followed by tree cover in riparian buffers, and total tree cover within entire watersheds. A number of issues associated with mapping using Ikonos imagery were encountered, including differences in phenological and atmospheric conditions, shadowing within canopies and between scene elements, and limited spectral discrimination of cover types. We report on both the capabilities and limitations of Ikonos imagery for these applications, and considerations for extending these analyses to other areas.

  6. Tree diversity and species identity effects on soil fungi, protists and animals are context dependent.

    PubMed

    Tedersoo, Leho; Bahram, Mohammad; Cajthaml, Tomáš; Põlme, Sergei; Hiiesalu, Indrek; Anslan, Sten; Harend, Helery; Buegger, Franz; Pritsch, Karin; Koricheva, Julia; Abarenkov, Kessy

    2016-02-01

    Plant species richness and the presence of certain influential species (sampling effect) drive the stability and functionality of ecosystems as well as primary production and biomass of consumers. However, little is known about these floristic effects on richness and community composition of soil biota in forest habitats owing to methodological constraints. We developed a DNA metabarcoding approach to identify the major eukaryote groups directly from soil with roughly species-level resolution. Using this method, we examined the effects of tree diversity and individual tree species on soil microbial biomass and taxonomic richness of soil biota in two experimental study systems in Finland and Estonia and accounted for edaphic variables and spatial autocorrelation. Our analyses revealed that the effects of tree diversity and individual species on soil biota are largely context dependent. Multiple regression and structural equation modelling suggested that biomass, soil pH, nutrients and tree species directly affect richness of different taxonomic groups. The community composition of most soil organisms was strongly correlated due to similar response to environmental predictors rather than causal relationships. On a local scale, soil resources and tree species have stronger effect on diversity of soil biota than tree species richness per se.

  7. TreeScaper: Visualizing and Extracting Phylogenetic Signal from Sets of Trees.

    PubMed

    Huang, Wen; Zhou, Guifang; Marchand, Melissa; Ash, Jeremy R; Morris, David; Van Dooren, Paul; Brown, Jeremy M; Gallivan, Kyle A; Wilgenbusch, Jim C

    2016-12-01

    Modern phylogenomic analyses often result in large collections of phylogenetic trees representing uncertainty in individual gene trees, variation across genes, or both. Extracting phylogenetic signal from these tree sets can be challenging, as they are difficult to visualize, explore, and quantify. To overcome some of these challenges, we have developed TreeScaper, an application for tree set visualization as well as the identification of distinct phylogenetic signals. GUI and command-line versions of TreeScaper and a manual with tutorials can be downloaded from https://github.com/whuang08/TreeScaper/releases TreeScaper is distributed under the GNU General Public License. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Detection of epistatic effects with logic regression and a classical linear regression model.

    PubMed

    Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata

    2014-02-01

    To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.

  9. Tree Data (TD)

    Treesearch

    Robert E. Keane

    2006-01-01

    The Tree Data (TD) methods are used to sample individual live and dead trees on a fixed-area plot to estimate tree density, size, and age class distributions before and after fire in order to assess tree survival and mortality rates. This method can also be used to sample individual shrubs if they are over 4.5 ft tall. When trees are larger than the user-specified...

  10. Tree-space statistics and approximations for large-scale analysis of anatomical trees.

    PubMed

    Feragen, Aasa; Owen, Megan; Petersen, Jens; Wille, Mathilde M W; Thomsen, Laura H; Dirksen, Asger; de Bruijne, Marleen

    2013-01-01

    Statistical analysis of anatomical trees is hard to perform due to differences in the topological structure of the trees. In this paper we define statistical properties of leaf-labeled anatomical trees with geometric edge attributes by considering the anatomical trees as points in the geometric space of leaf-labeled trees. This tree-space is a geodesic metric space where any two trees are connected by a unique shortest path, which corresponds to a tree deformation. However, tree-space is not a manifold, and the usual strategy of performing statistical analysis in a tangent space and projecting onto tree-space is not available. Using tree-space and its shortest paths, a variety of statistical properties, such as mean, principal component, hypothesis testing and linear discriminant analysis can be defined. For some of these properties it is still an open problem how to compute them; others (like the mean) can be computed, but efficient alternatives are helpful in speeding up algorithms that use means iteratively, like hypothesis testing. In this paper, we take advantage of a very large dataset (N = 8016) to obtain computable approximations, under the assumption that the data trees parametrize the relevant parts of tree-space well. Using the developed approximate statistics, we illustrate how the structure and geometry of airway trees vary across a population and show that airway trees with Chronic Obstructive Pulmonary Disease come from a different distribution in tree-space than healthy ones. Software is available from http://image.diku.dk/aasa/software.php.

  11. A novel prediction approach for antimalarial activities of Trimethoprim, Pyrimethamine, and Cycloguanil analogues using extremely randomized trees.

    PubMed

    Nattee, Cholwich; Khamsemanan, Nirattaya; Lawtrakul, Luckhana; Toochinda, Pisanu; Hannongbua, Supa

    2017-01-01

    Malaria is still one of the most serious diseases in tropical regions. This is due in part to the high resistance against available drugs for the inhibition of parasites, Plasmodium, the cause of the disease. New potent compounds with high clinical utility are urgently needed. In this work, we created a novel model using a regression tree to study structure-activity relationships and predict the inhibition constant, K i of three different antimalarial analogues (Trimethoprim, Pyrimethamine, and Cycloguanil) based on their molecular descriptors. To the best of our knowledge, this work is the first attempt to study the structure-activity relationships of all three analogues combined. The most relevant descriptors and appropriate parameters of the regression tree are harvested using extremely randomized trees. These descriptors are water accessible surface area, Log of the aqueous solubility, total hydrophobic van der Waals surface area, and molecular refractivity. Out of all possible combinations of these selected parameters and descriptors, the tree with the strongest coefficient of determination is selected to be our prediction model. Predicted K i values from the proposed model show a strong coefficient of determination, R 2 =0.996, to experimental K i values. From the structure of the regression tree, compounds with high accessible surface area of all hydrophobic atoms (ASA_H) and low aqueous solubility of inhibitors (Log S) generally possess low K i values. Our prediction model can also be utilized as a screening test for new antimalarial drug compounds which may reduce the time and expenses for new drug development. New compounds with high predicted K i should be excluded from further drug development. It is also our inference that a threshold of ASA_H greater than 575.80 and Log S less than or equal to -4.36 is a sufficient condition for a new compound to possess a low K i . Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Genetic Gains Through Testing and Crossing Longleaf Pine Plus Trees

    Treesearch

    Calvin F. Bey; E. Bayne Snyder

    1978-01-01

    A progeny test of 226 superior tree selections from nine geographic sources across the South confirmed earlier results that showed the Gulf Coast source superior in survival and growth. Family variation within a region was large and provided additional genetic gain. Control-pollinated tests of elite x elite trees yielded even more gains. Progeny of the elite x elite...

  13. Audubon Tree Study Program.

    ERIC Educational Resources Information Center

    National Audubon Society, New York, NY.

    Included are an illustrated student reader, "The Story of Trees," a leaders' guide, and a large tree chart with 37 colored pictures. The student reader reviews several aspects of trees: a definition of a tree; where and how trees grow; flowers, pollination and seed production; how trees make their food; how to recognize trees; seasonal changes;…

  14. Synthesis of phylogeny and taxonomy into a comprehensive tree of life

    PubMed Central

    Hinchliff, Cody E.; Smith, Stephen A.; Allman, James F.; Burleigh, J. Gordon; Chaudhary, Ruchi; Coghill, Lyndon M.; Crandall, Keith A.; Deng, Jiabin; Drew, Bryan T.; Gazis, Romina; Gude, Karl; Hibbett, David S.; Katz, Laura A.; Laughinghouse, H. Dail; McTavish, Emily Jane; Midford, Peter E.; Owen, Christopher L.; Ree, Richard H.; Rees, Jonathan A.; Soltis, Douglas E.; Williams, Tiffani; Cranston, Karen A.

    2015-01-01

    Reconstructing the phylogenetic relationships that unite all lineages (the tree of life) is a grand challenge. The paucity of homologous character data across disparately related lineages currently renders direct phylogenetic inference untenable. To reconstruct a comprehensive tree of life, we therefore synthesized published phylogenies, together with taxonomic classifications for taxa never incorporated into a phylogeny. We present a draft tree containing 2.3 million tips—the Open Tree of Life. Realization of this tree required the assembly of two additional community resources: (i) a comprehensive global reference taxonomy and (ii) a database of published phylogenetic trees mapped to this taxonomy. Our open source framework facilitates community comment and contribution, enabling the tree to be continuously updated when new phylogenetic and taxonomic data become digitally available. Although data coverage and phylogenetic conflict across the Open Tree of Life illuminate gaps in both the underlying data available for phylogenetic reconstruction and the publication of trees as digital objects, the tree provides a compelling starting point for community contribution. This comprehensive tree will fuel fundamental research on the nature of biological diversity, ultimately providing up-to-date phylogenies for downstream applications in comparative biology, ecology, conservation biology, climate change, agriculture, and genomics. PMID:26385966

  15. Relative Suffix Trees.

    PubMed

    Farruggia, Andrea; Gagie, Travis; Navarro, Gonzalo; Puglisi, Simon J; Sirén, Jouni

    2018-05-01

    Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into reducing the space usage, leading ultimately to compressed suffix trees. These compressed data structures can efficiently simulate the suffix tree, while using space proportional to a compressed representation of the sequence. In this work, we take a new approach to compressed suffix trees for repetitive sequence collections, such as collections of individual genomes. We compress the suffix trees of individual sequences relative to the suffix tree of a reference sequence. These relative data structures provide competitive time/space trade-offs, being almost as small as the smallest compressed suffix trees for repetitive collections, and competitive in time with the largest and fastest compressed suffix trees.

  16. Relative Suffix Trees

    PubMed Central

    Farruggia, Andrea; Gagie, Travis; Navarro, Gonzalo; Puglisi, Simon J; Sirén, Jouni

    2018-01-01

    Abstract Suffix trees are one of the most versatile data structures in stringology, with many applications in bioinformatics. Their main drawback is their size, which can be tens of times larger than the input sequence. Much effort has been put into reducing the space usage, leading ultimately to compressed suffix trees. These compressed data structures can efficiently simulate the suffix tree, while using space proportional to a compressed representation of the sequence. In this work, we take a new approach to compressed suffix trees for repetitive sequence collections, such as collections of individual genomes. We compress the suffix trees of individual sequences relative to the suffix tree of a reference sequence. These relative data structures provide competitive time/space trade-offs, being almost as small as the smallest compressed suffix trees for repetitive collections, and competitive in time with the largest and fastest compressed suffix trees. PMID:29795706

  17. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes.

    PubMed

    Petermann, Jana S; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W; Gossner, Martin M

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  18. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes

    PubMed Central

    Petermann, Jana S.; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W.; Gossner, Martin M.

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  19. Topographic influences on vegetation mosaics and tree diversity in the Chihuahuan Desert Borderlands.

    PubMed

    Poulos, Helen M; Camp, Ann E

    2010-04-01

    The abundance and distribution of species reflect how the niche requirements of species and the dynamics of populations interact with spatial and temporal variation in the environment. This study investigated the influence of geographical variation in environmental site conditions on tree dominance and diversity patterns in three topographically dissected mountain ranges in west Texas, USA, and northern Mexico. We measured tree abundance and basal area using a systematic sampling design across the forested areas of three mountain ranges and related these data to a suite of environmental parameters derived from field and digital elevation model data. We employed cluster analysis, classification and regression trees (CART), and rarefaction to identify (1) the dominant forest cover types across the three study sites and (2) environmental influences on tree distribution and diversity patterns. Elevation, topographic position, and incident solar radiation were the major influences on tree dominance and diversity. Mesic valley bottoms hosted high-diversity vegetation types, while hotter and drier mid-slopes and ridgetops supported lower tree diversity. Valley bottoms and other topographic positions shared few species, indicating high species turnover at the landscape scale. Mountain ranges with high topographic complexity also had higher species richness, suggesting that geographical variability in environmental conditions was a major influence on tree diversity. This study stressed the importance of landscape- and regional-scale topographic variability as a key factor controlling vegetation pattern and diversity in southwestern North America.

  20. History of Tree Growth Declines Recorded in Old Trees at Two Sacred Sites in Northern China

    PubMed Central

    Li, Yan; Zhang, Qi-Bin

    2017-01-01

    Old forests are an important component in sacred sites, yet they are at risk of growth decline from ongoing global warming and increased human activities. Growth decline, characterized by chronic loss of tree vigor, is not a recent phenomenon. Knowledge of past occurrence of declines is useful for preparing conservation plans because it helps understand if present day forests are outside the natural range of variation in tree health. We report a dendroecological study of growth decline events in the past two centuries at two sacred sites, Hengshan and Wutaishan, in Shanxi province of northern China. Tree rings collected at both sites show distinct periods of declining growth evident as narrow rings. These occurred in the 1830s in both sites, in the 1920s in Wutaishan and in the 2000s in Hengshan. By comparing the pattern of grow declines at the two sites, we hypothesize that resistance of tree growth to external disturbances is forest size dependent, and increased human activity might be a factor additional to climatic droughts in causing the recent strong growth decline at Hengshan Park. Despite these past declines, the forests at both sites have high resilience to disturbances as evidenced by the ability of trees to recover their growth rates to levels comparable to the pre-decline period. Managers should consider reducing fragmentation and restoring natural habitat of old forests, especially in areas on dry sites. PMID:29163557

  1. History of Tree Growth Declines Recorded in Old Trees at Two Sacred Sites in Northern China.

    PubMed

    Li, Yan; Zhang, Qi-Bin

    2017-01-01

    Old forests are an important component in sacred sites, yet they are at risk of growth decline from ongoing global warming and increased human activities. Growth decline, characterized by chronic loss of tree vigor, is not a recent phenomenon. Knowledge of past occurrence of declines is useful for preparing conservation plans because it helps understand if present day forests are outside the natural range of variation in tree health. We report a dendroecological study of growth decline events in the past two centuries at two sacred sites, Hengshan and Wutaishan, in Shanxi province of northern China. Tree rings collected at both sites show distinct periods of declining growth evident as narrow rings. These occurred in the 1830s in both sites, in the 1920s in Wutaishan and in the 2000s in Hengshan. By comparing the pattern of grow declines at the two sites, we hypothesize that resistance of tree growth to external disturbances is forest size dependent, and increased human activity might be a factor additional to climatic droughts in causing the recent strong growth decline at Hengshan Park. Despite these past declines, the forests at both sites have high resilience to disturbances as evidenced by the ability of trees to recover their growth rates to levels comparable to the pre-decline period. Managers should consider reducing fragmentation and restoring natural habitat of old forests, especially in areas on dry sites.

  2. Nitrogen deposition outweighs climatic variability in driving annual growth rate of canopy beech trees: Evidence from long-term growth reconstruction across a geographic gradient.

    PubMed

    Gentilesca, Tiziana; Rita, Angelo; Brunetti, Michele; Giammarchi, Francesco; Leonardi, Stefano; Magnani, Federico; van Noije, Twan; Tonon, Giustino; Borghetti, Marco

    2018-07-01

    In this study, we investigated the role of climatic variability and atmospheric nitrogen deposition in driving long-term tree growth in canopy beech trees along a geographic gradient in the montane belt of the Italian peninsula, from the Alps to the southern Apennines. We sampled dominant trees at different developmental stages (from young to mature tree cohorts, with tree ages spanning from 35 to 160 years) and used stem analysis to infer historic reconstruction of tree volume and dominant height. Annual growth volume (G V ) and height (G H ) variability were related to annual variability in model simulated atmospheric nitrogen deposition and site-specific climatic variables, (i.e. mean annual temperature, total annual precipitation, mean growing period temperature, total growing period precipitation, and standard precipitation evapotranspiration index) and atmospheric CO 2 concentration, including tree cambial age among growth predictors. Generalized additive models (GAM), linear mixed-effects models (LMM), and Bayesian regression models (BRM) were independently employed to assess explanatory variables. The main results from our study were as follows: (i) tree age was the main explanatory variable for long-term growth variability; (ii) GAM, LMM, and BRM results consistently indicated climatic variables and CO 2 effects on G V and G H were weak, therefore evidence of recent climatic variability influence on beech annual growth rates was limited in the montane belt of the Italian peninsula; (iii) instead, significant positive nitrogen deposition (N dep ) effects were repeatedly observed in G V and G H ; the positive effects of N dep on canopy height growth rates, which tended to level off at N dep values greater than approximately 1.0 g m -2  y -1 , were interpreted as positive impacts on forest stand above-ground net productivity at the selected study sites. © 2018 John Wiley & Sons Ltd.

  3. Predicting species' range limits from functional traits for the tree flora of North America.

    PubMed

    Stahl, Ulrike; Reu, Björn; Wirth, Christian

    2014-09-23

    Using functional traits to explain species' range limits is a promising approach in functional biogeography. It replaces the idiosyncrasy of species-specific climate ranges with a generic trait-based predictive framework. In addition, it has the potential to shed light on specific filter mechanisms creating large-scale vegetation patterns. However, its application to a continental flora, spanning large climate gradients, has been hampered by a lack of trait data. Here, we explore whether five key plant functional traits (seed mass, wood density, specific leaf area (SLA), maximum height, and longevity of a tree)--indicative of life history, mechanical, and physiological adaptations--explain the climate ranges of 250 North American tree species distributed from the boreal to the subtropics. Although the relationship between traits and the median climate across a species range is weak, quantile regressions revealed strong effects on range limits. Wood density and seed mass were strongly related to the lower but not upper temperature range limits of species. Maximum height affects the species range limits in both dry and humid climates, whereas SLA and longevity do not show clear relationships. These results allow the definition and delineation of climatic "no-go areas" for North American tree species based on key traits. As some of these key traits serve as important parameters in recent vegetation models, the implementation of trait-based climatic constraints has the potential to predict both range shifts and ecosystem consequences on a more functional basis. Moreover, for future trait-based vegetation models our results provide a benchmark for model evaluation.

  4. New flux based dose-response relationships for ozone for European forest tree species.

    PubMed

    Büker, P; Feng, Z; Uddling, J; Briolat, A; Alonso, R; Braun, S; Elvira, S; Gerosa, G; Karlsson, P E; Le Thiec, D; Marzuoli, R; Mills, G; Oksanen, E; Wieser, G; Wilkinson, M; Emberson, L D

    2015-11-01

    To derive O3 dose-response relationships (DRR) for five European forest trees species and broadleaf deciduous and needleleaf tree plant functional types (PFTs), phytotoxic O3 doses (PODy) were related to biomass reductions. PODy was calculated using a stomatal flux model with a range of cut-off thresholds (y) indicative of varying detoxification capacities. Linear regression analysis showed that DRR for PFT and individual tree species differed in their robustness. A simplified parameterisation of the flux model was tested and showed that for most non-Mediterranean tree species, this simplified model led to similarly robust DRR as compared to a species- and climate region-specific parameterisation. Experimentally induced soil water stress was not found to substantially reduce PODy, mainly due to the short duration of soil water stress periods. This study validates the stomatal O3 flux concept and represents a step forward in predicting O3 damage to forests in a spatially and temporally varying climate. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

  5. Steiner trees and spanning trees in six-pin soap films

    NASA Astrophysics Data System (ADS)

    Dutta, Prasun; Khastgir, S. Pratik; Roy, Anushree

    2010-02-01

    The problem of finding minimum (local as well as absolute) path lengths joining given points (or terminals) on a plane is known as the Steiner problem. The Steiner problem arises in finding the minimum total road length joining several towns and cities. We study the Steiner tree problem using six-pin soap films. Experimentally, we observe spanning trees as well as Steiner trees partly by varying the pin diameter. We propose a possibly exact expression for the length of a spanning tree or a Steiner tree, which fails mysteriously in certain cases.

  6. Why do trees die? Characterizing the drivers of background tree mortality

    USGS Publications Warehouse

    Das, Adrian J.; Stephenson, Nathan L.; Davis, Kristin P.

    2016-01-01

    The drivers of background tree mortality rates—the typical low rates of tree mortality found in forests in the absence of acute stresses like drought—are central to our understanding of forest dynamics, the effects of ongoing environmental changes on forests, and the causes and consequences of geographical gradients in the nature and strength of biotic interactions. To shed light on factors contributing to background tree mortality, we analyzed detailed pathological data from 200,668 tree-years of observation and 3,729 individual tree deaths, recorded over a 13-yr period in a network of old-growth forest plots in California's Sierra Nevada mountain range. We found that: (1) Biotic mortality factors (mostly insects and pathogens) dominated (58%), particularly in larger trees (86%). Bark beetles were the most prevalent (40%), even though there were no outbreaks during the study period; in contrast, the contribution of defoliators was negligible. (2) Relative occurrences of broad classes of mortality factors (biotic, 58%; suppression, 51%; and mechanical, 25%) are similar among tree taxa, but may vary with tree size and growth rate. (3) We found little evidence of distinct groups of mortality factors that predictably occur together on trees. Our results have at least three sets of implications. First, rather than being driven by abiotic factors such as lightning or windstorms, the “ambient” or “random” background mortality that many forest models presume to be independent of tree growth rate is instead dominated by biotic agents of tree mortality, with potentially critical implications for forecasting future mortality. Mechanistic models of background mortality, even for healthy, rapidly growing trees, must therefore include the insects and pathogens that kill trees. Second, the biotic agents of tree mortality, instead of occurring in a few predictable combinations, may generally act opportunistically and with a relatively large degree of independence from

  7. Assessing and Improving Student Understanding of Tree-Thinking

    NASA Astrophysics Data System (ADS)

    Kummer, Tyler A.

    Evolution is the unifying theory of biology. The importance of understanding evolution by those who study the origins, diversification and diversity life cannot be overstated. Because of its importance, in addition to a scientific study of evolution, many researchers have spent time studying the acceptance and the teaching of evolution. Phylogenetic Systematics is the field of study developed to understand the evolutionary history of organisms, traits, and genes. Tree-thinking is the term by which we identify concepts related to the evolutionary history of organisms. It is vital that those who undertake a study of biology be able to understand and interpret what information these phylogenies are meant to convey. In this project, we evaluated the current impact a traditional study of biology has on the misconceptions students hold by assessing tree-thinking in freshman biology students to those nearing the end of their studies. We found that the impact of studying biology was varied with some misconceptions changing significantly while others persisted. Despite the importance of tree-thinking no appropriately developed concept inventory exists to measure student understanding of these important concepts. We developed a concept inventory capable of filling this important need and provide evidence to support its use among undergraduate students. Finally, we developed and modified activities as well as courses based on best practices to improve teaching and learning of tree-thinking and organismal diversity. We accomplished this by focusing on two key questions. First, how do we best introduce students to tree-thinking and second does tree-thinking as a course theme enhance student understanding of not only tree-thinking but also organismal diversity. We found important evidence suggesting that introducing students to tree-thinking via building evolutionary trees was less successful than introducing the concept via tree interpretation and may have in fact introduced or

  8. Investigating how students communicate tree-thinking

    NASA Astrophysics Data System (ADS)

    Boyce, Carrie Jo

    Learning is often an active endeavor that requires students work at building conceptual understandings of complex topics. Personal experiences, ideas, and communication all play large roles in developing knowledge of and understanding complex topics. Sometimes these experiences can promote formation of scientifically inaccurate or incomplete ideas. Representations are tools used to help individuals understand complex topics. In biology, one way that educators help people understand evolutionary histories of organisms is by using representations called phylogenetic trees. In order to understand phylogenetics trees, individuals need to understand the conventions associated with phylogenies. My dissertation, supported by the Tree-Thinking Representational Competence and Word Association frameworks, is a mixed-methods study investigating the changes in students' tree-reading, representational competence and mental association of phylogenetic terminology after participation in varied instruction. Participants included 128 introductory biology majors from a mid-sized southern research university. Participants were enrolled in either Introductory Biology I, where they were not taught phylogenetics, or Introductory Biology II, where they were explicitly taught phylogenetics. I collected data using a pre- and post-assessment consisting of a word association task and tree-thinking diagnostic (n=128). Additionally, I recruited a subset of students from both courses (n=37) to complete a computer simulation designed to teach students about phylogenetic trees. I then conducted semi-structured interviews consisting of a word association exercise with card sort task, a retrospective pre-assessment discussion, a post-assessment discussion, and interview questions. I found that students who received explicit lecture instruction had a significantly higher increase in scores on a tree-thinking diagnostic than students who did not receive lecture instruction. Students who received both

  9. Selective logging in tropical forests decreases the robustness of liana-tree interaction networks to the loss of host tree species.

    PubMed

    Magrach, Ainhoa; Senior, Rebecca A; Rogers, Andrew; Nurdin, Deddy; Benedick, Suzan; Laurance, William F; Santamaria, Luis; Edwards, David P

    2016-03-16

    Selective logging is one of the major drivers of tropical forest degradation, causing important shifts in species composition. Whether such changes modify interactions between species and the networks in which they are embedded remain fundamental questions to assess the 'health' and ecosystem functionality of logged forests. We focus on interactions between lianas and their tree hosts within primary and selectively logged forests in the biodiversity hotspot of Malaysian Borneo. We found that lianas were more abundant, had higher species richness, and different species compositions in logged than in primary forests. Logged forests showed heavier liana loads disparately affecting slow-growing tree species, which could exacerbate the loss of timber value and carbon storage already associated with logging. Moreover, simulation scenarios of host tree local species loss indicated that logging might decrease the robustness of liana-tree interaction networks if heavily infested trees (i.e. the most connected ones) were more likely to disappear. This effect is partially mitigated in the short term by the colonization of host trees by a greater diversity of liana species within logged forests, yet this might not compensate for the loss of preferred tree hosts in the long term. As a consequence, species interaction networks may show a lagged response to disturbance, which may trigger sudden collapses in species richness and ecosystem function in response to additional disturbances, representing a new type of 'extinction debt'. © 2016 The Author(s).

  10. Selective logging in tropical forests decreases the robustness of liana–tree interaction networks to the loss of host tree species

    PubMed Central

    Magrach, Ainhoa; Senior, Rebecca A.; Rogers, Andrew; Nurdin, Deddy; Benedick, Suzan; Laurance, William F.; Santamaria, Luis; Edwards, David P.

    2016-01-01

    Selective logging is one of the major drivers of tropical forest degradation, causing important shifts in species composition. Whether such changes modify interactions between species and the networks in which they are embedded remain fundamental questions to assess the ‘health’ and ecosystem functionality of logged forests. We focus on interactions between lianas and their tree hosts within primary and selectively logged forests in the biodiversity hotspot of Malaysian Borneo. We found that lianas were more abundant, had higher species richness, and different species compositions in logged than in primary forests. Logged forests showed heavier liana loads disparately affecting slow-growing tree species, which could exacerbate the loss of timber value and carbon storage already associated with logging. Moreover, simulation scenarios of host tree local species loss indicated that logging might decrease the robustness of liana–tree interaction networks if heavily infested trees (i.e. the most connected ones) were more likely to disappear. This effect is partially mitigated in the short term by the colonization of host trees by a greater diversity of liana species within logged forests, yet this might not compensate for the loss of preferred tree hosts in the long term. As a consequence, species interaction networks may show a lagged response to disturbance, which may trigger sudden collapses in species richness and ecosystem function in response to additional disturbances, representing a new type of ‘extinction debt’. PMID:26936241

  11. Belowground Microbiota and the Health of Tree Crops.

    PubMed

    Mercado-Blanco, Jesús; Abrantes, Isabel; Barra Caracciolo, Anna; Bevivino, Annamaria; Ciancio, Aurelio; Grenni, Paola; Hrynkiewicz, Katarzyna; Kredics, László; Proença, Diogo N

    2018-01-01

    Trees are crucial for sustaining life on our planet. Forests and land devoted to tree crops do not only supply essential edible products to humans and animals, but also additional goods such as paper or wood. They also prevent soil erosion, support microbial, animal, and plant biodiversity, play key roles in nutrient and water cycling processes, and mitigate the effects of climate change acting as carbon dioxide sinks. Hence, the health of forests and tree cropping systems is of particular significance. In particular, soil/rhizosphere/root-associated microbial communities (known as microbiota) are decisive to sustain the fitness, development, and productivity of trees. These benefits rely on processes aiming to enhance nutrient assimilation efficiency (plant growth promotion) and/or to protect against a number of (a)biotic constraints. Moreover, specific members of the microbial communities associated with perennial tree crops interact with soil invertebrate food webs, underpinning many density regulation mechanisms. This review discusses belowground microbiota interactions influencing the growth of tree crops. The study of tree-(micro)organism interactions taking place at the belowground level is crucial to understand how they contribute to processes like carbon sequestration, regulation of ecosystem functioning, and nutrient cycling. A comprehensive understanding of the relationship between roots and their associate microbiota can also facilitate the design of novel sustainable approaches for the benefit of these relevant agro-ecosystems. Here, we summarize the methodological approaches to unravel the composition and function of belowground microbiota, the factors influencing their interaction with tree crops, their benefits and harms, with a focus on representative examples of Biological Control Agents (BCA) used against relevant biotic constraints of tree crops. Finally, we add some concluding remarks and suggest future perspectives concerning the microbiota

  12. Belowground Microbiota and the Health of Tree Crops

    PubMed Central

    Mercado-Blanco, Jesús; Abrantes, Isabel; Barra Caracciolo, Anna; Bevivino, Annamaria; Ciancio, Aurelio; Grenni, Paola; Hrynkiewicz, Katarzyna; Kredics, László; Proença, Diogo N.

    2018-01-01

    Trees are crucial for sustaining life on our planet. Forests and land devoted to tree crops do not only supply essential edible products to humans and animals, but also additional goods such as paper or wood. They also prevent soil erosion, support microbial, animal, and plant biodiversity, play key roles in nutrient and water cycling processes, and mitigate the effects of climate change acting as carbon dioxide sinks. Hence, the health of forests and tree cropping systems is of particular significance. In particular, soil/rhizosphere/root-associated microbial communities (known as microbiota) are decisive to sustain the fitness, development, and productivity of trees. These benefits rely on processes aiming to enhance nutrient assimilation efficiency (plant growth promotion) and/or to protect against a number of (a)biotic constraints. Moreover, specific members of the microbial communities associated with perennial tree crops interact with soil invertebrate food webs, underpinning many density regulation mechanisms. This review discusses belowground microbiota interactions influencing the growth of tree crops. The study of tree-(micro)organism interactions taking place at the belowground level is crucial to understand how they contribute to processes like carbon sequestration, regulation of ecosystem functioning, and nutrient cycling. A comprehensive understanding of the relationship between roots and their associate microbiota can also facilitate the design of novel sustainable approaches for the benefit of these relevant agro-ecosystems. Here, we summarize the methodological approaches to unravel the composition and function of belowground microbiota, the factors influencing their interaction with tree crops, their benefits and harms, with a focus on representative examples of Biological Control Agents (BCA) used against relevant biotic constraints of tree crops. Finally, we add some concluding remarks and suggest future perspectives concerning the microbiota

  13. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  14. The prediction of intelligence in preschool children using alternative models to regression.

    PubMed

    Finch, W Holmes; Chang, Mei; Davis, Andrew S; Holden, Jocelyn E; Rothlisberg, Barbara A; McIntosh, David E

    2011-12-01

    Statistical prediction of an outcome variable using multiple independent variables is a common practice in the social and behavioral sciences. For example, neuropsychologists are sometimes called upon to provide predictions of preinjury cognitive functioning for individuals who have suffered a traumatic brain injury. Typically, these predictions are made using standard multiple linear regression models with several demographic variables (e.g., gender, ethnicity, education level) as predictors. Prior research has shown conflicting evidence regarding the ability of such models to provide accurate predictions of outcome variables such as full-scale intelligence (FSIQ) test scores. The present study had two goals: (1) to demonstrate the utility of a set of alternative prediction methods that have been applied extensively in the natural sciences and business but have not been frequently explored in the social sciences and (2) to develop models that can be used to predict premorbid cognitive functioning in preschool children. Predictions of Stanford-Binet 5 FSIQ scores for preschool-aged children is used to compare the performance of a multiple regression model with several of these alternative methods. Results demonstrate that classification and regression trees provided more accurate predictions of FSIQ scores than does the more traditional regression approach. Implications of these results are discussed.

  15. [RS estimation of inventory parameters and carbon storage of moso bamboo forest based on synergistic use of object-based image analysis and decision tree].

    PubMed

    Du, Hua Qiang; Sun, Xiao Yan; Han, Ning; Mao, Fang Jie

    2017-10-01

    By synergistically using the object-based image analysis (OBIA) and the classification and regression tree (CART) methods, the distribution information, the indexes (including diameter at breast, tree height, and crown closure), and the aboveground carbon storage (AGC) of moso bamboo forest in Shanchuan Town, Anji County, Zhejiang Province were investigated. The results showed that the moso bamboo forest could be accurately delineated by integrating the multi-scale ima ge segmentation in OBIA technique and CART, which connected the image objects at various scales, with a pretty good producer's accuracy of 89.1%. The investigation of indexes estimated by regression tree model that was constructed based on the features extracted from the image objects reached normal or better accuracy, in which the crown closure model archived the best estimating accuracy of 67.9%. The estimating accuracy of diameter at breast and tree height was relatively low, which was consistent with conclusion that estimating diameter at breast and tree height using optical remote sensing could not achieve satisfactory results. Estimation of AGC reached relatively high accuracy, and accuracy of the region of high value achieved above 80%.

  16. MODIS Tree Cover Validation for the Circumpolar Taiga-Tundra Transition Zone

    NASA Technical Reports Server (NTRS)

    Montesano, P. M.; Nelson, R.; Sun, G.; Margolis, H.; Kerber, A.; Ranson, K. J.

    2009-01-01

    A validation of the 2005 500m MODIS vegetation continuous fields (VCF) tree cover product in the circumpolar taiga-tundra ecotone was performed using high resolution Quickbird imagery. Assessing the VCF's performance near the northern limits of the boreal forest can help quantify the accuracy of the product within this vegetation transition area. The circumpolar region was divided into longitudinal zones and validation sites were selected in areas of varying tree cover where Quickbird imagery is available in Google Earth. Each site was linked to the corresponding VCF pixel and overlaid with a regular dot grid within the VCF pixel's boundary to estimate percent tree crown cover in the area. Percent tree crown cover was estimated using Quickbird imagery for 396 sites throughout the circumpolar region and related to the VCF's estimates of canopy cover for 2000-2005. Regression results of VCF inter-annual comparisons (2000-2005) and VCF-Quickbird image-interpreted estimates indicate that: (1) Pixel-level, inter-annual comparisons of VCF estimates of percent canopy cover were linearly related (mean R(sup 2) = 0.77) and exhibited an average root mean square error (RMSE) of 10.1 % and an average root mean square difference (RMSD) of 7.3%. (2) A comparison of image-interpreted percent tree crown cover estimates based on dot counts on Quickbird color images by two different interpreters were more variable (R(sup 2) = 0.73, RMSE = 14.8%, RMSD = 18.7%) than VCF inter-annual comparisons. (3) Across the circumpolar boreal region, 2005 VCF-Quickbird comparisons were linearly related, with an R(sup 2) = 0.57, a RMSE = 13.4% and a RMSD = 21.3%, with a tendency to over-estimate areas of low percent tree cover and anomalous VCF results in Scandinavia. The relationship of the VCF estimates and ground reference indicate to potential users that the VCF's tree cover values for individual pixels, particularly those below 20% tree cover, may not be precise enough to monitor 500m pixel

  17. Urban climate modifies tree growth in Berlin

    NASA Astrophysics Data System (ADS)

    Dahlhausen, Jens; Rötzer, Thomas; Biber, Peter; Uhl, Enno; Pretzsch, Hans

    2017-12-01

    Climate, e.g., air temperature and precipitation, differs strongly between urban and peripheral areas, which causes diverse life conditions for trees. In order to compare tree growth, we sampled in total 252 small-leaved lime trees (Tilia cordata Mill) in the city of Berlin along a gradient from the city center to the surroundings. By means of increment cores, we are able to trace back their growth for the last 50 to 100 years. A general growth trend can be shown by comparing recent basal area growth with estimates from extrapolating a growth function that had been fitted with growth data from earlier years. Estimating a linear model, we show that air temperature and precipitation significantly influence tree growth within the last 20 years. Under consideration of housing density, the results reveal that higher air temperature and less precipitation led to higher growth rates in high-dense areas, but not in low-dense areas. In addition, our data reveal a significantly higher variance of the ring width index in areas with medium housing density compared to low housing density, but no temporal trend. Transferring the results to forest stands, climate change is expected to lead to higher tree growth rates.

  18. Urban climate modifies tree growth in Berlin

    NASA Astrophysics Data System (ADS)

    Dahlhausen, Jens; Rötzer, Thomas; Biber, Peter; Uhl, Enno; Pretzsch, Hans

    2018-05-01

    Climate, e.g., air temperature and precipitation, differs strongly between urban and peripheral areas, which causes diverse life conditions for trees. In order to compare tree growth, we sampled in total 252 small-leaved lime trees ( Tilia cordata Mill) in the city of Berlin along a gradient from the city center to the surroundings. By means of increment cores, we are able to trace back their growth for the last 50 to 100 years. A general growth trend can be shown by comparing recent basal area growth with estimates from extrapolating a growth function that had been fitted with growth data from earlier years. Estimating a linear model, we show that air temperature and precipitation significantly influence tree growth within the last 20 years. Under consideration of housing density, the results reveal that higher air temperature and less precipitation led to higher growth rates in high-dense areas, but not in low-dense areas. In addition, our data reveal a significantly higher variance of the ring width index in areas with medium housing density compared to low housing density, but no temporal trend. Transferring the results to forest stands, climate change is expected to lead to higher tree growth rates.

  19. Urban climate modifies tree growth in Berlin.

    PubMed

    Dahlhausen, Jens; Rötzer, Thomas; Biber, Peter; Uhl, Enno; Pretzsch, Hans

    2018-05-01

    Climate, e.g., air temperature and precipitation, differs strongly between urban and peripheral areas, which causes diverse life conditions for trees. In order to compare tree growth, we sampled in total 252 small-leaved lime trees (Tilia cordata Mill) in the city of Berlin along a gradient from the city center to the surroundings. By means of increment cores, we are able to trace back their growth for the last 50 to 100 years. A general growth trend can be shown by comparing recent basal area growth with estimates from extrapolating a growth function that had been fitted with growth data from earlier years. Estimating a linear model, we show that air temperature and precipitation significantly influence tree growth within the last 20 years. Under consideration of housing density, the results reveal that higher air temperature and less precipitation led to higher growth rates in high-dense areas, but not in low-dense areas. In addition, our data reveal a significantly higher variance of the ring width index in areas with medium housing density compared to low housing density, but no temporal trend. Transferring the results to forest stands, climate change is expected to lead to higher tree growth rates.

  20. Decision tree analysis in subarachnoid hemorrhage: prediction of outcome parameters during the course of aneurysmal subarachnoid hemorrhage using decision tree analysis.

    PubMed

    Hostettler, Isabel Charlotte; Muroi, Carl; Richter, Johannes Konstantin; Schmid, Josef; Neidert, Marian Christoph; Seule, Martin; Boss, Oliver; Pangalu, Athina; Germans, Menno Robbert; Keller, Emanuela

    2018-01-19

    OBJECTIVE The aim of this study was to create prediction models for outcome parameters by decision tree analysis based on clinical and laboratory data in patients with aneurysmal subarachnoid hemorrhage (aSAH). METHODS The database consisted of clinical and laboratory parameters of 548 patients with aSAH who were admitted to the Neurocritical Care Unit, University Hospital Zurich. To examine the model performance, the cohort was randomly divided into a derivation cohort (60% [n = 329]; training data set) and a validation cohort (40% [n = 219]; test data set). The classification and regression tree prediction algorithm was applied to predict death, functional outcome, and ventriculoperitoneal (VP) shunt dependency. Chi-square automatic interaction detection was applied to predict delayed cerebral infarction on days 1, 3, and 7. RESULTS The overall mortality was 18.4%. The accuracy of the decision tree models was good for survival on day 1 and favorable functional outcome at all time points, with a difference between the training and test data sets of < 5%. Prediction accuracy for survival on day 1 was 75.2%. The most important differentiating factor was the interleukin-6 (IL-6) level on day 1. Favorable functional outcome, defined as Glasgow Outcome Scale scores of 4 and 5, was observed in 68.6% of patients. Favorable functional outcome at all time points had a prediction accuracy of 71.1% in the training data set, with procalcitonin on day 1 being the most important differentiating factor at all time points. A total of 148 patients (27%) developed VP shunt dependency. The most important differentiating factor was hyperglycemia on admission. CONCLUSIONS The multiple variable analysis capability of decision trees enables exploration of dependent variables in the context of multiple changing influences over the course of an illness. The decision tree currently generated increases awareness of the early systemic stress response, which is seemingly pertinent for

  1. Liana competition with tropical trees varies seasonally but not with tree species identity.

    PubMed

    Leonor, Alvarez-Cansino; Schnitzer, Stefan A; Reid, Joseph P; Powers, Jennifer S

    2015-01-01

    Lianas in tropical forests compete intensely with trees for above- and belowground resources and limit tree growth and regeneration. Liana competition with adult canopy trees may be particularly strong, and, if lianas compete more intensely with some tree species than others, they may influence tree species composition. We performed the first systematic, large-scale liana removal experiment to assess the competitive effects of lianas on multiple tropical tree species by measuring sap velocity and growth in a lowland tropical forest in Panama. Tree sap velocity increased 60% soon after liana removal compared to control trees, and tree diameter growth increased 25% after one year. Although tree species varied in their response to lianas, this variation was not significant, suggesting that lianas competed similarly with all tree species examined. The effect of lianas on tree sap velocity was particularly strong during the dry season, when soil moisture was low, suggesting that lianas compete intensely with trees for water. Under the predicted global change scenario of increased temperature and drought intensity, competition from lianas may become more prevalent in seasonal tropical forests, which, according to our data, should have a negative effect on most tropical tree species.

  2. Informing tree-ring reconstructions with automated dendrometer data: the case of single-leaf pinyon (Pinus monophylla) from Great Basin National Park, Nevada, USA

    NASA Astrophysics Data System (ADS)

    Biondi, F.

    2012-12-01

    One of the most pressing issues in modern tree-ring science is to reduce uncertainty of reconstructions while emphasizing that the composition and dynamics of modern ecosystems cannot be understood from the present alone. I present here the latest results from research on the environmental factors that control radial growth of single-leaf pinyon (Pinus monophylla) in the Great Basin of North America using dendrometer data collected at half-hour intervals during two full growing season, 2010 and 2011. Automated (solar-powered) sensors at the site consisted of 8 point dendrometers installed on 7 trees to measure stem size, together with environmental probes that recorded air temperature, soil temperature and soil moisture. Additional meteorological variables at hourly timesteps were available from the EPA-CASTNET station located within 100 m of the dendrometer site. Daily cycles of stem expansion and contraction were quantified using the approach of Deslauriers et al. 2011, and the amount of daily radial stem increment was regressed against environmental variables. Graphical and numerical results showed that tree growth is relatively insensitive to surface soil moisture during the growing season. This finding corroborates empirical dendroclimatic results that showed how tree-ring chronologies of single-leaf pinyon are mostly a proxy for the balance between winter-spring precipitation supply and growing season evapotranspiration demand, thereby making it an ideal species for drought reconstructions.

  3. Ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses.

    PubMed

    Fouquier, Jennifer; Rideout, Jai Ram; Bolyen, Evan; Chase, John; Shiffer, Arron; McDonald, Daniel; Knight, Rob; Caporaso, J Gregory; Kelley, Scott T

    2016-02-24

    Fungi play critical roles in many ecosystems, cause serious diseases in plants and animals, and pose significant threats to human health and structural integrity problems in built environments. While most fungal diversity remains unknown, the development of PCR primers for the internal transcribed spacer (ITS) combined with next-generation sequencing has substantially improved our ability to profile fungal microbial diversity. Although the high sequence variability in the ITS region facilitates more accurate species identification, it also makes multiple sequence alignment and phylogenetic analysis unreliable across evolutionarily distant fungi because the sequences are hard to align accurately. To address this issue, we created ghost-tree, a bioinformatics tool that integrates sequence data from two genetic markers into a single phylogenetic tree that can be used for diversity analyses. Our approach starts with a "foundation" phylogeny based on one genetic marker whose sequences can be aligned across organisms spanning divergent taxonomic groups (e.g., fungal families). Then, "extension" phylogenies are built for more closely related organisms (e.g., fungal species or strains) using a second more rapidly evolving genetic marker. These smaller phylogenies are then grafted onto the foundation tree by mapping taxonomic names such that each corresponding foundation-tree tip would branch into its new "extension tree" child. We applied ghost-tree to graft fungal extension phylogenies derived from ITS sequences onto a foundation phylogeny derived from fungal 18S sequences. Our analysis of simulated and real fungal ITS data sets found that phylogenetic distances between fungal communities computed using ghost-tree phylogenies explained significantly more variance than non-phylogenetic distances. The phylogenetic metrics also improved our ability to distinguish small differences (effect sizes) between microbial communities, though results were similar to non

  4. Trees are good, but…

    Treesearch

    E.G. McPherson; F. Ferrini

    2010-01-01

    We know that “trees are good,” and most people believe this to be true. But if this is so, why are so many trees neglected, and so many tree wells empty? An individual’s attitude toward trees may result from their firsthand encounters with specific trees. Understanding how attitudes about trees are shaped, particularly aversion to trees, is critical to the business of...

  5. Use of the "Tree" Analogy in Evolution Teaching by Biology Teachers

    ERIC Educational Resources Information Center

    Marcelos, Maria Fatima; Nagem, Ronaldo Luiz

    2012-01-01

    This work discusses the use of Darwin's "Tree of Life" as a didactic analogy and metaphor in teaching evolution. It investigates whether biology teachers of pupils from 17 to 18 years old know Darwin's text "Tree of Life". In addition, it examines whether those teachers systematically employ either the analogies present in that…

  6. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  7. Cavity tree selection by red-cockaded woodpeckers in relation to tree age

    Treesearch

    D. Craig Rudolph; Richard N. Conner

    1991-01-01

    We aged over 1350 Red-cockaded Woodpecker (Picoides borealis) cavity trees and a comparable number of randomly selected trees. Resulting data strongly support the hypothesis that Red-cockaded Woodpeckers preferentially select older trees. Ages of recently initiated cavity trees in the Texas study areas generally were similar to those of cavity trees...

  8. Why do trees die? Characterizing the drivers of background tree mortality.

    PubMed

    Das, Adrian J; Stephenson, Nathan L; Davis, Kristin P

    2016-10-01

    The drivers of background tree mortality rates-the typical low rates of tree mortality found in forests in the absence of acute stresses like drought-are central to our understanding of forest dynamics, the effects of ongoing environmental changes on forests, and the causes and consequences of geographical gradients in the nature and strength of biotic interactions. To shed light on factors contributing to background tree mortality, we analyzed detailed pathological data from 200,668 tree-years of observation and 3,729 individual tree deaths, recorded over a 13-yr period in a network of old-growth forest plots in California's Sierra Nevada mountain range. We found that: (1) Biotic mortality factors (mostly insects and pathogens) dominated (58%), particularly in larger trees (86%). Bark beetles were the most prevalent (40%), even though there were no outbreaks during the study period; in contrast, the contribution of defoliators was negligible. (2) Relative occurrences of broad classes of mortality factors (biotic, 58%; suppression, 51%; and mechanical, 25%) are similar among tree taxa, but may vary with tree size and growth rate. (3) We found little evidence of distinct groups of mortality factors that predictably occur together on trees. Our results have at least three sets of implications. First, rather than being driven by abiotic factors such as lightning or windstorms, the "ambient" or "random" background mortality that many forest models presume to be independent of tree growth rate is instead dominated by biotic agents of tree mortality, with potentially critical implications for forecasting future mortality. Mechanistic models of background mortality, even for healthy, rapidly growing trees, must therefore include the insects and pathogens that kill trees. Second, the biotic agents of tree mortality, instead of occurring in a few predictable combinations, may generally act opportunistically and with a relatively large degree of independence from one another

  9. Tree attenuation at 20 GHz: Foliage effects

    NASA Technical Reports Server (NTRS)

    Vogel, Wolfhard J.; Goldhirsh, Julius

    1993-01-01

    Static tree attenuation measurements at 20 GHz (K-Band) on a 30 deg slant path through a mature Pecan tree with and without leaves showed median fades exceeding approximately 23 dB and 7 dB, respectively. The corresponding 1% probability fades were 43 dB and 25 dB. Previous 1.6 GHz (L-Band) measurements for the bare tree case showed fades larger than those at K-Band by 3.4 dB for the median and smaller by approximately 7 dB at the 1% probability. While the presence of foliage had only a small effect on fading at L-Band (approximately 1 dB additional for the median to 1% probability range), the attenuation increase was significant at K-Band, where it increased by about 17 dB over the same probability range.

  10. Tree attenuation at 20 GHz: Foliage effects

    NASA Astrophysics Data System (ADS)

    Vogel, Wolfhard J.; Goldhirsh, Julius

    1993-08-01

    Static tree attenuation measurements at 20 GHz (K-Band) on a 30 deg slant path through a mature Pecan tree with and without leaves showed median fades exceeding approximately 23 dB and 7 dB, respectively. The corresponding 1% probability fades were 43 dB and 25 dB. Previous 1.6 GHz (L-Band) measurements for the bare tree case showed fades larger than those at K-Band by 3.4 dB for the median and smaller by approximately 7 dB at the 1% probability. While the presence of foliage had only a small effect on fading at L-Band (approximately 1 dB additional for the median to 1% probability range), the attenuation increase was significant at K-Band, where it increased by about 17 dB over the same probability range.

  11. Hydrocarbon emissions from twelve urban shade trees of the Los Angeles, California, Air Basin

    NASA Astrophysics Data System (ADS)

    Corchnoy, Stephanie B.; Arey, Janet; Atkinson, Roger

    The large-scale planting of shade trees in urban areas to counteract heat-island effects and to minimize energy use is currently being discussed. Among the costs to be considered in a cost/benefit analysis of such a program is the potential for additional reactive organic compounds in the atmosphere due to emissions from these trees. In this program, 15 species of potential shade trees for the Los Angeles Air Basin were studied and emission rates were determined for 11 of these trees, with one further tree (Crape myrtle) exhibiting no detectable emissions. The emission rates normalized to dry leaf weight and corrected to 30°C were (in μg g -1 h -1), ranked from lowest to highest emission rate: Crape myrtle, none detected; Camphor, 0.03; Aleppo pine, 0.15; Deodar cedar, 0.29; Italian Stone pine, 0.42; Monterey pine, 0.90; Brazilian pepper, 1.3; Canary Island pine, 1.7; Ginkgo, 3.0; California pepper, 3.7; Liquidambar, 37; Carrotwood, 49. In addition to the emission rates per unit biomass, the biomass per tree must be factored into any assessment of the relative merits of the various trees, since some trees have higher biomass constants than others. The present data shows that there are large differences in emission rates among different tree species and this should be factored into decision-making as to which shade trees to plant. Based solely on the presently determined emission rates, the Crape myrtle and Camphor tree are good choices for large-scale planting, while the Carrotwood tree and Liquidambar are poor choices due to their high isoprene emission rates.

  12. Intermediate tree cover can maximize groundwater recharge in the seasonally dry tropics

    PubMed Central

    Ilstedt, U.; Bargués Tobella, A.; Bazié, H. R.; Bayala, J.; Verbeeten, E.; Nyberg, G.; Sanou, J.; Benegas, L.; Murdiyarso, D.; Laudon, H.; Sheil, D.; Malmer, A.

    2016-01-01

    Water scarcity contributes to the poverty of around one-third of the world’s people. Despite many benefits, tree planting in dry regions is often discouraged by concerns that trees reduce water availability. Yet relevant studies from the tropics are scarce, and the impacts of intermediate tree cover remain unexplored. We developed and tested an optimum tree cover theory in which groundwater recharge is maximized at an intermediate tree density. Below this optimal tree density the benefits from any additional trees on water percolation exceed their extra water use, leading to increased groundwater recharge, while above the optimum the opposite occurs. Our results, based on groundwater budgets calibrated with measurements of drainage and transpiration in a cultivated woodland in West Africa, demonstrate that groundwater recharge was maximised at intermediate tree densities. In contrast to the prevailing view, we therefore find that moderate tree cover can increase groundwater recharge, and that tree planting and various tree management options can improve groundwater resources. We evaluate the necessary conditions for these results to hold and suggest that they are likely to be common in the seasonally dry tropics, offering potential for widespread tree establishment and increased benefits for hundreds of millions of people. PMID:26908158

  13. Intermediate tree cover can maximize groundwater recharge in the seasonally dry tropics

    NASA Astrophysics Data System (ADS)

    Ilstedt, U.; Bargués Tobella, A.; Bazié, H. R.; Bayala, J.; Verbeeten, E.; Nyberg, G.; Sanou, J.; Benegas, L.; Murdiyarso, D.; Laudon, H.; Sheil, D.; Malmer, A.

    2016-02-01

    Water scarcity contributes to the poverty of around one-third of the world’s people. Despite many benefits, tree planting in dry regions is often discouraged by concerns that trees reduce water availability. Yet relevant studies from the tropics are scarce, and the impacts of intermediate tree cover remain unexplored. We developed and tested an optimum tree cover theory in which groundwater recharge is maximized at an intermediate tree density. Below this optimal tree density the benefits from any additional trees on water percolation exceed their extra water use, leading to increased groundwater recharge, while above the optimum the opposite occurs. Our results, based on groundwater budgets calibrated with measurements of drainage and transpiration in a cultivated woodland in West Africa, demonstrate that groundwater recharge was maximised at intermediate tree densities. In contrast to the prevailing view, we therefore find that moderate tree cover can increase groundwater recharge, and that tree planting and various tree management options can improve groundwater resources. We evaluate the necessary conditions for these results to hold and suggest that they are likely to be common in the seasonally dry tropics, offering potential for widespread tree establishment and increased benefits for hundreds of millions of people.

  14. A survival tree method for the analysis of discrete event times in clinical and epidemiological studies.

    PubMed

    Schmid, Matthias; Küchenhoff, Helmut; Hoerauf, Achim; Tutz, Gerhard

    2016-02-28

    Survival trees are a popular alternative to parametric survival modeling when there are interactions between the predictor variables or when the aim is to stratify patients into prognostic subgroups. A limitation of classical survival tree methodology is that most algorithms for tree construction are designed for continuous outcome variables. Hence, classical methods might not be appropriate if failure time data are measured on a discrete time scale (as is often the case in longitudinal studies where data are collected, e.g., quarterly or yearly). To address this issue, we develop a method for discrete survival tree construction. The proposed technique is based on the result that the likelihood of a discrete survival model is equivalent to the likelihood of a regression model for binary outcome data. Hence, we modify tree construction methods for binary outcomes such that they result in optimized partitions for the estimation of discrete hazard functions. By applying the proposed method to data from a randomized trial in patients with filarial lymphedema, we demonstrate how discrete survival trees can be used to identify clinically relevant patient groups with similar survival behavior. Copyright © 2015 John Wiley & Sons, Ltd.

  15. Trees in the Landscape.

    ERIC Educational Resources Information Center

    Webb, Richard; Forbatha, Ann

    1982-01-01

    Strategies for using trees in classroom instruction are provided. Includes: (1) activities (such as tree identification, mapping, measuring tree height/width); (2) list of asthetic, architectural, engineering, climate, and wildlife functions of trees; (3) tree discussion questions; and (4) references. (JN)

  16. Inferring species trees from incongruent multi-copy gene trees using the Robinson-Foulds distance

    PubMed Central

    2013-01-01

    Background Constructing species trees from multi-copy gene trees remains a challenging problem in phylogenetics. One difficulty is that the underlying genes can be incongruent due to evolutionary processes such as gene duplication and loss, deep coalescence, or lateral gene transfer. Gene tree estimation errors may further exacerbate the difficulties of species tree estimation. Results We present a new approach for inferring species trees from incongruent multi-copy gene trees that is based on a generalization of the Robinson-Foulds (RF) distance measure to multi-labeled trees (mul-trees). We prove that it is NP-hard to compute the RF distance between two mul-trees; however, it is easy to calculate this distance between a mul-tree and a singly-labeled species tree. Motivated by this, we formulate the RF problem for mul-trees (MulRF) as follows: Given a collection of multi-copy gene trees, find a singly-labeled species tree that minimizes the total RF distance from the input mul-trees. We develop and implement a fast SPR-based heuristic algorithm for the NP-hard MulRF problem. We compare the performance of the MulRF method (available at http://genome.cs.iastate.edu/CBL/MulRF/) with several gene tree parsimony approaches using gene tree simulations that incorporate gene tree error, gene duplications and losses, and/or lateral transfer. The MulRF method produces more accurate species trees than gene tree parsimony approaches. We also demonstrate that the MulRF method infers in minutes a credible plant species tree from a collection of nearly 2,000 gene trees. Conclusions Our new phylogenetic inference method, based on a generalized RF distance, makes it possible to quickly estimate species trees from large genomic data sets. Since the MulRF method, unlike gene tree parsimony, is based on a generic tree distance measure, it is appealing for analyses of genomic data sets, in which many processes such as deep coalescence, recombination, gene duplication and losses as

  17. Looking for trees in the forest: summary tree from posterior samples.

    PubMed

    Heled, Joseph; Bouckaert, Remco R

    2013-10-04

    Bayesian phylogenetic analysis generates a set of trees which are often condensed into a single tree representing the whole set. Many methods exist for selecting a representative topology for a set of unrooted trees, few exist for assigning branch lengths to a fixed topology, and even fewer for simultaneously setting the topology and branch lengths. However, there is very little research into locating a good representative for a set of rooted time trees like the ones obtained from a BEAST analysis. We empirically compare new and known methods for generating a summary tree. Some new methods are motivated by mathematical constructions such as tree metrics, while the rest employ tree concepts which work well in practice. These use more of the posterior than existing methods, which discard information not directly mapped to the chosen topology. Using results from a large number of simulations we assess the quality of a summary tree, measuring (a) how well it explains the sequence data under the model and (b) how close it is to the "truth", i.e to the tree used to generate the sequences. Our simulations indicate that no single method is "best". Methods producing good divergence time estimates have poor branch lengths and lower model fit, and vice versa. Using the results presented here, a user can choose the appropriate method based on the purpose of the summary tree.

  18. Dating tree mortality using log decay in the White Mountains of New Hampshire

    Treesearch

    Andrew J. Fast; Mark J. Ducey; Jeffrey H. Gove; William B. Leak

    2008-01-01

    Coarse woody material (CWM) is an important component of forest ecosystems. To meet specific CWM management objectives, it is important to understand rates of decay. We present results from a silvicultural trial at the Bartlett Experimental Forest, in which time of death is known for a large sample of trees. Either a simple table or regression equations that use...

  19. Plant-soil feedback in East-African savanna trees.

    PubMed

    Rutten, Gemma; Prati, Daniel; Hemp, Andreas; Fischer, Markus

    2016-02-01

    Research in savannas has focused on tree-grass interactions, whereas tree species coexistence received little attention. A leading hypothesis to explain tree coexistence is the Janzen-Connell model, which proposes an accumulation of host-specific enemies, e.g., soil organisms. While it has been shown in several non-savanna case studies that seedlings dispersed away from the mother perform better than seedlings that stay close (home-away effect), few studies tested whether foreign seedling species can replace own seedlings under conspecific adults (replacement effect). Some studies additionally tested for negative effects of conspecific biota (conspecific effect) to demonstrate the accumulation of enemies. We tested these effects by reciprocally growing seedlings of four tree species on soil collected beneath adults of all species, with and without applying a soil sterilization treatment. We found negative home-away effects suggesting that dispersal is advantageous and negative replacement effects suggesting species replacement under adults. While negative conspecific effects indicate accumulated enemies, positive heterospecific effects indicate an accumulation of mutualists rather than enemies for some species. We suggest that plant-soil feedbacks may well contribute to tree coexistence in savannas due to both negative conspecific and positive heterospecific feedbacks.

  20. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.