Science.gov

Sample records for additive regression trees

  1. Subgroup finding via Bayesian additive regression trees.

    PubMed

    Sivaganesan, Siva; Müller, Peter; Huang, Bin

    2017-03-09

    We provide a Bayesian decision theoretic approach to finding subgroups that have elevated treatment effects. Our approach separates the modeling of the response variable from the task of subgroup finding and allows a flexible modeling of the response variable irrespective of potential subgroups of interest. We use Bayesian additive regression trees to model the response variable and use a utility function defined in terms of a candidate subgroup and the predicted response for that subgroup. Subgroups are identified by maximizing the expected utility where the expectation is taken with respect to the posterior predictive distribution of the response, and the maximization is carried out over an a priori specified set of candidate subgroups. Our approach allows subgroups based on both quantitative and categorical covariates. We illustrate the approach using simulated data set study and a real data set. Copyright © 2017 John Wiley & Sons, Ltd.

  2. A multiple additive regression tree analysis of three exposure measures during Hurricane Katrina.

    PubMed

    Curtis, Andrew; Li, Bin; Marx, Brian D; Mills, Jacqueline W; Pine, John

    2011-01-01

    This paper analyses structural and personal exposure to Hurricane Katrina. Structural exposure is measured by flood height and building damage; personal exposure is measured by the locations of 911 calls made during the response. Using these variables, this paper characterises the geography of exposure and also demonstrates the utility of a robust analytical approach in understanding health-related challenges to disadvantaged populations during recovery. Analysis is conducted using a contemporary statistical approach, a multiple additive regression tree (MART), which displays considerable improvement over traditional regression analysis. By using MART, the percentage of improvement in R-squares over standard multiple linear regression ranges from about 62 to more than 100 per cent. The most revealing finding is the modelled verification that African Americans experienced disproportionate exposure in both structural and personal contexts. Given the impact of exposure to health outcomes, this finding has implications for understanding the long-term health challenges facing this population.

  3. Structural regression trees

    SciTech Connect

    Kramer, S.

    1996-12-31

    In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.

  4. Additive Similarity Trees

    ERIC Educational Resources Information Center

    Sattath, Shmuel; Tversky, Amos

    1977-01-01

    Tree representations of similarity data are investigated. Hierarchical clustering is critically examined, and a more general procedure, called the additive tree, is presented. The additive tree representation is then compared to multidimensional scaling. (Author/JKS)

  5. Inferring gene regression networks with model trees

    PubMed Central

    2010-01-01

    Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate

  6. Boosted Regression Tree Models to Explain Watershed ...

    EPA Pesticide Factsheets

    Boosted regression tree (BRT) models were developed to quantify the nonlinear relationships between landscape variables and nutrient concentrations in a mesoscale mixed land cover watershed during base-flow conditions. Factors that affect instream biological components, based on the Index of Biotic Integrity (IBI), were also analyzed. Seasonal BRT models at two spatial scales (watershed and riparian buffered area [RBA]) for nitrite-nitrate (NO2-NO3), total Kjeldahl nitrogen, and total phosphorus (TP) and annual models for the IBI score were developed. Two primary factors — location within the watershed (i.e., geographic position, stream order, and distance to a downstream confluence) and percentage of urban land cover (both scales) — emerged as important predictor variables. Latitude and longitude interacted with other factors to explain the variability in summer NO2-NO3 concentrations and IBI scores. BRT results also suggested that location might be associated with indicators of sources (e.g., land cover), runoff potential (e.g., soil and topographic factors), and processes not easily represented by spatial data indicators. Runoff indicators (e.g., Hydrological Soil Group D and Topographic Wetness Indices) explained a substantial portion of the variability in nutrient concentrations as did point sources for TP in the summer months. The results from our BRT approach can help prioritize areas for nutrient management in mixed-use and heavily impacted watershed

  7. Scalable Regression Tree Learning on Hadoop using OpenPlanet

    SciTech Connect

    Yin, Wei; Simmhan, Yogesh; Prasanna, Viktor

    2012-06-18

    As scientific and engineering domains attempt to effectively analyze the deluge of data arriving from sensors and instruments, machine learning is becoming a key data mining tool to build prediction models. Regression tree is a popular learning model that combines decision trees and linear regression to forecast numerical target variables based on a set of input features. Map Reduce is well suited for addressing such data intensive learning applications, and a proprietary regression tree algorithm, PLANET, using MapReduce has been proposed earlier. In this paper, we describe an open source implement of this algorithm, OpenPlanet, on the Hadoop framework using a hybrid approach. Further, we evaluate the performance of OpenPlanet using realworld datasets from the Smart Power Grid domain to perform energy use forecasting, and propose tuning strategies of Hadoop parameters to improve the performance of the default configuration by 75% for a training dataset of 17 million tuples on a 64-core Hadoop cluster on FutureGrid.

  8. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    PubMed

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed.

  9. Capacitance Regression Modelling Analysis on Latex from Selected Rubber Tree Clones

    NASA Astrophysics Data System (ADS)

    Rosli, A. D.; Hashim, H.; Khairuzzaman, N. A.; Mohd Sampian, A. F.; Baharudin, R.; Abdullah, N. E.; Sulaiman, M. S.; Kamaru'zzaman, M.

    2015-11-01

    This paper investigates the capacitance regression modelling performance of latex for various rubber tree clones, namely clone 2002, 2008, 2014 and 3001. Conventionally, the rubber tree clones identification are based on observation towards tree features such as shape of leaf, trunk, branching habit and pattern of seeds texture. The former method requires expert persons and very time-consuming. Currently, there is no sensing device based on electrical properties that can be employed to measure different clones from latex samples. Hence, with a hypothesis that the dielectric constant of each clone varies, this paper discusses the development of a capacitance sensor via Capacitance Comparison Bridge (known as capacitance sensor) to measure an output voltage of different latex samples. The proposed sensor is initially tested with 30ml of latex sample prior to gradually addition of dilution water. The output voltage and capacitance obtained from the test are recorded and analyzed using Simple Linear Regression (SLR) model. This work outcome infers that latex clone of 2002 has produced the highest and reliable linear regression line with determination coefficient of 91.24%. In addition, the study also found that the capacitive elements in latex samples deteriorate if it is diluted with higher volume of water.

  10. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  11. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  12. Response of the regression tree model to high resolution remote sensing data for predicting percent tree cover in a Mediterranean ecosystem.

    PubMed

    Donmez, Cenk; Berberoglu, Suha; Erdogan, Mehmet Akif; Tanriover, Anil Akin; Cilek, Ahmet

    2015-02-01

    Percent tree cover is the percentage of the ground surface area covered by a vertical projection of the outermost perimeter of the plants. It is an important indicator to reveal the condition of forest systems and has a significant importance for ecosystem models as a main input. The aim of this study is to estimate the percent tree cover of various forest stands in a Mediterranean environment based on an empirical relationship between tree coverage and remotely sensed data in Goksu Watershed located at the Eastern Mediterranean coast of Turkey. A regression tree algorithm was used to simulate spatial fractions of Pinus nigra, Cedrus libani, Pinus brutia, Juniperus excelsa and Quercus cerris using multi-temporal LANDSAT TM/ETM data as predictor variables and land cover information. Two scenes of high resolution GeoEye-1 images were employed for training and testing the model. The predictor variables were incorporated in addition to biophysical variables estimated from the LANDSAT TM/ETM data. Additionally, normalised difference vegetation index (NDVI) was incorporated to LANDSAT TM/ETM band settings as a biophysical variable. Stepwise linear regression (SLR) was applied for selecting the relevant bands to employ in regression tree process. SLR-selected variables produced accurate results in the model with a high correlation coefficient of 0.80. The output values ranged from 0 to 100 %. The different tree species were mapped in 30 m resolution in respect to elevation. Percent tree cover map as a final output was derived using LANDSAT TM/ETM image over Goksu Watershed and the biophysical variables. The results were tested using high spatial resolution GeoEye-1 images. Thus, the combination of the RT algorithm and higher resolution data for percent tree cover mapping were tested and examined in a complex Mediterranean environment.

  13. Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition

    EPA Science Inventory

    Boosted regression tree (BRT) models were developed to quantify the nonlinear relationships between landscape variables and nutrient concentrations in a mesoscale mixed land cover watershed during base-flow conditions. Factors that affect instream biological components, based on ...

  14. The identification of complex interactions in epidemiology and toxicology: a simulation study of boosted regression trees

    PubMed Central

    2014-01-01

    Background There is a need to evaluate complex interaction effects on human health, such as those induced by mixtures of environmental contaminants. The usual approach is to formulate an additive statistical model and check for departures using product terms between the variables of interest. In this paper, we present an approach to search for interaction effects among several variables using boosted regression trees. Methods We simulate a continuous outcome from real data on 27 environmental contaminants, some of which are correlated, and test the method’s ability to uncover the simulated interactions. The simulated outcome contains one four-way interaction, one non-linear effect and one interaction between a continuous variable and a binary variable. Four scenarios reflecting different strengths of association are simulated. We illustrate the method using real data. Results The method succeeded in identifying the true interactions in all scenarios except where the association was weakest. Some spurious interactions were also found, however. The method was also capable to identify interactions in the real data set. Conclusions We conclude that boosted regression trees can be used to uncover complex interaction effects in epidemiological studies. PMID:24993424

  15. Application of Boosting Regression Trees to Preliminary Cost Estimation in Building Construction Projects.

    PubMed

    Shin, Yoonseok

    2015-01-01

    Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.

  16. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  17. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  18. Methods for estimating population density in data-limited areas: evaluating regression and tree-based models in Peru.

    PubMed

    Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

    2014-01-01

    Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies.

  19. Methods for Estimating Population Density in Data-Limited Areas: Evaluating Regression and Tree-Based Models in Peru

    PubMed Central

    Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William

    2014-01-01

    Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies. PMID:24992657

  20. Evaluating multimedia chemical persistence: Classification and regression tree analysis

    SciTech Connect

    Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.

    2000-04-01

    For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Widely distributed persistent chemicals are impossible to remove from the environment and remediation by natural processes may take decades, which is problematic if adverse health or ecological effects are discovered after prolonged release into the environment. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as persistent or nonpersistent based on the chemical properties. In this approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robust for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and can be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.

  1. Short Term Forecasts of Volcanic Activity Using An Event Tree Analysis System and Logistic Regression

    NASA Astrophysics Data System (ADS)

    Junek, W. N.; Jones, W. L.; Woods, M. T.

    2011-12-01

    An automated event tree analysis system for estimating the probability of short term volcanic activity is presented. The algorithm is driven by a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a multidisciplinary dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g. InSAR, seismic), source modeling results, and historic eruption activity. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes. The algorithm is extensible and can be easily recalibrated to include new or additional monitoring, modeling, or historic information. Standard cross validation techniques are employed to optimize its forecasting capabilities. Analysis results from several recent volcanic unrest episodes are presented.

  2. Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis

    PubMed Central

    Rastegari, Azam; Haghdoost, Ali Akbar; Baneshi, Mohammad Reza

    2013-01-01

    Background Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Methods Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. Findings The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. Conclusion In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables. PMID:24494152

  3. Data mining in psychological treatment research: a primer on classification and regression trees.

    PubMed

    King, Matthew W; Resick, Patricia A

    2014-10-01

    Data mining of treatment study results can reveal unforeseen but critical insights, such as who receives the most benefit from treatment and under what circumstances. The usefulness and legitimacy of exploratory data analysis have received relatively little recognition, however, and analytic methods well suited to the task are not widely known in psychology. With roots in computer science and statistics, statistical learning approaches offer a credible option: These methods take a more inductive approach to building a model than is done in traditional regression, allowing the data greater role in suggesting the correct relationships between variables rather than imposing them a priori. Classification and regression trees are presented as a powerful, flexible exemplar of statistical learning methods. Trees allow researchers to efficiently identify useful predictors of an outcome and discover interactions between predictors without the need to anticipate and specify these in advance, making them ideal for revealing patterns that inform hypotheses about treatment effects. Trees can also provide a predictive model for forecasting outcomes as an aid to clinical decision making. This primer describes how tree models are constructed, how the results are interpreted and evaluated, and how trees overcome some of the complexities of traditional regression. Examples are drawn from randomized clinical trial data and highlight some interpretations of particular interest to treatment researchers. The limitations of tree models are discussed, and suggestions for further reading and choices in software are offered.

  4. Reconstructing missing daily precipitation data using regression trees and artificial neural networks

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Missing meteorological data have to be estimated for agricultural and environmental modeling. The objective of this work was to develop a technique to reconstruct the missing daily precipitation data in the central part of the Chesapeake Bay Watershed using regression trees (RT) and artificial neura...

  5. Reconstructing missing daily precipitation data using regression trees and artificial neural networks

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Incomplete meteorological data has been a problem in environmental modeling studies. The objective of this work was to develop a technique to reconstruct missing daily precipitation data in the central part of Chesapeake Bay Watershed using regression trees (RT) and artificial neural networks (ANN)....

  6. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  7. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  8. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  9. Predicting the limits to tree height using statistical regressions of leaf traits.

    PubMed

    Burgess, Stephen S O; Dawson, Todd E

    2007-01-01

    Leaf morphology and physiological functioning demonstrate considerable plasticity within tree crowns, with various leaf traits often exhibiting pronounced vertical gradients in very tall trees. It has been proposed that the trajectory of these gradients, as determined by regression methods, could be used in conjunction with theoretical biophysical limits to estimate the maximum height to which trees can grow. Here, we examined this approach using published and new experimental data from tall conifer and angiosperm species. We showed that height predictions were sensitive to tree-to-tree variation in the shape of the regression and to the biophysical endpoints selected. We examined the suitability of proposed end-points and their theoretical validity. We also noted that site and environment influenced height predictions considerably. Use of leaf mass per unit area or leaf water potential coupled with vulnerability of twigs to cavitation poses a number of difficulties for predicting tree height. Photosynthetic rate and carbon isotope discrimination show more promise, but in the second case, the complex relationship between light, water availability, photosynthetic capacity and internal conductance to CO(2) must first be characterized.

  10. Nitrogen Addition Enhances Drought Sensitivity of Young Deciduous Tree Species

    PubMed Central

    Dziedek, Christoph; Härdtle, Werner; von Oheimb, Goddert; Fichtner, Andreas

    2016-01-01

    Understanding how trees respond to global change drivers is central to predict changes in forest structure and functions. Although there is evidence on the mode of nitrogen (N) and drought (D) effects on tree growth, our understanding of the interplay of these factors is still limited. Simultaneously, as mixtures are expected to be less sensitive to global change as compared to monocultures, we aimed to investigate the combined effects of N addition and D on the productivity of three tree species (Fagus sylvatica, Quercus petraea, Pseudotsuga menziesii) in relation to functional diverse species mixtures using data from a 4-year field experiment in Northwest Germany. Here we show that species mixing can mitigate the negative effects of combined N fertilization and D events, but the community response is mainly driven by the combination of certain traits rather than the tree species richness of a community. For beech, we found that negative effects of D on growth rates were amplified by N fertilization (i.e., combined treatment effects were non-additive), while for oak and fir, the simultaneous effects of N and D were additive. Beech and oak were identified as most sensitive to combined N+D effects with a strong size-dependency observed for beech, suggesting that the negative impact of N+D becomes stronger with time as beech grows larger. As a consequence, the net biodiversity effect declined at the community level, which can be mainly assigned to a distinct loss of complementarity in beech-oak mixtures. This pattern, however, was not evident in the other species-mixtures, indicating that neighborhood composition (i.e., trait combination), but not tree species richness mediated the relationship between tree diversity and treatment effects on tree growth. Our findings point to the importance of the qualitative role (‘trait portfolio’) that biodiversity play in determining resistance of diverse tree communities to environmental changes. As such, they provide

  11. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  12. Hyperspectral analysis of soil nitrogen, carbon, carbonate, and organic matter using regression trees.

    PubMed

    Gmur, Stephan; Vogt, Daniel; Zabowski, Darlene; Moskal, L Monika

    2012-01-01

    The characterization of soil attributes using hyperspectral sensors has revealed patterns in soil spectra that are known to respond to mineral composition, organic matter, soil moisture and particle size distribution. Soil samples from different soil horizons of replicated soil series from sites located within Washington and Oregon were analyzed with the FieldSpec Spectroradiometer to measure their spectral signatures across the electromagnetic range of 400 to 1,000 nm. Similarity rankings of individual soil samples reveal differences between replicate series as well as samples within the same replicate series. Using classification and regression tree statistical methods, regression trees were fitted to each spectral response using concentrations of nitrogen, carbon, carbonate and organic matter as the response variables. Statistics resulting from fitted trees were: nitrogen R(2) 0.91 (p < 0.01) at 403, 470, 687, and 846 nm spectral band widths, carbonate R(2) 0.95 (p < 0.01) at 531 and 898 nm band widths, total carbon R(2) 0.93 (p < 0.01) at 400, 409, 441 and 907 nm band widths, and organic matter R(2) 0.98 (p < 0.01) at 300, 400, 441, 832 and 907 nm band widths. Use of the 400 to 1,000 nm electromagnetic range utilizing regression trees provided a powerful, rapid and inexpensive method for assessing nitrogen, carbon, carbonate and organic matter for upper soil horizons in a nondestructive method.

  13. Regression Trees Identify Relevant Interactions: Can This Improve the Predictive Performance of Risk Adjustment?

    PubMed

    Buchner, Florian; Wasem, Jürgen; Schillo, Sonja

    2017-01-01

    Risk equalization formulas have been refined since their introduction about two decades ago. Because of the complexity and the abundance of possible interactions between the variables used, hardly any interactions are considered. A regression tree is used to systematically search for interactions, a methodologically new approach in risk equalization. Analyses are based on a data set of nearly 2.9 million individuals from a major German social health insurer. A two-step approach is applied: In the first step a regression tree is built on the basis of the learning data set. Terminal nodes characterized by more than one morbidity-group-split represent interaction effects of different morbidity groups. In the second step the 'traditional' weighted least squares regression equation is expanded by adding interaction terms for all interactions detected by the tree, and regression coefficients are recalculated. The resulting risk adjustment formula shows an improvement in the adjusted R(2) from 25.43% to 25.81% on the evaluation data set. Predictive ratios are calculated for subgroups affected by the interactions. The R(2) improvement detected is only marginal. According to the sample level performance measures used, not involving a considerable number of morbidity interactions forms no relevant loss in accuracy. Copyright © 2015 John Wiley & Sons, Ltd.

  14. Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.

    PubMed

    Chung, Yi-Shih

    2013-12-01

    Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies.

  15. Prioritizing Highway Safety Manual's crash prediction variables using boosted regression trees.

    PubMed

    Saha, Dibakar; Alluri, Priyanka; Gan, Albert

    2015-06-01

    The Highway Safety Manual (HSM) recommends using the empirical Bayes (EB) method with locally derived calibration factors to predict an agency's safety performance. However, the data needs for deriving these local calibration factors are significant, requiring very detailed roadway characteristics information. Many of the data variables identified in the HSM are currently unavailable in the states' databases. Moreover, the process of collecting and maintaining all the HSM data variables is cost-prohibitive. Prioritization of the variables based on their impact on crash predictions would, therefore, help to identify influential variables for which data could be collected and maintained for continued updates. This study aims to determine the impact of each independent variable identified in the HSM on crash predictions. A relatively recent data mining approach called boosted regression trees (BRT) is used to investigate the association between the variables and crash predictions. The BRT method can effectively handle different types of predictor variables, identify very complex and non-linear association among variables, and compute variable importance. Five years of crash data from 2008 to 2012 on two urban and suburban facility types, two-lane undivided arterials and four-lane divided arterials, were analyzed for estimating the influence of variables on crash predictions. Variables were found to exhibit non-linear and sometimes complex relationship to predicted crash counts. In addition, only a few variables were found to explain most of the variation in the crash data.

  16. [Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].

    PubMed

    Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao

    2016-03-01

    Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.

  17. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification.

  18. Stressor-response modeling using the 2D water quality model and regression trees to predict chlorophyll-a in a reservoir system

    NASA Astrophysics Data System (ADS)

    Park, Yongeun; Pachepsky, Yakov A.; Cho, Kyung Hwa; Jeon, Dong Jin; Kim, Joon Ha

    2015-10-01

    To control algal blooms, the stressor-response relationships between water quality metrics, environmental variables, and algal growth need to be better understood and modeled. Machine-learning methods have been suggested as means to express the stressor-response relationships that are found when applying mechanistic water quality models. The objective of this work was to evaluate the efficiency of regression trees in the development of a stressor-response model for chlorophyll-a (Chl-a) concentrations, using the results from site-specific mechanistic water quality modeling. The 2-dimensional hydrodynamic and water quality model (CE-QUAL-W2) model was applied to simulate water quality using four-year observational data and additional scenarios of air temperature increases for the Yeongsan Reservoir in South Korea. Regression tree modeling was applied to the results of these simulations. Given the well-expressed seasonality in the simulated Chl-a dynamics, separate regression trees were developed for months from May to September. The regression trees provided a reasonably accurate representation of the stressor-response dependence generated by the CE-QUAL-W2 model. Different stressors were then selected as split variables for different months, and, in most cases, splits by the same stressor variable yielded the same correlation sign between the variable and the Chl-a concentration. Compared to physical variables, nutrient content appeared to better predict Chl-a responses. The highest Chl-a temperature sensitivities were found for May and June. Regression tree splits based on ammonium concentration resulted in a consistent trend of greater sensitivity in the groups of samples with higher ammonium concentrations. Regression tree models provided a transparent visual representation of the stressor-response relationships for Chl-a and its sensitivity. Overall, the representation of relationships using classification and regression tools can be considered a useful

  19. Perceived Organizational Support for Enhancing Welfare at Work: A Regression Tree Model.

    PubMed

    Giorgi, Gabriele; Dubin, David; Perez, Javier Fiz

    2016-01-01

    When trying to examine outcomes such as welfare and well-being, research tends to focus on main effects and take into account limited numbers of variables at a time. There are a number of techniques that may help address this problem. For example, many statistical packages available in R provide easy-to-use methods of modeling complicated analysis such as classification and tree regression (i.e., recursive partitioning). The present research illustrates the value of recursive partitioning in the prediction of perceived organizational support in a sample of more than 6000 Italian bankers. Utilizing the tree function party package in R, we estimated a regression tree model predicting perceived organizational support from a multitude of job characteristics including job demand, lack of job control, lack of supervisor support, training, etc. The resulting model appears particularly helpful in pointing out several interactions in the prediction of perceived organizational support. In particular, training is the dominant factor. Another dimension that seems to influence organizational support is reporting (perceived communication about safety and stress concerns). Results are discussed from a theoretical and methodological point of view.

  20. Perceived Organizational Support for Enhancing Welfare at Work: A Regression Tree Model

    PubMed Central

    Giorgi, Gabriele; Dubin, David; Perez, Javier Fiz

    2016-01-01

    When trying to examine outcomes such as welfare and well-being, research tends to focus on main effects and take into account limited numbers of variables at a time. There are a number of techniques that may help address this problem. For example, many statistical packages available in R provide easy-to-use methods of modeling complicated analysis such as classification and tree regression (i.e., recursive partitioning). The present research illustrates the value of recursive partitioning in the prediction of perceived organizational support in a sample of more than 6000 Italian bankers. Utilizing the tree function party package in R, we estimated a regression tree model predicting perceived organizational support from a multitude of job characteristics including job demand, lack of job control, lack of supervisor support, training, etc. The resulting model appears particularly helpful in pointing out several interactions in the prediction of perceived organizational support. In particular, training is the dominant factor. Another dimension that seems to influence organizational support is reporting (perceived communication about safety and stress concerns). Results are discussed from a theoretical and methodological point of view. PMID:28082924

  1. Landsat 8 six spectral band data and MODIS NDVI data for assessing the optimal regression tree models

    USGS Publications Warehouse

    Gu, Yingxin; Wylie, Bruce K.; Boyte, Stephen

    2016-01-01

    In this study, we developed a method that identifies an optimal sample data usage strategy and rule numbers that minimize over- and underfitting effects in regression tree mapping models. A LANDFIRE tile (r04c03, located mainly in northeastern Nevada), which is a composite of multiple Landsat 8 scenes for a target date, was selected for the study. To minimize any cloud and bad detection effects in the original Landsat 8 data, the compositing approach used cosine-similarity-combined pixels from multiple observations based on data quality and temporal proximity to a target date. Julian date 212, which yielded relatively low "no data and/or cloudy” pixels, was used as the target date with Landsat 8 observations from days 140–240 in 2013. The 30-m Landsat 8 composited data were then upscaled to 250 m using a spatial averaging method. Six Landsat 8 spectral bands (bands 1–6) at 250-m resolution were used as independent variables for developing the piecewise regression-tree models to predict the 250-m eMODIS NDVI (dependent variable). Furthermore, to ensure the high quality of the derived 250-m Landsat 8 data, and avoid any additional cloud and atmospheric effects, the percentage of 30-m pixels with “0” values within a 250-m pixel was calculated. Only those 250-m pixels with 0% of “0” values (i.e., all the 30-m pixels within a 250-m pixel have no zero values pixels) were selected to develop the regression-tree model.The 7-day maximum value composites of 250-m MODIS NDVI for the year 2013 were obtained from the USGS expedited MODIS (eMODIS) data archive (https://lta.cr.usgs.gov/emodis). Pixels with bad quality, negative values, clouds, snow cover, and low view angles were filtered out based on the MODIS quality assurance data to ensure high quality eMODIS NDVI data. The 2013 weekly NDVI data were then stacked and temporally smoothed using a weighted least-squares approach to reduce additional atmospheric noise. Temporal smoothing helps to ensure reliable

  2. Regression models for estimating leaf area of seedlings and adult individuals of Neotropical rainforest tree species.

    PubMed

    Brito-Rocha, E; Schilling, A C; Dos Anjos, L; Piotto, D; Dalmolin, A C; Mielke, M S

    2016-01-01

    Individual leaf area (LA) is a key variable in studies of tree ecophysiology because it directly influences light interception, photosynthesis and evapotranspiration of adult trees and seedlings. We analyzed the leaf dimensions (length - L and width - W) of seedlings and adults of seven Neotropical rainforest tree species (Brosimum rubescens, Manilkara maxima, Pouteria caimito, Pouteria torta, Psidium cattleyanum, Symphonia globulifera and Tabebuia stenocalyx) with the objective to test the feasibility of single regression models to estimate LA of both adults and seedlings. In southern Bahia, Brazil, a first set of data was collected between March and October 2012. From the seven species analyzed, only two (P. cattleyanum and T. stenocalyx) had very similar relationships between LW and LA in both ontogenetic stages. For these two species, a second set of data was collected in August 2014, in order to validate the single models encompassing adult and seedlings. Our results show the possibility of development of models for predicting individual leaf area encompassing different ontogenetic stages for tropical tree species. The development of these models was more dependent on the species than the differences in leaf size between seedlings and adults.

  3. Industrial and occupational ergonomics in the petrochemical process industry: a regression trees approach.

    PubMed

    Bevilacqua, M; Ciarapica, F E; Giacchetta, G

    2008-07-01

    This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.

  4. Identifying population groups with low palliative care program enrolment using classification and regression tree analysis.

    PubMed

    Gao, Jun; Johnston, Grace M; Lavergne, M Ruth; McIntyre, Paul

    2011-01-01

    Classification and regression tree (CART) analysis was used to identify subpopulations with lower palliative care program (PCP) enrolment rates. CART analysis uses recursive partitioning to group predictors. The PCP enrolment rate was 72 percent for the 6,892 adults who died of cancer from 2000 and 2005 in two counties in Nova Scotia, Canada. The lowest PCP enrolment rates were for nursing home residents over 82 years (27 percent), a group residing more than 43 kilometres from the PCP (31 percent), and another group living less than two weeks after their cancer diagnosis (37 percent). The highest rate (86 percent) was for the 2,118 persons who received palliative radiation. Findings from multiple logistic regression (MLR) were provided for comparison. CART findings identified low PCP enrolment subpopulations that were defined by interactions among demographic, social, medical, and health system predictors.

  5. Delaware River Streamflow Reconstruction using Tree Rings: Exploration of Hierarchical Bayesian Regression

    NASA Astrophysics Data System (ADS)

    Devineni, N.; Lall, U.; Cook, E.; Pederson, N.

    2011-12-01

    We present the application of a linear model in a Hierarchical Bayesian Regression (HBR) framework for reconstructing the summer seasonal averaged streamflow at five stations in the Delaware River Basin using eight newly developed regional tree ring chronologies. This technique directly provides estimates of the posterior probability distribution of each reconstructed streamflow value, considering model parameter uncertainty. The methodology also allows us to shrink the model parameters towards a common mean to incorporate the predictive ability of each tree chronology on multiple stations. We present the results from HBR analysis along with the results from traditional Point by Point Regression (PPR) analysis to demonstrate the benefits of developing the reconstructions under a Bayesian modeling framework. Further, we also present the comparative results of the model validation using various performance evaluation metrics such as reduction in error (RE) and coefficient of efficiency (CE). The reconstructed streamflow at various stations can be utilized to examine the frequency and recurrence attributes of extreme droughts in the region and their potential connections to known low frequency climate modes.

  6. Comparison of universal kriging and regression tree modelling for soil property mapping

    NASA Astrophysics Data System (ADS)

    Kempen, Bas

    2013-04-01

    Geostatistical modelling approaches have been dominating the field of digital soil mapping (DSM) since its inception in the early 1980s. In recent years, however, machine learning methods such as classification and regression trees, random forests, and neural networks have quickly gained popularity among researchers in the DSM community. The increased use of these methods has largely gone at the cost of geostatistical approaches. Despite the apparent shift in the application of DSM methods from geostatistics to machine learning, quantitative comparisons of the prediction performance of these methods are largely lacking. The aims of this research, therefore, are: i) to map two soil properties (topsoil organic matter content and thickness of the peat layer in the soil profile) using regression tree (RT) modelling and universal kriging (UK), and ii) to compare the prediction performance of these methods with independent data obtained by probability sampling. Using such data for validation does not only yield a statistically valid and unbiased estimates of the map accuracy, but it also allows a statistical comparison of the accuracies of the maps generated by the two methods. The topsoil organic matter content and the thickness of the peat layer were mapped for a 14,000 ha area in the province of Drenthe, The Netherlands. The calibration dataset contained soil property observations at 1,715 sites. The covariates used include layers derived from soil and paleogeography maps, land cover, relative elevation, drainage class, land reclamation period, elevation change, and historic land use. The validation dataset contained 125 observations selected by stratified simple random sampling of the study area. The root mean squared error (RMSE) of the soil organic matter map obtained by RT modelling was 0.603 log(%), that of the map obtained by UK 0.595 log(%). The difference in map accuracy was not significant (p = 0.377). The RMSE of the peat thickness map obtained by RT

  7. Classification and regression trees for epidemiologic research: an air pollution example

    PubMed Central

    2014-01-01

    Background Identifying and characterizing how mixtures of exposures are associated with health endpoints is challenging. We demonstrate how classification and regression trees can be used to generate hypotheses regarding joint effects from exposure mixtures. Methods We illustrate the approach by investigating the joint effects of CO, NO2, O3, and PM2.5 on emergency department visits for pediatric asthma in Atlanta, Georgia. Pollutant concentrations were categorized as quartiles. Days when all pollutants were in the lowest quartile were held out as the referent group (n = 131) and the remaining 3,879 days were used to estimate the regression tree. Pollutants were parameterized as dichotomous variables representing each ordinal split of the quartiles (e.g. comparing CO quartile 1 vs. CO quartiles 2–4) and considered one at a time in a Poisson case-crossover model with control for confounding. The pollutant-split resulting in the smallest P-value was selected as the first split and the dataset was partitioned accordingly. This process repeated for each subset of the data until the P-values for the remaining splits were not below a given alpha, resulting in the formation of a “terminal node”. We used the case-crossover model to estimate the adjusted risk ratio for each terminal node compared to the referent group, as well as the likelihood ratio test for the inclusion of the terminal nodes in the final model. Results The largest risk ratio corresponded to days when PM2.5 was in the highest quartile and NO2 was in the lowest two quartiles (RR: 1.10, 95% CI: 1.05, 1.16). A simultaneous Wald test for the inclusion of all terminal nodes in the model was significant, with a chi-square statistic of 34.3 (p = 0.001, with 13 degrees of freedom). Conclusions Regression trees can be used to hypothesize about joint effects of exposure mixtures and may be particularly useful in the field of air pollution epidemiology for gaining a better understanding of complex

  8. Using Classification and Regression Trees (CART) and random forests to analyze attrition: Results from two simulations.

    PubMed

    Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J

    2015-12-01

    In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers.

  9. A classification and regression tree model of controls on dissolved inorganic nitrogen leaching from European forests.

    PubMed

    Rothwell, James J; Futter, Martyn N; Dise, Nancy B

    2008-11-01

    Often, there is a non-linear relationship between atmospheric dissolved inorganic nitrogen (DIN) input and DIN leaching that is poorly captured by existing models. We present the first application of the non-parametric classification and regression tree approach to evaluate the key environmental drivers controlling DIN leaching from European forests. DIN leaching was classified as low (<3), medium (3-15) or high (>15kg N ha(-1) year(-1)) at 215 sites across Europe. The analysis identified throughfall NO(3)(-) deposition, acid deposition, hydrology, soil type, the carbon content of the soil, and the legacy of historic N deposition as the dominant drivers of DIN leaching for these forests. Ninety four percent of sites were successfully classified into the appropriate leaching category. This approach shows promise for understanding complex ecosystem responses to a wide range of anthropogenic stressors as well as an improved method for identifying risk and targeting pollution mitigation strategies in forest ecosystems.

  10. Combinations of Stressors in Midlife: Examining Role and Domain Stressors Using Regression Trees and Random Forests

    PubMed Central

    2013-01-01

    Objectives. Global perceptions of stress (GPS) have major implications for mental and physical health, and stress in midlife may influence adaptation in later life. Thus, it is important to determine the unique and interactive effects of diverse influences of role stress (at work or in personal relationships), loneliness, life events, time pressure, caregiving, finances, discrimination, and neighborhood circumstances on these GPS. Method. Exploratory regression trees and random forests were used to examine complex interactions among myriad events and chronic stressors in middle-aged participants’ (N = 410; mean age = 52.12) GPS. Results. Different role and domain stressors were influential at high and low levels of loneliness. Varied combinations of these stressors resulting in similar levels of perceived stress are also outlined as examples of equifinality. Loneliness emerged as an important predictor across trees. Discussion. Exploring multiple stressors simultaneously provides insights into the diversity of stressor combinations across individuals—even those with similar levels of global perceived stress—and answers theoretical mandates to better understand the influence of stress by sampling from many domain and role stressors. Further, the unique influences of each predictor relative to the others inform theory and applied work. Finally, examples of equifinality and multifinality call for targeted interventions. PMID:23341437

  11. Prediction of Wind Speeds Based on Digital Elevation Models Using Boosted Regression Trees

    NASA Astrophysics Data System (ADS)

    Fischer, P.; Etienne, C.; Tian, J.; Krauß, T.

    2015-12-01

    In this paper a new approach is presented to predict maximum wind speeds using Gradient Boosted Regression Trees (GBRT). GBRT are a non-parametric regression technique used in various applications, suitable to make predictions without having an in-depth a-priori knowledge about the functional dependancies between the predictors and the response variables. Our aim is to predict maximum wind speeds based on predictors, which are derived from a digital elevation model (DEM). The predictors describe the orography of the Area-of-Interest (AoI) by various means like first and second order derivatives of the DEM, but also higher sophisticated classifications describing exposure and shelterness of the terrain to wind flux. In order to take the different scales into account which probably influence the streams and turbulences of wind flow over complex terrain, the predictors are computed on different spatial resolutions ranging from 30 m up to 2000 m. The geographic area used for examination of the approach is Switzerland, a mountainious region in the heart of europe, dominated by the alps, but also covering large valleys. The full workflow is described in this paper, which consists of data preparation using image processing techniques, model training using a state-of-the-art machine learning algorithm, in-depth analysis of the trained model, validation of the model and application of the model to generate a wind speed map.

  12. Weighing risk factors associated with bee colony collapse disorder by classification and regression tree analysis.

    PubMed

    VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude

    2010-10-01

    Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.

  13. Using boosted regression trees to predict the near-saturated hydraulic conductivity of undisturbed soils

    NASA Astrophysics Data System (ADS)

    Koestel, John; Bechtold, Michel; Jorda, Helena; Jarvis, Nicholas

    2015-04-01

    The saturated and near-saturated hydraulic conductivity of soil is of key importance for modelling water and solute fluxes in the vadose zone. Hydraulic conductivity measurements are cumbersome at the Darcy scale and practically impossible at larger scales where water and solute transport models are mostly applied. Hydraulic conductivity must therefore be estimated from proxy variables. Such pedotransfer functions are known to work decently well for e.g. water retention curves but rather poorly for near-saturated and saturated hydraulic conductivities. Recently, Weynants et al. (2009, Revisiting Vereecken pedotransfer functions: Introducing a closed-form hydraulic model. Vadose Zone Journal, 8, 86-95) reported a coefficients of determination of 0.25 (validation with an independent data set) for the saturated hydraulic conductivity from lab-measurements of Belgian soil samples. In our study, we trained boosted regression trees on a global meta-database containing tension-disk infiltrometer data (see Jarvis et al. 2013. Influence of soil, land use and climatic factors on the hydraulic conductivity of soil. Hydrology & Earth System Sciences, 17, 5185-5195) to predict the saturated hydraulic conductivity (Ks) and the conductivity at a tension of 10 cm (K10). We found coefficients of determination of 0.39 and 0.62 under a simple 10-fold cross-validation for Ks and K10. When carrying out the validation folded over the data-sources, i.e. the source publications, we found that the corresponding coefficients of determination reduced to 0.15 and 0.36, respectively. We conclude that the stricter source-wise cross-validation should be applied in future pedotransfer studies to prevent overly optimistic validation results. The boosted regression trees also allowed for an investigation of relevant predictors for estimating the near-saturated hydraulic conductivity. We found that land use and bulk density were most important to predict Ks. We also observed that Ks is large in fine

  14. Risk Profiles for Weight Gain among Postmenopausal Women: A Classification and Regression Tree Analysis Approach

    PubMed Central

    Jung, Su Yon; Vitolins, Mara Z.; Fenton, Jenifer; Frazier-Wood, Alexis C.; Hursting, Stephen D.; Chang, Shine

    2015-01-01

    Purpose Risk factors for obesity and weight gain are typically evaluated individually while “adjusting for” the influence of other confounding factors, and few studies, if any, have created risk profiles by clustering risk factors. We identified subgroups of postmenopausal women homogeneous in their clustered modifiable and non-modifiable risk factors for gaining ≥ 3% weight. Methods This study included 612 postmenopausal women 50–79 years old, enrolled in an ancillary study of the Women's Health Initiative Observational Study between February 1995 and July 1998. Classification and regression tree and stepwise regression models were built and compared. Results Of 27 selected variables, the factors significantly related to ≥ 3% weight gain were weight change in the past 2 years, age at menopause, dietary fiber, fat, alcohol intake, and smoking. In women younger than 65 years, less than 4 kg weight change in the past 2 years sufficiently reduced risk of ≥ 3% weight gain. Different combinations of risk factors related to weight gain were reported for subgroups of women: women 65 years or older (essential factor: < 9.8 g/day dietary factor), African Americans (essential factor: currently smoking), and white women (essential factor: ≥ 5 kg weight change for the past 2 years). Conclusions Our findings suggest specific characteristics for particular subgroups of postmenopausal women that may be useful for identifying those at risk for weight gain. The study results may be useful for targeting efforts to promote strategies to reduce the risk of obesity and weight gain in subgroups of postmenopausal women and maximize the effect of weight control by decreasing obesity-relevant adverse health outcomes. PMID:25822239

  15. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  16. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  17. Improving Automatic English Writing Assessment Using Regression Trees and Error-Weighting

    NASA Astrophysics Data System (ADS)

    Lee, Kong-Joo; Kim, Jee-Eun

    The proposed automated scoring system for English writing tests provides an assessment result including a score and diagnostic feedback to test-takers without human's efforts. The system analyzes an input sentence and detects errors related to spelling, syntax and content similarity. The scoring model has adopted one of the statistical approaches, a regression tree. A scoring model in general calculates a score based on the count and the types of automatically detected errors. Accordingly, a system with higher accuracy in detecting errors raises the accuracy in scoring a test. The accuracy of the system, however, cannot be fully guaranteed for several reasons, such as parsing failure, incompleteness of knowledge bases, and ambiguous nature of natural language. In this paper, we introduce an error-weighting technique, which is similar to term-weighting widely used in information retrieval. The error-weighting technique is applied to judge reliability of the errors detected by the system. The score calculated with the technique is proven to be more accurate than the score without it.

  18. Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree

    PubMed Central

    Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid

    2016-01-01

    Introduction: Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. Objective: we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. Methods: we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. Results: The proposed model had an accuracy of 94.84% ( Standard Deviation: 24.42) in order to correct prediction of the ESD disease. Conclusions: Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD. PMID:28077889

  19. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  20. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees

    PubMed Central

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥106 CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥105 CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  1. Prognostic transcriptional association networks: a new supervised approach based on regression trees

    PubMed Central

    Nepomuceno-Chamorro, Isabel; Azuaje, Francisco; Devaux, Yvan; Nazarov, Petr V.; Muller, Arnaud; Aguilar-Ruiz, Jesús S.; Wagner, Daniel R.

    2011-01-01

    Motivation: The application of information encoded in molecular networks for prognostic purposes is a crucial objective of systems biomedicine. This approach has not been widely investigated in the cardiovascular research area. Within this area, the prediction of clinical outcomes after suffering a heart attack would represent a significant step forward. We developed a new quantitative prediction-based method for this prognostic problem based on the discovery of clinically relevant transcriptional association networks. This method integrates regression trees and clinical class-specific networks, and can be applied to other clinical domains. Results: Before analyzing our cardiovascular disease dataset, we tested the usefulness of our approach on a benchmark dataset with control and disease patients. We also compared it to several algorithms to infer transcriptional association networks and classification models. Comparative results provided evidence of the prediction power of our approach. Next, we discovered new models for predicting good and bad outcomes after myocardial infarction. Using blood-derived gene expression data, our models reported areas under the receiver operating characteristic curve above 0.70. Our model could also outperform different techniques based on co-expressed gene modules. We also predicted processes that may represent novel therapeutic targets for heart disease, such as the synthesis of leucine and isoleucine. Availability: The SATuRNo software is freely available at http://www.lsi.us.es/isanepo/toolsSaturno/. Contact: inepomuceno@us.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21098433

  2. Assessment of land use factors associated with dengue cases in Malaysia using Boosted Regression Trees.

    PubMed

    Cheong, Yoon Ling; Leitão, Pedro J; Lakes, Tobia

    2014-07-01

    The transmission of dengue disease is influenced by complex interactions among vector, host and virus. Land use such as water bodies or certain agricultural practices have been identified as likely risk factors for dengue because of the provision of suitable habitats for the vector. Many studies have focused on the land use factors of dengue vector abundance in small areas but have not yet studied the relationship between land use factors and dengue cases for large regions. This study aims to clarify if land use factors other than human settlements, e.g. different types of agricultural land use, water bodies and forest are associated with reported dengue cases from 2008 to 2010 in the state of Selangor, Malaysia. From the correlative relationship, we aim to generate a prediction risk map. We used Boosted Regression Trees (BRT) to account for nonlinearities and interactions between the factors with high predictive accuracies. Our model with a cross-validated performance score (Area Under the Receiver Operator Characteristic Curve, ROC AUC) of 0.81 showed that the most important land use factors are human settlements (model importance of 39.2%), followed by water bodies (16.1%), mixed horticulture (8.7%), open land (7.5%) and neglected grassland (6.7%). A risk map after 100 model runs with a cross-validated ROC AUC mean of 0.81 (±0.001 s.d.) is presented. Our findings may be an important asset for improving surveillance and control interventions for dengue.

  3. Exploring the link between drought indicators and impacts through data visualization and regression trees

    NASA Astrophysics Data System (ADS)

    Bachmair, Sophie; Stahl, Kerstin; Blauhut, Veit; Kohn, Irene

    2014-05-01

    impact occurrence. The applied data visualization and regression tree approach proved to be a valuable methodology for exploring the link between indicators and impacts. Nevertheless, the results are influenced by the uncertainty of identifying and quantifying drought impacts and vulnerability factors at a suitable spatial and temporal scale. This calls for more research on methodological issues of drought impact and vulnerability assessment, as well as for further developing impact inventories and exploiting the link between drought indicators and impacts.

  4. Simulating California Reservoir Operation Using the Classification and Regression Tree Algorithm Combined with a Shuffled Cross-Validation Scheme

    NASA Astrophysics Data System (ADS)

    Yang, T.; Gao, X.; Sorooshian, S.; Li, X.

    2015-12-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs, and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as the consideration of policy and regulation, environmental constraints, dry/wet conditions, etc. In this paper, a reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve model's predictive performance. An application study of 9 major reservoirs in California is carried out and the simulated results from different decision tree approaches are compared with observation, including original CART and Random Forest. The statistical measurements show that CART combined with the shuffled cross-validation scheme gives a better predictive performance over the other two methods, especially in simulating the peak flows. The results for simulated controlled outflow, storage changes and storage trajectories also show that the proposed model is able to consistently and reasonably predict the human's reservoir operation decisions. In addition, we found that the operation in the Trinity Lake, Oroville Lake and Shasta Lake are greatly influenced by policy and regulation, while low elevation reservoirs are more sensitive to inflow amount than others.

  5. Risk assessment of dengue fever in Zhongshan, China: a time-series regression tree analysis.

    PubMed

    Liu, K-K; Wang, T; Huang, X-D; Wang, G-L; Xia, Y; Zhang, Y-T; Jing, Q-L; Huang, J-W; Liu, X-X; Lu, J-H; Hu, W-B

    2017-02-01

    Dengue fever (DF) is the most prevalent and rapidly spreading mosquito-borne disease globally. Control of DF is limited by barriers to vector control and integrated management approaches. This study aimed to explore the potential risk factors for autochthonous DF transmission and to estimate the threshold effects of high-order interactions among risk factors. A time-series regression tree model was applied to estimate the hierarchical relationship between reported autochthonous DF cases and the potential risk factors including the timeliness of DF surveillance systems (median time interval between symptom onset date and diagnosis date, MTIOD), mosquito density, imported cases and meteorological factors in Zhongshan, China from 2001 to 2013. We found that MTIOD was the most influential factor in autochthonous DF transmission. Monthly autochthonous DF incidence rate increased by 36·02-fold [relative risk (RR) 36·02, 95% confidence interval (CI) 25·26-46·78, compared to the average DF incidence rate during the study period] when the 2-month lagged moving average of MTIOD was >4·15 days and the 3-month lagged moving average of the mean Breteau Index (BI) was ⩾16·57. If the 2-month lagged moving average MTIOD was between 1·11 and 4·15 days and the monthly maximum diurnal temperature range at a lag of 1 month was <9·6 °C, the monthly mean autochthonous DF incidence rate increased by 14·67-fold (RR 14·67, 95% CI 8·84-20·51, compared to the average DF incidence rate during the study period). This study demonstrates that the timeliness of DF surveillance systems, mosquito density and diurnal temperature range play critical roles in the autochthonous DF transmission in Zhongshan. Better assessment and prediction of the risk of DF transmission is beneficial for establishing scientific strategies for DF early warning surveillance and control.

  6. Chronic subdural hematoma: Surgical management and outcome in 986 cases: A classification and regression tree approach

    PubMed Central

    Rovlias, Aristedis; Theodoropoulos, Spyridon; Papoutsakis, Dimitrios

    2015-01-01

    Background: Chronic subdural hematoma (CSDH) is one of the most common clinical entities in daily neurosurgical practice which carries a most favorable prognosis. However, because of the advanced age and medical problems of patients, surgical therapy is frequently associated with various complications. This study evaluated the clinical features, radiological findings, and neurological outcome in a large series of patients with CSDH. Methods: A classification and regression tree (CART) technique was employed in the analysis of data from 986 patients who were operated at Asclepeion General Hospital of Athens from January 1986 to December 2011. Burr holes evacuation with closed system drainage has been the operative technique of first choice at our institution for 29 consecutive years. A total of 27 prognostic factors were examined to predict the outcome at 3-month postoperatively. Results: Our results indicated that neurological status on admission was the best predictor of outcome. With regard to the other data, age, brain atrophy, thickness and density of hematoma, subdural accumulation of air, and antiplatelet and anticoagulant therapy were found to correlate significantly with prognosis. The overall cross-validated predictive accuracy of CART model was 85.34%, with a cross-validated relative error of 0.326. Conclusions: Methodologically, CART technique is quite different from the more commonly used methods, with the primary benefit of illustrating the important prognostic variables as related to outcome. Since, the ideal therapy for the treatment of CSDH is still under debate, this technique may prove useful in developing new therapeutic strategies and approaches for patients with CSDH. PMID:26257985

  7. Regression Tree-Based Methodology for Customizing Building Energy Benchmarks to Individual Commercial Buildings

    NASA Astrophysics Data System (ADS)

    Kaskhedikar, Apoorva Prakash

    According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI

  8. Bacillary dysentery and meteorological factors in northeastern China: a historical review based on classification and regression trees.

    PubMed

    Guan, Peng; Huang, Desheng; Guo, Junqiao; Wang, Ping; Zhou, Baosen

    2008-09-01

    The relationship between the incidence of bacillary dysentery and meteorological factors was investigated. Data on bacillary dysentery incidence in Shenyang from 1990 to 1996 were obtained from Liaoning Provincial Center for Disease Control and Prevention, and meteorological data such as atmospheric pressure, air temperature, precipitation, evaporation, wind speed, and the amount of solar radiation were obtained from Shenyang Meteorological Bureau. Kendall and Spearman correlations were used to analyze the relationship between bacillary dysentery and meteorological factors. The incidence of bacillary dysentery was treated as a response variable, and meteorological factors were treated as predictable variables. Software R 2.3.1 was used to execute the classification and regression trees (CART). The model improved the accuracy of the fitting results. The residual sum square error of the regression tree model was 53.9, while the residual sum square error of the multivariate linear regression model was 107.2. Among all the meteorological indexes, relative humidity, minimum temperature, and pressure one month prior were statistically influential factors in the multivariate regression tree model. CART may be a useful tool for dealing with heterogeneous data, as it can serve as a decision support tool and is notable for its simplicity and ease.

  9. Structured additive distributional regression for analysing landings per unit effort in fisheries research.

    PubMed

    Mamouridis, Valeria; Klein, Nadja; Kneib, Thomas; Cadarso Suarez, Carmen; Maynou, Francesc

    2017-01-01

    We analysed the landings per unit effort (LPUE) from the Barcelona trawl fleet targeting the red shrimp (Aristeus antennatus) using novel Bayesian structured additive distributional regression to gain a better understanding of the dynamics and determinants of variation in LPUE. The data set, covering a time span of 17 years, includes fleet-dependent variables (e.g. the number of trips performed by vessels), temporal variables (inter- and intra-annual variability) and environmental variables (the North Atlantic Oscillation index). Based on structured additive distributional regression, we evaluate (i) the gain in replacing purely linear predictors by additive predictors including nonlinear effects of continuous covariates, (ii) the inclusion of vessel-specific effects based on either fixed or random effects, (iii) different types of distributions for the response, and (iv) the potential gain in not only modelling the location but also the scale/shape parameter of these distributions. Our findings support that flexible model variants are indeed able to improve the fit considerably and that additional insights can be gained. Tools to select within several model specifications and assumptions are discussed in detail as well.

  10. [Application of regression tree in analyzing the effects of climate factors on NDVI in loess hilly area of Shaanxi Province].

    PubMed

    Liu, Yang; Lü, Yi-he; Zheng, Hai-feng; Chen, Li-ding

    2010-05-01

    Based on the 10-day SPOT VEGETATION NDVI data and the daily meteorological data from 1998 to 2007 in Yan' an City, the main meteorological variables affecting the annual and interannual variations of NDVI were determined by using regression tree. It was found that the effects of test meteorological variables on the variability of NDVI differed with seasons and time lags. Temperature and precipitation were the most important meteorological variables affecting the annual variation of NDVI, and the average highest temperature was the most important meteorological variable affecting the inter-annual variation of NDVI. Regression tree was very powerful in determining the key meteorological variables affecting NDVI variation, but could not build quantitative relations between NDVI and meteorological variables, which limited its further and wider application.

  11. Analysis of the impact of recreational trail usage for prioritising management decisions: a regression tree approach

    NASA Astrophysics Data System (ADS)

    Tomczyk, Aleksandra; Ewertowski, Marek; White, Piran; Kasprzak, Leszek

    2016-04-01

    The dual role of many Protected Natural Areas in providing benefits for both conservation and recreation poses challenges for management. Although recreation-based damage to ecosystems can occur very quickly, restoration can take many years. The protection of conservation interests at the same as providing for recreation requires decisions to be made about how to prioritise and direct management actions. Trails are commonly used to divert visitors from the most important areas of a site, but high visitor pressure can lead to increases in trail width and a concomitant increase in soil erosion. Here we use detailed field data on condition of recreational trails in Gorce National Park, Poland, as the basis for a regression tree analysis to determine the factors influencing trail deterioration, and link specific trail impacts with environmental, use related and managerial factors. We distinguished 12 types of trails, characterised by four levels of degradation: (1) trails with an acceptable level of degradation; (2) threatened trails; (3) damaged trails; and (4) heavily damaged trails. Damaged trails were the most vulnerable of all trails and should be prioritised for appropriate conservation and restoration. We also proposed five types of monitoring of recreational trail conditions: (1) rapid inventory of negative impacts; (2) monitoring visitor numbers and variation in type of use; (3) change-oriented monitoring focusing on sections of trail which were subjected to changes in type or level of use or subjected to extreme weather events; (4) monitoring of dynamics of trail conditions; and (5) full assessment of trail conditions, to be carried out every 10-15 years. The application of the proposed framework can enhance the ability of Park managers to prioritise their trail management activities, enhancing trail conditions and visitor safety, while minimising adverse impacts on the conservation value of the ecosystem. A.M.T. was supported by the Polish Ministry of

  12. Classification of the PALMS single particle mass spectral data from Atlanta by regression tree analysis

    NASA Astrophysics Data System (ADS)

    Middlebrook, A. M.; Murphy, D. M.; Lee, S.; Lee, S.; Lee, S.; Thomson, D. S.; Thomson, D. S.

    2001-12-01

    During the Atlanta Supersites project in August 1999, the PALMS (Particle Analysis by Laser Mass Spectrometry) instrument collected over 500,000 individual particle spectra. The Atlanta data were originally analyzed by examining combinations of peaks and relative peak areas [Lee et al., 2001a,b], and a wide range of particle components such as sulfate, nitrate, mineral species, metals, organic species, and elemental carbon were detected. To further study the dataset, a classification program using regression tree analysis was developed and applied. Spectral data were compressed into a lower resolution spectrum (every 0.25 mass units) of the raw data and a list of peak areas (every mass unit). Each spectrum started as a normalized classification vector by itself. If the dot product of two classification vectors was within a certain threshold, they were combined into a new classification. The new classification vector was a normalized running average of the classifications being combined. In subsequent steps, the threshold for combining classifications was continuously lowered until a reasonable number of classifications remained. After the final iteration, each spectrum was compared individually with the entire set of classification vectors. Classifications were also combined manually. The classification results from the Atlanta data are generally consistent with those determined by peak identification. However, the classification program identified specific patterns in the mass spectra that were not found by peak identification and generated new particle types. Furthermore, rare particle types that may affect human health were studied in more detail. A description of the classification program as well as the results for the Atlanta data will be presented. Lee, S.-H., D. M. Murphy, D. S. Thomson, and A. M. Middlebrook, Chemical components of single particles measured with particle analysis by laser mass spectrometry (PALMS) during the Atlanta Supersites Project

  13. Variances in the projections, resulting from CLIMEX, Boosted Regression Trees and Random Forests techniques

    NASA Astrophysics Data System (ADS)

    Shabani, Farzin; Kumar, Lalit; Solhjouy-fard, Samaneh

    2016-05-01

    The aim of this study was to have a comparative investigation and evaluation of the capabilities of correlative and mechanistic modeling processes, applied to the projection of future distributions of date palm in novel environments and to establish a method of minimizing uncertainty in the projections of differing techniques. The location of this study on a global scale is in Middle Eastern Countries. We compared the mechanistic model CLIMEX (CL) with the correlative models MaxEnt (MX), Boosted Regression Trees (BRT), and Random Forests (RF) to project current and future distributions of date palm (Phoenix dactylifera L.). The Global Climate Model (GCM), the CSIRO-Mk3.0 (CS) using the A2 emissions scenario, was selected for making projections. Both indigenous and alien distribution data of the species were utilized in the modeling process. The common areas predicted by MX, BRT, RF, and CL from the CS GCM were extracted and compared to ascertain projection uncertainty levels of each individual technique. The common areas identified by all four modeling techniques were used to produce a map indicating suitable and unsuitable areas for date palm cultivation for Middle Eastern countries, for the present and the year 2100. The four different modeling approaches predict fairly different distributions. Projections from CL were more conservative than from MX. The BRT and RF were the most conservative methods in terms of projections for the current time. The combination of the final CL and MX projections for the present and 2100 provide higher certainty concerning those areas that will become highly suitable for future date palm cultivation. According to the four models, cold, hot, and wet stress, with differences on a regional basis, appears to be the major restrictions on future date palm distribution. The results demonstrate variances in the projections, resulting from different techniques. The assessment and interpretation of model projections requires reservations

  14. Marginal regression approach for additive hazards models with clustered current status data.

    PubMed

    Su, Pei-Fang; Chi, Yunchan

    2014-01-15

    Current status data arise naturally from tumorigenicity experiments, epidemiology studies, biomedicine, econometrics and demographic and sociology studies. Moreover, clustered current status data may occur with animals from the same litter in tumorigenicity experiments or with subjects from the same family in epidemiology studies. Because the only information extracted from current status data is whether the survival times are before or after the monitoring or censoring times, the nonparametric maximum likelihood estimator of survival function converges at a rate of n(1/3) to a complicated limiting distribution. Hence, semiparametric regression models such as the additive hazards model have been extended for independent current status data to derive the test statistics, whose distributions converge at a rate of n(1/2) , for testing the regression parameters. However, a straightforward application of these statistical methods to clustered current status data is not appropriate because intracluster correlation needs to be taken into account. Therefore, this paper proposes two estimating functions for estimating the parameters in the additive hazards model for clustered current status data. The comparative results from simulation studies are presented, and the application of the proposed estimating functions to one real data set is illustrated.

  15. Predicting the occurrence of wildfires with binary structured additive regression models.

    PubMed

    Ríos-Pena, Laura; Kneib, Thomas; Cadarso-Suárez, Carmen; Marey-Pérez, Manuel

    2017-02-01

    Wildfires are one of the main environmental problems facing societies today, and in the case of Galicia (north-west Spain), they are the main cause of forest destruction. This paper used binary structured additive regression (STAR) for modelling the occurrence of wildfires in Galicia. Binary STAR models are a recent contribution to the classical logistic regression and binary generalized additive models. Their main advantage lies in their flexibility for modelling non-linear effects, while simultaneously incorporating spatial and temporal variables directly, thereby making it possible to reveal possible relationships among the variables considered. The results showed that the occurrence of wildfires depends on many covariates which display variable behaviour across space and time, and which largely determine the likelihood of ignition of a fire. The joint possibility of working on spatial scales with a resolution of 1 × 1 km cells and mapping predictions in a colour range makes STAR models a useful tool for plotting and predicting wildfire occurrence. Lastly, it will facilitate the development of fire behaviour models, which can be invaluable when it comes to drawing up fire-prevention and firefighting plans.

  16. Additive hazards regression and partial likelihood estimation for ecological monitoring data across space.

    PubMed

    Lin, Feng-Chang; Zhu, Jun

    2012-01-01

    We develop continuous-time models for the analysis of environmental or ecological monitoring data such that subjects are observed at multiple monitoring time points across space. Of particular interest are additive hazards regression models where the baseline hazard function can take on flexible forms. We consider time-varying covariates and take into account spatial dependence via autoregression in space and time. We develop statistical inference for the regression coefficients via partial likelihood. Asymptotic properties, including consistency and asymptotic normality, are established for parameter estimates under suitable regularity conditions. Feasible algorithms utilizing existing statistical software packages are developed for computation. We also consider a simpler additive hazards model with homogeneous baseline hazard and develop hypothesis testing for homogeneity. A simulation study demonstrates that the statistical inference using partial likelihood has sound finite-sample properties and offers a viable alternative to maximum likelihood estimation. For illustration, we analyze data from an ecological study that monitors bark beetle colonization of red pines in a plantation of Wisconsin.

  17. Regression analysis of mixed recurrent-event and panel-count data with additive rate models.

    PubMed

    Zhu, Liang; Zhao, Hui; Sun, Jianguo; Leisenring, Wendy; Robison, Leslie L

    2015-03-01

    Event-history studies of recurrent events are often conducted in fields such as demography, epidemiology, medicine, and social sciences (Cook and Lawless, 2007, The Statistical Analysis of Recurrent Events. New York: Springer-Verlag; Zhao et al., 2011, Test 20, 1-42). For such analysis, two types of data have been extensively investigated: recurrent-event data and panel-count data. However, in practice, one may face a third type of data, mixed recurrent-event and panel-count data or mixed event-history data. Such data occur if some study subjects are monitored or observed continuously and thus provide recurrent-event data, while the others are observed only at discrete times and hence give only panel-count data. A more general situation is that each subject is observed continuously over certain time periods but only at discrete times over other time periods. There exists little literature on the analysis of such mixed data except that published by Zhu et al. (2013, Statistics in Medicine 32, 1954-1963). In this article, we consider the regression analysis of mixed data using the additive rate model and develop some estimating equation-based approaches to estimate the regression parameters of interest. Both finite sample and asymptotic properties of the resulting estimators are established, and the numerical studies suggest that the proposed methodology works well for practical situations. The approach is applied to a Childhood Cancer Survivor Study that motivated this study.

  18. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China

    PubMed Central

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (P<0.05). However, the biomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia. PMID:27002822

  19. Tree Biomass Allocation and Its Model Additivity for Casuarina equisetifolia in a Tropical Forest of Hainan Island, China.

    PubMed

    Xue, Yang; Yang, Zhongyang; Wang, Xiaoyan; Lin, Zhipan; Li, Dunxi; Su, Shaofeng

    2016-01-01

    Casuarina equisetifolia is commonly planted and used in the construction of coastal shelterbelt protection in Hainan Island. Thus, it is critical to accurately estimate the tree biomass of Casuarina equisetifolia L. for forest managers to evaluate the biomass stock in Hainan. The data for this work consisted of 72 trees, which were divided into three age groups: young forest, middle-aged forest, and mature forest. The proportion of biomass from the trunk significantly increased with age (P<0.05). However, the biomass of the branch and leaf decreased, and the biomass of the root did not change. To test whether the crown radius (CR) can improve biomass estimates of C. equisetifolia, we introduced CR into the biomass models. Here, six models were used to estimate the biomass of each component, including the trunk, the branch, the leaf, and the root. In each group, we selected one model among these six models for each component. The results showed that including the CR greatly improved the model performance and reduced the error, especially for the young and mature forests. In addition, to ensure biomass additivity, the selected equation for each component was fitted as a system of equations using seemingly unrelated regression (SUR). The SUR method not only gave efficient and accurate estimates but also achieved the logical additivity. The results in this study provide a robust estimation of tree biomass components and total biomass over three groups of C. equisetifolia.

  20. Multiple Additive Regression Trees a Methodology for Predictive Data Mining for Fraud Detection

    DTIC Science & Technology

    2002-09-01

    5000 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING / MONITORING AGENCY NAME(S) AND ADDRESS(ES) Operation Mongoose / DFAS 400 Gigling...The Defense Finance Accounting Service DFAS-Operation Mongoose (Internal Review - Seaside) is using new and innovative techniques for fraud detection...v ABSTRACT The Defense Finance Accounting Service DFAS-Operation Mongoose (Internal Review - Seaside) is using new

  1. Use of generalized regression tree models to characterize vegetation favoring Anopheles albimanus breeding.

    PubMed

    Hernandez, J E; Epstein, L D; Rodriguez, M H; Rodriguez, A D; Rejmankova, E; Roberts, D R

    1997-03-01

    We propose the use of generalized tree models (GTMs) to analyze data from entomological field studies. Generalized tree models can be used to characterize environments with different mosquito breeding capacity. A GTM simultaneously analyzes a set of predictor variables (e.g., vegetation coverage) in relation to a response variable (e.g., counts of Anopheles albimanus larvae), and how it varies with respect to a set of criterion variables (e.g., presence of predators). The algorithm produces a treelike graphical display with its root at the top and 2 branches stemming down from each node. At each node, conditions on the value of predictors partition the observations into subgroups (environments) in which the relation between response and criterion variables is most homogeneous.

  2. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  3. Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling.

    PubMed

    Melcher, Michael; Scharl, Theresa; Luchner, Markus; Striedner, Gerald; Leisch, Friedrich

    2017-02-01

    The quality of biopharmaceuticals and patients' safety are of highest priority and there are tremendous efforts to replace empirical production process designs by knowledge-based approaches. Main challenge in this context is that real-time access to process variables related to product quality and quantity is severely limited. To date comprehensive on- and offline monitoring platforms are used to generate process data sets that allow for development of mechanistic and/or data driven models for real-time prediction of these important quantities. Ultimate goal is to implement model based feed-back control loops that facilitate online control of product quality. In this contribution, we explore structured additive regression (STAR) models in combination with boosting as a variable selection tool for modeling the cell dry mass, product concentration, and optical density on the basis of online available process variables and two-dimensional fluorescence spectroscopic data. STAR models are powerful extensions of linear models allowing for inclusion of smooth effects or interactions between predictors. Boosting constructs the final model in a stepwise manner and provides a variable importance measure via predictor selection frequencies. Our results show that the cell dry mass can be modeled with a relative error of about ±3%, the optical density with ±6%, the soluble protein with ±16%, and the insoluble product with an accuracy of ±12%. Biotechnol. Bioeng. 2017;114: 321-334. © 2016 Wiley Periodicals, Inc.

  4. Grassland and cropland net ecosystem production of the U.S. Great Plains: Regression tree model development and comparative analysis

    USGS Publications Warehouse

    Wylie, Bruce K.; Howard, Daniel; Dahal, Devendra; Gilmanov, Tagir; Ji, Lei; Zhang, Li; Smith, Kelcy

    2016-01-01

    This paper presents the methodology and results of two ecological-based net ecosystem production (NEP) regression tree models capable of up scaling measurements made at various flux tower sites throughout the U.S. Great Plains. Separate grassland and cropland NEP regression tree models were trained using various remote sensing data and other biogeophysical data, along with 15 flux towers contributing to the grassland model and 15 flux towers for the cropland model. The models yielded weekly mean daily grassland and cropland NEP maps of the U.S. Great Plains at 250 m resolution for 2000–2008. The grassland and cropland NEP maps were spatially summarized and statistically compared. The results of this study indicate that grassland and cropland ecosystems generally performed as weak net carbon (C) sinks, absorbing more C from the atmosphere than they released from 2000 to 2008. Grasslands demonstrated higher carbon sink potential (139 g C·m−2·year−1) than non-irrigated croplands. A closer look into the weekly time series reveals the C fluctuation through time and space for each land cover type.

  5. Comparative Analysis of Decision Trees with Logistic Regression in Predicting Fault-Prone Classes

    NASA Astrophysics Data System (ADS)

    Singh, Yogesh; Takkar, Arvinder Kaur; Malhotra, Ruchika

    There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. Decision Tree (DT) methods have been successfully applied for solving classification problems in many applications. This paper evaluates the capability of three DT methods and compares its performance with statistical method in predicting fault prone software classes using publicly available NASA data set. The results indicate that the prediction performance of DT is generally better than statistical model. However, similar types of studies are required to be carried out in order to establish the acceptability of the DT models.

  6. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  7. Drought forecasting in eastern Australia using multivariate adaptive regression spline, least square support vector machine and M5Tree model

    NASA Astrophysics Data System (ADS)

    Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.

    2017-02-01

    Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the

  8. Further Insight and Additional Inference Methods for Polynomial Regression Applied to the Analysis of Congruence

    ERIC Educational Resources Information Center

    Cohen, Ayala; Nahum-Shani, Inbal; Doveh, Etti

    2010-01-01

    In their seminal paper, Edwards and Parry (1993) presented the polynomial regression as a better alternative to applying difference score in the study of congruence. Although this method is increasingly applied in congruence research, its complexity relative to other methods for assessing congruence (e.g., difference score methods) was one of the…

  9. Strengthening the Regression Discontinuity Design Using Additional Design Elements: A Within-Study Comparison

    ERIC Educational Resources Information Center

    Wing, Coady; Cook, Thomas D.

    2013-01-01

    The sharp regression discontinuity design (RDD) has three key weaknesses compared to the randomized clinical trial (RCT). It has lower statistical power, it is more dependent on statistical modeling assumptions, and its treatment effect estimates are limited to the narrow subpopulation of cases immediately around the cutoff, which is rarely of…

  10. An Additional Measure of Overall Effect Size for Logistic Regression Models

    ERIC Educational Resources Information Center

    Allen, Jeff; Le, Huy

    2008-01-01

    Users of logistic regression models often need to describe the overall predictive strength, or effect size, of the model's predictors. Analogs of R[superscript 2] have been developed, but none of these measures are interpretable on the same scale as effects of individual predictors. Furthermore, R[superscript 2] analogs are not invariant to the…

  11. Analysis of the effect of evergreen and deciduous trees on urban nitrogen dioxide levels in the U.S. using land-use regression

    NASA Astrophysics Data System (ADS)

    Rao, M.; George, L. A.

    2012-12-01

    Nitrogen dioxide (NO2), an atmospheric pollutant generated primarily by anthropogenic combustion processes, is typically found at higher concentrations in urban areas compared to non-urbanized environments. Elevated NO2 levels have multiple ecosystem effects at different spatial scales. At the local scale, elevated levels affect human health directly and through the formation of secondary pollutants such as ozone and aerosols; at the regional scale secondary pollutants such as nitric acid and organic nitrates have deleterious effects on non-urbanized areas; and, at the global scale, nitrogen oxide emissions significantly alter the natural biogeochemical nitrogen cycle. As cities globally become larger and larger sources of nitrogen oxide emissions, it is important to assess possible mitigation strategies to reduce the impact of emissions locally, regionally and globally. In this study, we build a national land-use regression (LUR) model to compare the impacts of deciduous and evergreen trees on urban NO2 levels in the United States. We use the EPA monitoring network values of NO2 levels for 2006, the 2006 NLCD tree canopy data for deciduous and evergreen canopies, and the US Census Bureau's TIGER shapefiles for roads, railroads, impervious area & population density as proxies for NO2 sources on-road traffic, railroad traffic, off-road and area sources respectively. Our preliminary LUR model corroborates previous LUR studies showing that the presence of trees is associated with reduced urban NO2 levels. Additionally, our model indicates that deciduous and evergreen trees reduce NO2 to different extents, and that the amount of NO2 reduced varies seasonally. The model indicates that every square kilometer of deciduous canopy within a 2km buffer is associated with a reduction in ambient NO2 levels of 0.64 ppb in summer and 0.46ppb in winter. Similarly, every square kilometer of evergreen tree canopy within a 2 km buffer is associated with a reduction in ambient NO2 by

  12. Additional results on 'Reducing geometric dilution of precision using ridge regression'

    NASA Astrophysics Data System (ADS)

    Kelly, Robert J.

    1990-07-01

    Kelly (1990) presented preliminary results on the feasibility of using ridge regression (RR) to reduce the effects of geometric dilution of precision (GDOP) error inflation in position-fix navigation systems. Recent results indicate that RR will not reduce GDOP bias inflation when biaslike measurement errors last much longer than the aircraft guidance-loop response time. This conclusion precludes the use of RR on navigation systems whose dominant error sources are biaslike; e.g., the GPS selective-availability error source. The simulation results given by Kelly are, however, valid for the conditions defined. Although RR has not yielded a satisfactory solution to the general GDOP problem, it has illuminated the role that multicollinearity plays in navigation signal processors such as the Kalman filter. Bias inflation, initial position guess errors, ridge-parameter selection methodology, and the recursive ridge filter are discussed.

  13. An optimal sample data usage strategy to minimize overfitting and underfitting effects in regression tree models based on remotely-sensed data

    USGS Publications Warehouse

    Gu, Yingxin; Wylie, Bruce K.; Boyte, Stephen; Picotte, Joshua J.; Howard, Danny; Smith, Kelcy; Nelson, Kurtis

    2016-01-01

    Regression tree models have been widely used for remote sensing-based ecosystem mapping. Improper use of the sample data (model training and testing data) may cause overfitting and underfitting effects in the model. The goal of this study is to develop an optimal sampling data usage strategy for any dataset and identify an appropriate number of rules in the regression tree model that will improve its accuracy and robustness. Landsat 8 data and Moderate-Resolution Imaging Spectroradiometer-scaled Normalized Difference Vegetation Index (NDVI) were used to develop regression tree models. A Python procedure was designed to generate random replications of model parameter options across a range of model development data sizes and rule number constraints. The mean absolute difference (MAD) between the predicted and actual NDVI (scaled NDVI, value from 0–200) and its variability across the different randomized replications were calculated to assess the accuracy and stability of the models. In our case study, a six-rule regression tree model developed from 80% of the sample data had the lowest MAD (MADtraining = 2.5 and MADtesting = 2.4), which was suggested as the optimal model. This study demonstrates how the training data and rule number selections impact model accuracy and provides important guidance for future remote-sensing-based ecosystem modeling.

  14. Using hierarchical tree-based regression model to predict train-vehicle crashes at passive highway-rail grade crossings.

    PubMed

    Yan, Xuedong; Richards, Stephen; Su, Xiaogang

    2010-01-01

    This paper applies a nonparametric statistical method, hierarchical tree-based regression (HTBR), to explore train-vehicle crash prediction and analysis at passive highway-rail grade crossings. Using the Federal Railroad Administration (FRA) database, the research focuses on 27 years of train-vehicle accident history in the United States from 1980 through 2006. A cross-sectional statistical analysis based on HTBR is conducted for public highway-rail grade crossings that were upgraded from crossbuck-only to stop signs without involvement of other traffic-control devices or automatic countermeasures. In this study, HTBR models are developed to predict train-vehicle crash frequencies for passive grade crossings controlled by crossbucks only and crossbucks combined with stop signs respectively, and assess how the crash frequencies change after the stop-sign treatment is applied at the crossbuck-only-controlled crossings. The study results indicate that stop-sign treatment is an effective engineering countermeasure to improve safety at the passive grade crossings. Decision makers and traffic engineers can use the HTBR models to examine train-vehicle crash frequency at passive crossings and assess the potential effectiveness of stop-sign treatment based on specific attributes of the given crossings.

  15. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  16. Estimating Dbh of Trees Employing Multiple Linear Regression of the best Lidar-Derived Parameter Combination Automated in Python in a Natural Broadleaf Forest in the Philippines

    NASA Astrophysics Data System (ADS)

    Ibanez, C. A. G.; Carcellar, B. G., III; Paringit, E. C.; Argamosa, R. J. L.; Faelga, R. A. G.; Posilero, M. A. V.; Zaragosa, G. P.; Dimayacyac, N. A.

    2016-06-01

    Diameter-at-Breast-Height Estimation is a prerequisite in various allometric equations estimating important forestry indices like stem volume, basal area, biomass and carbon stock. LiDAR Technology has a means of directly obtaining different forest parameters, except DBH, from the behavior and characteristics of point cloud unique in different forest classes. Extensive tree inventory was done on a two-hectare established sample plot in Mt. Makiling, Laguna for a natural growth forest. Coordinates, height, and canopy cover were measured and types of species were identified to compare to LiDAR derivatives. Multiple linear regression was used to get LiDAR-derived DBH by integrating field-derived DBH and 27 LiDAR-derived parameters at 20m, 10m, and 5m grid resolutions. To know the best combination of parameters in DBH Estimation, all possible combinations of parameters were generated and automated using python scripts and additional regression related libraries such as Numpy, Scipy, and Scikit learn were used. The combination that yields the highest r-squared or coefficient of determination and lowest AIC (Akaike's Information Criterion) and BIC (Bayesian Information Criterion) was determined to be the best equation. The equation is at its best using 11 parameters at 10mgrid size and at of 0.604 r-squared, 154.04 AIC and 175.08 BIC. Combination of parameters may differ among forest classes for further studies. Additional statistical tests can be supplemented to help determine the correlation among parameters such as Kaiser- Meyer-Olkin (KMO) Coefficient and the Barlett's Test for Spherecity (BTS).

  17. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  18. Multi-scale remote sensing sagebrush characterization with regression trees over Wyoming, USA: laying a foundation for monitoring

    USGS Publications Warehouse

    Homer, Collin G.; Aldridge, Cameron L.; Meyer, Debra K.; Schell, Spencer J.

    2012-01-01

    agebrush ecosystems in North America have experienced extensive degradation since European settlement. Further degradation continues from exotic invasive plants, altered fire frequency, intensive grazing practices, oil and gas development, and climate change – adding urgency to the need for ecosystem-wide understanding. Remote sensing is often identified as a key information source to facilitate ecosystem-wide characterization, monitoring, and analysis; however, approaches that characterize sagebrush with sufficient and accurate local detail across large enough areas to support this paradigm are unavailable. We describe the development of a new remote sensing sagebrush characterization approach for the state of Wyoming, U.S.A. This approach integrates 2.4 m QuickBird, 30 m Landsat TM, and 56 m AWiFS imagery into the characterization of four primary continuous field components including percent bare ground, percent herbaceous cover, percent litter, and percent shrub, and four secondary components including percent sagebrush (Artemisia spp.), percent big sagebrush (Artemisia tridentata), percent Wyoming sagebrush (Artemisia tridentata Wyomingensis), and shrub height using a regression tree. According to an independent accuracy assessment, primary component root mean square error (RMSE) values ranged from 4.90 to 10.16 for 2.4 m QuickBird, 6.01 to 15.54 for 30 m Landsat, and 6.97 to 16.14 for 56 m AWiFS. Shrub and herbaceous components outperformed the current data standard called LANDFIRE, with a shrub RMSE value of 6.04 versus 12.64 and a herbaceous component RMSE value of 12.89 versus 14.63. This approach offers new advancements in sagebrush characterization from remote sensing and provides a foundation to quantitatively monitor these components into the future.

  19. Multi-scale remote sensing sagebrush characterization with regression trees over Wyoming, USA: Laying a foundation for monitoring

    NASA Astrophysics Data System (ADS)

    Homer, Collin G.; Aldridge, Cameron L.; Meyer, Debra K.; Schell, Spencer J.

    2012-02-01

    Sagebrush ecosystems in North America have experienced extensive degradation since European settlement. Further degradation continues from exotic invasive plants, altered fire frequency, intensive grazing practices, oil and gas development, and climate change - adding urgency to the need for ecosystem-wide understanding. Remote sensing is often identified as a key information source to facilitate ecosystem-wide characterization, monitoring, and analysis; however, approaches that characterize sagebrush with sufficient and accurate local detail across large enough areas to support this paradigm are unavailable. We describe the development of a new remote sensing sagebrush characterization approach for the state of Wyoming, U.S.A. This approach integrates 2.4 m QuickBird, 30 m Landsat TM, and 56 m AWiFS imagery into the characterization of four primary continuous field components including percent bare ground, percent herbaceous cover, percent litter, and percent shrub, and four secondary components including percent sagebrush ( Artemisia spp.), percent big sagebrush ( Artemisia tridentata), percent Wyoming sagebrush ( Artemisia tridentata Wyomingensis), and shrub height using a regression tree. According to an independent accuracy assessment, primary component root mean square error (RMSE) values ranged from 4.90 to 10.16 for 2.4 m QuickBird, 6.01 to 15.54 for 30 m Landsat, and 6.97 to 16.14 for 56 m AWiFS. Shrub and herbaceous components outperformed the current data standard called LANDFIRE, with a shrub RMSE value of 6.04 versus 12.64 and a herbaceous component RMSE value of 12.89 versus 14.63. This approach offers new advancements in sagebrush characterization from remote sensing and provides a foundation to quantitatively monitor these components into the future.

  20. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  1. Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods

    USGS Publications Warehouse

    Moisen, G.G.; Freeman, E.A.; Blackard, J.A.; Frescino, T.S.; Zimmermann, N.E.; Edwards, T.C.

    2006-01-01

    Many efforts are underway to produce broad-scale forest attribute maps by modelling forest class and structure variables collected in forest inventories as functions of satellite-based and biophysical information. Typically, variants of classification and regression trees implemented in Rulequest's?? See5 and Cubist (for binary and continuous responses, respectively) are the tools of choice in many of these applications. These tools are widely used in large remote sensing applications, but are not easily interpretable, do not have ties with survey estimation methods, and use proprietary unpublished algorithms. Consequently, three alternative modelling techniques were compared for mapping presence and basal area of 13 species located in the mountain ranges of Utah, USA. The modelling techniques compared included the widely used See5/Cubist, generalized additive models (GAMs), and stochastic gradient boosting (SGB). Model performance was evaluated using independent test data sets. Evaluation criteria for mapping species presence included specificity, sensitivity, Kappa, and area under the curve (AUC). Evaluation criteria for the continuous basal area variables included correlation and relative mean squared error. For predicting species presence (setting thresholds to maximize Kappa), SGB had higher values for the majority of the species for specificity and Kappa, while GAMs had higher values for the majority of the species for sensitivity. In evaluating resultant AUC values, GAM and/or SGB models had significantly better results than the See5 models where significant differences could be detected between models. For nine out of 13 species, basal area prediction results for all modelling techniques were poor (correlations less than 0.5 and relative mean squared errors greater than 0.8), but SGB provided the most stable predictions in these instances. SGB and Cubist performed equally well for modelling basal area for three species with moderate prediction success

  2. Understanding Child Stunting in India: A Comprehensive Analysis of Socio-Economic, Nutritional and Environmental Determinants Using Additive Quantile Regression

    PubMed Central

    Fenske, Nora; Burns, Jacob; Hothorn, Torsten; Rehfuess, Eva A.

    2013-01-01

    Background Most attempts to address undernutrition, responsible for one third of global child deaths, have fallen behind expectations. This suggests that the assumptions underlying current modelling and intervention practices should be revisited. Objective We undertook a comprehensive analysis of the determinants of child stunting in India, and explored whether the established focus on linear effects of single risks is appropriate. Design Using cross-sectional data for children aged 0–24 months from the Indian National Family Health Survey for 2005/2006, we populated an evidence-based diagram of immediate, intermediate and underlying determinants of stunting. We modelled linear, non-linear, spatial and age-varying effects of these determinants using additive quantile regression for four quantiles of the Z-score of standardized height-for-age and logistic regression for stunting and severe stunting. Results At least one variable within each of eleven groups of determinants was significantly associated with height-for-age in the 35% Z-score quantile regression. The non-modifiable risk factors child age and sex, and the protective factors household wealth, maternal education and BMI showed the largest effects. Being a twin or multiple birth was associated with dramatically decreased height-for-age. Maternal age, maternal BMI, birth order and number of antenatal visits influenced child stunting in non-linear ways. Findings across the four quantile and two logistic regression models were largely comparable. Conclusions Our analysis confirms the multifactorial nature of child stunting. It emphasizes the need to pursue a systems-based approach and to consider non-linear effects, and suggests that differential effects across the height-for-age distribution do not play a major role. PMID:24223839

  3. Regionalization of meso-scale physically based nitrogen modeling outputs to the macro-scale by the use of regression trees

    NASA Astrophysics Data System (ADS)

    Künne, A.; Fink, M.; Kipka, H.; Krause, P.; Flügel, W.-A.

    2012-06-01

    In this paper, a method is presented to estimate excess nitrogen on large scales considering single field processes. The approach was implemented by using the physically based model J2000-S to simulate the nitrogen balance as well as the hydrological dynamics within meso-scale test catchments. The model input data, the parameterization, the results and a detailed system understanding were used to generate the regression tree models with GUIDE (Loh, 2002). For each landscape type in the federal state of Thuringia a regression tree was calibrated and validated using the model data and results of excess nitrogen from the test catchments. Hydrological parameters such as precipitation and evapotranspiration were also used to predict excess nitrogen by the regression tree model. Hence they had to be calculated and regionalized as well for the state of Thuringia. Here the model J2000g was used to simulate the water balance on the macro scale. With the regression trees the excess nitrogen was regionalized for each landscape type of Thuringia. The approach allows calculating the potential nitrogen input into the streams of the drainage area. The results show that the applied methodology was able to transfer the detailed model results of the meso-scale catchments to the entire state of Thuringia by low computing time without losing the detailed knowledge from the nitrogen transport modeling. This was validated with modeling results from Fink (2004) in a catchment lying in the regionalization area. The regionalized and modeled excess nitrogen correspond with 94%. The study was conducted within the framework of a project in collaboration with the Thuringian Environmental Ministry, whose overall aim was to assess the effect of agro-environmental measures regarding load reduction in the water bodies of Thuringia to fulfill the requirements of the European Water Framework Directive (Bäse et al., 2007; Fink, 2006; Fink et al., 2007).

  4. Effects of Additives, Photodegradation, and Water-tree Degradation on the Photoluminescence in Polyethylene and Polypropylene

    NASA Astrophysics Data System (ADS)

    Ito, Toshihide; Fuse, Norikazu; Ohki, Yoshimichi

    Photoluminescence (PL) spectra induced by irradiation of ultraviolet photons are compared among low-density polyethylene (LDPE), crosslinked polyethylene (XLPE), and polypropylene (PP). Three PL bands appear around 4.2, 3.6, and 3.1 eV in LDPE and XLPE, while similar three PL bands are observed at similar energies in PP. The PL spectra and their decay profiles are independent of the presence of additives and are also independent of whether the samples were crosslinked or not. These results indicate that neither the additives nor the crosslinking has any significant effects on the respective three PLs in PE and PP. When the sample was pre-irradiated by the ultraviolet photons under different atmospheres (air, O2, and vacuum), all the PL intensities decrease with the progress of the pre-irradiation regardless of whether the sample is PE or PP. Therefore, all the PLs are considered to result from impurities. In all the pre-irradiated samples, a new PL band appears at 2.9 eV, of which intensity is stronger when the oxygen partial pressure during the pre-irradiation was lower. This PL is considered to be due to photo-induced conjugated double bonds. It has also been confirmed that water-tree degradation in LDPE or in XLPE does not contribute to PL.

  5. Separating the effects of water physicochemistry and sediment contamination on Chironomus tepperi (Skuse) survival, growth and development: a boosted regression tree approach.

    PubMed

    Hale, Robin; Marshall, Stephen; Jeppe, Katherine; Pettigrove, Vincent

    2014-07-01

    More comprehensive ecological risk assessment procedures are needed as the unprecedented rate of anthropogenic disturbances to aquatic ecosystems continues. Identifying the effects of pollutants on aquatic ecosystems is difficult, requiring the individual and joint effects of a range of natural and anthropogenic factors to be isolated, often via the analysis of large, complicated datasets. Ecotoxicologists have traditionally used multiple regression to analyse such datasets, but there are inherent problems with this approach and a need to consider other potentially more suitable methods. Sediment pollution can cause a range of negative effects on aquatic animals, and these are used as the basis for toxicity bioassays to measure the biological impact of pollution and the success of remediation efforts. However, experimental artefacts can also lead to sediments being incorrectly classed as toxic in such studies. Understanding the influence of potentially confounding factors will help more accurate assessments of sediment pollution. In this study, we analysed standardised sediment bioassays conducted using the chironomid Chironomus tepperi, with the aim of modelling the impact of sediment toxicants and water physico-chemistry on four endpoints (survival, growth, median emergence day, and number of emerging adults). We used boosted regression trees (BRT), a method that has a number of advantages over multiple regression, to model bioassay endpoints as a function of water chemistry, sediment quality and underlying geology. Endpoints were generally influenced most strongly by water quality parameters and nutrients, although some metals negatively influenced emergence endpoints. Sub-lethal endpoints were generally better predicted than lethal endpoints; median emergence day was the most sensitive endpoint examined in this study, while the number of emerging adults was the least sensitive. We tested our modelling results by experimentally manipulating sediment and

  6. Contrasting regional and national mechanisms for predicting elevated arsenic in private wells across the United States using classification and regression trees.

    PubMed

    Frederick, Logan; VanDerslice, James; Taddie, Marissa; Malecki, Kristen; Gregg, Josh; Faust, Nicholas; Johnson, William P

    2016-03-15

    Arsenic contamination in groundwater is a public health and environmental concern in the United States (U.S.) particularly where monitoring is not required under the Safe Water Drinking Act. Previous studies suggest the influence of regional mechanisms for arsenic mobilization into groundwater; however, no study has examined how influencing parameters change at a continental scale spanning multiple regions. We herein examine covariates for groundwater in the western, central and eastern U.S. regions representing mechanisms associated with arsenic concentrations exceeding the U.S. Environmental Protection Agency maximum contamination level (MCL) of 10 parts per billion (ppb). Statistically significant covariates were identified via classification and regression tree (CART) analysis, and included hydrometeorological and groundwater chemical parameters. The CART analyses were performed at two scales: national and regional; for which three physiographic regions located in the western (Payette Section and the Snake River Plain), central (Osage Plains of the Central Lowlands), and eastern (Embayed Section of the Coastal Plains) U.S. were examined. Validity of each of the three regional CART models was indicated by values >85% for the area under the receiver-operating characteristic curve. Aridity (precipitation minus potential evapotranspiration) was identified as the primary covariate associated with elevated arsenic at the national scale. At the regional scale, aridity and pH were the major covariates in the arid to semi-arid (western) region; whereas dissolved iron (taken to represent chemically reducing conditions) and pH were major covariates in the temperate (eastern) region, although additional important covariates emerged, including elevated phosphate. Analysis in the central U.S. region indicated that elevated arsenic concentrations were driven by a mixture of those observed in the western and eastern regions.

  7. Trees

    NASA Astrophysics Data System (ADS)

    Epstein, Henri

    2016-11-01

    An algebraic formalism, developed with V. Glaser and R. Stora for the study of the generalized retarded functions of quantum field theory, is used to prove a factorization theorem which provides a complete description of the generalized retarded functions associated with any tree graph. Integrating over the variables associated to internal vertices to obtain the perturbative generalized retarded functions for interacting fields arising from such graphs is shown to be possible for a large category of space-times.

  8. Decomposition of conifer tree bark under field conditions: effects of nitrogen and phosphorus additions

    NASA Astrophysics Data System (ADS)

    Lopes de Gerenyu, Valentin; Kurganova, Irina; Kapitsa, Ekaterina; Shorokhova, Ekaterina

    2016-04-01

    In forest ecosystems, the processes of decomposition of coarse woody debris (CWD) can contribute significantly to the emission component of carbon (C) cycle and thus accelerate the greenhouse effect and global climate change. A better understanding of decomposition of CWD is required to refine estimates of the C balance in forest ecosystems and improve biogeochemical models. These estimates will in turn contribute to assessing the role of forests in maintaining their long-term productivity and other ecosystems services. We examined the decomposition rate of coniferous bark with added nitrogen (N) and phosphorus (P) fertilizers in experiment under field conditions. The experiment was carried out in 2015 during 17 weeks in Moscow region (54o50'N, 37o36'E) under continental-temperate climatic conditions. The conifer tree bark mixture (ca. 70% of Norway spruce and 30% of Scots pine) was combined with soil and placed in piles of soil-bark substrate (SBS) with height of ca. 60 cm and surface area of ca. 3 m2. The dry mass ratio of bark to soil was 10:1. The experimental design included following treatments: (1) soil (Luvisols Haplic) without bark, (S), (2) pure SBS, (3) SBS with N addition in the amount of 1% of total dry bark mass (SBS-N), and (4) SBS with N and P addition in the amount of 1% of total dry bark mass for each element (SBS-NP). The decomposition rate expressed as CO2 emission flux, g C/m2/h was measured using closed chamber method 1-3 times per week from July to early November using LiCor 6400 (Nebraska, USA). During the experiment, we also controlled soil temperature at depths of 5, 20, 40, and 60 cm below surface of SBS using thermochrons iButton (DS1921G, USA). The pattern of CO2 emission rate from SBS depended strongly on fertilizing. The highest decomposition rates (DecR) of 2.8-5.6 g C/m2/h were observed in SBS-NP treatment during the first 6 weeks of experiment. The decay process of bark was less active in the treatment with only N addition. In this

  9. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests

    ERIC Educational Resources Information Center

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2009-01-01

    Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…

  10. Responses of Acer saccharum canopy trees and saplings to P, K and lime additions under high N deposition.

    PubMed

    Gradowski, Tomasz; Thomas, Sean C

    2008-02-01

    Heavy atmospheric nitrogen (N) deposition has been associated with altered nutrient cycling, and even N saturation, in forest ecosystems previously thought to be N-limited. This observation has prompted application to such forests of non-N mineral nutrients as a mitigation measure. We examined leaf gas-exchange, leaf chemistry and leaf and shoot morphological responses of Acer saccharum Marsh. saplings and mature trees to experimental additions of non-nitrogenous mineral nutrients (dolomitic lime, phosphorus + potassium (P + K) and lime plus P + K) over 2 years in the Haliburton region of central Ontario, which receives some of the largest annual N inputs in North America. Nutrients were adsorbed in the mineral soil and taken up by A. saccharum trees within 1 year of fertilizer application; however, contrary to expectation, liming had no effect on soil P availability. Saplings and canopy trees showed significant responses to both P + K fertilization and liming, including increased foliar nutrient concentration, leaf size and shoot extension growth; however, no treatment effects on leaf gas-exchange parameters were detected. Increases in shoot extension preceded increases in diameter growth in saplings and canopy trees. Vector analysis of shoot extension growth and nutrient content was consistent with sufficiency of N but marked limitation of P, with co-limitation by calcium (Ca) in saplings and by Ca, Mg and K in canopy trees.

  11. A community-level, mesoscale analysis of fish assemblage structure in shoreline habitats of a large river using multivariate regression trees

    NASA Astrophysics Data System (ADS)

    Wilkes, Martin; Maddock, Ian; Link, Oscar; Habit, Evelyn

    2015-04-01

    Despite the numerous advantages over traditional methods ascribed to community-level analyses, including the ability to rapidly predict the abundance of multiple species and the integration of complex biological interactions, very few applications to the mesoscale of river habitats can be found in the extant literature. Most previous work has been based on single species, species-by-species modelling or reduced dimensionality approaches. Community-level analyses have especially good properties for improving the understanding of habitat associations in large rivers where biological interactions are most intense and applications of the mesohabitat concept relatively sparse. This chapter seeks to identify quantitative relationships between key environmental variables and community structure using a particular type of community-level technique known as multivariate regression trees in order to test the ecological basis for applications of the mesohabitat concept in large rivers. Mesohabitats were mapped and their environmental characteristics recorded along a reach of the San Pedro River, Chile, which is inhabited by a highly endemic fish community. A representative portion of the mesohabitats were selected for fish sampling and multivariate regression trees produced to predict community structure based on combinations of environmental variables. The analyses showed that fish assemblages were distinct at the mesoscale, with flow depth, bank materials, cover and woody debris the key predictor variables. The results support the application of the mesohabitat concept in this geographical context and establish a basis for predicting the community structure of any mesohabitat along the reach.

  12. Partitioning of multivariate phenotypes using regression trees reveals complex patterns of adaptation to climate across the range of black cottonwood (Populus trichocarpa)

    PubMed Central

    Oubida, Regis W.; Gantulga, Dashzeveg; Zhang, Man; Zhou, Lecong; Bawa, Rajesh; Holliday, Jason A.

    2015-01-01

    Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13–0.32) and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref) explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP), and mean annual precipitation (MAP). These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref) had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures) had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP) had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP) performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar. PMID:25870603

  13. Partitioning of multivariate phenotypes using regression trees reveals complex patterns of adaptation to climate across the range of black cottonwood (Populus trichocarpa).

    PubMed

    Oubida, Regis W; Gantulga, Dashzeveg; Zhang, Man; Zhou, Lecong; Bawa, Rajesh; Holliday, Jason A

    2015-01-01

    Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13-0.32) and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref) explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP), and mean annual precipitation (MAP). These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref) had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures) had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP) had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP) performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar.

  14. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    NASA Astrophysics Data System (ADS)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  15. The relation of student behavior, peer status, race, and gender to decisions about school discipline using CHAID decision trees and regression modeling.

    PubMed

    Horner, Stacy B; Fireman, Gary D; Wang, Eugene W

    2010-04-01

    Peer nominations and demographic information were collected from a diverse sample of 1493 elementary school participants to examine behavior (overt and relational aggression, impulsivity, and prosociality), context (peer status), and demographic characteristics (race and gender) as predictors of teacher and administrator decisions about discipline. Exploratory results using classification tree analyses indicated students nominated as average or highly overtly aggressive were more likely to be disciplined than others. Among these students, race was the most significant predictor, with African American students more likely to be disciplined than Caucasians, Hispanics, or Others. Among the students nominated as low in overt aggression, a lack of prosocial behavior was the most significant predictor. Confirmatory analysis using hierarchical logistic regression supported the exploratory results. Similarities with other biased referral patterns, proactive classroom management strategies, and culturally sensitive recommendations are discussed.

  16. Analysis of the Importance of Oxides and Clays in Cd, Cr, Cu, Ni, Pb and Zn Adsorption and Retention with Regression Trees

    PubMed Central

    González-Costa, Juan José; Reigosa, Manuel Joaquín; Matías, José María; Fernández-Covelo, Emma

    2017-01-01

    This study determines the influence of the different soil components and of the cation-exchange capacity on the adsorption and retention of different heavy metals: cadmium, chromium, copper, nickel, lead and zinc. In order to do so, regression models were created through decision trees and the importance of soil components was assessed. Used variables were: humified organic matter, specific cation-exchange capacity, percentages of sand and silt, proportions of Mn, Fe and Al oxides and hematite, and the proportion of quartz, plagioclase and mica, and the proportions of the different clays: kaolinite, vermiculite, gibbsite and chlorite. The most important components in the obtained models were vermiculite and gibbsite, especially for the adsorption of cadmium and zinc, while clays were less relevant. Oxides are less important than clays, especially for the adsorption of chromium and lead and the retention of chromium, copper and lead. PMID:28072849

  17. Analysis of the Importance of Oxides and Clays in Cd, Cr, Cu, Ni, Pb and Zn Adsorption and Retention with Regression Trees.

    PubMed

    González-Costa, Juan José; Reigosa, Manuel Joaquín; Matías, José María; Fernández-Covelo, Emma

    2017-01-01

    This study determines the influence of the different soil components and of the cation-exchange capacity on the adsorption and retention of different heavy metals: cadmium, chromium, copper, nickel, lead and zinc. In order to do so, regression models were created through decision trees and the importance of soil components was assessed. Used variables were: humified organic matter, specific cation-exchange capacity, percentages of sand and silt, proportions of Mn, Fe and Al oxides and hematite, and the proportion of quartz, plagioclase and mica, and the proportions of the different clays: kaolinite, vermiculite, gibbsite and chlorite. The most important components in the obtained models were vermiculite and gibbsite, especially for the adsorption of cadmium and zinc, while clays were less relevant. Oxides are less important than clays, especially for the adsorption of chromium and lead and the retention of chromium, copper and lead.

  18. Boosted Regression Trees Outperforms Support Vector Machines in Predicting (Regional) Yields of Winter Wheat from Single and Cumulated Dekadal Spot-VGT Derived Normalized Difference Vegetation Indices

    NASA Astrophysics Data System (ADS)

    Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos

    2016-08-01

    This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.

  19. Effects of multiple chronic conditions on health care costs: an analysis based on an advanced tree-based regression model

    PubMed Central

    2013-01-01

    Background To analyze the impact of multimorbidity (MM) on health care costs taking into account data heterogeneity. Methods Data come from a multicenter prospective cohort study of 1,050 randomly selected primary care patients aged 65 to 85 years suffering from MM in Germany. MM was defined as co-occurrence of ≥3 conditions from a list of 29 chronic diseases. A conditional inference tree (CTREE) algorithm was used to detect the underlying structure and most influential variables on costs of inpatient care, outpatient care, medications as well as formal and informal nursing care. Results Irrespective of the number and combination of co-morbidities, a limited number of factors influential on costs were detected. Parkinson’s disease (PD) and cardiac insufficiency (CI) were the most influential variables for total costs. Compared to patients not suffering from any of the two conditions, PD increases predicted mean total costs 3.5-fold to approximately € 11,000 per 6 months, and CI two-fold to approximately € 6,100. The high total costs of PD are largely due to costs of nursing care. Costs of inpatient care were significantly influenced by cerebral ischemia/chronic stroke, whereas medication costs were associated with COPD, insomnia, PD and Diabetes. Except for costs of nursing care, socio-demographic variables did not significantly influence costs. Conclusions Irrespective of any combination and number of co-occurring diseases, PD and CI appear to be most influential on total health care costs in elderly patients with MM, and only a limited number of factors significantly influenced cost. Trial registration Current Controlled Trials ISRCTN89818205 PMID:23768192

  20. An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests

    PubMed Central

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2010-01-01

    Recursive partitioning methods have become popular and widely used tools for non-parametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years. High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated using freely available implementations in the R system for statistical computing. PMID:19968396

  1. Performance of seedlings of a shade-tolerant tropical tree species after moderate addition of N and P

    NASA Astrophysics Data System (ADS)

    Cárate Tandalla, Daisy; Leuschner, Christoph; Homeier, Jürgen

    2015-12-01

    Nitrogen deposition to tropical forests is predicted to increase in future in many regions due to agricultural intensification. We conducted a seedling transplantation experiment in a tropical premontane forest in Ecuador with a locally abundant late-successional tree species (Pouteria torta, Sapotaceae) aimed at detecting species-specific responses to moderate N and P addition and to understand how increasing nutrient availability will affect regeneration. From locally collected seeds, 320 seedlings were produced and transplanted to the plots of the Ecuadorian Nutrient Manipulation Experiment (NUMEX) with three treatments (moderate N addition: 50 kg N ha-1 yr-1, moderate P addition: 10 kg P ha-1 yr-1 and combined N and P addition) and a control (80 plants per treatment). After 12 months, mortality, relative growth rate, leaf nutrient content and leaf herbivory rate were measured. N and NP addition significantly increased the mortality rate (70 % vs. 54 % in the control). However, N and P addition also increased the diameter growth rate of the surviving seedlings. N and P addition did not alter foliar nutrient concentrations and leaf N:P ratio, but N addition decreased the leaf C:N ratio and increased SLA. P addition (but not N addition) resulted in higher leaf area loss to herbivore consumption and also shifted carbon allocation to root growth. This fertilization experiment with a common rainforest tree species conducted in old-growth forest shows that already moderate doses of added N and P are affecting seedling performance which most likely will have consequences for the competitive strength in the understory and the recruitment success of P. torta. Simultaneous increases in growth, herbivory and mortality rates make it difficult to assess the species' overall performance and predict how a future increase in nutrient deposition will alter the abundance of this species in the Andean tropical montane forests.

  2. Understanding how roadside concentrations of NOx are influenced by the background levels, traffic density, and meteorological conditions using Boosted Regression Trees

    NASA Astrophysics Data System (ADS)

    Sayegh, Arwa; Tate, James E.; Ropkins, Karl

    2016-02-01

    Oxides of Nitrogen (NOx) is a major component of photochemical smog and its constituents are considered principal traffic-related pollutants affecting human health. This study investigates the influence of background concentrations of NOx, traffic density, and prevailing meteorological conditions on roadside concentrations of NOx at UK urban, open motorway, and motorway tunnel sites using the statistical approach Boosted Regression Trees (BRT). BRT models have been fitted using hourly concentration, traffic, and meteorological data for each site. The models predict, rank, and visualise the relationship between model variables and roadside NOx concentrations. A strong relationship between roadside NOx and monitored local background concentrations is demonstrated. Relationships between roadside NOx and other model variables have been shown to be strongly influenced by the quality and resolution of background concentrations of NOx, i.e. if it were based on monitored data or modelled prediction. The paper proposes a direct method of using site-specific fundamental diagrams for splitting traffic data into four traffic states: free-flow, busy-flow, congested, and severely congested. Using BRT models, the density of traffic (vehicles per kilometre) was observed to have a proportional influence on the concentrations of roadside NOx, with different fitted regression line slopes for the different traffic states. When other influences are conditioned out, the relationship between roadside concentrations and ambient air temperature suggests NOx concentrations reach a minimum at around 22 °C with high concentrations at low ambient air temperatures which could be associated to restricted atmospheric dispersion and/or to changes in road traffic exhaust emission characteristics at low ambient air temperatures. This paper uses BRT models to study how different critical factors, and their relative importance, influence the variation of roadside NOx concentrations. The paper

  3. Thin cloud removal from remote sensing images using multidirectional dual tree complex wavelet transform and transfer least square support vector regression

    NASA Astrophysics Data System (ADS)

    Hu, Gensheng; Li, Xiaoyi; Liang, Dong

    2015-01-01

    The existence of clouds affects the interpretation and utilization of remote sensing images. A thin cloud removal algorithm for cloud-contaminated remote sensing images is proposed by combining a multidirectional dual tree complex wavelet transform (M-DTCWT) with domain adaptation transfer least square support vector regression (T-LSSVR). First, M-DTCWT is constructed by using the hourglass filter bank in combination with DTCWT, which is used to decompose remote sensing images into multiscale and multidirectional subbands. Then the low-frequency subband coefficients of the cloud-free regions on target images and source domain images are used as samples for a T-LSSVR model, which can be used to predict those of the cloud regions on cloud-contaminated images. Finally, by enhancing the high-frequency coefficients and replacing the low-frequency coefficients, the thin clouds on cloud-contaminated images are removed. Experimental results show that M-DTCWT contributes to keeping the details of the ground objects of cloud-contaminated images, and the T-LSSVR model can effectively learn the contour information from multisource and multitemporal images, therefore, the proposed method achieves a good effect of thin cloud removal.

  4. [Evaluation of chemotherapy for stage IV non-small cell lung cancer employing a regression tree type method for quality-adjusted survival analysis to determine prognostic factors].

    PubMed

    Fujita, A; Takabatake, H; Tagaki, S; Sohda, T; Sekine, K

    1996-03-01

    To evaluate the effect of chemotherapy on QOL, the survival period was categorized by 3 intervals: one in the hospital for chemotherapy (TOX), on an outpatient basis (TWiST Time without Symptom and Toxicity), and in the hospital for conservative therapy (REL). Coefficients showing the QOL level were expressed as ut, uw and ur. If uw was 1 and ut and ur were plotted at less than 1, ut TOX+uwTWiST+urREL could be a quality-adjusted value relative to TWiST (Q-TWiST). One hundred five patients with stage IV non-small cell lung cancer were included. Sixty-five were given chemotherapy, and the other 40 were not. The observation period was 2 years. Q-TWiST values for age, sex, PS, histology and chemotherapy were calculated. Their quantification was performed employing a regression tree type method. Chemotherapy contributed to Q-TWiST when ut approached 1 i.e., no side effect was supposed). When ut was less than 0.5, PS and sex had an appreciable role.

  5. Identifying changes in dissolved organic matter content and characteristics by fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis during wastewater treatment.

    PubMed

    Yu, Huibin; Song, Yonghui; Liu, Ruixia; Pan, Hongwei; Xiang, Liancheng; Qian, Feng

    2014-10-01

    The stabilization of latent tracers of dissolved organic matter (DOM) of wastewater was analyzed by three-dimensional excitation-emission matrix (EEM) fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis (CART) in wastewater treatment performance. DOM of water samples collected from primary sedimentation, anaerobic, anoxic, oxic and secondary sedimentation tanks in a large-scale wastewater treatment plant contained four fluorescence components: tryptophan-like (C1), tyrosine-like (C2), microbial humic-like (C3) and fulvic-like (C4) materials extracted by self-organizing map. These components showed good positive linear correlations with dissolved organic carbon of DOM. C1 and C2 were representative components in the wastewater, and they were removed to a higher extent than those of C3 and C4 in the treatment process. C2 was a latent parameter determined by CART to differentiate water samples of oxic and secondary sedimentation tanks from the successive treatment units, indirectly proving that most of tyrosine-like material was degraded by anaerobic microorganisms. C1 was an accurate parameter to comprehensively separate the samples of the five treatment units from each other, indirectly indicating that tryptophan-like material was decomposed by anaerobic and aerobic bacteria. EEM fluorescence spectroscopy in combination with self-organizing map and CART analysis can be a nondestructive effective method for characterizing structural component of DOM fractions and monitoring organic matter removal in wastewater treatment process.

  6. Additives

    NASA Technical Reports Server (NTRS)

    Smalheer, C. V.

    1973-01-01

    The chemistry of lubricant additives is discussed to show what the additives are chemically and what functions they perform in the lubrication of various kinds of equipment. Current theories regarding the mode of action of lubricant additives are presented. The additive groups discussed include the following: (1) detergents and dispersants, (2) corrosion inhibitors, (3) antioxidants, (4) viscosity index improvers, (5) pour point depressants, and (6) antifouling agents.

  7. Quantifying mineral abundances of complex mixtures by coupling spectral deconvolution of SWIR spectra (2.1-2.4 μm) and regression tree analysis

    USGS Publications Warehouse

    Mulder, V.L.; Plotze, Michael; de Bruin, Sytze; Schaepman, Michael E.; Mavris, C.; Kokaly, Raymond F.; Egli, Markus

    2013-01-01

    This paper presents a methodology for assessing mineral abundances of mixtures having more than two constituents using absorption features in the 2.1-2.4 μm wavelength region. In the first step, the absorption behaviour of mineral mixtures is parameterised by exponential Gaussian optimisation. Next, mineral abundances are predicted by regression tree analysis using these parameters as inputs. The approach is demonstrated on a range of prepared samples with known abundances of kaolinite, dioctahedral mica, smectite, calcite and quartz and on a set of field samples from Morocco. The latter contained varying quantities of other minerals, some of which did not have diagnostic absorption features in the 2.1-2.4 μm region. Cross validation showed that the prepared samples of kaolinite, dioctahedral mica, smectite and calcite were predicted with a root mean square error (RMSE) less than 9 wt.%. For the field samples, the RMSE was less than 8 wt.% for calcite, dioctahedral mica and kaolinite abundances. Smectite could not be well predicted, which was attributed to spectral variation of the cations within the dioctahedral layered smectites. Substitution of part of the quartz by chlorite at the prediction phase hardly affected the accuracy of the predicted mineral content; this suggests that the method is robust in handling the omission of minerals during the training phase. The degree of expression of absorption components was different between the field sample and the laboratory mixtures. This demonstrates that the method should be calibrated and trained on local samples. Our method allows the simultaneous quantification of more than two minerals within a complex mixture and thereby enhances the perspectives of spectral analysis for mineral abundances.

  8. Evaluating the High Risk Groups for Suicide: A Comparison of Logistic Regression, Support Vector Machine, Decision Tree and Artificial Neural Network

    PubMed Central

    AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad

    2016-01-01

    Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463

  9. Responses of nitrous oxide emissions to nitrogen and phosphorus additions in two tropical plantations with N-fixing vs. non-N-fixing tree species

    NASA Astrophysics Data System (ADS)

    Zhang, W.; Zhu, X.; Luo, Y.; Rafique, R.; Chen, H.; Huang, J.; Mo, J.

    2014-01-01

    Leguminous tree plantations at phosphorus (P) limited sites may result in higher rates of nitrous oxide (N2O) emissions, however, the effects of nitrogen (N) and P applications on soil N2O emissions from plantations with N-fixing vs. non-N-fixing tree species has rarely been studied in the field. We conducted an experimental manipulation of N and P additions in two tropical plantations with Acacia auriculiformis (AA) and Eucalyptus urophylla (EU) tree species in South China. The objective was to determine the effects of N- or P-addition alone, as well as NP application together on soil N2O emissions from tropical plantations with N-fixing vs. non-N-fixing tree species. We found that the average N2O emission from control was greater in AA (2.26 ± 0.06 kg N2O-N ha-1 yr-1) than in EU plantation (1.87 ± 0.05 kg N2O-N ha-1 yr-1). For the AA plantation, N-addition stimulated the N2O emission from soil while P-addition did not. Applications of N with P together significantly decreased N2O emission compared to N-addition alone, especially in high level treatment plots (decreased by 18%). In the EU plantation, N2O emissions significantly decreased in P-addition plots compared with the controls, however, N- and NP-additions did not. The differing response of N2O emissions to N- or P-addition was attributed to the higher initial soil N status in the AA than that of the EU plantation, due to symbiotic N fixation in the former. Our results suggest that atmospheric N deposition potentially stimulates N2O emissions from leguminous tree plantations in the tropics, whereas P fertilization has the potential to mitigate N deposition-induced N2O emissions from such plantations.

  10. Growth enhancement of Picea abies trees under long-term, low-dose N addition is due to morphological more than to physiological changes.

    PubMed

    Krause, Kim; Cherubini, Paolo; Bugmann, Harald; Schleppi, Patrick

    2012-12-01

    Human activities have drastically increased nitrogen (N) inputs into natural and near-natural terrestrial ecosystems such that critical loads are now being exceeded in many regions of the world. This implies that these ecosystems are shifting from natural N limitation to eutrophication or even N saturation. This process is expected to modify the growth of forests and thus, along with management, to affect their carbon (C) sequestration. However, knowledge of the physiological mechanisms underlying tree response to N inputs, especially in the long term, is still lacking. In this study, we used tree-ring patterns and a dual stable isotope approach (δ(13)C and δ(18)O) to investigate tree growth responses and the underlying physiological reactions in a long-term, low-dose N addition experiment (+23 kg N ha(-1) a(-1)). This experiment has been conducted for 14 years in a mountain Picea abies (L.) Karst. forest in Alptal, Switzerland, using a paired-catchment design. Tree stem C sequestration increased by ∼22%, with an N use efficiency (NUE) of ca. 8 kg additional C in tree stems per kg of N added. Neither earlywood nor latewood δ(13)C values changed significantly compared with the control, indicating that the intrinsic water use efficiency (WUE(i)) (A/g(s)) did not change due to N addition. Further, the isotopic signal of δ(18)O in early- and latewood showed no significant response to the treatment, indicating that neither stomatal conductance nor leaf-level photosynthesis changed significantly. Foliar analyses showed that needle N concentration significantly increased in the fourth to seventh treatment year, accompanied by increased dry mass and area per needle, and by increased tree height growth. Later, N concentration and height growth returned to nearly background values, while dry mass and area per needle remained high. Our results support the hypothesis that enhanced stem growth caused by N addition is mainly due to an increased leaf area index (LAI

  11. Responses of nitrous oxide emissions to nitrogen and phosphorus additions in two tropical plantations with N-fixing vs. non-N-fixing tree species

    NASA Astrophysics Data System (ADS)

    Zhang, W.; Zhu, X.; Luo, Y.; Rafique, R.; Chen, H.; Huang, J.; Mo, J.

    2014-09-01

    Leguminous tree plantations at phosphorus (P) limited sites may result in excess nitrogen (N) and higher rates of nitrous oxide (N2O) emissions. However, the effects of N and P applications on soil N2O emissions from plantations with N-fixing vs. non-N-fixing tree species have rarely been studied in the field. We conducted an experimental manipulation of N and/or P additions in two plantations with Acacia auriculiformis (AA, N-fixing) and Eucalyptus urophylla (EU, non-N-fixing) in South China. The objective was to determine the effects of N or P addition alone, as well as NP application together on soil N2O emissions from these tropical plantations. We found that the average N2O emission from control was greater in the AA (2.3 ± 0.1 kg N2O-N ha-1 yr-1) than in EU plantation (1.9 ± 0.1 kg N2O-N ha-1 yr-1). For the AA plantation, N addition stimulated N2O emission from the soil while P addition did not. Applications of N with P together significantly decreased N2O emission compared to N addition alone, especially in the high-level treatments (decreased by 18%). In the EU plantation, N2O emissions significantly decreased in P-addition plots compared with the controls; however, N and NP additions did not. The different response of N2O emission to N or P addition was attributed to the higher initial soil N status in the AA than that of EU plantation, due to symbiotic N fixation in the former. Our result suggests that atmospheric N deposition potentially stimulates N2O emissions from leguminous tree plantations in the tropics, whereas P fertilization has the potential to mitigate N-deposition-induced N2O emissions from such plantations.

  12. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA

    PubMed Central

    Momen, Bahram; Behling, Shawna J.; Lawrence, Greg B.; Sullivan, Joseph H.

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  13. Photosynthetic and growth response of sugar maple (Acer saccharum Marsh.) mature trees and seedlings to calcium, magnesium, and nitrogen additions in the Catskill Mountains, NY, USA

    USGS Publications Warehouse

    Momen, Bahram; Behling, Shawna J; Lawrence, Gregory B.; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the

  14. Photosynthetic and Growth Response of Sugar Maple (Acer saccharum Marsh.) Mature Trees and Seedlings to Calcium, Magnesium, and Nitrogen Additions in the Catskill Mountains, NY, USA.

    PubMed

    Momen, Bahram; Behling, Shawna J; Lawrence, Greg B; Sullivan, Joseph H

    2015-01-01

    Decline of sugar maple in North American forests has been attributed to changes in soil calcium (Ca) and nitrogen (N) by acidic precipitation. Although N is an essential and usually a limiting factor in forests, atmospheric N deposition may cause N-saturation leading to loss of soil Ca. Such changes can affect carbon gain and growth of sugar maple trees and seedlings. We applied a 22 factorial arrangement of N and dolomitic limestone containing Ca and Magnesium (Mg) to 12 forest plots in the Catskill Mountain region of NY, USA. To quantify the short-term effects, we measured photosynthetic-light responses of sugar maple mature trees and seedlings two or three times during two summers. We estimated maximum net photosynthesis (An-max) and its related light intensity (PAR at An-max), apparent quantum efficiency (Aqe), and light compensation point (LCP). To quantify the long-term effects, we measured basal area of living mature trees before and 4 and 8 years after treatment applications. Soil and foliar chemistry variables were also measured. Dolomitic limestone increased Ca, Mg, and pH in the soil Oe horizon. Mg was increased in the B horizon when comparing the plots receiving N with those receiving CaMg. In mature trees, foliar Ca and Mg concentrations were higher in the CaMg and N+CaMg plots than in the reference or N plots; foliar Ca concentration was higher in the N+CaMg plots compared with the CaMg plots, foliar Mg was higher in the CaMg plots than the N+CaMg plots; An-max was maximized due to N+CaMg treatment; Aqe decreased by N addition; and PAR at An-max increased by N or CaMg treatments alone, but the increase was maximized by their combination. No treatment effect was detected on basal areas of living mature trees four or eight years after treatment applications. In seedlings, An-max was increased by N+CaMg addition. The reference plots had an open herbaceous layer, but the plots receiving N had a dense monoculture of common woodfern in the forest floor

  15. The integration of geophysical and enhanced Moderate Resolution Imaging Spectroradiometer Normalized Difference Vegetation Index data into a rule-based, piecewise regression-tree model to estimate cheatgrass beginning of spring growth

    USGS Publications Warehouse

    Boyte, Stephen P.; Wylie, Bruce K.; Major, Donald J.; Brown, Jesslyn F.

    2015-01-01

    Cheatgrass exhibits spatial and temporal phenological variability across the Great Basin as described by ecological models formed using remote sensing and other spatial data-sets. We developed a rule-based, piecewise regression-tree model trained on 99 points that used three data-sets – latitude, elevation, and start of season time based on remote sensing input data – to estimate cheatgrass beginning of spring growth (BOSG) in the northern Great Basin. The model was then applied to map the location and timing of cheatgrass spring growth for the entire area. The model was strong (R2 = 0.85) and predicted an average cheatgrass BOSG across the study area of 29 March–4 April. Of early cheatgrass BOSG areas, 65% occurred at elevations below 1452 m. The highest proportion of cheatgrass BOSG occurred between mid-April and late May. Predicted cheatgrass BOSG in this study matched well with previous Great Basin cheatgrass green-up studies.

  16. Levels and determinants of tree pollen in New York City.

    PubMed

    Weinberger, Kate R; Kinney, Patrick L; Robinson, Guy S; Sheehan, Daniel; Kheirbek, Iyad; Matte, Thomas D; Lovasi, Gina S

    2016-12-21

    Exposure to allergenic tree pollen is a risk factor for multiple allergic disease outcomes. Little is known about how tree pollen levels vary within cities and whether such variation affects the development or exacerbation of allergic disease. Accordingly, we collected integrated pollen samples at uniform height at 45 sites across New York City during the 2013 pollen season. We used these monitoring results in combination with adjacent land use data to develop a land use regression model for tree pollen. We evaluated four types of land use variables for inclusion in the model: tree canopy, distributed building height (a measure of building volume density), elevation, and distance to water. When included alone in the model, percent tree canopy cover within a 0.5 km radial buffer explained 39% of the variance in tree pollen (1.9% increase in tree pollen per one-percentage point increase in tree canopy cover, P<0.0001). The inclusion of additional variables did not improve model fit. We conclude that intra-urban variation in tree canopy is an important driver of tree pollen exposure. Land use regression models can be used to incorporate spatial variation in tree pollen exposure in studies of allergic disease outcomes.Journal of Exposure Science and Environmental Epidemiology advance online publication, 21 December 2016; doi:10.1038/jes.2016.72.

  17. Astronomical Methods for Nonparametric Regression

    NASA Astrophysics Data System (ADS)

    Steinhardt, Charles L.; Jermyn, Adam

    2017-01-01

    I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.

  18. Functional relationships between leaf hydraulics and leaf economic traits in response to nutrient addition in subtropical tree species.

    PubMed

    Villagra, Mariana; Campanello, Paula I; Bucci, Sandra J; Goldstein, Guillermo

    2013-12-01

    Leaves can be both a hydraulic bottleneck and a safety valve against hydraulic catastrophic dysfunctions, and thus changes in traits related to water movement in leaves and associated costs may be critical for the success of plant growth. A 4-year fertilization experiment with nitrogen (N) and phosphorus (P) addition was done in a semideciduous Atlantic forest in northeastern Argentina. Saplings of five dominant canopy species were grown in similar gaps inside the forests (five control and five N + P addition plots). Leaf lifespan (LL), leaf mass per unit area (LMA), leaf and stem vulnerability to cavitation, leaf hydraulic conductance (K(leaf_area) and K(leaf_mass)) and leaf turgor loss point (TLP) were measured in the five species and in both treatments. Leaf lifespan tended to decrease with the addition of fertilizers, and LMA was significantly higher in plants with nutrient addition compared with individuals in control plots. The vulnerability to cavitation of leaves (P50(leaf)) either increased or decreased with the nutrient treatment depending on the species, but the average P50(leaf) did not change with nutrient addition. The P50(leaf) decreased linearly with increasing LMA and LL across species and treatments. These trade-offs have an important functional significance because more expensive (higher LMA) and less vulnerable leaves (lower P50(leaf)) are retained for a longer period of time. Osmotic potentials at TLP and at full turgor became more negative with decreasing P50(leaf) regardless of nutrient treatment. The K(leaf) on a mass basis was negatively correlated with LMA and LL, indicating that there is a carbon cost associated with increased water transport that is compensated by a longer LL. The vulnerability to cavitation of stems and leaves were similar, particularly in fertilized plants. Leaves in the species studied may not function as safety valves at low water potentials to protect the hydraulic pathway from water stress-induced cavitation

  19. Logistic Regression

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.

  20. The Application of Classification and Regression Trees for the Triage of Women for Referral to Colposcopy and the Estimation of Risk for Cervical Intraepithelial Neoplasia: A Study Based on 1625 Cases with Incomplete Data from Molecular Tests

    PubMed Central

    Pouliakis, Abraham; Karakitsou, Efrossyni; Chrelias, Charalampos; Pappas, Asimakis; Panayiotides, Ioannis; Valasoulis, George; Kyrgiou, Maria; Paraskevaidis, Evangelos; Karakitsos, Petros

    2015-01-01

    Objective. Nowadays numerous ancillary techniques detecting HPV DNA and mRNA compete with cytology; however no perfect test exists; in this study we evaluated classification and regression trees (CARTs) for the production of triage rules and estimate the risk for cervical intraepithelial neoplasia (CIN) in cases with ASCUS+ in cytology. Study Design. We used 1625 cases. In contrast to other approaches we used missing data to increase the data volume, obtain more accurate results, and simulate real conditions in the everyday practice of gynecologic clinics and laboratories. The proposed CART was based on the cytological result, HPV DNA typing, HPV mRNA detection based on NASBA and flow cytometry, p16 immunocytochemical expression, and finally age and parous status. Results. Algorithms useful for the triage of women were produced; gynecologists could apply these in conjunction with available examination results and conclude to an estimation of the risk for a woman to harbor CIN expressed as a probability. Conclusions. The most important test was the cytological examination; however the CART handled cases with inadequate cytological outcome and increased the diagnostic accuracy by exploiting the results of ancillary techniques even if there were inadequate missing data. The CART performance was better than any other single test involved in this study. PMID:26339651

  1. Morse–Smale Regression

    SciTech Connect

    Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  2. Morse-Smale Regression

    PubMed Central

    Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-01

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424

  3. Under which conditions, additional monitoring data are worth gathering for improving decision making? Application of the VOI theory in the Bayesian Event Tree eruption forecasting framework

    NASA Astrophysics Data System (ADS)

    Loschetter, Annick; Rohmer, Jérémy

    2016-04-01

    Standard and new generation of monitoring observations provide in almost real-time important information about the evolution of the volcanic system. These observations are used to update the model and contribute to a better hazard assessment and to support decision making concerning potential evacuation. The framework BET_EF (based on Bayesian Event Tree) developed by INGV enables dealing with the integration of information from monitoring with the prospect of decision making. Using this framework, the objectives of the present work are i. to propose a method to assess the added value of information (within the Value Of Information (VOI) theory) from monitoring; ii. to perform sensitivity analysis on the different parameters that influence the VOI from monitoring. VOI consists in assessing the possible increase in expected value provided by gathering information, for instance through monitoring. Basically, the VOI is the difference between the value with information and the value without additional information in a Cost-Benefit approach. This theory is well suited to deal with situations that can be represented in the form of a decision tree such as the BET_EF tool. Reference values and ranges of variation (for sensitivity analysis) were defined for input parameters, based on data from the MESIMEX exercise (performed at Vesuvio volcano in 2006). Complementary methods for sensitivity analyses were implemented: local, global using Sobol' indices and regional using Contribution to Sample Mean and Variance plots. The results (specific to the case considered) obtained with the different techniques are in good agreement and enable answering the following questions: i. Which characteristics of monitoring are important for early warning (reliability)? ii. How do experts' opinions influence the hazard assessment and thus the decision? Concerning the characteristics of monitoring, the more influent parameters are the means rather than the variances for the case considered

  4. Boosted Beta Regression

    PubMed Central

    Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas

    2013-01-01

    Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706

  5. Fault-Tree Compiler

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Boerschlein, David P.

    1993-01-01

    Fault-Tree Compiler (FTC) program, is software tool used to calculate probability of top event in fault tree. Gates of five different types allowed in fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N. High-level input language easy to understand and use. In addition, program supports hierarchical fault-tree definition feature, which simplifies tree-description process and reduces execution time. Set of programs created forming basis for reliability-analysis workstation: SURE, ASSIST, PAWS/STEM, and FTC fault-tree tool (LAR-14586). Written in PASCAL, ANSI-compliant C language, and FORTRAN 77. Other versions available upon request.

  6. Understanding poisson regression.

    PubMed

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes.

  7. The effectiveness of selected feed and water additives for reducing Salmonella spp. of public health importance in broiler chickens: a systematic review, meta-analysis, and meta-regression approach.

    PubMed

    Totton, Sarah C; Farrar, Ashley M; Wilkins, Wendy; Bucher, Oliver; Waddell, Lisa A; Wilhelm, Barbara J; McEwen, Scott A; Rajić, Andrijana

    2012-10-01

    Eating inappropriately prepared poultry meat is a major cause of foodborne salmonellosis. Our objectives were to determine the efficacy of feed and water additives (other than competitive exclusion and antimicrobials) on reducing Salmonella prevalence or concentration in broiler chickens using systematic review-meta-analysis and to explore sources of heterogeneity found in the meta-analysis through meta-regression. Six electronic databases were searched (Current Contents (1999-2009), Agricola (1924-2009), MEDLINE (1860-2009), Scopus (1960-2009), Centre for Agricultural Bioscience (CAB) (1913-2009), and CAB Global Health (1971-2009)), five topic experts were contacted, and the bibliographies of review articles and a topic-relevant textbook were manually searched to identify all relevant research. Study inclusion criteria comprised: English-language primary research investigating the effects of feed and water additives on the Salmonella prevalence or concentration in broiler chickens. Data extraction and study methodological assessment were conducted by two reviewers independently using pretested forms. Seventy challenge studies (n=910 unique treatment-control comparisons), seven controlled studies (n=154), and one quasi-experiment (n=1) met the inclusion criteria. Compared to an assumed control group prevalence of 44 of 1000 broilers, random-effects meta-analysis indicated that the Salmonella cecal colonization in groups with prebiotics (fructooligosaccharide, lactose, whey, dried milk, lactulose, lactosucrose, sucrose, maltose, mannanoligosaccharide) added to feed or water was 15 out of 1000 broilers; with lactose added to feed or water it was 10 out of 1000 broilers; with experimental chlorate product (ECP) added to feed or water it was 21 out of 1000. For ECP the concentration of Salmonella in the ceca was decreased by 0.61 log(10)cfu/g in the treated group compared to the control group. Significant heterogeneity (Cochran's Q-statistic p≤0.10) was observed

  8. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed

  9. Atlas of relations between climatic parameters and distributions of important trees and shrubs in North America; additional conifers, hardwoods, and monocots

    USGS Publications Warehouse

    Thompson, Robert S.; Anderson, Katherine H.; Bartlein, Patrick J.; Smith, Sharon A.

    2000-01-01

    This volume explores the continental-scale relations between climate and the geographic ranges of woody plant species in North America. A 25-km equal-area grid of modern climatic and bioclimatic parameters for North America was constructed from instrumental weather records. The geographic distributions of selected tree and shrub species were digitized, and the presence or absence of each species was determined for each cell on the 25-km grid, thus providing a basis for comparing climatic data and species' distribution.

  10. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  11. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  12. A polyethylenimine-modified carboxyl-poly(styrene/acrylamide) copolymer nanosphere for co-delivering of CpG and TGF-β receptor I inhibitor with remarkable additive tumor regression effect against liver cancer in mice

    PubMed Central

    Liang, Shuyan; Hu, Jun; Xie, Yuanyuan; Zhou, Qing; Zhu, Yanhong; Yang, Xiangliang

    2016-01-01

    Cancer immunotherapy based on nanodelivery systems has shown potential for treatment of various malignancies, owing to the benefits of tumor targeting of nanoparticles. However, induction of a potent T-cell immune response against tumors still remains a challenge. In this study, polyethylenimine-modified carboxyl-styrene/acrylamide (PS) copolymer nano-spheres were developed as a delivery system of unmethylated cytosine-phosphate-guanine (CpG) oligodeoxynucleotides and transforming growth factor-beta (TGF-β) receptor I inhibitors for cancer immunotherapy. TGF-β receptor I inhibitors (LY2157299, LY) were encapsulated to the PS via hydrophobic interaction, while CpG oligodeoxynucleotides were loaded onto the PS through electrostatic interaction. Compared to the control group, tumor inhibition in the PS-LY/CpG group was up to 99.7% without noticeable toxicity. The tumor regression may be attributed to T-cell activation and amplification in mouse models. The results highlight the additive effect of CpG and TGF-β receptor I inhibitors co-delivered in cancer immunotherapy. PMID:28008250

  13. Tree-structured supervised learning and the genetics of hypertension.

    PubMed

    Huang, Jing; Lin, Alfred; Narasimhan, Balasubramanian; Quertermous, Thomas; Hsiung, C Agnes; Ho, Low-Tone; Grove, John S; Olivier, Michael; Ranade, Koustubh; Risch, Neil J; Olshen, Richard A

    2004-07-20

    This paper is about an algorithm, FlexTree, for general supervised learning. It extends the binary tree-structured approach (Classification and Regression Trees, CART) although it differs greatly in its selection and combination of predictors. It is particularly applicable to assessing interactions: gene by gene and gene by environment as they bear on complex disease. One model for predisposition to complex disease involves many genes. Of them, most are pure noise; each of the values that is not the prevalent genotype for the minority of genes that contribute to the signal carries a "score." Scores add. Individuals with scores above an unknown threshold are predisposed to the disease. For the additive score problem and simulated data, FlexTree has cross-validated risk better than many cutting-edge technologies to which it was compared when small fractions of candidate genes carry the signal. For the model where only a precise list of aberrant genotypes is predisposing, there is not a systematic pattern of absolute superiority; however, overall, FlexTree seems better than the other technologies. We tried the algorithm on data from 563 Chinese women, 206 hypotensive, 357 hypertensive, with information on ethnicity, menopausal status, insulin-resistant status, and 21 loci. FlexTree and Logic Regression appear better than the others in terms of Bayes risk. However, the differences are not significant in the usual statistical sense.

  14. Ridge Regression: A Panacea?

    ERIC Educational Resources Information Center

    Walton, Joseph M.; And Others

    1978-01-01

    Ridge regression is an approach to the problem of large standard errors of regression estimates of intercorrelated regressors. The effect of ridge regression on the estimated squared multiple correlation coefficient is discussed and illustrated. (JKS)

  15. Our Air: Unfit for Trees.

    ERIC Educational Resources Information Center

    Dochinger, Leon S.

    To help urban, suburban, and rural tree owners know about air pollution's effects on trees and their tolerance and intolerance to pollutants, the USDA Forest Service has prepared this booklet. It answers the following questions about atmospheric pollution: Where does it come from? What can it do to trees? and What can we do about it? In addition,…

  16. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  17. Fragmentation of random trees

    NASA Astrophysics Data System (ADS)

    Kalay, Z.; Ben-Naim, E.

    2015-01-01

    We study fragmentation of a random recursive tree into a forest by repeated removal of nodes. The initial tree consists of N nodes and it is generated by sequential addition of nodes with each new node attaching to a randomly-selected existing node. As nodes are removed from the tree, one at a time, the tree dissolves into an ensemble of separate trees, namely, a forest. We study statistical properties of trees and nodes in this heterogeneous forest, and find that the fraction of remaining nodes m characterizes the system in the limit N\\to ∞ . We obtain analytically the size density {{φ }s} of trees of size s. The size density has power-law tail {{φ }s}˜ {{s}-α } with exponent α =1+\\frac{1}{m}. Therefore, the tail becomes steeper as further nodes are removed, and the fragmentation process is unusual in that exponent α increases continuously with time. We also extend our analysis to the case where nodes are added as well as removed, and obtain the asymptotic size density for growing trees.

  18. Regressive systemic sclerosis.

    PubMed Central

    Black, C; Dieppe, P; Huskisson, T; Hart, F D

    1986-01-01

    Systemic sclerosis is a disease which usually progresses or reaches a plateau with persistence of symptoms and signs. Regression is extremely unusual. Four cases of established scleroderma are described in which regression is well documented. The significance of this observation and possible mechanisms of disease regression are discussed. Images PMID:3718012

  19. NCCS Regression Test Harness

    SciTech Connect

    Tharrington, Arnold N.

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  20. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  1. Assessing visual green effects of individual urban trees using airborne Lidar data.

    PubMed

    Chen, Ziyue; Xu, Bing; Gao, Bingbo

    2015-12-01

    Urban trees benefit people's daily life in terms of air quality, local climate, recreation and aesthetics. Among these functions, a growing number of studies have been conducted to understand the relationship between residents' preference towards local environments and visual green effects of urban greenery. However, except for on-site photography, there are few quantitative methods to calculate green visibility, especially tree green visibility, from viewers' perspectives. To fill this research gap, a case study was conducted in the city of Cambridge, which has a diversity of tree species, sizes and shapes. Firstly, a photograph-based survey was conducted to approximate the actual value of visual green effects of individual urban trees. In addition, small footprint airborne Lidar (Light detection and ranging) data was employed to measure the size and shape of individual trees. Next, correlations between visual tree green effects and tree structural parameters were examined. Through experiments and gradual refinement, a regression model with satisfactory R2 and limited large errors is proposed. Considering the diversity of sample trees and the result of cross-validation, this model has the potential to be applied to other study sites. This research provides urban planners and decision makers with an innovative method to analyse and evaluate landscape patterns in terms of tree greenness.

  2. Fully Regressive Melanoma

    PubMed Central

    Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé

    2016-01-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418

  3. Neem-tree (Azadirachta indica Juss.) extract as a feed additive against the American dog tick (Dermacentor variabilis) in sheep (Ovis aries).

    PubMed

    Landau, S Y; Provenza, F D; Gardner, D R; Pfister, J A; Knoppel, E L; Peterson, C; Kababya, D; Needham, G R; Villalba, J J

    2009-11-12

    Acaricides can be conveyed to ticks via the blood of their hosts. As fruit and kernel extracts from the Meliaceae family, and, in particular the tetranortriterpenoid azadirachtin (AZA) inhibits tick egg production and embryogenesis in the Ixodidae ticks, we investigated the effects of Neem Azal, an extract containing 43% AZA, given as a feed additive to lambs artificially infested with engorging adult Dermacentor vairiabilis ticks. After tick attachment, the lambs were allotted to three dietary treatments: AZA0 (control, n=10), AZA0.3 (n=5), and AZA0.6 (n=5), with feed containing 0%, 0.3%, and 0.6% AZA on DM basis, respectively. In half of the AZA0 lambs, ticks were sprayed on day 4 after attachment with an ethanol:water:soap emulsion containing 0.6% AZA (AZA0S). In spite of its very pungent odor, the neem extract was well accepted by all but one lamb. No differences were found between treatment groups in liver enzymes in blood, and there was no indication of toxicity. The plasma AZA concentrations after 7 and 14 days of feeding AZA were (4.81 and 4.35 microg/mL) for the AZA0.6 and (3.32 and 1.88 microg/mL) for the AZA0.3 treatments, respectively (P<0.0001). Treatments were not lethal to ticks, but tick weights at detachment were 0.64, 0.56, 0.48, and 0.37 g for ticks from the AZA0, AZA0.3, AZA0S, and AZA0.6 treatments (P<0.04), respectively, suggesting that blood AZA impaired blood-feeding. The highest mortality rate after detachment was for AZA0.6 (P<0.09). As AZA affects embryo development and ticks at the molting stages, we expect that following treatments of hosts for longer periods, one-host ticks will be more affected than the three-host tick D. variabilis.

  4. Phylogenetic trees in bioinformatics

    SciTech Connect

    Burr, Tom L

    2008-01-01

    Genetic data is often used to infer evolutionary relationships among a collection of viruses, bacteria, animal or plant species, or other operational taxonomic units (OTU). A phylogenetic tree depicts such relationships and provides a visual representation of the estimated branching order of the OTUs. Tree estimation is unique for several reasons, including: the types of data used to represent each OTU; the use ofprobabilistic nucleotide substitution models; the inference goals involving both tree topology and branch length, and the huge number of possible trees for a given sample of a very modest number of OTUs, which implies that fmding the best tree(s) to describe the genetic data for each OTU is computationally demanding. Bioinformatics is too large a field to review here. We focus on that aspect of bioinformatics that includes study of similarities in genetic data from multiple OTUs. Although research questions are diverse, a common underlying challenge is to estimate the evolutionary history of the OTUs. Therefore, this paper reviews the role of phylogenetic tree estimation in bioinformatics, available methods and software, and identifies areas for additional research and development.

  5. Talking Trees

    ERIC Educational Resources Information Center

    Tolman, Marvin

    2005-01-01

    Students love outdoor activities and will love them even more when they build confidence in their tree identification and measurement skills. Through these activities, students will learn to identify the major characteristics of trees and discover how the pace--a nonstandard measuring unit--can be used to estimate not only distances but also the…

  6. Bayesian Evidence Framework for Decision Tree Learning

    NASA Astrophysics Data System (ADS)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  7. Geometric tree kernels: classification of COPD from airway tree geometry.

    PubMed

    Feragen, Aasa; Petersen, Jens; Grimm, Dominik; Dirksen, Asger; Pedersen, Jesper Holst; Borgwardt, Karsten; de Bruijne, Marleen

    2013-01-01

    Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or other vector valued properties. In addition to being flexible in their ability to model different types of attributes, the presented kernels are computationally efficient and some of them can easily be computed for large datasets (N - 10.000) of trees with 30 - 600 branches. Combining the kernels with standard machine learning tools enables us to analyze the relation between disease and anatomical tree structure and geometry. Experimental results: The kernels are used to compare airway trees segmented from low-dose CT, endowed with branch shape descriptors and airway wall area percentage measurements made along the tree. Using kernelized hypothesis testing we show that the geometric airway trees are significantly differently distributed in patients with Chronic Obstructive Pulmonary Disease (COPD) than in healthy individuals. The geometric tree kernels also give a significant increase in the classification accuracy of COPD from geometric tree structure endowed with airway wall thickness measurements in comparison with state-of-the-art methods, giving further insight into the relationship between airway wall thickness and COPD. Software: Software for computing kernels and statistical tests is available at http://image.diku.dk/aasa/software.php.

  8. Classificação geométrica de galáxias bianeladas através do metódo CART (Classification And Regression Trees)

    NASA Astrophysics Data System (ADS)

    Ormeño, M. I.; Faúndez-Abans, M.; Cavada, G.

    2003-08-01

    A importância deste trabalho deve-se à seleção de objetos ainda não tratados particularmente como uma família e ao emprego de procedimento estatístico robusto que não precisa de pressupostos ou condições de contorno. Contribui, assim, ao melhor entendimento do cenário das Galáxias Aneladas do diagrama de Hubble via classificação e estudo de subclasses. Selecionaram-se 100 galáxias possuidoras de dois anéis do Catalog of Southern Ringed Galaxies compilado por Ronald Buta, de modo a construir uma amostra completa em termos de conhecimento dos semi-eixos dos anéis interno e externo projetados no plano do céu. Visando uma possível classificação destas galáxias aneladas normais em famílias de acordo com as características geométricas dos anéis, empregou-se primeiramente a Análise de Aglomerados (ferramenta de classificação: medições de semelhança em um espaço bidimensional) para explorar a possível existência de famílias. As variáveis analisadas foram: os diâmetros interiores menores d(I) e maiores D(I), os diâmetros exteriores menores d(E) e maiores D(E), e os ângulos de inclinação dos semi-eixos maiores interiores q(I) e exteriores q(E) dos anéis. Como metodologia de discriminação, empregou-se a construção de Árvores de Classificação. As árvores de classificação constituem um método de discriminação alternativo aos modelos clássicos, tais como a Análise Discriminante e a Regressão Logística, onde uma base de dados é dividida em partições (subgrupos) da árvore por ação de um predictor (variável específica). Os pacotes estatísticos utilizados para o processamento da informação foram: SAS versão 8.0 (Statistical Analisys System) e CART versão 3.6.3. Esta análise estatística sugere a existência de três possíveis famílias de galáxias bianeladas, com base apenas na geometria dos anéis. Como forma exploratória inicial deste resultado, a construção de um diagrama BT (magnitude total) versus o

  9. Improved Regression Calibration

    ERIC Educational Resources Information Center

    Skrondal, Anders; Kuha, Jouni

    2012-01-01

    The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…

  10. The limits to tree height.

    PubMed

    Koch, George W; Sillett, Stephen C; Jennings, Gregory M; Davis, Stephen D

    2004-04-22

    Trees grow tall where resources are abundant, stresses are minor, and competition for light places a premium on height growth. The height to which trees can grow and the biophysical determinants of maximum height are poorly understood. Some models predict heights of up to 120 m in the absence of mechanical damage, but there are historical accounts of taller trees. Current hypotheses of height limitation focus on increasing water transport constraints in taller trees and the resulting reductions in leaf photosynthesis. We studied redwoods (Sequoia sempervirens), including the tallest known tree on Earth (112.7 m), in wet temperate forests of northern California. Our regression analyses of height gradients in leaf functional characteristics estimate a maximum tree height of 122-130 m barring mechanical damage, similar to the tallest recorded trees of the past. As trees grow taller, increasing leaf water stress due to gravity and path length resistance may ultimately limit leaf expansion and photosynthesis for further height growth, even with ample soil moisture.

  11. Rate of tree carbon accumulation increases continuously with tree size.

    PubMed

    Stephenson, N L; Das, A J; Condit, R; Russo, S E; Baker, P J; Beckman, N G; Coomes, D A; Lines, E R; Morris, W K; Rüger, N; Alvarez, E; Blundo, C; Bunyavejchewin, S; Chuyong, G; Davies, S J; Duque, A; Ewango, C N; Flores, O; Franklin, J F; Grau, H R; Hao, Z; Harmon, M E; Hubbell, S P; Kenfack, D; Lin, Y; Makana, J-R; Malizia, A; Malizia, L R; Pabst, R J; Pongpattananurak, N; Su, S-H; Sun, I-F; Tan, S; Thomas, D; van Mantgem, P J; Wang, X; Wiser, S K; Zavala, M A

    2014-03-06

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle--particularly net primary productivity and carbon storage--increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree's total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to undertand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  12. Interaction Models for Functional Regression

    PubMed Central

    USSET, JOSEPH; STAICU, ANA-MARIA; MAITY, ARNAB

    2015-01-01

    A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data. PMID:26744549

  13. Phylogenetic trees and Euclidean embeddings.

    PubMed

    Layer, Mark; Rhodes, John A

    2017-01-01

    It was recently observed by de Vienne et al. (Syst Biol 60(6):826-832, 2011) that a simple square root transformation of distances between taxa on a phylogenetic tree allowed for an embedding of the taxa into Euclidean space. While the justification for this was based on a diffusion model of continuous character evolution along the tree, here we give a direct and elementary explanation for it that provides substantial additional insight. We use this embedding to reinterpret the differences between the NJ and BIONJ tree building algorithms, providing one illustration of how this embedding reflects tree structures in data.

  14. Rate of tree carbon accumulation increases continuously with tree size

    USGS Publications Warehouse

    Stephenson, N.L.; Das, A.J.; Condit, R.; Russo, S.E.; Baker, P.J.; Beckman, N.G.; Coomes, D.A.; Lines, E.R.; Morris, W.K.; Rüger, N.; Álvarez, E.; Blundo, C.; Bunyavejchewin, S.; Chuyong, G.; Davies, S.J.; Duque, Á.; Ewango, C.N.; Flores, O.; Franklin, J.F.; Grau, H.R.; Hao, Z.; Harmon, M.E.; Hubbell, S.P.; Kenfack, D.; Lin, Y.; Makana, J.-R.; Malizia, A.; Malizia, L.R.; Pabst, R.J.; Pongpattananurak, N.; Su, S.-H.; Sun, I-F.; Tan, S.; Thomas, D.; van Mantgem, P.J.; Wang, X.; Wiser, S.K.; Zavala, M.A.

    2014-01-01

    Forests are major components of the global carbon cycle, providing substantial feedback to atmospheric greenhouse gas concentrations. Our ability to understand and predict changes in the forest carbon cycle—particularly net primary productivity and carbon storage - increasingly relies on models that represent biological processes across several scales of biological organization, from tree leaves to forest stands. Yet, despite advances in our understanding of productivity at the scales of leaves and stands, no consensus exists about the nature of productivity at the scale of the individual tree, in part because we lack a broad empirical assessment of whether rates of absolute tree mass growth (and thus carbon accumulation) decrease, remain constant, or increase as trees increase in size and age. Here we present a global analysis of 403 tropical and temperate tree species, showing that for most species mass growth rate increases continuously with tree size. Thus, large, old trees do not act simply as senescent carbon reservoirs but actively fix large amounts of carbon compared to smaller trees; at the extreme, a single big tree can add the same amount of carbon to the forest within a year as is contained in an entire mid-sized tree. The apparent paradoxes of individual tree growth increasing with tree size despite declining leaf-level and stand-level productivity can be explained, respectively, by increases in a tree’s total leaf area that outpace declines in productivity per unit of leaf area and, among other factors, age-related reductions in population density. Our results resolve conflicting assumptions about the nature of tree growth, inform efforts to understand and model forest carbon dynamics, and have additional implications for theories of resource allocation and plant senescence.

  15. George: Gaussian Process regression

    NASA Astrophysics Data System (ADS)

    Foreman-Mackey, Daniel

    2015-11-01

    George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.

  16. The fault-tree compiler

    NASA Technical Reports Server (NTRS)

    Martensen, Anna L.; Butler, Ricky W.

    1987-01-01

    The Fault Tree Compiler Program is a new reliability tool used to predict the top event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N gates. The high level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precise (within the limits of double precision floating point arithmetic) to the five digits in the answer. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Corporation VAX with the VMS operation system.

  17. Audubon Tree Study Program.

    ERIC Educational Resources Information Center

    National Audubon Society, New York, NY.

    Included are an illustrated student reader, "The Story of Trees," a leaders' guide, and a large tree chart with 37 colored pictures. The student reader reviews several aspects of trees: a definition of a tree; where and how trees grow; flowers, pollination and seed production; how trees make their food; how to recognize trees; seasonal changes;…

  18. Regression based modeling of vegetation and climate variables for the Amazon rainforests

    NASA Astrophysics Data System (ADS)

    Kodali, A.; Khandelwal, A.; Ganguly, S.; Bongard, J.; Das, K.

    2015-12-01

    Both short-term (weather) and long-term (climate) variations in the atmosphere directly impact various ecosystems on earth. Forest ecosystems, especially tropical forests, are crucial as they are the largest reserves of terrestrial carbon sink. For example, the Amazon forests are a critical component of global carbon cycle storing about 100 billion tons of carbon in its woody biomass. There is a growing concern that these forests could succumb to precipitation reduction in a progressively warming climate, leading to release of significant amount of carbon in the atmosphere. Therefore, there is a need to accurately quantify the dependence of vegetation growth on different climate variables and obtain better estimates of drought-induced changes to atmospheric CO2. The availability of globally consistent climate and earth observation datasets have allowed global scale monitoring of various climate and vegetation variables such as precipitation, radiation, surface greenness, etc. Using these diverse datasets, we aim to quantify the magnitude and extent of ecosystem exposure, sensitivity and resilience to droughts in forests. The Amazon rainforests have undergone severe droughts twice in last decade (2005 and 2010), which makes them an ideal candidate for the regional scale analysis. Current studies on vegetation and climate relationships have mostly explored linear dependence due to computational and domain knowledge constraints. We explore a modeling technique called symbolic regression based on evolutionary computation that allows discovery of the dependency structure without any prior assumptions. In symbolic regression the population of possible solutions is defined via trees structures. Each tree represents a mathematical expression that includes pre-defined functions (mathematical operators) and terminal sets (independent variables from data). Selection of these sets is critical to computational efficiency and model accuracy. In this work we investigate

  19. Factors Governing Stemflow Production from Plantation Grown Teak Trees in Thailand

    NASA Astrophysics Data System (ADS)

    Tanaka, N.; Levia, D. F., Jr.; Igarashi, Y.; Yoshifuji, N.; Tanaka, K.; Chatchai, T.; Nanko, K.; Suzuki, M.; Kumagai, T.

    2015-12-01

    Stemflow (SF) is recognized as an important process delivering water, solute, and particulate fluxes to spatially localized areas of the forest floor. Using both long-term SF data from nine even-aged deciduous teak trees grown in the same plantation and meteorological data from a nearby tower, this study seeks to better understand how: (1) specific biotic and abiotic factors control stand-scale SF production of teak; and (2) various biotic and abiotic factors affect tree-to-tree variations in teak SF production. A conventional regression analysis of SF volume against rainfall indicates that, for five individuals among the nine, SF was more efficiently produced in the leafless than in the leafed. However, for the other individuals, there was no such a relation, suggesting tree-to-tree variation in the response of SF to canopy status. A boosted regression tree (BRT) analysis setting daily basis SF funneling ratios (SFF) of the nine trees as dependent variables, indicates that SFF was intricately controlled by a variety of biotic and abiotic factors. The top six influential factors were, in descending order, rainfall duration, tree height, rainfall intensity, air temperature, wind speed, and antecedent dry period length having positive, negative, positive, negative, positive, and negative influence on SFF, respectively. Although teak exhibits drastic intra-annual changes in leaf phenology, leaf area index (LAI) had an unexpectedly small influence on SFF on a stand scale. Additional BRT analyses focusing on individuals with the maximum and the minimum SFF values (among the nine individuals) showed that there was considerable tree-to-tree variation in an array of the influential variables for SFF, even though they were planted in the same year and grown in the same plot. In addition to this difference, the BRT analyses also showed that response of SFF to LAI differs between the two individuals. The differentiating responses to LAI depending on individuals may be the

  20. [Understanding logistic regression].

    PubMed

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  1. Tree harvesting

    SciTech Connect

    Badger, P.C.

    1995-12-31

    Short rotation intensive culture tree plantations have been a major part of biomass energy concepts since the beginning. One aspect receiving less attention than it deserves is harvesting. This article describes an method of harvesting somewhere between agricultural mowing machines and huge feller-bunchers of the pulpwood and lumber industries.

  2. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  3. Ridge Regression: A Regression Procedure for Analyzing correlated Independent Variables

    ERIC Educational Resources Information Center

    Rakow, Ernest A.

    1978-01-01

    Ridge regression is a technique used to ameliorate the problem of highly correlated independent variables in multiple regression analysis. This paper explains the fundamentals of ridge regression and illustrates its use. (JKS)

  4. Modelling of filariasis in East Java with Poisson regression and generalized Poisson regression models

    NASA Astrophysics Data System (ADS)

    Darnah

    2016-04-01

    Poisson regression has been used if the response variable is count data that based on the Poisson distribution. The Poisson distribution assumed equal dispersion. In fact, a situation where count data are over dispersion or under dispersion so that Poisson regression inappropriate because it may underestimate the standard errors and overstate the significance of the regression parameters, and consequently, giving misleading inference about the regression parameters. This paper suggests the generalized Poisson regression model to handling over dispersion and under dispersion on the Poisson regression model. The Poisson regression model and generalized Poisson regression model will be applied the number of filariasis cases in East Java. Based regression Poisson model the factors influence of filariasis are the percentage of families who don't behave clean and healthy living and the percentage of families who don't have a healthy house. The Poisson regression model occurs over dispersion so that we using generalized Poisson regression. The best generalized Poisson regression model showing the factor influence of filariasis is percentage of families who don't have healthy house. Interpretation of result the model is each additional 1 percentage of families who don't have healthy house will add 1 people filariasis patient.

  5. Modern Regression Discontinuity Analysis

    ERIC Educational Resources Information Center

    Bloom, Howard S.

    2012-01-01

    This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…

  6. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  7. Explorations in Statistics: Regression

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas

    2011-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive…

  8. Technical Tree Climbing.

    ERIC Educational Resources Information Center

    Jenkins, Peter

    Tree climbing offers a safe, inexpensive adventure sport that can be performed almost anywhere. Using standard procedures practiced in tree surgery or rock climbing, almost any tree can be climbed. Tree climbing provides challenge and adventure as well as a vigorous upper-body workout. Tree Climbers International classifies trees using a system…

  9. Exotic trees.

    PubMed

    Burda, Z; Erdmann, J; Petersson, B; Wattenberg, M

    2003-02-01

    We discuss the scaling properties of free branched polymers. The scaling behavior of the model is classified by the Hausdorff dimensions for the internal geometry, d(L) and d(H), and for the external one, D(L) and D(H). The dimensions d(H) and D(H) characterize the behavior for long distances, while d(L) and D(L) for short distances. We show that the internal Hausdorff dimension is d(L)=2 for generic and scale-free trees, contrary to d(H), which is known to be equal to 2 for generic trees and to vary between 2 and infinity for scale-free trees. We show that the external Hausdorff dimension D(H) is directly related to the internal one as D(H)=alphad(H), where alpha is the stability index of the embedding weights for the nearest-vertex interactions. The index is alpha=2 for weights from the Gaussian domain of attraction and 0

  10. Tree thinning as an option to increase herbaceous yield of an encroached semi-arid savanna in South Africa

    PubMed Central

    Smit, Gert N

    2005-01-01

    Background The investigation was conducted in a savanna area covered by what was considered an undesirably dense stand of Colophospermum mopane trees, mainly because such a dense stand of trees often results in the suppression of herbaceous plants. The objectives of this study were to determine the influence of intensity of tree thinning on the dry matter yield of herbaceous plants (notably grasses) and to investigate differences in herbaceous species composition between defined subhabitats (under tree canopies, between tree canopies and where trees have been removed). Seven plots (65 × 180 m) were subjected to different intensities of tree thinning, ranging from a totally cleared plot (0 %) to plots thinned to the equivalent of 10 %, 20%, 35 %, 50% and 75 % of the leaf biomass of a control plot (100 %) with a tree density of 2711 plants ha-1. The establishment of herbaceous plants (grasses and forbs) in response to reduced competition from the woody plants was measured during three full growing seasons following the thinning treatments. Results The grass component reacted positively to the tree thinning in terms of total dry matter (DM) yield, but forbs were negatively influenced. Rainfall interacted with tree density and the differences between grass DM yields in thinned plots during years of below average rainfall were substantially higher than those of the control. At high tree densities, yields differed little between seasons of varying rainfall. The relation between grass DM yield and tree biomass was curvilinear, best described by the exponential regression equation. Subhabitat differentiation by C. mopane trees did provide some qualitative benefits, with certain desirable grass species showing a preference for the subhabitat under tree canopies. Conclusion While it can be concluded from this study that high tree densities suppress herbaceous production, the decision to clear/thin the C. mopane trees should include additional considerations. Thinning of C

  11. MixtureTree annotator: a program for automatic colorization and visual annotation of MixtureTree.

    PubMed

    Chen, Shu-Chuan; Ogata, Aaron

    2015-01-01

    The MixtureTree Annotator, written in JAVA, allows the user to automatically color any phylogenetic tree in Newick format generated from any phylogeny reconstruction program and output the Nexus file. By providing the ability to automatically color the tree by sequence name, the MixtureTree Annotator provides a unique advantage over any other programs which perform a similar function. In addition, the MixtureTree Annotator is the only package that can efficiently annotate the output produced by MixtureTree with mutation information and coalescent time information. In order to visualize the resulting output file, a modified version of FigTree is used. Certain popular methods, which lack good built-in visualization tools, for example, MEGA, Mesquite, PHY-FI, TreeView, treeGraph and Geneious, may give results with human errors due to either manually adding colors to each node or with other limitations, for example only using color based on a number, such as branch length, or by taxonomy. In addition to allowing the user to automatically color any given Newick tree by sequence name, the MixtureTree Annotator is the only method that allows the user to automatically annotate the resulting tree created by the MixtureTree program. The MixtureTree Annotator is fast and easy-to-use, while still allowing the user full control over the coloring and annotating process.

  12. What Makes a Tree a Tree?

    ERIC Educational Resources Information Center

    NatureScope, 1986

    1986-01-01

    Provides: (1) background information on trees, focusing on the parts of trees and how they differ from other plants; (2) eight activities; and (3) ready-to-copy pages dealing with tree identification and tree rings. Activities include objective(s), recommended age level(s), subject area(s), list of materials needed, and procedures. (JN)

  13. Extensions and applications of ensemble-of-trees methods in machine learning

    NASA Astrophysics Data System (ADS)

    Bleich, Justin

    Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of

  14. The gene tree delusion.

    PubMed

    Springer, Mark S; Gatesy, John

    2016-01-01

    Higher-level relationships among placental mammals are mostly resolved, but several polytomies remain contentious. Song et al. (2012) claimed to have resolved three of these using shortcut coalescence methods (MP-EST, STAR) and further concluded that these methods, which assume no within-locus recombination, are required to unravel deep-level phylogenetic problems that have stymied concatenation. Here, we reanalyze Song et al.'s (2012) data and leverage these re-analyses to explore key issues in systematics including the recombination ratchet, gene tree stoichiometry, the proportion of gene tree incongruence that results from deep coalescence versus other factors, and simulations that compare the performance of coalescence and concatenation methods in species tree estimation. Song et al. (2012) reported an average locus length of 3.1 kb for the 447 protein-coding genes in their phylogenomic dataset, but the true mean length of these loci (start codon to stop codon) is 139.6 kb. Empirical estimates of recombination breakpoints in primates, coupled with consideration of the recombination ratchet, suggest that individual coalescence genes (c-genes) approach ∼12 bp or less for Song et al.'s (2012) dataset, three to four orders of magnitude shorter than the c-genes reported by these authors. This result has general implications for the application of coalescence methods in species tree estimation. We contend that it is illogical to apply coalescence methods to complete protein-coding sequences. Such analyses amalgamate c-genes with different evolutionary histories (i.e., exons separated by >100,000 bp), distort true gene tree stoichiometry that is required for accurate species tree inference, and contradict the central rationale for applying coalescence methods to difficult phylogenetic problems. In addition, Song et al.'s (2012) dataset of 447 genes includes 21 loci with switched taxonomic names, eight duplicated loci, 26 loci with non-homologous sequences that are

  15. Understanding Boswellia papyrifera tree secondary metabolites through bark spectral analysis

    NASA Astrophysics Data System (ADS)

    Girma, Atkilt; Skidmore, Andrew K.; de Bie, C. A. J. M.; Bongers, Frans

    2015-07-01

    Decision makers are concerned whether to tap or rest Boswellia Papyrifera trees. Tapping for the production of frankincense is known to deplete carbon reserves from the tree leading to production of less viable seeds, tree carbon starvation and ultimately tree mortality. Decision makers use traditional experience without considering the amount of metabolites stored or depleted from the stem-bark of the tree. This research was designed to come up with a non-destructive B. papyrifera tree metabolite estimation technique relevant for management using spectroscopy. The concentration of biochemicals (metabolites) found in the tree bark was estimated through spectral analysis. Initially, a random sample of 33 trees was selected, the spectra of bark measured with an Analytical Spectral Device (ASD) spectrometer. Bark samples were air dried and ground. Then, 10 g of sample was soaked in Petroleum ether to extract crude metabolites. Further chemical analysis was conducted to quantify and isolate pure metabolite compounds such as incensole acetate and boswellic acid. The crude metabolites, which relate to frankincense produce, were compared to plant properties (such as diameter and crown area) and reflectance spectra of the bark. Moreover, the extract was compared to the ASD spectra using partial least square regression technique (PLSR) and continuum removed spectral analysis. The continuum removed spectral analysis were performed, on two wavelength regions (1275-1663 and 1836-2217) identified through PLSR, using absorption features such as band depth, area, position, asymmetry and the width to characterize and find relationship with the bark extracts. The results show that tree properties such as diameter at breast height (DBH) and the crown area of untapped and healthy trees were strongly correlated to the amount of stored crude metabolites. In addition, the PLSR technique applied to the first derivative transformation of the reflectance spectrum was found to estimate the

  16. Calculating a Stepwise Ridge Regression.

    ERIC Educational Resources Information Center

    Morris, John D.

    1986-01-01

    Although methods for using ordinary least squares regression computer programs to calculate a ridge regression are available, the calculation of a stepwise ridge regression requires a special purpose algorithm and computer program. The correct stepwise ridge regression procedure is given, and a parallel FORTRAN computer program is described.…

  17. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  18. Marginal longitudinal semiparametric regression via penalized splines

    PubMed Central

    Kadiri, M. Al; Carroll, R.J.; Wand, M.P.

    2010-01-01

    We study the marginal longitudinal nonparametric regression problem and some of its semiparametric extensions. We point out that, while several elaborate proposals for efficient estimation have been proposed, a relative simple and straightforward one, based on penalized splines, has not. After describing our approach, we then explain how Gibbs sampling and the BUGS software can be used to achieve quick and effective implementation. Illustrations are provided for nonparametric regression and additive models. PMID:21037941

  19. Decision Trees for Prediction and Data Mining

    DTIC Science & Technology

    2005-02-10

    ironic, as research in tree-structured methods was originally motivated by the desire for an interpretable alternative to standard methods such as...multiple linear regression and neural networks. Another problem with most tree construction algorithms is that their variable selection methods are biased...software, including well-known ones such as CART (Breiman, Friedman, Olshen and Stone 1984) and M5 (Quinlan 1992). With the excep- tion of the lesser

  20. Tree-augmented Cox proportional hazards models.

    PubMed

    Su, Xiaogang; Tsai, Chih-Ling

    2005-07-01

    We study a hybrid model that combines Cox proportional hazards regression with tree-structured modeling. The main idea is to use step functions, provided by a tree structure, to 'augment' Cox (1972) proportional hazards models. The proposed model not only provides a natural assessment of the adequacy of the Cox proportional hazards model but also improves its model fitting without loss of interpretability. Both simulations and an empirical example are provided to illustrate the use of the proposed method.

  1. Steganalysis using logistic regression

    NASA Astrophysics Data System (ADS)

    Lubenko, Ivans; Ker, Andrew D.

    2011-02-01

    We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.

  2. Atlas of United States Trees, Volume 2: Alaska Trees and Common Shrubs.

    ERIC Educational Resources Information Center

    Viereck, Leslie A.; Little, Elbert L., Jr.

    This volume is the second in a series of atlases describing the natural distribution or range of native tree species in the United States. The 82 species maps include 32 of trees in Alaska, 6 of shrubs rarely reaching tree size, and 44 more of common shrubs. More than 20 additional maps summarize environmental factors and furnish general…

  3. An Introduction to Tree-Structured Modeling with Application to Quality of Life (QOL) Data

    PubMed Central

    Su, Xiaogang; Azuero, Andres; Cho, June; Kvale, Elizabeth; Meneses, Karen M.; McNees, M. Patrick

    2011-01-01

    Background Investigators addressing nursing research are faced increasingly with the need to analyze data that involve variables of mixed types and are characterized by complex nonlinearity and interactions. Tree-based methods, also called recursive partitioning, are gaining popularity in various fields. In addition to efficiency and flexibility in handling multifaceted data, tree-based methods offer ease of interpretation. Objectives To introduce tree-based methods, discuss their advantages and pitfalls in application, and describe their potential use in nursing research. Method In this paper, (a) an introduction to tree-structured methods is presented, (b) the technique is illustrated via quality of life (QOL) data collected in the Breast Cancer Education Intervention (BCEI) study, and (c) implications for their potential use in nursing research are discussed. Discussion As illustrated by the QOL analysis example, tree methods generate interesting and easily understood findings that cannot be uncovered via traditional linear regression analysis. The expanding breadth and complexity of nursing research may entail the use of new tools to improve efficiency and gain new insights. In certain situations, tree-based methods offer an attractive approach that help address such needs. PMID:21720217

  4. Interactive effects of ambient ozone and climate measured on growth of mature loblolly pine trees

    SciTech Connect

    McLaughlin, S.B.; Downing, D.J.

    1995-02-01

    Analysis of the seasonal growth patterns of mature loblolly pine trees over the interval 1988-1993 has provided the first direct measurement of reductions of stem growth of large forest trees by ambient ozone. Patterns of stem expansion and contraction of 34 trees were examined in eastern Tennessee using serial measurements with sensitive dendrometer bind systems. Study sites varied in soil moisture, soil fertility, and stand density. Levels of ozone, rainfall, and temperature varied widely over the six year study interval. Regression analysis identified statistically and biologically significant influences of ozone on stem growth. Acting either individually or in interaction with high temperature and moisture stress, higher levels of ozone were associated with reduced stem expansion of individual trees within and across years. Observed responses to ozone were relatively rapid, differed widely among trees, and across years, and were significantly amplified by low soil moisture and high air temperatures. Both short term responses, clearly tied to changing stem water status, and longer term cumulative responses were identified. These data indicate that relatively low levels of ambient ozone can significantly reduce growth of mature forest trees and that interactions between ambient ozone and climate are likely to be important modifiers of future forest growth and function. Additional studies of mechanisms of short term response and inter species comparisons are clearly needed.

  5. Digression and Value Concatenation to Enable Privacy-Preserving Regression

    PubMed Central

    Li, Xiao-Bai; Sarkar, Sumit

    2015-01-01

    Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals’ sensitive data. This problem, which we call a “regression attack,” has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression, which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis. PMID:26752802

  6. The Tree Worker's Manual.

    ERIC Educational Resources Information Center

    Smithyman, S. J.

    This manual is designed to prepare students for entry-level positions as tree care professionals. Addressed in the individual chapters of the guide are the following topics: the tree service industry; clothing, eqiupment, and tools; tree workers; basic tree anatomy; techniques of pruning; procedures for climbing and working in the tree; aerial…

  7. Ridge regression processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.

  8. Through bore subsea christmas trees

    SciTech Connect

    Huber, D.S.; Simmers, G.F.C.; Johnson, C.S.

    1985-01-01

    The workovers of subsea completed wells are expensive and time consuming as even the most routine tasks must be carried out by a semi-submersible. This paper describes the economic, safety and operational advantages which led to the development and successful first installation of 'through bore' subsea production trees. The conventional wet subsea trees have proved to be very reliable over the past ten years of operation in the Argyll, Duncan and Innes fields, however the completion strings require pulling on the average about once every three to five years. The conventional subsea tree/tubing hanger set up design requires the tree to be tripped and a rig BOP stack run to pull the tubing. This operation is time consuming, very weather sensitive and leaves the well temporarily without a well control stack on the wellhead. The 7 1/16'' 'through bore' subsea tree was developed to minimize the tubing pulling workover time and several trees have been run successfully since the latter part of 1984. The time saving on a tubing pulling workover is three days. In addition, the design considerably reduces the hazards and equipment damage risk inherent in the conventional design. Hamilton Brothers and National Supply Company in Aberdeen designed the equipment which must be considered a new generation of subsea production trees.

  9. Environmental conditions for alternative tree cover states in high latitudes

    NASA Astrophysics Data System (ADS)

    Abis, Beniamino; Brovkin, Victor

    2016-04-01

    Previous analysis of the vegetation cover from remote sensing revealed the existence of three alternative modes in the frequency distribution of boreal tree cover: a sparsely vegetated treeless state, a savanna-like state, and a forest state. Identifying which are the regions subject to multimodality, and assessing which are the main factors underlying their existence, is important to project future change of natural vegetation cover and its effect on climate. We study the impact on the forest cover fraction distribution of seven globally-observed environmental factors: mean annual rainfall, mean minimum temperature, growing degree days above 0, permafrost distribution, soil moisture, wildfire occurrence frequency, and thawing depth. Through the use of generalised additive models, regression trees, and conditional histograms, we find that the main factors determining the forest distribution in high latitudes are: permafrost distribution, mean annual rainfall, mean minimum temperature, soil moisture, and wildfire frequency. Additionally, we find differences between regions within the boreal area, such as Eurasia, Eastern North America, and Western North America. Furthermore, using a classification based on these factors, we show the existence and location of alternative tree cover states under the same climate conditions in the boreal region. These are areas of potential interest for a more detailed analysis of land-atmosphere interactions.

  10. Training Tree Transducers

    DTIC Science & Technology

    2004-01-01

    trees (similar to the role played by the finite- state acceptor FSA for strings). We describe the version (equivalent to TSG ( Schabes , 1990)) where...strictly contained in tree sets of tree adjoining gram- mars (Joshi and Schabes , 1997). 4 Extended-LHS Tree Transducers (xR) Section 1 informally described...changes without modifying the training procedure, as long as we stick to tree automata. 10 Related Work Tree substitution grammars or TSG ( Schabes , 1990

  11. Non-phytoseiid Mesostigmata within citrus orchards in Florida: species distribution, relative and seasonal abundance within trees, associated vines and ground cover plants and additional collection records of mites in citrus orchards.

    PubMed

    Childers, Carl C; Ueckermann, Eduard A

    2015-03-01

    Seven citrus orchards on reduced- to no-pesticide spray programs in central and south central Florida were sampled for non-phytoseiid mesostigmatid mites. Inner and outer canopy leaves, fruits, twigs and trunk scrapings were sampled monthly between August 1994 and January 1996. Open flowers were sampled in March from five of the sites. A total of 431 samples from one or more of 82 vine or ground cover plants were sampled monthly in five of the seven orchards. Two of the seven orchards (Mixon I and II) were on full herbicide programs and vines and ground cover plants were absent. A total of 2,655 mites (26 species) within the families: Ascidae, Blattisociidae, Laelapidae, Macrochelidae, Melicharidae, Pachylaelapidae and Parasitidae were identified. A total of 685 mites in the genus Asca (nine species: family Ascidae) were collected from within tree samples, 79 from vine or ground cover plants. Six species of Blattisociidae were collected: Aceodromus convolvuli, Blattisocius dentriticus, B. keegani, Cheiroseius sp. near jamaicensis, Lasioseius athiashenriotae and L. dentatus. A total of 485 Blattisociidae were collected from within tree samples compared with 167 from vine or ground cover plants. Low numbers of Laelapidae and Macrochelidae were collected from within tree samples. One Zygoseius furciger (Pachylaelapidae) was collected from Eleusine indica. Four species of Melicharidae were identified from 34 mites collected from within tree samples and 1,190 from vine or ground cover plants: Proctolaelaps lobatus was the most abundant species with 1,177 specimens collected from seven ground cover plants. One Phorytocarpais fimetorum (Parasitidae) was collected from inner leaves and four from twigs. Species of Ascidae, Blattisociidae, Melicharidae, Laelapidae and Pachylaelapidae were collected from 31 of the 82 vine or ground cover plants sampled, representing only a small fraction of the total number of Phytoseiidae collected from the same plants. Including the

  12. The influence of tree morphology on stemflow generation in a tropical lowland rainforest

    NASA Astrophysics Data System (ADS)

    Uber, Magdalena; Levia, Delphis F.; Zimmermann, Beate; Zimmermann, Alexander

    2014-05-01

    Even though stemflow usually accounts for only a small proportion of rainfall, it is an important point source of water and ion input to forest floors and may, for instance, influence soil moisture patterns and groundwater recharge. Previous studies showed that the generation of stemflow depends on a multitude of meteorological and biological factors. Interestingly, despite the tremendous progress in stemflow research during the last decades it is still largely unknown which combination of tree characteristics determines stemflow volumes in species-rich tropical forests. This knowledge gap motivated us to analyse the influence of tree characteristics on stemflow volumes in a 1 hectare plot located in a Panamanian lowland rainforest. Our study comprised stemflow measurements in six randomly selected 10 m by 10 m subplots. In each subplot we measured stemflow of all trees with a diameter at breast height (DBH) > 5 cm on an event-basis for a period of six weeks. Additionally, we identified all tree species and determined a set of tree characteristics including DBH, crown diameter, bark roughness, bark furrowing, epiphyte coverage, tree architecture, stem inclination, and crown position. During the sampling period, we collected 985 L of stemflow (0.98 % of total rainfall). Based on regression analyses and comparisons among plant functional groups we show that palms were most efficient in yielding stemflow due to their large inclined fronds. Trees with large emergent crowns also produced relatively large amounts of stemflow. Due to their abundance, understory trees contribute much to stemflow yield not on individual but on the plot scale. Even though parameters such as crown diameter, branch inclination and position of the crown influence stemflow generation to some extent, these parameters explain less than 30 % of the variation in stemflow volumes. In contrast to published results from temperate forests, we did not detect a negative correlation between bark roughness

  13. Insert tree completion system

    SciTech Connect

    Brands, K.W.; Ball, I.G.; Cegielski, E.J.; Gresham, J.S.; Saunders, D.N.

    1982-09-01

    This paper outlines the overall project for development and installation of a low-profile, caisson-installed subsea Christmas tree. After various design studies and laboratory and field tests of key components, a system for installation inside a 30-in. conductor was ordered in July 1978 from Cameron Iron Works Inc. The system is designed to have all critical-pressure-containing components below the mudline and, with the reduced profile (height) above seabed, provides for improved safety of satellite underwater wells from damage by anchors, trawl boards, and even icebergs. In addition to the innovative nature of the tree design, the completion includes improved 3 1/2-in. through flowline (TFL) pumpdown completion equipment with deep set safety valves and a dual detachable packer head for simplified workover capability. The all-hydraulic control system incorporates a new design of sequencing valve for both Christmas tree control and remote flowline connection. A semisubmersible drilling rig was used to initiate the first end flowline connection at the wellhead for subsequent tie-in to the prelaid, surface-towed, all-welded subsea pipeline bundle.

  14. Categorizing ideas about trees: a tree of trees.

    PubMed

    Fisler, Marie; Lecointre, Guillaume

    2013-01-01

    The aim of this study is to explore whether matrices and MP trees used to produce systematic categories of organisms could be useful to produce categories of ideas in history of science. We study the history of the use of trees in systematics to represent the diversity of life from 1766 to 1991. We apply to those ideas a method inspired from coding homologous parts of organisms. We discretize conceptual parts of ideas, writings and drawings about trees contained in 41 main writings; we detect shared parts among authors and code them into a 91-characters matrix and use a tree representation to show who shares what with whom. In other words, we propose a hierarchical representation of the shared ideas about trees among authors: this produces a "tree of trees." Then, we categorize schools of tree-representations. Classical schools like "cladists" and "pheneticists" are recovered but others are not: "gradists" are separated into two blocks, one of them being called here "grade theoreticians." We propose new interesting categories like the "buffonian school," the "metaphoricians," and those using "strictly genealogical classifications." We consider that networks are not useful to represent shared ideas at the present step of the study. A cladogram is made for showing who is sharing what with whom, but also heterobathmy and homoplasy of characters. The present cladogram is not modelling processes of transmission of ideas about trees, and here it is mostly used to test for proximity of ideas of the same age and for categorization.

  15. Tree Tectonics

    NASA Astrophysics Data System (ADS)

    Vogt, Peter R.

    2004-09-01

    Nature often replicates her processes at different scales of space and time in differing media. Here a tree-trunk cross section I am preparing for a dendrochronological display at the Battle Creek Cypress Swamp Nature Sanctuary (Calvert County, Maryland) dried and cracked in a way that replicates practically all the planform features found along the Mid-Oceanic Ridge (see Figure 1). The left-lateral offset of saw marks, contrasting with the right-lateral ``rift'' offset, even illustrates the distinction between transcurrent (strike-slip) and transform faults, the latter only recognized as a geologic feature, by J. Tuzo Wilson, in 1965. However, wood cracking is but one of many examples of natural processes that replicate one or several elements of lithospheric plate tectonics. Many of these examples occur in everyday venues and thus make great teaching aids, ``teachable'' from primary school to university levels. Plate tectonics, the dominant process of Earth geology, also occurs in miniature on the surface of some lava lakes, and as ``ice plate tectonics'' on our frozen seas and lakes. Ice tectonics also happens at larger spatial and temporal scales on the Jovian moons Europa and perhaps Ganymede. Tabletop plate tectonics, in which a molten-paraffin ``asthenosphere'' is surfaced by a skin of congealing wax ``plates,'' first replicated Mid-Oceanic Ridge type seafloor spreading more than three decades ago. A seismologist (J. Brune, personal communication, 2004) discovered wax plate tectonics by casually and serendipitously pulling a stick across a container of molten wax his wife and daughters had used in making candles. Brune and his student D. Oldenburg followed up and mirabile dictu published the results in Science (178, 301-304).

  16. Does Gene Tree Discordance Explain the Mismatch between Macroevolutionary Models and Empirical Patterns of Tree Shape and Branching Times?

    PubMed Central

    Stadler, Tanja; Degnan, James H.; Rosenberg, Noah A.

    2016-01-01

    Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth–death and multispecies coalescent model can explain the difference in empirical trees and birth–death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion. PMID:26968785

  17. The Needs of Trees

    ERIC Educational Resources Information Center

    Boyd, Amy E.; Cooper, Jim

    2004-01-01

    Tree rings can be used not only to look at plant growth, but also to make connections between plant growth and resource availability. In this lesson, students in 2nd-4th grades use role-play to become familiar with basic requirements of trees and how availability of those resources is related to tree ring sizes and tree growth. These concepts can…

  18. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  19. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  20. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  1. Censored partial regression.

    PubMed

    Orbe, Jesus; Ferreira, Eva; Núñez-Antón, Vicente

    2003-01-01

    In this work we study the effect of several covariates on a censored response variable with unknown probability distribution. A semiparametric model is proposed to consider situations where the functional form of the effect of one or more covariates is unknown, as is the case in the application presented in this work. We provide its estimation procedure and, in addition, a bootstrap technique to make inference on the parameters. A simulation study has been carried out to show the good performance of the proposed estimation process and to analyse the effect of the censorship. Finally, we present the results when the methodology is applied to AIDS diagnosed patients.

  2. 77 FR 22663 - Asian Longhorned Beetle; Additions to Quarantined Areas in Massachusetts

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-17

    ... trees. It attacks many healthy hardwood trees, including maple, horse chestnut, birch, poplar, willow, and elm. In addition, nursery stock, logs, green lumber, firewood, stumps, roots, branches, and...

  3. Tea tree oil.

    PubMed

    Hartford, Orville; Zug, Kathryn A

    2005-09-01

    Tea tree oil is a popular ingredient in many over-the-counter healthcare and cosmetic products. With the explosion of the natural and alternative medicine industry, more and more people are using products containing tea tree oil. This article reviews basic information about tea tree oil and contact allergy, including sources of tea tree oil, chemical composition, potential cross reactions, reported cases of allergic contact dermatitis, allergenic compounds in tea tree oil, practical patch testing information, and preventive measures.

  4. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  5. Error bounds in cascading regressions

    USGS Publications Warehouse

    Karlinger, M.R.; Troutman, B.M.

    1985-01-01

    Cascading regressions is a technique for predicting a value of a dependent variable when no paired measurements exist to perform a standard regression analysis. Biases in coefficients of a cascaded-regression line as well as error variance of points about the line are functions of the correlation coefficient between dependent and independent variables. Although this correlation cannot be computed because of the lack of paired data, bounds can be placed on errors through the required properties of the correlation coefficient. The potential meansquared error of a cascaded-regression prediction can be large, as illustrated through an example using geomorphologic data. ?? 1985 Plenum Publishing Corporation.

  6. The Allometry of Coarse Root Biomass: Log-Transformed Linear Regression or Nonlinear Regression?

    PubMed Central

    Lai, Jiangshan; Yang, Bo; Lin, Dunmei; Kerkhoff, Andrew J.; Ma, Keping

    2013-01-01

    Precise estimation of root biomass is important for understanding carbon stocks and dynamics in forests. Traditionally, biomass estimates are based on allometric scaling relationships between stem diameter and coarse root biomass calculated using linear regression (LR) on log-transformed data. Recently, it has been suggested that nonlinear regression (NLR) is a preferable fitting method for scaling relationships. But while this claim has been contested on both theoretical and empirical grounds, and statistical methods have been developed to aid in choosing between the two methods in particular cases, few studies have examined the ramifications of erroneously applying NLR. Here, we use direct measurements of 159 trees belonging to three locally dominant species in east China to compare the LR and NLR models of diameter-root biomass allometry. We then contrast model predictions by estimating stand coarse root biomass based on census data from the nearby 24-ha Gutianshan forest plot and by testing the ability of the models to predict known root biomass values measured on multiple tropical species at the Pasoh Forest Reserve in Malaysia. Based on likelihood estimates for model error distributions, as well as the accuracy of extrapolative predictions, we find that LR on log-transformed data is superior to NLR for fitting diameter-root biomass scaling models. More importantly, inappropriately using NLR leads to grossly inaccurate stand biomass estimates, especially for stands dominated by smaller trees. PMID:24116197

  7. Trees grow on money: urban tree canopy cover and environmental justice.

    PubMed

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G; Zhou, Weiqi; McHale, Melissa; Grove, J Morgan; O'Neil-Dunne, Jarlath; McFadden, Joseph P; Buckley, Geoffrey L; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman's correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns.

  8. Trees Grow on Money: Urban Tree Canopy Cover and Environmental Justice

    PubMed Central

    Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G.; Zhou, Weiqi; McHale, Melissa; Grove, J. Morgan; O’Neil-Dunne, Jarlath; McFadden, Joseph P.; Buckley, Geoffrey L.; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L.

    2015-01-01

    This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman’s correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns. PMID:25830303

  9. National assessment of Tree City USA participation

    EPA Pesticide Factsheets

    Tree City USA is a national program that recognizes municipal commitment to community forestry. In return for meeting program requirements, Tree City USA participants expect social, economic, and/or environmental benefits. Understanding the geographic distribution and socioeconomic characteristics of Tree City USA communities at the national scale can offer insights into the motivations or barriers to program participation, and provide context for community forestry research at finer scales. In this study, researchers assessed patterns in Tree City USA participation for all U.S. communities with more than 2,500 people according to geography, community population size, and socioeconomic characteristics, such as income, education, and race. Nationally, 23.5% of communities studied were Tree City USA participants, and this accounted for 53.9% of the total population in these communities. Tree City USA participation rates varied substantially by U.S. region, but in each region participation rates were higher in larger communities, and long-term participants tended to be larger communities than more recent enrollees. In logistic regression models, owner occupancy rates were significant negative predictors of Tree City USA participation, education and percent white population were positive predictors in many U.S. regions, and inconsistent patterns were observed for income and population age. The findings indicate that communities with smaller populations, lower educat

  10. On regression adjustment for the propensity score.

    PubMed

    Vansteelandt, S; Daniel, R M

    2014-10-15

    Propensity scores are widely adopted in observational research because they enable adjustment for high-dimensional confounders without requiring models for their association with the outcome of interest. The results of statistical analyses based on stratification, matching or inverse weighting by the propensity score are therefore less susceptible to model extrapolation than those based solely on outcome regression models. This is attractive because extrapolation in outcome regression models may be alarming, yet difficult to diagnose, when the exposed and unexposed individuals have very different covariate distributions. Standard regression adjustment for the propensity score forms an alternative to the aforementioned propensity score methods, but the benefits of this are less clear because it still involves modelling the outcome in addition to the propensity score. In this article, we develop novel insights into the properties of this adjustment method. We demonstrate that standard tests of the null hypothesis of no exposure effect (based on robust variance estimators), as well as particular standardised effects obtained from such adjusted regression models, are robust against misspecification of the outcome model when a propensity score model is correctly specified; they are thus not vulnerable to the aforementioned problem of extrapolation. We moreover propose efficient estimators for these standardised effects, which retain a useful causal interpretation even when the propensity score model is misspecified, provided the outcome regression model is correctly specified.

  11. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  12. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  13. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes

    PubMed Central

    Petermann, Jana S.; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W.; Gossner, Martin M.

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  14. Forest Management Intensity Affects Aquatic Communities in Artificial Tree Holes.

    PubMed

    Petermann, Jana S; Rohland, Anja; Sichardt, Nora; Lade, Peggy; Guidetti, Brenda; Weisser, Wolfgang W; Gossner, Martin M

    2016-01-01

    Forest management could potentially affect organisms in all forest habitats. However, aquatic communities in water-filled tree-holes may be especially sensitive because of small population sizes, the risk of drought and potential dispersal limitation. We set up artificial tree holes in forest stands subject to different management intensities in two regions in Germany and assessed the influence of local environmental properties (tree-hole opening type, tree diameter, water volume and water temperature) as well as regional drivers (forest management intensity, tree-hole density) on tree-hole insect communities (not considering other organisms such as nematodes or rotifers), detritus content, oxygen and nutrient concentrations. In addition, we compared data from artificial tree holes with data from natural tree holes in the same area to evaluate the methodological approach of using tree-hole analogues. We found that forest management had strong effects on communities in artificial tree holes in both regions and across the season. Abundance and species richness declined, community composition shifted and detritus content declined with increasing forest management intensity. Environmental variables, such as tree-hole density and tree diameter partly explained these changes. However, dispersal limitation, indicated by effects of tree-hole density, generally showed rather weak impacts on communities. Artificial tree holes had higher water temperatures (on average 2°C higher) and oxygen concentrations (on average 25% higher) than natural tree holes. The abundance of organisms was higher but species richness was lower in artificial tree holes. Community composition differed between artificial and natural tree holes. Negative management effects were detectable in both tree-hole systems, despite their abiotic and biotic differences. Our results indicate that forest management has substantial and pervasive effects on tree-hole communities and may alter their structure and

  15. Categorizing Ideas about Trees: A Tree of Trees

    PubMed Central

    Fisler, Marie; Lecointre, Guillaume

    2013-01-01

    The aim of this study is to explore whether matrices and MP trees used to produce systematic categories of organisms could be useful to produce categories of ideas in history of science. We study the history of the use of trees in systematics to represent the diversity of life from 1766 to 1991. We apply to those ideas a method inspired from coding homologous parts of organisms. We discretize conceptual parts of ideas, writings and drawings about trees contained in 41 main writings; we detect shared parts among authors and code them into a 91-characters matrix and use a tree representation to show who shares what with whom. In other words, we propose a hierarchical representation of the shared ideas about trees among authors: this produces a “tree of trees.” Then, we categorize schools of tree-representations. Classical schools like “cladists” and “pheneticists” are recovered but others are not: “gradists” are separated into two blocks, one of them being called here “grade theoreticians.” We propose new interesting categories like the “buffonian school,” the “metaphoricians,” and those using “strictly genealogical classifications.” We consider that networks are not useful to represent shared ideas at the present step of the study. A cladogram is made for showing who is sharing what with whom, but also heterobathmy and homoplasy of characters. The present cladogram is not modelling processes of transmission of ideas about trees, and here it is mostly used to test for proximity of ideas of the same age and for categorization. PMID:23950877

  16. Methane Emissions from Upland Trees

    NASA Astrophysics Data System (ADS)

    Pitz, S.; Megonigal, P.; Schile, L. M.; Szlavecz, K. A.; King, K.

    2013-12-01

    Most work on methane (CH4) emissions from natural ecosystems has focused on wetlands and wetland soils because they are predictable emitters and relatively simple to quantify. Less attention has been directed toward upland ecosystems that cover far larger areas, but are assumed to be too dry to emit CH4. There is abundant evidence that upland ecosystems emit small amounts of CH4 during hot moments that collectively constitute a significant source in the global budget of this potent greenhouse gas. We have established two transects across natural moisture gradients in two forests near Annapolis, Maryland. Both tree and soil methane fluxes were measured using chamber methods. Each tree chamber was custom fit to the stem near the base. In addition, porewater methane concentrations were collected at multiple depths near trees. Abiotic parameters such as soil temperature, soil moisture, water potential, and depth to groundwater were monitored using a wireless sensor network. Upland emissions from tree stems were as high as 14.6 umoles CH4 m-2 hr-1 while the soil uptake was -1.5 umoles CH4 m-2 hr-1. These results demonstrate that tree methane emissions and soil methane uptake can occur simultaneously in a mesic forest. Factors controlling methane emissions were soil temperature, soil moisture, and depth to groundwater. Based on our preliminary data, tree mediated methane emissions may be offsetting the soil methane sink of upland forests by 20 to 30%. Future methane budgets and climate models will need to include tree fluxes and the parameters that control methane emissions for accurate accounting and predictions.

  17. Category of trees in representation theory of quantum algebras

    SciTech Connect

    Moskaliuk, N. M.; Moskaliuk, S. S.

    2013-10-15

    New applications of categorical methods are connected with new additional structures on categories. One of such structures in representation theory of quantum algebras, the category of Kuznetsov-Smorodinsky-Vilenkin-Smirnov (KSVS) trees, is constructed, whose objects are finite rooted KSVS trees and morphisms generated by the transition from a KSVS tree to another one.

  18. Rank regression: an alternative regression approach for data with outliers.

    PubMed

    Chen, Tian; Tang, Wan; Lu, Ying; Tu, Xin

    2014-10-01

    Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.

  19. Evolution of tree nutrition.

    PubMed

    Raven, John A; Andrews, Mitchell

    2010-09-01

    Using a broad definition of trees, the evolutionary origins of trees in a nutritional context is considered using data from the fossil record and molecular phylogeny. Trees are first known from the Late Devonian about 380 million years ago, originated polyphyletically at the pteridophyte grade of organization; the earliest gymnosperms were trees, and trees are polyphyletic in the angiosperms. Nutrient transporters, assimilatory pathways, homoiohydry (cuticle, intercellular gas spaces, stomata, endohydric water transport systems including xylem and phloem-like tissue) and arbuscular mycorrhizas preceded the origin of trees. Nutritional innovations that began uniquely in trees were the seed habit and, certainly (but not necessarily uniquely) in trees, ectomycorrhizas, cyanobacterial, actinorhizal and rhizobial (Parasponia, some legumes) diazotrophic symbioses and cluster roots.

  20. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  1. Decision Tree Approach for Soil Liquefaction Assessment

    PubMed Central

    Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498

  2. Decision tree approach for soil liquefaction assessment.

    PubMed

    Gandomi, Amir H; Fridline, Mark M; Roke, David A

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view.

  3. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  4. A tutorial on Bayesian Normal linear regression

    NASA Astrophysics Data System (ADS)

    Klauenberg, Katy; Wübbeler, Gerd; Mickan, Bodo; Harris, Peter; Elster, Clemens

    2015-12-01

    Regression is a common task in metrology and often applied to calibrate instruments, evaluate inter-laboratory comparisons or determine fundamental constants, for example. Yet, a regression model cannot be uniquely formulated as a measurement function, and consequently the Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements are not applicable directly. Bayesian inference, however, is well suited to regression tasks, and has the advantage of accounting for additional a priori information, which typically robustifies analyses. Furthermore, it is anticipated that future revisions of the GUM shall also embrace the Bayesian view. Guidance on Bayesian inference for regression tasks is largely lacking in metrology. For linear regression models with Gaussian measurement errors this tutorial gives explicit guidance. Divided into three steps, the tutorial first illustrates how a priori knowledge, which is available from previous experiments, can be translated into prior distributions from a specific class. These prior distributions have the advantage of yielding analytical, closed form results, thus avoiding the need to apply numerical methods such as Markov Chain Monte Carlo. Secondly, formulas for the posterior results are given, explained and illustrated, and software implementations are provided. In the third step, Bayesian tools are used to assess the assumptions behind the suggested approach. These three steps (prior elicitation, posterior calculation, and robustness to prior uncertainty and model adequacy) are critical to Bayesian inference. The general guidance given here for Normal linear regression tasks is accompanied by a simple, but real-world, metrological example. The calibration of a flow device serves as a running example and illustrates the three steps. It is shown that prior knowledge from previous calibrations of the same sonic nozzle enables robust predictions even for extrapolations.

  5. Modeling confounding by half-sibling regression

    PubMed Central

    Schölkopf, Bernhard; Hogg, David W.; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-01-01

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as “half-sibling regression,” is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application. PMID:27382154

  6. Modeling confounding by half-sibling regression.

    PubMed

    Schölkopf, Bernhard; Hogg, David W; Wang, Dun; Foreman-Mackey, Daniel; Janzing, Dominik; Simon-Gabriel, Carl-Johann; Peters, Jonas

    2016-07-05

    We describe a method for removing the effect of confounders to reconstruct a latent quantity of interest. The method, referred to as "half-sibling regression," is inspired by recent work in causal inference using additive noise models. We provide a theoretical justification, discussing both independent and identically distributed as well as time series data, respectively, and illustrate the potential of the method in a challenging astronomy application.

  7. Illumination Under Trees

    SciTech Connect

    Max, N

    2002-08-19

    This paper is a survey of the author's work on illumination and shadows under trees, including the effects of sky illumination, sun penumbras, scattering in a misty atmosphere below the trees, and multiple scattering and transmission between leaves. It also describes a hierarchical image-based rendering method for trees.

  8. The Wish Tree Project

    ERIC Educational Resources Information Center

    Brooks, Sarah DeWitt

    2010-01-01

    This article describes the author's experience in implementing a Wish Tree project in her school in an effort to bring the school community together with a positive art-making experience during a potentially stressful time. The concept of a wish tree is simple: plant a tree; provide tags and pencils for writing wishes; and encourage everyone to…

  9. Diary of a Tree.

    ERIC Educational Resources Information Center

    Srulowitz, Frances

    1992-01-01

    Describes an activity to develop students' skills of observation and recordkeeping by studying the growth of a tree's leaves during the spring. Children monitor the growth of 11 tress over a 2-month period, draw pictures of the tree at different stages of growth, and write diaries of the tree's growth. (MDH)

  10. Minnesota's Forest Trees. Revised.

    ERIC Educational Resources Information Center

    Miles, William R.; Fuller, Bruce L.

    This bulletin describes 46 of the more common trees found in Minnesota's forests and windbreaks. The bulletin contains two tree keys, a summer key and a winter key, to help the reader identify these trees. Besides the two keys, the bulletin includes an introduction, instructions for key use, illustrations of leaf characteristics and twig…

  11. Winter Birch Trees

    ERIC Educational Resources Information Center

    Sweeney, Debra; Rounds, Judy

    2011-01-01

    Trees are great inspiration for artists. Many art teachers find themselves inspired and maybe somewhat obsessed with the natural beauty and elegance of the lofty tree, and how it changes through the seasons. One such tree that grows in several regions and always looks magnificent, regardless of the time of year, is the birch. In this article, the…

  12. Food additives

    PubMed Central

    Spencer, Michael

    1974-01-01

    Food additives are discussed from the food technology point of view. The reasons for their use are summarized: (1) to protect food from chemical and microbiological attack; (2) to even out seasonal supplies; (3) to improve their eating quality; (4) to improve their nutritional value. The various types of food additives are considered, e.g. colours, flavours, emulsifiers, bread and flour additives, preservatives, and nutritional additives. The paper concludes with consideration of those circumstances in which the use of additives is (a) justified and (b) unjustified. PMID:4467857

  13. Multiple Regression and Its Discontents

    ERIC Educational Resources Information Center

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  14. Model building in nonproportional hazard regression.

    PubMed

    Rodríguez-Girondo, Mar; Kneib, Thomas; Cadarso-Suárez, Carmen; Abu-Assi, Emad

    2013-12-30

    Recent developments of statistical methods allow for a very flexible modeling of covariates affecting survival times via the hazard rate, including also the inspection of possible time-dependent associations. Despite their immediate appeal in terms of flexibility, these models typically introduce additional difficulties when a subset of covariates and the corresponding modeling alternatives have to be chosen, that is, for building the most suitable model for given data. This is particularly true when potentially time-varying associations are given. We propose to conduct a piecewise exponential representation of the original survival data to link hazard regression with estimation schemes based on of the Poisson likelihood to make recent advances for model building in exponential family regression accessible also in the nonproportional hazard regression context. A two-stage stepwise selection approach, an approach based on doubly penalized likelihood, and a componentwise functional gradient descent approach are adapted to the piecewise exponential regression problem. These three techniques were compared via an intensive simulation study. An application to prognosis after discharge for patients who suffered a myocardial infarction supplements the simulation to demonstrate the pros and cons of the approaches in real data analyses.

  15. Distributed Contour Trees

    SciTech Connect

    Morozov, Dmitriy; Weber, Gunther H.

    2014-03-31

    Topological techniques provide robust tools for data analysis. They are used, for example, for feature extraction, for data de-noising, and for comparison of data sets. This chapter concerns contour trees, a topological descriptor that records the connectivity of the isosurfaces of scalar functions. These trees are fundamental to analysis and visualization of physical phenomena modeled by real-valued measurements. We study the parallel analysis of contour trees. After describing a particular representation of a contour tree, called local{global representation, we illustrate how di erent problems that rely on contour trees can be solved in parallel with minimal communication.

  16. Functional Generalized Additive Models.

    PubMed

    McLean, Mathew W; Hooker, Giles; Staicu, Ana-Maria; Scheipl, Fabian; Ruppert, David

    2014-01-01

    We introduce the functional generalized additive model (FGAM), a novel regression model for association studies between a scalar response and a functional predictor. We model the link-transformed mean response as the integral with respect to t of F{X(t), t} where F(·,·) is an unknown regression function and X(t) is a functional covariate. Rather than having an additive model in a finite number of principal components as in Müller and Yao (2008), our model incorporates the functional predictor directly and thus our model can be viewed as the natural functional extension of generalized additive models. We estimate F(·,·) using tensor-product B-splines with roughness penalties. A pointwise quantile transformation of the functional predictor is also considered to ensure each tensor-product B-spline has observed data on its support. The methods are evaluated using simulated data and their predictive performance is compared with other competing scalar-on-function regression alternatives. We illustrate the usefulness of our approach through an application to brain tractography, where X(t) is a signal from diffusion tensor imaging at position, t, along a tract in the brain. In one example, the response is disease-status (case or control) and in a second example, it is the score on a cognitive test. R code for performing the simulations and fitting the FGAM can be found in supplemental materials available online.

  17. Trees, soils, and food security

    PubMed Central

    Sanchez, P. A.; Buresh, R. J.; Leakey, R. R. B.

    1997-01-01

    Trees have a different impact on soil properties than annual crops, because of their longer residence time, larger biomass accumulation, and longer-lasting, more extensive root systems. In natural forests nutrients are efficiently cycled with very small inputs and outputs from the system. In most agricultural systems the opposite happens. Agroforestry encompasses the continuum between these extremes, and emerging hard data is showing that successful agroforestry systems increase nutrient inputs, enhance internal flows, decrease nutrient losses and provide environmental benefits: when the competition for growth resources between the tree and the crop component is well managed. The three main determinants for overcoming rural poverty in Africa are (i) reversing soil fertility depletion, (ii) intensifying and diversifying land use with high-value products, and (iii) providing an enabling policy environment for the smallholder farming sector. Agroforestry practices can improve food production in a sustainable way through their contribution to soil fertility replenishment. The use of organic inputs as a source of biologically-fixed nitrogen, together with deep nitrate that is captured by trees, plays a major role in nitrogen replenishment. The combination of commercial phosphorus fertilizers with available organic resources may be the key to increasing and sustaining phosphorus capital. High-value trees, 'Cinderella' species, can fit in specific niches on farms, thereby making the system ecologically stable and more rewarding economically, in addition to diversifying and increasing rural incomes and improving food security. In the most heavily populated areas of East Africa, where farm size is extremely small, the number of trees on farms is increasing as farmers seek to reduce labour demands, compatible with the drift of some members of the family into the towns to earn off-farm income. Contrary to the concept that population pressure promotes deforestation, there is

  18. Growth of a Pine Tree

    ERIC Educational Resources Information Center

    Rollinson, Susan Wells

    2012-01-01

    The growth of a pine tree is examined by preparing "tree cookies" (cross-sectional disks) between whorls of branches. The use of Christmas trees allows the tree cookies to be obtained with inexpensive, commonly available tools. Students use the tree cookies to investigate the annual growth of the tree and how it corresponds to the number of whorls…

  19. Efficient tree codes on SIMD computer architectures

    NASA Astrophysics Data System (ADS)

    Olson, Kevin M.

    1996-11-01

    This paper describes changes made to a previous implementation of an N -body tree code developed for a fine-grained, SIMD computer architecture. These changes include (1) switching from a balanced binary tree to a balanced oct tree, (2) addition of quadrupole corrections, and (3) having the particles search the tree in groups rather than individually. An algorithm for limiting errors is also discussed. In aggregate, these changes have led to a performance increase of over a factor of 10 compared to the previous code. For problems several times larger than the processor array, the code now achieves performance levels of ~ 1 Gflop on the Maspar MP-2 or roughly 20% of the quoted peak performance of this machine. This percentage is competitive with other parallel implementations of tree codes on MIMD architectures. This is significant, considering the low relative cost of SIMD architectures.

  20. Estimating peak flow characteristics at ungaged sites by ridge regression

    USGS Publications Warehouse

    Tasker, Gary D.

    1982-01-01

    A regression simulation model, is combined with a multisite streamflow generator to simulate a regional regression of 50-year peak discharge against a set of basin characteristics. Monte Carlo experiments are used to compare the unbiased ordinary lease squares parameter estimator with Hoerl and Kennard's (1970a) ridge estimator in which the biasing parameter is that proposed by Hoerl, Kennard, and Baldwin (1975). The simulation results indicate a substantial improvement in parameter estimation using ridge regression when the correlation between basin characteristics is more than about 0.90. In addition, results indicate a strong potential for improving the mean square error of prediction of a peak-flow characteristic versus basin characteristics regression model when the basin characteristics are approximately colinear. The simulation covers a range of regression parameters, streamflow statistics, and basin characteristics commonly found in regional regression studies.

  1. IND - THE IND DECISION TREE PACKAGE

    NASA Technical Reports Server (NTRS)

    Buntine, W.

    1994-01-01

    A common approach to supervised classification and prediction in artificial intelligence and statistical pattern recognition is the use of decision trees. A tree is "grown" from data using a recursive partitioning algorithm to create a tree which has good prediction of classes on new data. Standard algorithms are CART (by Breiman Friedman, Olshen and Stone) and ID3 and its successor C4 (by Quinlan). As well as reimplementing parts of these algorithms and offering experimental control suites, IND also introduces Bayesian and MML methods and more sophisticated search in growing trees. These produce more accurate class probability estimates that are important in applications like diagnosis. IND is applicable to most data sets consisting of independent instances, each described by a fixed length vector of attribute values. An attribute value may be a number, one of a set of attribute specific symbols, or it may be omitted. One of the attributes is delegated the "target" and IND grows trees to predict the target. Prediction can then be done on new data or the decision tree printed out for inspection. IND provides a range of features and styles with convenience for the casual user as well as fine-tuning for the advanced user or those interested in research. IND can be operated in a CART-like mode (but without regression trees, surrogate splits or multivariate splits), and in a mode like the early version of C4. Advanced features allow more extensive search, interactive control and display of tree growing, and Bayesian and MML algorithms for tree pruning and smoothing. These often produce more accurate class probability estimates at the leaves. IND also comes with a comprehensive experimental control suite. IND consists of four basic kinds of routines: data manipulation routines, tree generation routines, tree testing routines, and tree display routines. The data manipulation routines are used to partition a single large data set into smaller training and test sets. The

  2. XRA image segmentation using regression

    NASA Astrophysics Data System (ADS)

    Jin, Jesse S.

    1996-04-01

    Segmentation is an important step in image analysis. Thresholding is one of the most important approaches. There are several difficulties in segmentation, such as automatic selecting threshold, dealing with intensity distortion and noise removal. We have developed an adaptive segmentation scheme by applying the Central Limit Theorem in regression. A Gaussian regression is used to separate the distribution of background from foreground in a single peak histogram. The separation will help to automatically determine the threshold. A small 3 by 3 widow is applied and the modal of the local histogram is used to overcome noise. Thresholding is based on local weighting, where regression is used again for parameter estimation. A connectivity test is applied to the final results to remove impulse noise. We have applied the algorithm to x-ray angiogram images to extract brain arteries. The algorithm works well for single peak distribution where there is no valley in the histogram. The regression provides a method to apply knowledge in clustering. Extending regression for multiple-level segmentation needs further investigation.

  3. Demonstration of a Fiber Optic Regression Probe

    NASA Technical Reports Server (NTRS)

    Korman, Valentin; Polzin, Kurt A.

    2010-01-01

    The capability to provide localized, real-time monitoring of material regression rates in various applications has the potential to provide a new stream of data for development testing of various components and systems, as well as serving as a monitoring tool in flight applications. These applications include, but are not limited to, the regression of a combusting solid fuel surface, the ablation of the throat in a chemical rocket or the heat shield of an aeroshell, and the monitoring of erosion in long-life plasma thrusters. The rate of regression in the first application is very fast, while the second and third are increasingly slower. A recent fundamental sensor development effort has led to a novel regression, erosion, and ablation sensor technology (REAST). The REAST sensor allows for measurement of real-time surface erosion rates at a discrete surface location. The sensor is optical, using two different, co-located fiber-optics to perform the regression measurement. The disparate optical transmission properties of the two fiber-optics makes it possible to measure the regression rate by monitoring the relative light attenuation through the fibers. As the fibers regress along with the parent material in which they are embedded, the relative light intensities through the two fibers changes, providing a measure of the regression rate. The optical nature of the system makes it relatively easy to use in a variety of harsh, high temperature environments, and it is also unaffected by the presence of electric and magnetic fields. In addition, the sensor could be used to perform optical spectroscopy on the light emitted by a process and collected by fibers, giving localized measurements of various properties. The capability to perform an in-situ measurement of material regression rates is useful in addressing a variety of physical issues in various applications. An in-situ measurement allows for real-time data regarding the erosion rates, providing a quick method for

  4. Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data

    PubMed Central

    Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.

    2016-01-01

    We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872

  5. Effects of irrigation deprivation during the harvest period on yield determinants in mature almond trees.

    PubMed

    Esparza, G; DeJong, T M; Weinbaum, S A; Klein, I

    2001-09-01

    Effects of irrigation deprivation during the harvest period on yield determinants in mature almond (Prunus dulcis (Mill.) D.A. Webb cv. Nonpareil) trees were investigated during a 3-year field experiment. Return bloom and fruit set were measured on 2185 individually tagged spurs. Water stress resulting from irrigation deprivation during the harvest period, which purportedly coincides with the time of flower initiation, had no effect on the percentage of spurs that flowered or set fruit during subsequent years. Although water stress had no apparent effect on spur mortality, 66% of the tagged spurs died within 3 years. In addition, many spurs were vegetative by the third year, indicating the importance of spur renewal for sustained fruit production. Reductions in nut yield were evident after two successive years of irrigation deprivation during the harvest period. Regression analysis indicated a loss in yield of 7.7 kg tree(-1) in response to each 1 MPa decrease in stem water potential below -1.2 MPa during the previous seasons. The number of fruiting positions per tree (estimated indirectly for whole trees based on weight of current-year shoots > 5 cm in length) was negatively associated with water stress. Yield reduction in response to water stress during harvest appears to be a compound, multiyear effect, associated with reduced annual growth and renewal of fruiting positions.

  6. Survival Data and Regression Models

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.

  7. Regressive evolution in Astyanax cavefish.

    PubMed

    Jeffery, William R

    2009-01-01

    A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment.

  8. Food additives

    MedlinePlus

    ... or natural. Natural food additives include: Herbs or spices to add flavor to foods Vinegar for pickling ... Certain colors improve the appearance of foods. Many spices, as well as natural and man-made flavors, ...

  9. A classification tree approach to the development of actuarial violence risk assessment tools.

    PubMed

    Steadman, H J; Silver, E; Monahan, J; Appelbaum, P S; Robbins, P C; Mulvey, E P; Grisso, T; Roth, L H; Banks, S

    2000-02-01

    Since the 1970s, a wide body of research has suggested that the accuracy of clinical risk assessments of violence might be increased if clinicians used actuarial tools. Despite considerable progress in recent years in the development of such tools for violence risk assessment, they remain primarily research instruments, largely ignored in daily clinical practice. We argue that because most existing actuarial tools are based on a main effects regression approach, they do not adequately reflect the contingent nature of the clinical assessment processes. To enhance the use of actuarial violence risk assessment tools, we propose a classification tree rather than a main effects regression approach. In addition, we suggest that by employing two decision thresholds for identifying high- and low-risk cases--instead of the standard single threshold--the use of actuarial tools to make dichotomous risk classification decisions may be further enhanced. These claims are supported with empirical data from the MacArthur Violence Risk Assessment Study.

  10. Cactus: An Introduction to Regression

    ERIC Educational Resources Information Center

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…

  11. Multiple Regression: A Leisurely Primer.

    ERIC Educational Resources Information Center

    Daniel, Larry G.; Onwuegbuzie, Anthony J.

    Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…

  12. Weighting Regressions by Propensity Scores

    ERIC Educational Resources Information Center

    Freedman, David A.; Berk, Richard A.

    2008-01-01

    Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If…

  13. Quantile Regression with Censored Data

    ERIC Educational Resources Information Center

    Lin, Guixian

    2009-01-01

    The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…

  14. Tree growth response to ENSO in Durango, Mexico

    NASA Astrophysics Data System (ADS)

    Pompa-García, Marin; Miranda-Aragón, Liliana; Aguirre-Salado, Carlos Arturo

    2015-01-01

    The dynamics of forest ecosystems worldwide have been driven largely by climatic teleconnections. El Niño-Southern Oscillation (ENSO) is the strongest interannual variation of the Earth's climate, affecting the regional climatic regime. These teleconnections may impact plant phenology, growth rate, forest extent, and other gradual changes in forest ecosystems. The objective of this study was to investigate how Pinus cooperi populations face the influence of ENSO and regional microclimates in five ecozones in northwestern Mexico. Using standard dendrochronological techniques, tree-ring chronologies (TRI) were generated. TRI, ENSO, and climate relationships were correlated from 1950-2010. Additionally, multiple regressions were conducted in order to detect those ENSO months with direct relations in TRI ( p < 0.1). The five chronologies showed similar trends during the period they overlapped, indicating that the P. cooperi populations shared an interannual growth variation. In general, ENSO index showed correspondences with tree-ring growth in synchronous periods. We concluded that ENSO had connectivity with regional climate in northern Mexico and radial growth of P. cooperi populations has been driven largely by positive ENSO values (El Niño episodes).

  15. Exposure and effects of perfluoroalkyl substances in tree ...

    EPA Pesticide Factsheets

    The exposure and effects of perfluoroalkyl substances (PFASs) were studied at eight locations in Minnesota and Wisconsin between 2007 and 2011 using tree swallows (Tachycineta bicolor) as sentinel species. These eight sites covered a range of possible exposure pathways and ecological settings. Concentrations in various swallow tissues were quantified as were reproductive success endpoints. The sample egg method was used wherein an egg sample is collected and the hatching success of the remaining eggs in the nest is assessed. The association between PFAS exposure and reproductive success was assessed by site comparisons, logistic regression analysis, and multistate modeling, a technique that has not previously been used in this context. There was a negative association between concentrations of PFASs in eggs and hatching success; this is the second field study in which a negative association was found. The concentration at which effects became evident (150 200 ng/g wet wt.) was far below effect levels found in laboratory feeding trials or egg injection studies on other avian species. This discrepancy was likely because behavioral effects and other extrinsic factors are not accounted for in these laboratory studies; further, there is a mixture of PFASs in field studies rather than a single-contaminant used in laboratory studies, and the possibility that tree swallows are unusually sensitive to PFASs. Additional field effect studies on other avian species

  16. Quantum decision tree classifier

    NASA Astrophysics Data System (ADS)

    Lu, Songfeng; Braunstein, Samuel L.

    2013-11-01

    We study the quantum version of a decision tree classifier to fill the gap between quantum computation and machine learning. The quantum entropy impurity criterion which is used to determine which node should be split is presented in the paper. By using the quantum fidelity measure between two quantum states, we cluster the training data into subclasses so that the quantum decision tree can manipulate quantum states. We also propose algorithms constructing the quantum decision tree and searching for a target class over the tree for a new quantum object.

  17. Fragmentation of random trees

    NASA Astrophysics Data System (ADS)

    Kalay, Ziya; Ben-Naim, Eli

    2015-03-01

    We investigate the fragmentation of a random recursive tree by repeated removal of nodes, resulting in a forest of disjoint trees. The initial tree is generated by sequentially attaching new nodes to randomly chosen existing nodes until the tree contains N nodes. As nodes are removed, one at a time, the tree dissolves into an ensemble of separate trees, namely a forest. We study the statistical properties of trees and nodes in this heterogeneous forest. In the limit N --> ∞ , we find that the system is characterized by a single parameter: the fraction of remaining nodes m. We obtain analytically the size density ϕs of trees of size s, which has a power-law tail ϕs ~s-α , with exponent α = 1 + 1 / m . Therefore, the tail becomes steeper as further nodes are removed, producing an unusual scaling exponent that increases continuously with time. Furthermore, we investigate the fragment size distribution in a growing tree, where nodes are added as well as removed, and find that the distribution for this case is much narrower.

  18. arb_tree_32

    SciTech Connect

    Bavykin, Sergey; Alferov, Oleg

    2006-08-01

    The purpose of this program is to generate probes specific for the group of sequences that belong to a given phylogenetic node. For each node of the input tree, this program selects probes that are positive for all sequences that belong to this node and negative for all that doesn't. The program uses condensed tree for probe representation to save computer memory. As a result of calculation, the program prints lists for each node from the tree. Input file formats: FASTA for sequence database and ARB tree for phylogenetic organization of nodes. Output file format: text file.

  19. More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011.

    PubMed

    Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R

    2016-01-01

    Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.

  20. More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011

    NASA Astrophysics Data System (ADS)

    Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R.

    2016-01-01

    Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.

  1. Neither fixed nor random: weighted least squares meta-regression.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2017-03-01

    Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd.

  2. Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors

    PubMed Central

    Woodard, Dawn B.; Crainiceanu, Ciprian; Ruppert, David

    2013-01-01

    We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in online supplemental materials. PMID:24293988

  3. Stable feature selection for clinical prediction: exploiting ICD tree structure using Tree-Lasso.

    PubMed

    Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha

    2015-02-01

    Modern healthcare is getting reshaped by growing Electronic Medical Records (EMR). Recently, these records have been shown of great value towards building clinical prediction models. In EMR data, patients' diseases and hospital interventions are captured through a set of diagnoses and procedures codes. These codes are usually represented in a tree form (e.g. ICD-10 tree) and the codes within a tree branch may be highly correlated. These codes can be used as features to build a prediction model and an appropriate feature selection can inform a clinician about important risk factors for a disease. Traditional feature selection methods (e.g. Information Gain, T-test, etc.) consider each variable independently and usually end up having a long feature list. Recently, Lasso and related l1-penalty based feature selection methods have become popular due to their joint feature selection property. However, Lasso is known to have problems of selecting one feature of many correlated features randomly. This hinders the clinicians to arrive at a stable feature set, which is crucial for clinical decision making process. In this paper, we solve this problem by using a recently proposed Tree-Lasso model. Since, the stability behavior of Tree-Lasso is not well understood, we study the stability behavior of Tree-Lasso and compare it with other feature selection methods. Using a synthetic and two real-world datasets (Cancer and Acute Myocardial Infarction), we show that Tree-Lasso based feature selection is significantly more stable than Lasso and comparable to other methods e.g. Information Gain, ReliefF and T-test. We further show that, using different types of classifiers such as logistic regression, naive Bayes, support vector machines, decision trees and Random Forest, the classification performance of Tree-Lasso is comparable to Lasso and better than other methods. Our result has implications in identifying stable risk factors for many healthcare problems and therefore can

  4. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  5. Is susceptibility to prenatal methylmercury exposure from fish consumption non-homogeneous? Tree-structured analysis for the Seychelles Child Development Study.

    PubMed

    Huang, Li-Shan; Myers, Gary J; Davidson, Philip W; Cox, Christopher; Xiao, Fenyuan; Thurston, Sally W; Cernichiari, Elsa; Shamlaye, Conrad F; Sloane-Reeves, Jean; Georger, Lesley; Clarkson, Thomas W

    2007-11-01

    Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age 9 years. The analyses for the most recent 9-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated non-linearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of 21 endpoints available at age 9 years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other 9-year endpoints that in the linear analysis had a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels.

  6. Regression Verification Using Impact Summaries

    NASA Technical Reports Server (NTRS)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program

  7. Potlining Additives

    SciTech Connect

    Rudolf Keller

    2004-08-10

    In this project, a concept to improve the performance of aluminum production cells by introducing potlining additives was examined and tested. Boron oxide was added to cathode blocks, and titanium was dissolved in the metal pool; this resulted in the formation of titanium diboride and caused the molten aluminum to wet the carbonaceous cathode surface. Such wetting reportedly leads to operational improvements and extended cell life. In addition, boron oxide suppresses cyanide formation. This final report presents and discusses the results of this project. Substantial economic benefits for the practical implementation of the technology are projected, especially for modern cells with graphitized blocks. For example, with an energy savings of about 5% and an increase in pot life from 1500 to 2500 days, a cost savings of $ 0.023 per pound of aluminum produced is projected for a 200 kA pot.

  8. Phosphazene additives

    DOEpatents

    Harrup, Mason K; Rollins, Harry W

    2013-11-26

    An additive comprising a phosphazene compound that has at least two reactive functional groups and at least one capping functional group bonded to phosphorus atoms of the phosphazene compound. One of the at least two reactive functional groups is configured to react with cellulose and the other of the at least two reactive functional groups is configured to react with a resin, such as an amine resin of a polycarboxylic acid resin. The at least one capping functional group is selected from the group consisting of a short chain ether group, an alkoxy group, or an aryloxy group. Also disclosed are an additive-resin admixture, a method of treating a wood product, and a wood product.

  9. Embedded Sensors for Measuring Surface Regression

    NASA Technical Reports Server (NTRS)

    Gramer, Daniel J.; Taagen, Thomas J.; Vermaak, Anton G.

    2006-01-01

    non-eroding end of the sensor. The sensor signal can be transmitted from inside a high-pressure chamber to the ambient environment, using commercially available feedthrough connectors. Miniaturized internal recorders or wireless data transmission could also potentially be employed to eliminate the need for producing penetrations in the chamber case. The rungs are designed so that as each successive rung is eroded away, the resistance changes by an amount that yields a readily measurable signal larger than the background noise. (In addition, signal-conditioning techniques are used in processing the resistance readings to mitigate the effect of noise.) Hence, each discrete change of resistance serves to indicate the arrival of the regressing host material front at the known depth of the affected resistor rung. The average rate of regression between two adjacent resistors can be calculated simply as the distance between the resistors divided by the time interval between their resistance jumps. Advanced data reduction techniques have also been developed to establish the instantaneous surface position and regression rate when the regressing front is between rungs.

  10. Tree nut oils

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The major tree nuts include almonds, Brazil nuts, cashew nuts, hazelnuts, macadamia nuts, pecans, pine nuts, pistachio nuts, and walnuts. Tree nut oils are appreciated in food applications because of their flavors and are generally more expensive than other gourmet oils. Research during the last de...

  11. Trees Are Terrific!

    ERIC Educational Resources Information Center

    Braus, Judy, Ed.

    1992-01-01

    Ranger Rick's NatureScope is a creative education series dedicated to inspiring in children an understanding and appreciation of the natural world while developing the skills they will need to make responsible decisions about the environment. Contents are organized into the following sections: (1) "What Makes a Tree a Tree?," including…

  12. CSI for Trees

    ERIC Educational Resources Information Center

    Rubino, Darrin L.; Hanson, Deborah

    2009-01-01

    The circles and patterns in a tree's stem tell a story, but that story can be a mystery. Interpreting the story of tree rings provides a way to heighten the natural curiosity of students and help them gain insight into the interaction of elements in the environment. It also represents a wonderful opportunity to incorporate the nature of science.…

  13. The Flame Tree

    ERIC Educational Resources Information Center

    Lewis, Richard

    2004-01-01

    Lewis's own experiences living in Indonesia are fertile ground for telling "a ripping good story," one found in "The Flame Tree." He hopes people will enjoy the tale and appreciate the differences of an unfamiliar culture. The excerpt from "The Flame Tree" will reel readers in quickly.

  14. Trees for Mother Earth.

    ERIC Educational Resources Information Center

    Greer, Sandy

    1993-01-01

    Describes Trees for Mother Earth, a program in which secondary students raise funds to buy fruit trees to plant during visits to the Navajo Reservation. Benefits include developing feelings of self-worth among participants, promoting cultural exchange and understanding, and encouraging self-sufficiency among the Navajo. (LP)

  15. Reclamation: what about trees

    SciTech Connect

    Kolar, C.A.; Ashby, W.C.

    1982-07-01

    A five-year research programme was started in 1978 in the Botany Department of Southern Illinois University to evaluate the effect of reclamation practices on tree survival and growth. The project was initiated as a direct result of reports from Illinois and Indiana of tree-planting failures on mined lands reclaimed to current regulation standards.

  16. Structural Equation Model Trees

    ERIC Educational Resources Information Center

    Brandmaier, Andreas M.; von Oertzen, Timo; McArdle, John J.; Lindenberger, Ulman

    2013-01-01

    In the behavioral and social sciences, structural equation models (SEMs) have become widely accepted as a modeling tool for the relation between latent and observed variables. SEMs can be seen as a unification of several multivariate analysis techniques. SEM Trees combine the strengths of SEMs and the decision tree paradigm by building tree…

  17. Trees in Our Lives.

    ERIC Educational Resources Information Center

    NatureScope, 1986

    1986-01-01

    Provides: (1) background information on how trees have influenced human history and how trees affect people today; (2) four activities dealing with these topics; and (3) a ready-to-copy page related to paper and plastics. Activities include an objective, recommended age level(s), subject area(s), list of materials needed, and procedures. (JN)

  18. Dependency Tree Annotation Software

    DTIC Science & Technology

    2015-11-01

    Features 5 Distribution List 12 iv List of Figures Fig. 1 Manually created dependency tree for the sentence, “The little cat ate the pie...with a dependency relation label. Fig. 1 Manually created dependency tree for the sentence, “The little cat ate the pie” The user can easily

  19. Project Learning Tree (Corporate Propaganda Tree).

    ERIC Educational Resources Information Center

    Mayer, Mike

    This document contains a critical analysis of Project Learning Tree (PLT). PLT was developed and distributed in the mid-1970s. It consists of 2 activity guides, one for grades K-6 with 89 activities and another for grades 7-12 with 88 activities. The program also provides free workshops for teachers and others. The analysis of PLT includes the…

  20. Feedback of trees on nitrogen mineralization to restrict the advance of trees in C4 savannahs.

    PubMed

    Higgins, Steven I; Keretetse, Moagi; February, Edmund C

    2015-08-01

    Remote sensing studies suggest that savannahs are transforming into more tree-dominated states; however, progressive nitrogen limitation could potentially retard this putatively CO2-driven invasion. We analysed controls on nitrogen mineralization rates in savannah by manipulating rainfall and the cover of grass and tree elements against the backdrop of the seasonal temperature and rainfall variation. We found that the seasonal pattern of nitrogen mineralization was strongly influenced by rainfall, and that manipulative increases in rainfall could boost mineralization rates. Additionally, mineralization rates were considerably higher on plots with grasses and lower on plots with trees. Our findings suggest that shifting a savannah from a grass to a tree-dominated state can substantially reduce nitrogen mineralization rates, thereby potentially creating a negative feedback on the CO2-induced invasion of savannahs by trees.

  1. Investigating the role of feedstock properties and process conditions on products formed during the hydrothermal carbonization of organics using regression techniques.

    PubMed

    Li, Liang; Flora, Joseph R V; Caicedo, Juan M; Berge, Nicole D

    2015-01-01

    The purpose of this study is to develop regression models that describe the role of process conditions and feedstock chemical properties on carbonization product characteristics. Experimental data were collected and compiled from literature-reported carbonization studies and subsequently analyzed using two statistical approaches: multiple linear regression and regression trees. Results from these analyses indicate that both the multiple linear regression and regression tree models fit the product characteristics data well. The regression tree models provide valuable insight into parameter relationships. Relative weight analyses indicate that process conditions are more influential to the solid yields and liquid and gas-phase carbon contents, while feedstock properties are more influential on the hydrochar carbon content, energy content, and the normalized carbon content of the solid.

  2. Up in the Tree – The Overlooked Richness of Bryophytes and Lichens in Tree Crowns

    PubMed Central

    Boch, Steffen; Müller, Jörg; Prati, Daniel; Blaser, Stefan; Fischer, Markus

    2013-01-01

    Assessing diversity is among the major tasks in ecology and conservation science. In ecological and conservation studies, epiphytic cryptogams are usually sampled up to accessible heights in forests. Thus, their diversity, especially of canopy specialists, likely is underestimated. If the proportion of those species differs among forest types, plot-based diversity assessments are biased and may result in misleading conservation recommendations. We sampled bryophytes and lichens in 30 forest plots of 20 m × 20 m in three German regions, considering all substrates, and including epiphytic litter fall. First, the sampling of epiphytic species was restricted to the lower 2 m of trees and shrubs. Then, on one representative tree per plot, we additionally recorded epiphytic species in the crown, using tree climbing techniques. Per tree, on average 54% of lichen and 20% of bryophyte species were overlooked if the crown was not been included. After sampling all substrates per plot, including the bark of all shrubs and trees, still 38% of the lichen and 4% of the bryophyte species were overlooked if the tree crown of the sampled tree was not included. The number of overlooked lichen species varied strongly among regions. Furthermore, the number of overlooked bryophyte and lichen species per plot was higher in European beech than in coniferous stands and increased with increasing diameter at breast height of the sampled tree. Thus, our results indicate a bias of comparative studies which might have led to misleading conservation recommendations of plot-based diversity assessments. PMID:24358373

  3. Up in the tree--the overlooked richness of bryophytes and lichens in tree crowns.

    PubMed

    Boch, Steffen; Müller, Jörg; Prati, Daniel; Blaser, Stefan; Fischer, Markus

    2013-01-01

    Assessing diversity is among the major tasks in ecology and conservation science. In ecological and conservation studies, epiphytic cryptogams are usually sampled up to accessible heights in forests. Thus, their diversity, especially of canopy specialists, likely is underestimated. If the proportion of those species differs among forest types, plot-based diversity assessments are biased and may result in misleading conservation recommendations. We sampled bryophytes and lichens in 30 forest plots of 20 m × 20 m in three German regions, considering all substrates, and including epiphytic litter fall. First, the sampling of epiphytic species was restricted to the lower 2 m of trees and shrubs. Then, on one representative tree per plot, we additionally recorded epiphytic species in the crown, using tree climbing techniques. Per tree, on average 54% of lichen and 20% of bryophyte species were overlooked if the crown was not been included. After sampling all substrates per plot, including the bark of all shrubs and trees, still 38% of the lichen and 4% of the bryophyte species were overlooked if the tree crown of the sampled tree was not included. The number of overlooked lichen species varied strongly among regions. Furthermore, the number of overlooked bryophyte and lichen species per plot was higher in European beech than in coniferous stands and increased with increasing diameter at breast height of the sampled tree. Thus, our results indicate a bias of comparative studies which might have led to misleading conservation recommendations of plot-based diversity assessments.

  4. Phylogenetic Tree Reconstruction Accuracy and Model Fit when Proportions of Variable Sites Change across the Tree

    PubMed Central

    Grievink, Liat Shavit; Penny, David; Hendy, Michael D.; Holland, Barbara R.

    2010-01-01

    Commonly used phylogenetic models assume a homogeneous process through time in all parts of the tree. However, it is known that these models can be too simplistic as they do not account for nonhomogeneous lineage-specific properties. In particular, it is now widely recognized that as constraints on sequences evolve, the proportion and positions of variable sites can vary between lineages causing heterotachy. The extent to which this model misspecification affects tree reconstruction is still unknown. Here, we evaluate the effect of changes in the proportions and positions of variable sites on model fit and tree estimation. We consider 5 current models of nucleotide sequence evolution in a Bayesian Markov chain Monte Carlo framework as well as maximum parsimony (MP). We show that for a tree with 4 lineages where 2 nonsister taxa undergo a change in the proportion of variable sites tree reconstruction under the best-fitting model, which is chosen using a relative test, often results in the wrong tree. In this case, we found that an absolute test of model fit is a better predictor of tree estimation accuracy. We also found further evidence that MP is not immune to heterotachy. In addition, we show that increased sampling of taxa that have undergone a change in proportion and positions of variable sites is critical for accurate tree reconstruction. PMID:20525636

  5. Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs

    PubMed Central

    Smith, Stephen A.; Brown, Joseph W.; Hinchliff, Cody E.

    2013-01-01

    Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe. PMID:24086118

  6. Lazy decision trees

    SciTech Connect

    Friedman, J.H.; Yun, Yeogirl; Kohavi, R.

    1996-12-31

    Lazy learning algorithms, exemplified by nearest-neighbor algorithms, do not induce a concise hypothesis from a given training set; the inductive process is delayed until a test instance is given. Algorithms for constructing decision trees, such as C4.5, ID3, and CART create a single {open_quotes}best{close_quotes} decision tree during the training phase, and this tree is then used to classify test instances. The tests at the nodes of the constructed tree are good on average, but there may be better tests for classifying a specific instance. We propose a lazy decision tree algorithm-LazyDT-that conceptually constructs the {open_quotes}best{close_quote} decision tree for each test instance. In practice, only a path needs to be constructed, and a caching scheme makes the algorithm fast. The algorithm is robust with respect to missing values without resorting to the complicated methods usually seen in induction of decision trees. Experiments on real and artificial problems are presented.

  7. Regression analysis of cytopathological data

    SciTech Connect

    Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.

    1982-12-01

    Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.

  8. Mortality rates associated with crown health for eastern forest tree species.

    PubMed

    Morin, Randall S; Randolph, KaDonna C; Steinman, Jim

    2015-03-01

    The condition of tree crowns is an important indicator of tree and forest health. Crown conditions have been evaluated during inventories of the US Forest Service Forest Inventory and Analysis (FIA) program since 1999. In this study, remeasured data from 55,013 trees on 2616 FIA plots in the eastern USA were used to assess the probability of survival among various tree species using the suite of FIA crown condition variables. Logistic regression procedures were employed to develop models for predicting tree survival. Results of the regression analyses indicated that crown dieback was the most important crown condition variable for predicting tree survival for all species combined and for many of the 15 individual species in the study. The logistic models were generally successful in representing recent tree mortality responses to multiyear infestations of beech bark disease and hemlock woolly adelgid. Although our models are only applicable to trees growing in a forest setting, the utility of models that predict impending tree mortality goes beyond forest inventory or traditional forestry growth and yield models and includes any application where managers need to assess tree health or predict tree mortality including urban forest, recreation, wildlife, and pest management.

  9. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  10. The gravity apple tree

    NASA Astrophysics Data System (ADS)

    Espinosa Aldama, Mariana

    2015-04-01

    The gravity apple tree is a genealogical tree of the gravitation theories developed during the past century. The graphic representation is full of information such as guides in heuristic principles, names of main proponents, dates and references for original articles (See under Supplementary Data for the graphic representation). This visual presentation and its particular classification allows a quick synthetic view for a plurality of theories, many of them well validated in the Solar System domain. Its diachronic structure organizes information in a shape of a tree following similarities through a formal concept analysis. It can be used for educational purposes or as a tool for philosophical discussion.

  11. Evolutionary tree reconstruction

    NASA Technical Reports Server (NTRS)

    Cheeseman, Peter; Kanefsky, Bob

    1990-01-01

    It is described how Minimum Description Length (MDL) can be applied to the problem of DNA and protein evolutionary tree reconstruction. If there is a set of mutations that transform a common ancestor into a set of the known sequences, and this description is shorter than the information to encode the known sequences directly, then strong evidence for an evolutionary relationship has been found. A heuristic algorithm is described that searches for the simplest tree (smallest MDL) that finds close to optimal trees on the test data. Various ways of extending the MDL theory to more complex evolutionary relationships are discussed.

  12. Calibrating divergence times on species trees versus gene trees: implications for speciation history of Aphelocoma jays.

    PubMed

    McCormack, John E; Heled, Joseph; Delaney, Kathleen S; Peterson, A Townsend; Knowles, L Lacey

    2011-01-01

    Estimates of the timing of divergence are central to testing the underlying causes of speciation. Relaxed molecular clocks and fossil calibration have improved these estimates; however, these advances are implemented in the context of gene trees, which can overestimate divergence times. Here we couple recent innovations for dating speciation events with the analytical power of species trees, where multilocus data are considered in a coalescent context. Divergence times are estimated in the bird genus Aphelocoma to test whether speciation in these jays coincided with mountain uplift or glacial cycles. Gene trees and species trees show general agreement that diversification began in the Miocene amid mountain uplift. However, dates from the multilocus species tree are more recent, occurring predominately in the Pleistocene, consistent with theory that divergence times can be significantly overestimated with gene-tree based approaches that do not correct for genetic divergence that predates speciation. In addition to coalescent stochasticity, Haldane's rule could account for some differences in timing estimates between mitochondrial DNA and nuclear genes. By incorporating a fossil calibration applied to the species tree, in addition to the process of gene lineage coalescence, the present approach provides a more biologically realistic framework for dating speciation events, and hence for testing the links between diversification and specific biogeographic and geologic events.

  13. Tree-ring based may-july temperature reconstruction since AD 1630 on the Western Loess Plateau, China.

    PubMed

    Song, Huiming; Liu, Yu; Li, Qiang; Gao, Na; Ma, Yongyong; Zhang, Yanhua

    2014-01-01

    Tree-ring samples from Chinese Pine (Pinus tabulaeformis Carr.) collected at Mt. Shimen on the western Loess Plateau, China, were used to reconstruct the mean May-July temperature during AD 1630-2011. The regression model explained 48% of the adjusted variance in the instrumentally observed mean May-July temperature. The reconstruction revealed significant temperature variations at interannual to decadal scales. Cool periods observed in the reconstruction coincided with reduced solar activities. The reconstructed temperature matched well with two other tree-ring based temperature reconstructions conducted on the northern slope of the Qinling Mountains (on the southern margin of the Loess Plateau of China) for both annual and decadal scales. In addition, this study agreed well with several series derived from different proxies. This reconstruction improves upon the sparse network of high-resolution paleoclimatic records for the western Loess Plateau, China.

  14. Multiatlas segmentation as nonparametric regression.

    PubMed

    Awate, Suyash P; Whitaker, Ross T

    2014-09-01

    This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems.

  15. Multiatlas Segmentation as Nonparametric Regression

    PubMed Central

    Awate, Suyash P.; Whitaker, Ross T.

    2015-01-01

    This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528

  16. Trees of trees: an approach to comparing multiple alternative phylogenies.

    PubMed

    Nye, Tom M W

    2008-10-01

    Phylogenetic analysis very commonly produces several alternative trees for a given fixed set of taxa. For example, different sets of orthologous genes may be analyzed, or the analysis may sample from a distribution of probable trees. This article describes an approach to comparing and visualizing multiple alternative phylogenies via the idea of a "tree of trees" or "meta-tree." A meta-tree clusters phylogenies with similar topologies together in the same way that a phylogeny clusters species with similar DNA sequences. Leaf nodes on a meta-tree correspond to the original set of phylogenies given by some analysis, whereas interior nodes correspond to certain consensus topologies. The construction of meta-trees is motivated by analogy with construction of a most parsimonious tree for DNA data, but instead of using DNA letters, in a meta-tree the characters are partitions or splits of the set of taxa. An efficient algorithm for meta-tree construction is described that makes use of a known relationship between the majority consensus and parsimony in terms of gain and loss of splits. To illustrate these ideas meta-trees are constructed for two datasets: a set of gene trees for species of yeast and trees from a bootstrap analysis of a set of gene trees in ray-finned fish. A software tool for constructing meta-trees and comparing alternative phylogenies is available online, and the source code can be obtained from the author.

  17. Tree-bank grammars

    SciTech Connect

    Charniak, E.

    1996-12-31

    By a {open_quotes}tree-bank grammar{close_quotes} we mean a context-free grammar created by reading the production rules directly from hand-parsed sentences in a tree bank. Common wisdom has it that such grammars do not perform well, though we know of no published data on the issue. The primary purpose of this paper is to show that the common wisdom is wrong. In particular, we present results on a tree-bank grammar based on the Penn Wall Street Journal tree bank. To the best of our knowledge, this grammar outperforms all other non-word-based statistical parsers/grammars on this corpus. That is, it outperforms parsers that consider the input as a string of tags and ignore the actual words of the corpus.

  18. Leonardo's Tree Theory.

    ERIC Educational Resources Information Center

    Werner, Suzanne K.

    2003-01-01

    Describes a series of activities exploring Leonardo da Vinci's tree theory that are designed to strengthen 8th grade students' data collection and problem solving skills in physical science classes. (KHR)

  19. Tea tree oil.

    PubMed

    Larson, David; Jacob, Sharon E

    2012-01-01

    Tea tree oil is an increasingly popular ingredient in a variety of household and cosmetic products, including shampoos, massage oils, skin and nail creams, and laundry detergents. Known for its potential antiseptic properties, it has been shown to be active against a variety of bacteria, fungi, viruses, and mites. The oil is extracted from the leaves of the tea tree via steam distillation. This essential oil possesses a sharp camphoraceous odor followed by a menthol-like cooling sensation. Most commonly an ingredient in topical products, it is used at a concentration of 5% to 10%. Even at this concentration, it has been reported to induce contact sensitization and allergic contact dermatitis reactions. In 1999, tea tree oil was added to the North American Contact Dermatitis Group screening panel. The latest prevalence rates suggest that 1.4% of patients referred for patch testing had a positive reaction to tea tree oil.

  20. Generalized constructive tree weights

    SciTech Connect

    Rivasseau, Vincent E-mail: adrian.tanasa@ens-lyon.org; Tanasa, Adrian E-mail: adrian.tanasa@ens-lyon.org

    2014-04-15

    The Loop Vertex Expansion (LVE) is a quantum field theory (QFT) method which explicitly computes the Borel sum of Feynman perturbation series. This LVE relies in a crucial way on symmetric tree weights which define a measure on the set of spanning trees of any connected graph. In this paper we generalize this method by defining new tree weights. They depend on the choice of a partition of a set of vertices of the graph, and when the partition is non-trivial, they are no longer symmetric under permutation of vertices. Nevertheless we prove they have the required positivity property to lead to a convergent LVE; in fact we formulate this positivity property precisely for the first time. Our generalized tree weights are inspired by the Brydges-Battle-Federbush work on cluster expansions and could be particularly suited to the computation of connected functions in QFT. Several concrete examples are explicitly given.

  1. Mapping geogenic radon potential by regression kriging.

    PubMed

    Pásztor, László; Szabó, Katalin Zsuzsanna; Szatmári, Gábor; Laborczi, Annamária; Horváth, Ákos

    2016-02-15

    Radon ((222)Rn) gas is produced in the radioactive decay chain of uranium ((238)U) which is an element that is naturally present in soils. Radon is transported mainly by diffusion and convection mechanisms through the soil depending mainly on the physical and meteorological parameters of the soil and can enter and accumulate in buildings. Health risks originating from indoor radon concentration can be attributed to natural factors and is characterized by geogenic radon potential (GRP). Identification of areas with high health risks require spatial modeling, that is, mapping of radon risk. In addition to geology and meteorology, physical soil properties play a significant role in the determination of GRP. In order to compile a reliable GRP map for a model area in Central-Hungary, spatial auxiliary information representing GRP forming environmental factors were taken into account to support the spatial inference of the locally measured GRP values. Since the number of measured sites was limited, efficient spatial prediction methodologies were searched for to construct a reliable map for a larger area. Regression kriging (RK) was applied for the interpolation using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly, the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Overall accuracy of the map was tested by Leave-One-Out Cross-Validation. Furthermore the spatial reliability of the resultant map is also estimated by the calculation of the 90% prediction interval of the local prediction values. The applicability of the applied method as well as that of the map is discussed briefly.

  2. Tree Topology Estimation

    PubMed Central

    Estrada, Rolando; Tomasi, Carlo; Schmidler, Scott C.; Farsiu, Sina

    2015-01-01

    Tree-like structures are fundamental in nature, and it is often useful to reconstruct the topology of a tree—what connects to what—from a two-dimensional image of it. However, the projected branches often cross in the image: the tree projects to a planar graph, and the inverse problem of reconstructing the topology of the tree from that of the graph is ill-posed. We regularize this problem with a generative, parametric tree-growth model. Under this model, reconstruction is possible in linear time if one knows the direction of each edge in the graph—which edge endpoint is closer to the root of the tree—but becomes NP-hard if the directions are not known. For the latter case, we present a heuristic search algorithm to estimate the most likely topology of a rooted, three-dimensional tree from a single two-dimensional image. Experimental results on retinal vessel, plant root, and synthetic tree datasets show that our methodology is both accurate and efficient. PMID:26353004

  3. Exposure and effects of perfluoroalkyl substances in tree swallows nesting in Minnesota and Wisconsin, USA

    USGS Publications Warehouse

    Custer, Christine M.; Custer, Thomas W.; Dummer, Paul; Etterson, Matthew A.; Thogmartin, Wayne E.; Wu, Qian; Kannan, Kurunthachalam; Trowbridge, Annette; McKann, Patrick C.

    2013-01-01

    The exposure and effects of perfluoroalkyl substances (PFASs) were studied at eight locations in Minnesota and Wisconsin between 2007 and 2011 using tree swallows (Tachycineta bicolor). Concentrations of PFASs were quantified as were reproductive success end points. The sample egg method was used wherein an egg sample is collected, and the hatching success of the remaining eggs in the nest is assessed. The association between PFAS exposure and reproductive success was assessed by site comparisons, logistic regression analysis, and multistate modeling, a technique not previously used in this context. There was a negative association between concentrations of perfluorooctane sulfonate (PFOS) in eggs and hatching success. The concentration at which effects became evident (150–200 ng/g wet weight) was far lower than effect levels found in laboratory feeding trials or egg-injection studies of other avian species. This discrepancy was likely because behavioral effects and other extrinsic factors are not accounted for in these laboratory studies and the possibility that tree swallows are unusually sensitive to PFASs. The results from multistate modeling and simple logistic regression analyses were nearly identical. Multistate modeling provides a better method to examine possible effects of additional covariates and assessment of models using Akaike information criteria analyses. There was a credible association between PFOS concentrations in plasma and eggs, so extrapolation between these two commonly sampled tissues can be performed.

  4. Recognition of caudal regression syndrome.

    PubMed

    Boulas, Mari M

    2009-04-01

    Caudal regression syndrome, also referred to as caudal dysplasia and sacral agenesis syndrome, is a rare congenital malformation characterized by varying degrees of developmental failure early in gestation. It involves the lower extremities, the lumbar and coccygeal vertebrae, and corresponding segments of the spinal cord. This is a rare disorder, and true pathogenesis is unclear. The etiology is thought to be related to maternal diabetes, genetic predisposition, and vascular hypoperfusion, but no true causative factor has been determined. Fetal diagnostic tools allow for early recognition of the syndrome, and careful examination of the newborn is essential to determine the extent of the disorder. Associated organ system dysfunction depends on the severity of the disease. Related defects are structural, and systematic problems including respiratory, cardiac, gastrointestinal, urinary, orthopedic, and neurologic can be present in varying degrees of severity and in different combinations. A multidisciplinary approach to management is crucial. Because the primary pathology is irreversible, treatment is only supportive.

  5. Practical Session: Multiple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).

  6. Anatomy of the Pythagoras' Tree

    ERIC Educational Resources Information Center

    Teia, Luis

    2016-01-01

    The architecture of nature can be seen at play in a tree: no two are alike. The Pythagoras' tree behaves just as a "tree" in that the root plus the same movement repeated over and over again grows from a seed, to a plant, to a tree. In human life, this movement is termed cell division. With triples, this movement is a geometrical and…

  7. How Trees Can Save Energy.

    ERIC Educational Resources Information Center

    Fazio, James R., Ed.

    1991-01-01

    This document might easily have been called "How To Use Trees To Save Energy". It presents the energy saving advantages of landscaping the home and community with trees. The discussion includes: (1) landscaping advice to obtain the benefits of tree shade; (2) the heat island phenomenon in cities; (3) how and where to properly plant trees for…

  8. State Trees and Arbor Days.

    ERIC Educational Resources Information Center

    Forest Service (USDA), Washington, DC.

    Provides information on state trees for each of the 50 states and the District of Columbia. Includes for each state: (1) year in which state tree was chosen; (2) common and scientific names of the tree; (3) arbor day observance; (4) address of state forester; and (5) drawings of the tree, leaf, and fruit or cone. (JN)

  9. Lumbar herniated disc: spontaneous regression

    PubMed Central

    Yüksel, Kasım Zafer

    2017-01-01

    Background Low back pain is a frequent condition that results in substantial disability and causes admission of patients to neurosurgery clinics. To evaluate and present the therapeutic outcomes in lumbar disc hernia (LDH) patients treated by means of a conservative approach, consisting of bed rest and medical therapy. Methods This retrospective cohort was carried out in the neurosurgery departments of hospitals in Kahramanmaraş city and 23 patients diagnosed with LDH at the levels of L3−L4, L4−L5 or L5−S1 were enrolled. Results The average age was 38.4 ± 8.0 and the chief complaint was low back pain and sciatica radiating to one or both lower extremities. Conservative treatment was administered. Neurological examination findings, durations of treatment and intervals until symptomatic recovery were recorded. Laségue tests and neurosensory examination revealed that mild neurological deficits existed in 16 of our patients. Previously, 5 patients had received physiotherapy and 7 patients had been on medical treatment. The number of patients with LDH at the level of L3−L4, L4−L5, and L5−S1 were 1, 13, and 9, respectively. All patients reported that they had benefit from medical treatment and bed rest, and radiologic improvement was observed simultaneously on MRI scans. The average duration until symptomatic recovery and/or regression of LDH symptoms was 13.6 ± 5.4 months (range: 5−22). Conclusions It should be kept in mind that lumbar disc hernias could regress with medical treatment and rest without surgery, and there should be an awareness that these patients could recover radiologically. This condition must be taken into account during decision making for surgical intervention in LDH patients devoid of indications for emergent surgery. PMID:28119770

  10. A Gibbs sampler for multivariate linear regression

    NASA Astrophysics Data System (ADS)

    Mantz, Adam B.

    2016-04-01

    Kelly described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modelled by a flexible mixture of Gaussians rather than assumed to be uniform. Here, I extend the Kelly algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Secondly, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamically relaxed galaxy clusters as a function of their mass and redshift. An implementation of the Gibbs sampler in the R language, called LRGS, is provided.

  11. Scientific Progress or Regress in Sports Physiology?

    PubMed

    Böning, Dieter

    2016-11-01

    In modern societies there is strong belief in scientific progress, but, unfortunately, a parallel partial regress occurs because of often avoidable mistakes. Mistakes are mainly forgetting, erroneous theories, errors in experiments and manuscripts, prejudice, selected publication of "positive" results, and fraud. An example of forgetting is that methods introduced decades ago are used without knowing the underlying theories: Basic articles are no longer read or cited. This omission may cause incorrect interpretation of results. For instance, false use of actual base excess instead of standard base excess for calculation of the number of hydrogen ions leaving the muscles raised the idea that an unknown fixed acid is produced in addition to lactic acid during exercise. An erroneous theory led to the conclusion that lactate is not the anion of a strong acid but a buffer. Mistakes occur after incorrect application of a method, after exclusion of unwelcome values, during evaluation of measurements by false calculations, or during preparation of manuscripts. Co-authors, as well as reviewers, do not always carefully read papers before publication. Peer reviewers might be biased against a hypothesis or an author. A general problem is selected publication of positive results. An example of fraud in sports medicine is the presence of doped subjects in groups of investigated athletes. To reduce regress, it is important that investigators search both original and recent articles on a topic and conscientiously examine the data. All co-authors and reviewers should read the text thoroughly and inspect all tables and figures in a manuscript.

  12. Tree growth and competition in an old-growth Picea abies forest of boreal Sweden: influence of tree spatial patterning

    USGS Publications Warehouse

    Fraver, Shawn; D'Amato, Anthony W.; Bradford, John B.; Jonsson, Bengt Gunnar; Jönsson, Mari; Esseen, Per-Anders

    2013-01-01

    Question: What factors best characterize tree competitive environments in this structurally diverse old-growth forest, and do these factors vary spatially within and among stands? Location: Old-growth Picea abies forest of boreal Sweden. Methods: Using long-term, mapped permanent plot data augmented with dendrochronological analyses, we evaluated the effect of neighbourhood competition on focal tree growth by means of standard competition indices, each modified to include various metrics of trees size, neighbour mortality weighting (for neighbours that died during the inventory period), and within-neighbourhood tree clustering. Candidate models were evaluated using mixed-model linear regression analyses, with mean basal area increment as the response variable. We then analysed stand-level spatial patterns of competition indices and growth rates (via kriging) to determine if the relationship between these patterns could further elucidate factors influencing tree growth. Results: Inter-tree competition clearly affected growth rates, with crown volume being the size metric most strongly influencing the neighbourhood competitive environment. Including neighbour tree mortality weightings in models only slightly improved descriptions of competitive interactions. Although the within-neighbourhood clustering index did not improve model predictions, competition intensity was influenced by the underlying stand-level tree spatial arrangement: stand-level clustering locally intensified competition and reduced tree growth, whereas in the absence of such clustering, inter-tree competition played a lesser role in constraining tree growth. Conclusions: Our findings demonstrate that competition continues to influence forest processes and structures in an old-growth system that has not experienced major disturbances for at least two centuries. The finding that the underlying tree spatial pattern influenced the competitive environment suggests caution in interpreting traditional tree

  13. Early development and regression in Rett syndrome.

    PubMed

    Lee, J Y L; Leonard, H; Piek, J P; Downs, J

    2013-12-01

    This study utilized developmental profiling to examine symptoms in 14 girls with genetically confirmed Rett syndrome and whose families were participating in the Australian Rett syndrome or InterRett database. Regression was mostly characterized by loss of hand and/or communication skills (13/14) except one girl demonstrated slowing of skill development. Social withdrawal and inconsolable crying often developed simultaneously (9/14), with social withdrawal for shorter duration than inconsolable crying. Previously acquired gross motor skills declined in just over half of the sample (8/14), mostly observed as a loss of balance. Early abnormalities such as vomiting and strabismus were also seen. Our findings provide additional insight into the early clinical profile of Rett syndrome.

  14. How tree roots respond to drought

    PubMed Central

    Brunner, Ivano; Herzog, Claude; Dawes, Melissa A.; Arend, Matthias; Sperisen, Christoph

    2015-01-01

    The ongoing climate change is characterized by increased temperatures and altered precipitation patterns. In addition, there has been an increase in both the frequency and intensity of extreme climatic events such as drought. Episodes of drought induce a series of interconnected effects, all of which have the potential to alter the carbon balance of forest ecosystems profoundly at different scales of plant organization and ecosystem functioning. During recent years, considerable progress has been made in the understanding of how aboveground parts of trees respond to drought and how these responses affect carbon assimilation. In contrast, processes of belowground parts are relatively underrepresented in research on climate change. In this review, we describe current knowledge about responses of tree roots to drought. Tree roots are capable of responding to drought through a variety of strategies that enable them to avoid and tolerate stress. Responses include root biomass adjustments, anatomical alterations, and physiological acclimations. The molecular mechanisms underlying these responses are characterized to some extent, and involve stress signaling and the induction of numerous genes, leading to the activation of tolerance pathways. In addition, mycorrhizas seem to play important protective roles. The current knowledge compiled in this review supports the view that tree roots are well equipped to withstand drought situations and maintain morphological and physiological functions as long as possible. Further, the reviewed literature demonstrates the important role of tree roots in the functioning of forest ecosystems and highlights the need for more research in this emerging field. PMID:26284083

  15. MetaTreeMap: An Alternative Visualization Method for Displaying Metagenomic Phylogenic Trees

    PubMed Central

    Taylor, Todd D.

    2016-01-01

    Metagenomic samples can contain hundreds or thousands of different species. The most common method to identify these species is to sequence the samples and then classify the reads to nodes along a phylogenic tree. Linear representations of trees with so many nodes face legibility issues. In addition, such views are not optimal for appreciating the read quantity assigned to each node. The problem is exaggerated when comparison between multiple samples is needed. MetaTreeMap adapts a visualization method that addresses these weaknesses. The tree is represented by nested rectangles that illustrate the number or percentage of assigned reads. MetaTreeMap implements various options specific to phylogenic trees that allow for quick overview and investigation of the information. More generally, the goal of this software is to provide the user with the ability to easily display phylogenic trees based on various quantities assigned to the nodes, such as read number, percentage or other values. The tool can be used online at http://metasystems.riken.jp/visualization/treemap/. PMID:27336370

  16. Molecular and physiological responses to abiotic stress in forest trees and their relevance to tree improvement.

    PubMed

    Harfouche, Antoine; Meilan, Richard; Altman, Arie

    2014-11-01

    Abiotic stresses, such as drought, salinity and cold, are the major environmental stresses that adversely affect tree growth and, thus, forest productivity, and play a major role in determining the geographic distribution of tree species. Tree responses and tolerance to abiotic stress are complex biological processes that are best analyzed at a systems level using genetic, genomic, metabolomic and phenomic approaches. This will expedite the dissection of stress-sensing and signaling networks to further support efficient genetic improvement programs. Enormous genetic diversity for stress tolerance exists within some forest-tree species, and due to advances in sequencing technologies the molecular genetic basis for this diversity has been rapidly unfolding in recent years. In addition, the use of emerging phenotyping technologies extends the suite of traits that can be measured and will provide us with a better understanding of stress tolerance. The elucidation of abiotic stress-tolerance mechanisms will allow for effective pyramiding of multiple tolerances in a single tree through genetic engineering. Here we review recent progress in the dissection of the molecular basis of abiotic stress tolerance in forest trees, with special emphasis on Populus, Pinus, Picea, Eucalyptus and Quercus spp. We also outline practices that will enable the deployment of trees engineered for abiotic stress tolerance to land owners. Finally, recommendations for future work are discussed.

  17. Tree testing of hierarchical menu structures for health applications.

    PubMed

    Le, Thai; Chaudhuri, Shomir; Chung, Jane; Thompson, Hilaire J; Demiris, George

    2014-06-01

    To address the need for greater evidence-based evaluation of Health Information Technology (HIT) systems we introduce a method of usability testing termed tree testing. In a tree test, participants are presented with an abstract hierarchical tree of the system taxonomy and asked to navigate through the tree in completing representative tasks. We apply tree testing to a commercially available health application, demonstrating a use case and providing a comparison with more traditional in-person usability testing methods. Online tree tests (N=54) and in-person usability tests (N=15) were conducted from August to September 2013. Tree testing provided a method to quantitatively evaluate the information structure of a system using various navigational metrics including completion time, task accuracy, and path length. The results of the analyses compared favorably to the results seen from the traditional usability test. Tree testing provides a flexible, evidence-based approach for researchers to evaluate the information structure of HITs. In addition, remote tree testing provides a quick, flexible, and high volume method of acquiring feedback in a structured format that allows for quantitative comparisons. With the diverse nature and often large quantities of health information available, addressing issues of terminology and concept classifications during the early development process of a health information system will improve navigation through the system and save future resources. Tree testing is a usability method that can be used to quickly and easily assess information hierarchy of health information systems.

  18. Tree Testing of Hierarchical Menu Structures for Health Applications

    PubMed Central

    Le, Thai; Chaudhuri, Shomir; Chung, Jane; Thompson, Hilaire J; Demiris, George

    2014-01-01

    To address the need for greater evidence-based evaluation of Health Information Technology (HIT) systems we introduce a method of usability testing termed tree testing. In a tree test, participants are presented with an abstract hierarchical tree of the system taxonomy and asked to navigate through the tree in completing representative tasks. We apply tree testing to a commercially available health application, demonstrating a use case and providing a comparison with more traditional in-person usability testing methods. Online tree tests (N=54) and in-person usability tests (N=15) were conducted from August to September 2013. Tree testing provided a method to quantitatively evaluate the information structure of a system using various navigational metrics including completion time, task accuracy, and path length. The results of the analyses compared favorably to the results seen from the traditional usability test. Tree testing provides a flexible, evidence-based approach for researchers to evaluate the information structure of HITs. In addition, remote tree testing provides a quick, flexible, and high volume method of acquiring feedback in a structured format that allows for quantitative comparisons. With the diverse nature and often large quantities of health information available, addressing issues of terminology and concept classifications during the early development process of a health information system will improve navigation through the system and save future resources. Tree testing is a usability method that can be used to quickly and easily assess information hierarchy of health information systems. PMID:24582924

  19. Genetics Home Reference: caudal regression syndrome

    MedlinePlus

    ... of a genetic condition? Genetic and Rare Diseases Information Center Frequency Caudal regression syndrome is estimated to occur in 1 to ... parts of the skeleton, gastrointestinal system, and genitourinary ... caudal regression syndrome results from the presence of an abnormal ...

  20. Meteorological Factors and Tree Characteristics Influencing the Initiation and Rate of Stemflow from Deciduous Trees in an Urban Park

    NASA Astrophysics Data System (ADS)

    Schooling, J. T.; Carlyle-Moses, D. E.

    2013-12-01

    Stemflow, SF, represents that portion of precipitation that is intercepted by a tree's canopy and diverted to the ground at the tree base by flowing along branches and down the bole. The focused input of water and nutrients associated with SF have been shown to be of hydrological and biogeochemical importance in a number of plant communities and forest environments. Although the concentrated water volume and the nutrient / pollutant fluxes associated with SF in urban areas may be highly relevant for stormwater quantity and quality management, they have received only minor study in built environments. In an urban park in Kamloops, British Columbia, Canada, SF volumes generated from 40 deciduous trees representing 22 species were sampled on a precipitation event basis over a period of 16 months. Using this data, we derived the threshold rainfall depth required for SF initiation from each tree by taking the absolute value of the y-intercept of the linear regression of SF volume versus rainfall depth divided by the slope of that regression. The SF discharge rate once the threshold rainfall depth had been reached was taken as the slope of the linear regression equation. Thus, a simplified SF equation was developed: SFv = QSF x (Pg = Pg''), where SFv is stemflow volume (litres), QSF is the discharge rate (litres / mm), and Pg and Pg' represent the precipitation depth and the threshold precipitation depth, respectively. We then examined the influence of meteorological factors (precipitation type [rain / snow / rain + snow], precipitation depth, rainfall intensity, wind speed and direction, and vapour pressure deficit), and tree characteristics (tree diameter at breast height, tree height, leaf size and orientation, bark roughness, crown projection area, leaf area index, canopy cover fraction, branching angle, the proportion of the crown that was comprised of branches, and overlap with other tree canopies) on QSF and Pg' in order to expand on the simplified model and

  1. The Fault Tree Compiler (FTC): Program and mathematics

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Martensen, Anna L.

    1989-01-01

    The Fault Tree Compiler Program is a new reliability tool used to predict the top-event probability for a fault tree. Five different gate types are allowed in the fault tree: AND, OR, EXCLUSIVE OR, INVERT, AND m OF n gates. The high-level input language is easy to understand and use when describing the system tree. In addition, the use of the hierarchical fault tree capability can simplify the tree description and decrease program execution time. The current solution technique provides an answer precisely (within the limits of double precision floating point arithmetic) within a user specified number of digits accuracy. The user may vary one failure rate or failure probability over a range of values and plot the results for sensitivity analyses. The solution technique is implemented in FORTRAN; the remaining program code is implemented in Pascal. The program is written to run on a Digital Equipment Corporation (DEC) VAX computer with the VMS operation system.

  2. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  3. [Spatial interpolation of soil organic matter using regression Kriging and geographically weighted regression Kriging].

    PubMed

    Yang, Shun-hua; Zhang, Hai-tao; Guo, Long; Ren, Yan

    2015-06-01

    Relative elevation and stream power index were selected as auxiliary variables based on correlation analysis for mapping soil organic matter. Geographically weighted regression Kriging (GWRK) and regression Kriging (RK) were used for spatial interpolation of soil organic matter and compared with ordinary Kriging (OK), which acts as a control. The results indicated that soil or- ganic matter was significantly positively correlated with relative elevation whilst it had a significantly negative correlation with stream power index. Semivariance analysis showed that both soil organic matter content and its residuals (including ordinary least square regression residual and GWR resi- dual) had strong spatial autocorrelation. Interpolation accuracies by different methods were esti- mated based on a data set of 98 validation samples. Results showed that the mean error (ME), mean absolute error (MAE) and root mean square error (RMSE) of RK were respectively 39.2%, 17.7% and 20.6% lower than the corresponding values of OK, with a relative-improvement (RI) of 20.63. GWRK showed a similar tendency, having its ME, MAE and RMSE to be respectively 60.6%, 23.7% and 27.6% lower than those of OK, with a RI of 59.79. Therefore, both RK and GWRK significantly improved the accuracy of OK interpolation of soil organic matter due to their in- corporation of auxiliary variables. In addition, GWRK performed obviously better than RK did in this study, and its improved performance should be attributed to the consideration of sample spatial locations.

  4. Reinforcement Learning Trees.

    PubMed

    Zhu, Ruoqing; Zeng, Donglin; Kosorok, Michael R

    In this paper, we introduce a new type of tree-based method, reinforcement learning trees (RLT), which exhibits significantly improved performance over traditional methods such as random forests (Breiman, 2001) under high-dimensional settings. The innovations are three-fold. First, the new method implements reinforcement learning at each selection of a splitting variable during the tree construction processes. By splitting on the variable that brings the greatest future improvement in later splits, rather than choosing the one with largest marginal effect from the immediate split, the constructed tree utilizes the available samples in a more efficient way. Moreover, such an approach enables linear combination cuts at little extra computational cost. Second, we propose a variable muting procedure that progressively eliminates noise variables during the construction of each individual tree. The muting procedure also takes advantage of reinforcement learning and prevents noise variables from being considered in the search for splitting rules, so that towards terminal nodes, where the sample size is small, the splitting rules are still constructed from only strong variables. Last, we investigate asymptotic properties of the proposed method under basic assumptions and discuss rationale in general settings.

  5. Reinforcement Learning Trees

    PubMed Central

    Zhu, Ruoqing; Zeng, Donglin; Kosorok, Michael R.

    2015-01-01

    In this paper, we introduce a new type of tree-based method, reinforcement learning trees (RLT), which exhibits significantly improved performance over traditional methods such as random forests (Breiman, 2001) under high-dimensional settings. The innovations are three-fold. First, the new method implements reinforcement learning at each selection of a splitting variable during the tree construction processes. By splitting on the variable that brings the greatest future improvement in later splits, rather than choosing the one with largest marginal effect from the immediate split, the constructed tree utilizes the available samples in a more efficient way. Moreover, such an approach enables linear combination cuts at little extra computational cost. Second, we propose a variable muting procedure that progressively eliminates noise variables during the construction of each individual tree. The muting procedure also takes advantage of reinforcement learning and prevents noise variables from being considered in the search for splitting rules, so that towards terminal nodes, where the sample size is small, the splitting rules are still constructed from only strong variables. Last, we investigate asymptotic properties of the proposed method under basic assumptions and discuss rationale in general settings. PMID:26903687

  6. Bayesian Unimodal Density Regression for Causal Inference

    ERIC Educational Resources Information Center

    Karabatsos, George; Walker, Stephen G.

    2011-01-01

    Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…

  7. Developmental Regression in Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Rogers, Sally J.

    2004-01-01

    The occurrence of developmental regression in autism is one of the more puzzling features of this disorder. Although several studies have documented the validity of parental reports of regression using home videos, accumulating data suggest that most children who demonstrate regression also demonstrated previous, subtle, developmental differences.…

  8. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  9. Regression Analysis by Example. 5th Edition

    ERIC Educational Resources Information Center

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  10. Synthesizing Regression Results: A Factored Likelihood Method

    ERIC Educational Resources Information Center

    Wu, Meng-Jia; Becker, Betsy Jane

    2013-01-01

    Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported…

  11. Streamflow forecasting using functional regression

    NASA Astrophysics Data System (ADS)

    Masselot, Pierre; Dabo-Niang, Sophie; Chebana, Fateh; Ouarda, Taha B. M. J.

    2016-07-01

    Streamflow, as a natural phenomenon, is continuous in time and so are the meteorological variables which influence its variability. In practice, it can be of interest to forecast the whole flow curve instead of points (daily or hourly). To this end, this paper introduces the functional linear models and adapts it to hydrological forecasting. More precisely, functional linear models are regression models based on curves instead of single values. They allow to consider the whole process instead of a limited number of time points or features. We apply these models to analyse the flow volume and the whole streamflow curve during a given period by using precipitations curves. The functional model is shown to lead to encouraging results. The potential of functional linear models to detect special features that would have been hard to see otherwise is pointed out. The functional model is also compared to the artificial neural network approach and the advantages and disadvantages of both models are discussed. Finally, future research directions involving the functional model in hydrology are presented.

  12. Survival analysis and Cox regression.

    PubMed

    Benítez-Parejo, N; Rodríguez del Águila, M M; Pérez-Vicente, S

    2011-01-01

    The data provided by clinical trials are often expressed in terms of survival. The analysis of survival comprises a series of statistical analytical techniques in which the measurements analysed represent the time elapsed between a given exposure and the outcome of a certain event. Despite the name of these techniques, the outcome in question does not necessarily have to be either survival or death, and may be healing versus no healing, relief versus pain, complication versus no complication, relapse versus no relapse, etc. The present article describes the analysis of survival from both a descriptive perspective, based on the Kaplan-Meier estimation method, and in terms of bivariate comparisons using the log-rank statistic. Likewise, a description is provided of the Cox regression models for the study of risk factors or covariables associated to the probability of survival. These models are defined in both simple and multiple forms, and a description is provided of how they are calculated and how the postulates for application are checked - accompanied by illustrating examples with the shareware application R.

  13. Estimating equivalence with quantile regression

    USGS Publications Warehouse

    Cade, B.S.

    2011-01-01

    Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. ?? 2011 by the Ecological Society of America.

  14. Long tree-ring chronologies provide evidence of recent tree growth decrease in a Central African tropical forest.

    PubMed

    Battipaglia, Giovanna; Zalloni, Enrica; Castaldi, Simona; Marzaioli, Fabio; Cazzolla-Gatti, Roberto; Lasserre, Bruno; Tognetti, Roberto; Marchetti, Marco; Valentini, Riccardo

    2015-01-01

    It is still unclear whether the exponential rise of atmospheric CO2 concentration has produced a fertilization effect on tropical forests, thus incrementing their growth rate, in the last two centuries. As many factors affect tree growth patterns, short -term studies might be influenced by the confounding effect of several interacting environmental variables on plant growth. Long-term analyses of tree growth can elucidate long-term trends of plant growth response to dominant drivers. The study of annual rings, applied to long tree-ring chronologies in tropical forest trees enables such analysis. Long-term tree-ring chronologies of three widespread African species were measured in Central Africa to analyze the growth of trees over the last two centuries. Growth trends were correlated to changes in global atmospheric CO2 concentration and local variations in the main climatic drivers, temperature and rainfall. Our results provided no evidence for a fertilization effect of CO2 on tree growth. On the contrary, an overall growth decline was observed for all three species in the last century, which appears to be significantly correlated to the increase in local temperature. These findings provide additional support to the global observations of a slowing down of C sequestration in the trunks of forest trees in recent decades. Data indicate that the CO2 increase alone has not been sufficient to obtain a tree growth increase in tropical trees. The effect of other changing environmental factors, like temperature, may have overridden the fertilization effect of CO2.

  15. Long Tree-Ring Chronologies Provide Evidence of Recent Tree Growth Decrease in a Central African Tropical Forest

    PubMed Central

    Battipaglia, Giovanna; Zalloni, Enrica; Castaldi, Simona; Marzaioli, Fabio; Cazzolla- Gatti, Roberto; Lasserre, Bruno; Tognetti, Roberto; Marchetti, Marco; Valentini, Riccardo

    2015-01-01

    It is still unclear whether the exponential rise of atmospheric CO2 concentration has produced a fertilization effect on tropical forests, thus incrementing their growth rate, in the last two centuries. As many factors affect tree growth patterns, short -term studies might be influenced by the confounding effect of several interacting environmental variables on plant growth. Long-term analyses of tree growth can elucidate long-term trends of plant growth response to dominant drivers. The study of annual rings, applied to long tree-ring chronologies in tropical forest trees enables such analysis. Long-term tree-ring chronologies of three widespread African species were measured in Central Africa to analyze the growth of trees over the last two centuries. Growth trends were correlated to changes in global atmospheric CO2 concentration and local variations in the main climatic drivers, temperature and rainfall. Our results provided no evidence for a fertilization effect of CO2 on tree growth. On the contrary, an overall growth decline was observed for all three species in the last century, which appears to be significantly correlated to the increase in local temperature. These findings provide additional support to the global observations of a slowing down of C sequestration in the trunks of forest trees in recent decades. Data indicate that the CO2 increase alone has not been sufficient to obtain a tree growth increase in tropical trees. The effect of other changing environmental factors, like temperature, may have overridden the fertilization effect of CO2. PMID:25806946

  16. Basal physiological parameters in domesticated tree shrews (Tupaia belangeri chinensis).

    PubMed

    Wang, Jing; Xu, Xin-Li; Ding, Ze-Yang; Mao, Rong-Rong; Zhou, Qi-Xin; Lü, Long-Bao; Wang, Li-Ping; Wang, Shuang; Zhang, Chen; Xu, Lin; Yang, Yue-Xiong

    2013-04-01

    Establishing non-human primate models of human diseases is an efficient way to narrow the large gap between basic studies and translational medicine. Multifold advantages such as simplicity of breeding, low cost of feeding and facility of operating make the tree shrew an ideal non-human primate model proxy. Additional features like vulnerability to stress and spontaneous diabetic characteristics also indicate that the tree shrew could be a potential new animal model of human diseases. However, basal physiological indexes of tree shrew, especially those related to human disease, have not been systematically reported. Accordingly, we established important basal physiological indexes of domesticated tree shrews including several factors: (1) body weight, (2) core body temperature and rhythm, (3) diet metabolism, (4) locomotor rhythm, (5) electroencephalogram, (6) glycometabolism and (7) serum and urinary hormone level and urinary cortisol rhythm. We compared the physiological parameters of domesticated tree shrew with that of rats and macaques. Results showed that (a) the core body temperature of the tree shrew was 39.59±0.05 ℃, which was higher than that of rats and macaques; (b) Compared with wild tree shrews, with two activity peaks, domesticated tree shrews had only one activity peak from 17:30 to 19:30; (c) Compared with rats, tree shrews had poor carbohydrate metabolism ability; and (d) Urinary cortisol rhythm indicated there were two peaks at 8:00 and 17:00 in domesticated tree shrews, which matched activity peaks in wild tree shrews. These results provided basal physiological indexes for domesticated tree shrews and laid an important foundation for diabetes and stress-related disease models established on tree shrews.

  17. Tree nut allergens.

    PubMed

    Roux, Kenneth H; Teuber, Suzanne S; Sathe, Shridhar K

    2003-08-01

    Allergic reactions to tree nuts can be serious and life threatening. Considerable research has been conducted in recent years in an attempt to characterize those allergens that are most responsible for allergy sensitization and triggering. Both native and recombinant nut allergens have been identified and characterized and, for some, the IgE-reactive epitopes described. Some allergens, such as lipid transfer proteins, profilins, and members of the Bet v 1-related family, represent minor constituents in tree nuts. These allergens are frequently cross-reactive with other food and pollen homologues, and are considered panallergens. Others, such as legumins, vicilins, and 2S albumins, represent major seed storage protein constituents of the nuts. The allergenic tree nuts discussed in this review include those most commonly responsible for allergic reactions such as hazelnut, walnut, cashew, and almond as well as those less frequently associated with allergies including pecan, chestnut, Brazil nut, pine nut, macadamia nut, pistachio, coconut, Nangai nut, and acorn.

  18. Optimal parallel evaluation of AND trees

    NASA Technical Reports Server (NTRS)

    Wah, Benjamin W.; Li, Guo-Jie

    1990-01-01

    A quantitative analysis based on both preemptive and nonpreemptive critical-path scheduling algorithms is presently conducted for the optimal degree of parallelism required in evaluating a given AND tree. The optimal degree of parallelism is found to depend on problem complexity, precedence-graph shape, and task-time distribution along each path. In addition to demonstrating the optimality of the preemptive critical-path scheduling algorithm for evaluating an arbitrary AND tree on a fixed number of processors, the possibility of efficiently ascertaining tight bounds on the number of processors for optimal processor-time efficiency is illustrated.

  19. Classification tree method for bacterial source tracking with antibiotic resistance analysis data.

    PubMed

    Price, Bertram; Venso, Elichia A; Frana, Mark F; Greenberg, Joshua; Ware, Adam; Currey, Lee

    2006-05-01

    Various statistical classification methods, including discriminant analysis, logistic regression, and cluster analysis, have been used with antibiotic resistance analysis (ARA) data to construct models for bacterial source tracking (BST). We applied the statistical method known as classification trees to build a model for BST for the Anacostia Watershed in Maryland. Classification trees have more flexibility than other statistical classification approaches based on standard statistical methods to accommodate complex interactions among ARA variables. This article describes the use of classification trees for BST and includes discussion of its principal parameters and features. Anacostia Watershed ARA data are used to illustrate the application of classification trees, and we report the BST results for the watershed.

  20. A celestial Christmas tree

    NASA Astrophysics Data System (ADS)

    Moore, S.

    2006-12-01

    Having finished decorating your terrestrial Christmas tree this year, you may care to step outside and view a celestial one. Well placed in the December night sky in the often overlooked but very rewarding constellation of Monoceros, NGC 2264, called the Christmas Tree by the American astronomer and writer Leland S. Copeland, lies due south around 1 a.m. in mid-December at an altitude of 50°. The cluster lies amid a vast area of nebulosity, well captured in the image by Gordon Rogers on the cover of this Journal.

  1. Tree biomass in the Swiss landscape: nationwide modelling for improved accounting for forest and non-forest trees.

    PubMed

    Price, B; Gomez, A; Mathys, L; Gardi, O; Schellenberger, A; Ginzler, C; Thürig, E

    2017-03-01

    Trees outside forest (TOF) can perform a variety of social, economic and ecological functions including carbon sequestration. However, detailed quantification of tree biomass is usually limited to forest areas. Taking advantage of structural information available from stereo aerial imagery and airborne laser scanning (ALS), this research models tree biomass using national forest inventory data and linear least-square regression and applies the model both inside and outside of forest to create a nationwide model for tree biomass (above ground and below ground). Validation of the tree biomass model against TOF data within settlement areas shows relatively low model performance (R (2) of 0.44) but still a considerable improvement on current biomass estimates used for greenhouse gas inventory and carbon accounting. We demonstrate an efficient and easily implementable approach to modelling tree biomass across a large heterogeneous nationwide area. The model offers significant opportunity for improved estimates on land use combination categories (CC) where tree biomass has either not been included or only roughly estimated until now. The ALS biomass model also offers the advantage of providing greater spatial resolution and greater within CC spatial variability compared to the current nationwide estimates.

  2. Developmental regression in autism spectrum disorder.

    PubMed

    Al Backer, Nouf Backer

    2015-01-01

    The occurrence of developmental regression in autism spectrum disorder (ASD) is one of the most puzzling phenomena of this disorder. A little is known about the nature and mechanism of developmental regression in ASD. About one-third of young children with ASD lose some skills during the preschool period, usually speech, but sometimes also nonverbal communication, social or play skills are also affected. There is a lot of evidence suggesting that most children who demonstrate regression also had previous, subtle, developmental differences. It is difficult to predict the prognosis of autistic children with developmental regression. It seems that the earlier development of social, language, and attachment behaviors followed by regression does not predict the later recovery of skills or better developmental outcomes. The underlying mechanisms that lead to regression in autism are unknown. The role of subclinical epilepsy in the developmental regression of children with autism remains unclear.

  3. Optimizing a basal bark spray of dinotefuran to manage armored scales (Hemiptera: Diaspididae) in Christmas tree plantations.

    PubMed

    Cowles, Richard S

    2010-10-01

    The armored scales Fiorinia externa Ferris and Aspidiotus cryptomeriae Kuwana (Hemiptera: Diaspididae) are increasingly damaging to Christmas tree plantings in southern New England. The systemic insecticide dinotefuran was investigated for selectively suppressing armored scale populations relative to their natural enemies in cooperating growers' fields in 2008 and 2009. Banded soil application of dinotefuran resulted in poor control. However, a dinotefuran spray applied to the basal 25 cm of trunk resulted in its absorption through the bark, translocation to the foliage, and good efficacy. The basal bark spray did not significantly impact the activity of predators Chilocorus stigma (Say) or Cybocephalus nipponicus Enrody-Younga and in 2009 showed a dosage-dependent improvement in the percentage of scales parasitized by Encarsia citrina Craw. A field dosage-response factorial experiment revealed that a 0.25% (vol:vol) addition of a surfactant with dinotefuran did not enhance insecticidal effect. Probit-transformed scale population reduction relative to the untreated check was subjected to linear regression analysis; reduction of scale populations was proportional to the log of insecticide dosage, whereas basal bark spray efficacy declined in proportion to the cube of tree height. The regression equation can be used to optimize dosage relative to tree height. Excellent efficacy resulted from basal bark spray application dates of 28 April (prebud break) to mid-June, but earlier spray timing within that treatment window had fewer crawlers discoloring new growth with their short-lived feeding. A basal bark spray of dinotefuran is well suited for integration with natural enemies to manage armored scales in Christmas tree plantations.

  4. Iris movement mediates vascular apoptosis during rat pupillary membrane regression.

    PubMed

    Morizane, Yuki; Mohri, Satoshi; Kosaka, Jun; Toné, Shigenobu; Kiyooka, Takahiko; Miyasaka, Takehiro; Shimizu, Juichiro; Ogasawara, Yasuo; Shiraga, Fumio; Minatogawa, Yohsuke; Sasaki, Junzo; Ohtsuki, Hiroshi; Kajiya, Fumihiko

    2006-03-01

    In the course of mammalian lens development, a transient capillary meshwork known as the pupillary membrane (PM) forms, which is located at the pupil area; the PM nourishes the anterior surface of the lens and then regresses to make the optical path clear. Although the involvement of apoptotic process has been reported in the PM regression, the initiating factor remains unknown. We initially found that regression of the PM coincided with the development of iris motility, and iris movement caused cessation and resumption of blood flow within the PM. Therefore, we investigated whether the development of the iris's ability to constrict and dilate functions as an essential signal that induces apoptosis in the PM. Continuous inhibition of iris movement with mydriatic agents from postnatal day 7 to day 12 suppressed apoptosis of the PM and migration of macrophage toward the PM, and resulted in the persistence of PM in rats. The distribution of apoptotic cells in the regressing PM was diffuse and showed no apparent localization. These results indicated that iris movement induced regression of the PM by changing the blood flow within it. This study suggests the importance of the physiological interactions between tissues-in this case, the iris and the PM-as a signal to advance vascular regression during organ development, and defines a novel function of the iris during ocular development in addition to the well-known function, that is, optimization of light transmission into the eye.

  5. Direction of Effects in Multiple Linear Regression Models.

    PubMed

    Wiedermann, Wolfgang; von Eye, Alexander

    2015-01-01

    Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.

  6. Tree attenuation at 20 GHz: Foliage effects

    NASA Technical Reports Server (NTRS)

    Vogel, Wolfhard J.; Goldhirsh, Julius

    1993-01-01

    Static tree attenuation measurements at 20 GHz (K-Band) on a 30 deg slant path through a mature Pecan tree with and without leaves showed median fades exceeding approximately 23 dB and 7 dB, respectively. The corresponding 1% probability fades were 43 dB and 25 dB. Previous 1.6 GHz (L-Band) measurements for the bare tree case showed fades larger than those at K-Band by 3.4 dB for the median and smaller by approximately 7 dB at the 1% probability. While the presence of foliage had only a small effect on fading at L-Band (approximately 1 dB additional for the median to 1% probability range), the attenuation increase was significant at K-Band, where it increased by about 17 dB over the same probability range.

  7. Tree attenuation at 20 GHz: Foliage effects

    NASA Astrophysics Data System (ADS)

    Vogel, Wolfhard J.; Goldhirsh, Julius

    1993-08-01

    Static tree attenuation measurements at 20 GHz (K-Band) on a 30 deg slant path through a mature Pecan tree with and without leaves showed median fades exceeding approximately 23 dB and 7 dB, respectively. The corresponding 1% probability fades were 43 dB and 25 dB. Previous 1.6 GHz (L-Band) measurements for the bare tree case showed fades larger than those at K-Band by 3.4 dB for the median and smaller by approximately 7 dB at the 1% probability. While the presence of foliage had only a small effect on fading at L-Band (approximately 1 dB additional for the median to 1% probability range), the attenuation increase was significant at K-Band, where it increased by about 17 dB over the same probability range.

  8. Forward estimation for game-tree search

    SciTech Connect

    Zhang, Weixiong

    1996-12-31

    It is known that bounds on the minimax values of nodes in a game tree can be used to reduce the computational complexity of minimax search for two-player games. We describe a very simple method to estimate bounds on the minimax values of interior nodes of a game tree, and use the bounds to improve minimax search. The new algorithm, called forward estimation, does not require additional domain knowledge other than a static node evaluation function, and has small constant overhead per node expansion. We also propose a variation of forward estimation, which provides a tradeoff between computational complexity and decision quality. Our experimental results show that forward estimation outperforms alpha-beta pruning on random game trees and the game of Othello.

  9. The Inference of Gene Trees with Species Trees

    PubMed Central

    Szöllősi, Gergely J.; Tannier, Eric; Daubin, Vincent; Boussau, Bastien

    2015-01-01

    This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution. PMID:25070970

  10. A novel tree structure based watermarking algorithm

    NASA Astrophysics Data System (ADS)

    Lin, Qiwei; Feng, Gui

    2008-03-01

    In this paper, we propose a new blind watermarking algorithm for images which is based on tree structure. The algorithm embeds the watermark in wavelet transform domain, and the embedding positions are determined by significant coefficients wavelets tree(SCWT) structure, which has the same idea with the embedded zero-tree wavelet (EZW) compression technique. According to EZW concepts, we obtain coefficients that are related to each other by a tree structure. This relationship among the wavelet coefficients allows our technique to embed more watermark data. If the watermarked image is attacked such that the set of significant coefficients is changed, the tree structure allows the correlation-based watermark detector to recover synchronously. The algorithm also uses a visual adaptive scheme to insert the watermark to minimize watermark perceptibility. In addition to the watermark, a template is inserted into the watermarked image at the same time. The template contains synchronization information, allowing the detector to determine the geometric transformations type applied to the watermarked image. Experimental results show that the proposed watermarking algorithm is robust against most signal processing attacks, such as JPEG compression, median filtering, sharpening and rotating. And it is also an adaptive method which shows a good performance to find the best areas to insert a stronger watermark.

  11. A Universal Phylogenetic Tree.

    ERIC Educational Resources Information Center

    Offner, Susan

    2001-01-01

    Presents a universal phylogenetic tree suitable for use in high school and college-level biology classrooms. Illustrates the antiquity of life and that all life is related, even if it dates back 3.5 billion years. Reflects important evolutionary relationships and provides an exciting way to learn about the history of life. (SAH)

  12. Tree-Ties.

    ERIC Educational Resources Information Center

    Gresczyk, Rick

    Created to help students understand how plants were used for food, for medicine, and for arts and crafts among the Ojibwe (Chippewa) Indians, the game Tree-Ties combines earth and social sciences within a specific culture. The game requires mutual respect, understanding, and agreement to succeed. Sounding like the word "treaties", the…

  13. Christmas Tree Category Manual.

    ERIC Educational Resources Information Center

    Bowman, James S.; Turmel, Jon P.

    This manual provides information needed to meet the standards for pesticide applicator certification. Pests and diseases of christmas tree plantations are identified and discussed. Section one deals with weeds and woody plants and the application, formulation and effects of herbicides in controlling them. Section two discusses specific diseases…

  14. Digging Deeper with Trees.

    ERIC Educational Resources Information Center

    Growing Ideas, 2001

    2001-01-01

    Describes hands-on science areas that focus on trees. A project on leaf pigmentation involves putting crushed leaves in a test tube with solvent acetone to dissolve pigment. In another project, students learn taxonomy by sorting and classifying leaves based on observable characteristics. Includes a language arts connection. (PVD)

  15. Phylogenics & Tree-Thinking

    ERIC Educational Resources Information Center

    Baum, David A.; Offner, Susan

    2008-01-01

    Phylogenetic trees, which are depictions of the inferred evolutionary relationships among a set of species, now permeate almost all branches of biology and are appearing in increasing numbers in biology textbooks. While few state standards explicitly require knowledge of phylogenetics, most require some knowledge of evolutionary biology, and many…

  16. The Sacred Tree.

    ERIC Educational Resources Information Center

    Lethbridge Univ. (Alberta).

    Designed as a text for high school students and adults, this illustrated book presents ethical concepts and teachings of Native societies throughout North America concerning the nature and possibilities of human existence. The final component of a course in self-discovery and development, the book begins with the legend of the "Sacred Tree"…

  17. Using GA-Ridge regression to select hydro-geological parameters influencing groundwater pollution vulnerability.

    PubMed

    Ahn, Jae Joon; Kim, Young Min; Yoo, Keunje; Park, Joonhong; Oh, Kyong Joo

    2012-11-01

    For groundwater conservation and management, it is important to accurately assess groundwater pollution vulnerability. This study proposed an integrated model using ridge regression and a genetic algorithm (GA) to effectively select the major hydro-geological parameters influencing groundwater pollution vulnerability in an aquifer. The GA-Ridge regression method determined that depth to water, net recharge, topography, and the impact of vadose zone media were the hydro-geological parameters that influenced trichloroethene pollution vulnerability in a Korean aquifer. When using these selected hydro-geological parameters, the accuracy was improved for various statistical nonlinear and artificial intelligence (AI) techniques, such as multinomial logistic regression, decision trees, artificial neural networks, and case-based reasoning. These results provide a proof of concept that the GA-Ridge regression is effective at determining influential hydro-geological parameters for the pollution vulnerability of an aquifer, and in turn, improves the AI performance in assessing groundwater pollution vulnerability.

  18. Efficient Gene Tree Correction Guided by Genome Evolution

    PubMed Central

    Lafond, Manuel; Seguin, Jonathan; Boussau, Bastien; Guéguen, Laurent; El-Mabrouk, Nadia; Tannier, Eric

    2016-01-01

    Motivations Gene trees inferred solely from multiple alignments of homologous sequences often contain weakly supported and uncertain branches. Information for their full resolution may lie in the dependency between gene families and their genomic context. Integrative methods, using species tree information in addition to sequence information, often rely on a computationally intensive tree space search which forecloses an application to large genomic databases. Results We propose a new method, called ProfileNJ, that takes a gene tree with statistical supports on its branches, and corrects its weakly supported parts by using a combination of information from a species tree and a distance matrix. Its low running time enabled us to use it on the whole Ensembl Compara database, for which we propose an alternative, arguably more plausible set of gene trees. This allowed us to perform a genome-wide analysis of duplication and loss patterns on the history of 63 eukaryote species, and predict ancestral gene content and order for all ancestors along the phylogeny. Availability A web interface called RefineTree, including ProfileNJ as well as a other gene tree correction methods, which we also test on the Ensembl gene families, is available at: http://www-ens.iro.umontreal.ca/~adbit/polytomysolver.html. The code of ProfileNJ as well as the set of gene trees corrected by ProfileNJ from Ensembl Compara version 73 families are also made available. PMID:27513924

  19. A Beta-splitting model for evolutionary trees

    PubMed Central

    Sainudiin, Raazesh

    2016-01-01

    In this article, we construct a generalization of the Blum–François Beta-splitting model for evolutionary trees, which was itself inspired by Aldous' Beta-splitting model on cladograms. The novelty of our approach allows for asymmetric shares of diversification rates (or diversification ‘potential’) between two sister species in an evolutionarily interpretable manner, as well as the addition of extinction to the model in a natural way. We describe the incremental evolutionary construction of a tree with n leaves by splitting or freezing extant lineages through the generating, organizing and deleting processes. We then give the probability of any (binary rooted) tree under this model with no extinction, at several resolutions: ranked planar trees giving asymmetric roles to the first and second offspring species of a given species and keeping track of the order of the speciation events occurring during the creation of the tree, unranked planar trees, ranked non-planar trees and finally (unranked non-planar) trees. We also describe a continuous-time equivalent of the generating, organizing and deleting processes where tree topology and branch lengths are jointly modelled and provide code in SageMath/Python for these algorithms. PMID:27293780

  20. Optimizing Urban Tree Soil Substrate for the City of Vienna

    NASA Astrophysics Data System (ADS)

    Murer, Erwin; Strauss, Peter; Schmidt, Stefan

    2015-04-01

    Many of the city garden managements in Central Europe encounter problems with the sustainable growing of trees in the cities. Tree root space is more and more limited by pavements and roads and is polluted by salt application during winter time. Thus, the life expectancy of the city trees is decreasing because the trees become more susceptible to diseases. Diseased trees are a safety risk. These challenges are additionally enforced by lower budgets to re-establish new trees. To actively react on this challenge a new soil substrate for city trees has been developed and tested combining cost effectiveness with improved characteristics for water retention and nutrient delivery on one side and drainage capabilities on the other side. The new substrate should be inexpensive, easy and simple to produce and well miscible. Therefore, easily available materials have been tested which are river sediments that are delivered by annual floods; compost produced by a city owned composting plant and low cost dolomite chippings from quarries near Vienna. The final composition of the new Vienna tree substrate consists of 3 mineral components and one organic component. These are mixed in a relationship of 4 parts dolomite chippings, 3 parts sand and 3 parts of fluvial fine sediment and 2 parts of compost. After a laboratory phase to develop the new substrate, field testing of the newly developed substrate is presently carried out in three different types of field experiments consisting of 20 implementation sites distributed over the city of Vienna, with annual checking for the growth of trees, 2 implementation sites with sensors to measure the water and salt balance and 6 city lysimeters with implementation of enhanced facilities to monitor substrate and water behaviour. These facilities will be used to relate the growing factors in connection with the site properties, to developing of a fertilizer recommendation for urban trees and to make tests for the compatibility of the trees

  1. New Life From Dead Trees

    ERIC Educational Resources Information Center

    DeGraaf, Richard M.

    1978-01-01

    There are numerous bird species that will nest only in dead or dying trees. Current forestry practices include clearing forests of these snags, or dead trees. This practice is driving many species out of the forests. An illustrated example of bird succession in and on a tree is given. (MA)

  2. The Hopi Fruit Tree Book.

    ERIC Educational Resources Information Center

    Nyhuis, Jane

    Referring as often as possible to traditional Hopi practices and to materials readily available on the reservation, the illustrated booklet provides information on the care and maintenance of young fruit trees. An introduction to fruit trees explains the special characteristics of new trees, e.g., grafting, planting pits, and watering. The…

  3. Building up rhetorical structure trees

    SciTech Connect

    Marcu, D.

    1996-12-31

    I use the distinction between the nuclei and the satellites that pertain to discourse relations to introduce a compositionality criterion for discourse trees. I provide a first-order formalization of rhetorical structure trees and, on its basis, I derive an algorithm that constructs all the valid rhetorical trees that can be associated with a given discourse.

  4. The Tree Worker's Manual. [Revised.

    ERIC Educational Resources Information Center

    Lilly, S. J.

    This manual acquaints readers with the general operations of the tree care industry. The manual covers subjects important to a tree worker and serves as a training aid for workers at the entry level as tree care professionals. Each chapter begins with a set of objectives and may include figures, tables, and photographs. Ten chapters are included:…

  5. Tests with VHR images for the identification of olive trees and other fruit trees in the European Union

    NASA Astrophysics Data System (ADS)

    Masson, Josiane; Soille, Pierre; Mueller, Rick

    2004-10-01

    In the context of the Common Agricultural Policy (CAP) there is a strong interest of the European Commission for counting and individually locating fruit trees. An automatic counting algorithm developed by the JRC (OLICOUNT) was used in the past for olive trees only, on 1m black and white orthophotos but with limits in case of young trees or irregular groves. This study investigates the improvement of fruit tree identification using VHR images on a large set of data in three test sites, one in Creta (Greece; one in the south-east of France with a majority of olive trees and associated fruit trees, and the last one in Florida on citrus trees. OLICOUNT was compared with two other automatic tree counting, applications, one using the CRISP software on citrus trees and the other completely automatic based on regional minima (morphological image analysis). Additional investigation was undertaken to refine the methods. This paper describes the automatic methods and presents the results derived from the tests.

  6. A Method to Quantify Plant Availability and Initiating Event Frequency Using a Large Event Tree, Small Fault Tree Model

    SciTech Connect

    Kee, Ernest J.; Sun, Alice; Rodgers, Shawn; Popova, ElmiraV; Nelson, Paul; Moiseytseva, Vera; Wang, Eric

    2006-07-01

    South Texas Project uses a large fault tree to produce scenarios (minimal cut sets) used in quantification of plant availability and event frequency predictions. On the other hand, the South Texas Project probabilistic risk assessment model uses a large event tree, small fault tree for quantifying core damage and radioactive release frequency predictions. The South Texas Project is converting its availability and event frequency model to use a large event tree, small fault in an effort to streamline application support and to provide additional detail in results. The availability and event frequency model as well as the applications it supports (maintenance and operational risk management, system engineering health assessment, preventive maintenance optimization, and RIAM) are briefly described. A methodology to perform availability modeling in a large event tree, small fault tree framework is described in detail. How the methodology can be used to support South Texas Project maintenance and operations risk management is described in detail. Differences with other fault tree methods and other recently proposed methods are discussed in detail. While the methods described are novel to the South Texas Project Risk Management program and to large event tree, small fault tree models, concepts in the area of application support and availability modeling have wider applicability to the industry. (authors)

  7. Process modeling with the regression network.

    PubMed

    van der Walt, T; Barnard, E; van Deventer, J

    1995-01-01

    A new connectionist network topology called the regression network is proposed. The structural and underlying mathematical features of the regression network are investigated. Emphasis is placed on the intricacies of the optimization process for the regression network and some measures to alleviate these difficulties of optimization are proposed and investigated. The ability of the regression network algorithm to perform either nonparametric or parametric optimization, as well as a combination of both, is also highlighted. It is further shown how the regression network can be used to model systems which are poorly understood on the basis of sparse data. A semi-empirical regression network model is developed for a metallurgical processing operation (a hydrocyclone classifier) by building mechanistic knowledge into the connectionist structure of the regression network model. Poorly understood aspects of the process are provided for by use of nonparametric regions within the structure of the semi-empirical connectionist model. The performance of the regression network model is compared to the corresponding generalization performance results obtained by some other nonparametric regression techniques.

  8. Quantile regression applied to spectral distance decay

    USGS Publications Warehouse

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  9. [From clinical judgment to linear regression model.

    PubMed

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R(2)) indicates the importance of independent variables in the outcome.

  10. Geodesic least squares regression on information manifolds

    SciTech Connect

    Verdoolaege, Geert

    2014-12-05

    We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.

  11. Tree Colors: Color Schemes for Tree-Structured Data.

    PubMed

    Tennekes, Martijn; de Jonge, Edwin

    2014-12-01

    We present a method to map tree structures to colors from the Hue-Chroma-Luminance color model, which is known for its well balanced perceptual properties. The Tree Colors method can be tuned with several parameters, whose effect on the resulting color schemes is discussed in detail. We provide a free and open source implementation with sensible parameter defaults. Categorical data are very common in statistical graphics, and often these categories form a classification tree. We evaluate applying Tree Colors to tree structured data with a survey on a large group of users from a national statistical institute. Our user study suggests that Tree Colors are useful, not only for improving node-link diagrams, but also for unveiling tree structure in non-hierarchical visualizations.

  12. Hide and vanish: data sets where the most parsimonious tree is known but hard to find, and their implications for tree search methods.

    PubMed

    Goloboff, Pablo A

    2014-10-01

    Three different types of data sets, for which the uniquely most parsimonious tree can be known exactly but is hard to find with heuristic tree search methods, are studied. Tree searches are complicated more by the shape of the tree landscape (i.e. the distribution of homoplasy on different trees) than by the sheer abundance of homoplasy or character conflict. Data sets of Type 1 are those constructed by Radel et al. (2013). Data sets of Type 2 present a very rugged landscape, with narrow peaks and valleys, but relatively low amounts of homoplasy. For such a tree landscape, subjecting the trees to TBR and saving suboptimal trees produces much better results when the sequence of clipping for the tree branches is randomized instead of fixed. An unexpected finding for data sets of Types 1 and 2 is that starting a search from a random tree instead of a random addition sequence Wagner tree may increase the probability that the search finds the most parsimonious tree; a small artificial example where these probabilities can be calculated exactly is presented. Data sets of Type 3, the most difficult data sets studied here, comprise only congruent characters, and a single island with only one most parsimonious tree. Even if there is a single island, missing entries create a very flat landscape which is difficult to traverse with tree search algorithms because the number of equally parsimonious trees that need to be saved and swapped to effectively move around the plateaus is too large. Minor modifications of the parameters of tree drifting, ratchet, and sectorial searches allow travelling around these plateaus much more efficiently than saving and swapping large numbers of equally parsimonious trees with TBR. For these data sets, two new related criteria for selecting taxon addition sequences in Wagner trees (the "selected" and "informative" addition sequences) produce much better results than the standard random or closest addition sequences. These new methods for Wagner

  13. Narrowing historical uncertainty: probabilistic classification of ambiguously identified tree species in historical forest survey data

    USGS Publications Warehouse

    Mladenoff, D.J.; Dahir, S.E.; Nordheim, E.V.; Schulte, L.A.; Guntenspergen, G.R.

    2002-01-01

    Historical data have increasingly become appreciated for insight into the past conditions of ecosystems. Uses of such data include assessing the extent of ecosystem change; deriving ecological baselines for management, restoration, and modeling; and assessing the importance of past conditions on the composition and function of current systems. One historical data set of this type is the Public Land Survey (PLS) of the United States General Land Office, which contains data on multiple tree species, sizes, and distances recorded at each survey point, located at half-mile (0.8 km) intervals on a 1-mi (1.6 km) grid. This survey method was begun in the 1790s on US federal lands extending westward from Ohio. Thus, the data have the potential of providing a view of much of the US landscape from the mid-1800s, and they have been used extensively for this purpose. However, historical data sources, such as those describing the species composition of forests, can often be limited in the detail recorded and the reliability of the data, since the information was often not originally recorded for ecological purposes. Forest trees are sometimes recorded ambiguously, using generic or obscure common names. For the PLS data of northern Wisconsin, USA, we developed a method to classify ambiguously identified tree species using logistic regression analysis, using data on trees that were clearly identified to species and a set of independent predictor variables to build the models. The models were first created on partial data sets for each species and then tested for fit against the remaining data. Validations were conducted using repeated, random subsets of the data. Model prediction accuracy ranged from 81% to 96% in differentiating congeneric species among oak, pine, ash, maple, birch, and elm. Major predictor variables were tree size, associated species, landscape classes indicative of soil type, and spatial location within the study region. Results help to clarify ambiguities

  14. An approach for reconstructing past streamflows using a water balance model and tree-ring records in the upper West Walker River basin, California

    NASA Astrophysics Data System (ADS)

    Vittori, J. C.; Saito, L.; Biondi, F.

    2010-12-01

    Historical streamflows in a given river basin can be useful for determining regional patterns of drought and climate, yet such measured data are typically available for the last 100 years at most. To extend the measured record, observed streamflows can be regressed against tree-ring data that serve as proxies for streamflow. This empirical approach, however, cannot account for or test factors that do not directly affect tree-ring growth but may influence streamflow. To reconstruct past streamflows in a more mechanistic way, a seasonal water balance model has been developed for the upper West Walker River basin that uses proxy precipitation and air temperature data derived from tree-ring records as input. The model incorporates simplistic relationships between precipitation and other components of the hydrologic cycle, as well as a component for modeling snow, and operates at a seasonal time scale. The model allows for flexibility in manipulating various hydrologic and land use characteristics, and can be applied to other watersheds. The intent is for the model to investigate sources of uncertainty in streamflow reconstructions, and how factors such as wildfire or changes in vegetation cover could impact estimates of past flows, something regression-based models are not able to do. In addition, the use of a mechanistic water balance model calibrated against proxy climate records can provide information on changes in various components of the water cycle, including the interaction between evapotranspiration, snowmelt, and runoff under warmer climatic regimes.

  15. Marginally compact hyperbranched polymer trees.

    PubMed

    Dolgushev, M; Wittmer, J P; Johner, A; Benzerara, O; Meyer, H; Baschnagel, J

    2017-03-29

    Assuming Gaussian chain statistics along the chain contour, we generate by means of a proper fractal generator hyperbranched polymer trees which are marginally compact. Static and dynamical properties, such as the radial intrachain pair density distribution ρpair(r) or the shear-stress relaxation modulus G(t), are investigated theoretically and by means of computer simulations. We emphasize that albeit the self-contact density diverges logarithmically with the total mass N, this effect becomes rapidly irrelevant with increasing spacer length S. In addition to this it is seen that the standard Rouse analysis must necessarily become inappropriate for compact objects for which the relaxation time τp of mode p must scale as τp ∼ (N/p)(5/3) rather than the usual square power law for linear chains.

  16. Use of probabilistic weights to enhance linear regression myoelectric control

    NASA Astrophysics Data System (ADS)

    Smith, Lauren H.; Kuiken, Todd A.; Hargrove, Levi J.

    2015-12-01

    Objective. Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Approach. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts’ law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Main results. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p < 0.05) by preventing extraneous movement at additional DOFs. Similar results were seen in experiments with two transradial amputees. Though goodness-of-fit evaluations suggested that the EMG feature distributions showed some deviations from the Gaussian, equal-covariance assumptions used in this experiment, the assumptions were sufficiently met to provide improved performance compared to linear regression control. Significance. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.

  17. Spatial vulnerability assessments by regression kriging

    NASA Astrophysics Data System (ADS)

    Pásztor, László; Laborczi, Annamária; Takács, Katalin; Szatmári, Gábor

    2016-04-01

    information representing IEW or GRP forming environmental factors were taken into account to support the spatial inference of the locally experienced IEW frequency and measured GRP values respectively. An efficient spatial prediction methodology was applied to construct reliable maps, namely regression kriging (RK) using spatially exhaustive auxiliary data on soil, geology, topography, land use and climate. RK divides the spatial inference into two parts. Firstly the deterministic component of the target variable is determined by a regression model. The residuals of the multiple linear regression analysis represent the spatially varying but dependent stochastic component, which are interpolated by kriging. The final map is the sum of the two component predictions. Application of RK also provides the possibility of inherent accuracy assessment. The resulting maps are characterized by global and local measures of its accuracy. Additionally the method enables interval estimation for spatial extension of the areas of predefined risk categories. All of these outputs provide useful contribution to spatial planning, action planning and decision making. Acknowledgement: Our work was partly supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).

  18. Fault-Tree Compiler Program

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Martensen, Anna L.

    1992-01-01

    FTC, Fault-Tree Compiler program, is reliability-analysis software tool used to calculate probability of top event of fault tree. Five different types of gates allowed in fault tree: AND, OR, EXCLUSIVE OR, INVERT, and M OF N. High-level input language of FTC easy to understand and use. Program supports hierarchical fault-tree-definition feature simplifying process of description of tree and reduces execution time. Solution technique implemented in FORTRAN, and user interface in Pascal. Written to run on DEC VAX computer operating under VMS operating system.

  19. Global Value Trees

    PubMed Central

    Zhu, Zhen; Puliga, Michelangelo; Cerina, Federica; Chessa, Alessandro; Riccaboni, Massimo

    2015-01-01

    The fragmentation of production across countries has become an important feature of the globalization in recent decades and is often conceptualized by the term “global value chains” (GVCs). When empirically investigating the GVCs, previous studies are mainly interested in knowing how global the GVCs are rather than how the GVCs look like. From a complex networks perspective, we use the World Input-Output Database (WIOD) to study the evolution of the global production system. We find that the industry-level GVCs are indeed not chain-like but are better characterized by the tree topology. Hence, we compute the global value trees (GVTs) for all the industries available in the WIOD. Moreover, we compute an industry importance measure based on the GVTs and compare it with other network centrality measures. Finally, we discuss some future applications of the GVTs. PMID:25978067

  20. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.; Bader, Jon B.

    2010-01-01

    Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.

  1. Penalized spline estimation for functional coefficient regression models.

    PubMed

    Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan

    2010-04-01

    The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.

  2. An adequate design for regression analysis of yield trials.

    PubMed

    Gusmão, L

    1985-12-01

    Based on theoretical demonstrations and illustrated with a numerical example from triticale yield trials in Portugal, the Completely Randomized Design is proposed as the one suited for Regression Analysis. When trials are designed in Complete Randomized Blocks the regression of plot production on block mean instead of the regression of cultivar mean on the overall mean of the trial is proposed as the correct procedure for regression analysis. These proposed procedures, in addition to providing a better agreement with the assumptions for regression and the philosophy of the method, induce narrower confidence intervals and attenuation of the hyperbolic effect. The increase in precision is brought about by both a decrease in the t Student values by an increased number of degrees of freedom, and by a decrease in standard error by a non proportional increase of residual variance and non proportional increase of the sum of squares of the assumed independent variable. The new procedures seem to be promising for a better understanding of the mechanism of specific instability.

  3. PhyBin: binning trees by topology.

    PubMed

    Newton, Ryan R; Newton, Irene L G

    2013-01-01

    A major goal of many evolutionary analyses is to determine the true evolutionary history of an organism. Molecular methods that rely on the phylogenetic signal generated by a few to a handful of loci can be used to approximate the evolution of the entire organism but fall short of providing a global, genome-wide, perspective on evolutionary processes. Indeed, individual genes in a genome may have different evolutionary histories. Therefore, it is informative to analyze the number and kind of phylogenetic topologies found within an orthologous set of genes across a genome. Here we present PhyBin: a flexible program for clustering gene trees based on topological structure. PhyBin can generate bins of topologies corresponding to exactly identical trees or can utilize Robinson-Fould's distance matrices to generate clusters of similar trees, using a user-defined threshold. Additionally, PhyBin allows the user to adjust for potential noise in the dataset (as may be produced when comparing very closely related organisms) by pre-processing trees to collapse very short branches or those nodes not meeting a defined bootstrap threshold. As a test case, we generated individual trees based on an orthologous gene set from 10 Wolbachia species across four different supergroups (A-D) and utilized PhyBin to categorize the complete set of topologies produced from this dataset. Using this approach, we were able to show that although a single topology generally dominated the analysis, confirming the separation of the supergroups, many genes supported alternative evolutionary histories. Because PhyBin's output provides the user with lists of gene trees in each topological cluster, it can be used to explore potential reasons for discrepancies between phylogenies including homoplasies, long-branch attraction, or horizontal gene transfer events.

  4. Tree Rings: Timekeepers of the Past.

    ERIC Educational Resources Information Center

    Phipps, R. L.; McGowan, J.

    One of a series of general interest publications on science issues, this booklet describes the uses of tree rings in historical and biological recordkeeping. Separate sections cover the following topics: dating of tree rings, dating with tree rings, tree ring formation, tree ring identification, sample collections, tree ring cross dating, tree…

  5. How To Write a Municipal Tree Ordinance.

    ERIC Educational Resources Information Center

    Fazio, James R., Ed.

    1990-01-01

    At the heart of the Tree City USA program are four basic requirements: The community must have the following: (1) a tree board or department; (2) an annual community forestry program with financial provisions for trees and tree care; (3) an annual Arbor Day proclamation and observance; and (4) a tree ordinance. Sections of a model tree ordinance…

  6. Active flows on trees

    NASA Astrophysics Data System (ADS)

    Forrow, Aden; Woodhouse, Francis G.; Dunkel, Jörn

    2016-11-01

    Coherent, large scale dynamics in many nonequilibrium physical, biological, or information transport networks are driven by small-scale local energy input. We introduce and explore a generic model for compressible active flows on tree networks. In contrast to thermally-driven systems, active friction selects discrete states with only a small number of oscillation modes activated at distinct fixed amplitudes. This state selection can interact with graph topology to produce different localized dynamical time scales in separate regions of large networks. Using perturbation theory, we systematically predict the stationary states of noisy networks. Our analytical predictions agree well with a Bayesian state estimation based on a hidden Markov model applied to simulated time series data on binary trees. While the number of stable states per tree scales exponentially with the number of edges, the mean number of activated modes in each state averages 1 / 4 the number of edges. More broadly, these results suggest that the macroscopic response of active networks, from actin-myosin networks in cells to flow networks in Physarum polycephalum, can be dominated by a few select modes.

  7. Suppression Situations in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2006-01-01

    This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…

  8. Regression Analysis: Legal Applications in Institutional Research

    ERIC Educational Resources Information Center

    Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.

    2008-01-01

    This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…

  9. Principles of Quantile Regression and an Application

    ERIC Educational Resources Information Center

    Chen, Fang; Chalhoub-Deville, Micheline

    2014-01-01

    Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…

  10. Three-Dimensional Modeling in Linear Regression.

    ERIC Educational Resources Information Center

    Herman, James D.

    Linear regression examines the relationship between one or more independent (predictor) variables and a dependent variable. By using a particular formula, regression determines the weights needed to minimize the error term for a given set of predictors. With one predictor variable, the relationship between the predictor and the dependent variable…

  11. A Practical Guide to Regression Discontinuity

    ERIC Educational Resources Information Center

    Jacob, Robin; Zhu, Pei; Somers, Marie-Andrée; Bloom, Howard

    2012-01-01

    Regression discontinuity (RD) analysis is a rigorous nonexperimental approach that can be used to estimate program impacts in situations in which candidates are selected for treatment based on whether their value for a numeric rating exceeds a designated threshold or cut-point. Over the last two decades, the regression discontinuity approach has…

  12. Regression Analysis and the Sociological Imagination

    ERIC Educational Resources Information Center

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  13. First analysis of risk factors associated with bee colony collapse disorder by classification and regression trees

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sudden losses of managed honey bee (Apis mellifera L.) colonies are considered an important problem worldwide but the underlying cause or causes of these losses are currently unknown. In the United States, this syndrome was termed Colony Collapse Disorder (CCD), since the defining trait was a rapid ...

  14. Risk profiles for weight gain among postmenopausal women: A classification and regression tree analysis approach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Risk factors for obesity and weight gain are typically evaluated individually while "adjusting for" the influence of other confounding factors, and few studies, if any, have created risk profiles by clustering risk factors. We identified subgroups of postmenopausal women homogeneous in their cluster...

  15. Pesticides in Urban Multiunit Dwellings: Hazard IdentificationUsing Classification and Regression Tree (CART) Analysis

    EPA Science Inventory

    Many units in public housing or other low-income urban dwellings may have elevated pesticide residues, given recurring infestation, but it would be logistically and economically infeasible to sample a large number of units to identify highly exposed households to design interven...

  16. Exact solutions for species tree inference from discordant gene trees.

    PubMed

    Chang, Wen-Chieh; Górecki, Paweł; Eulenstein, Oliver

    2013-10-01

    Phylogenetic analysis has to overcome the grant challenge of inferring accurate species trees from evolutionary histories of gene families (gene trees) that are discordant with the species tree along whose branches they have evolved. Two well studied approaches to cope with this challenge are to solve either biologically informed gene tree parsimony (GTP) problems under gene duplication, gene loss, and deep coalescence, or the classic RF supertree problem that does not rely on any biological model. Despite the potential of these problems to infer credible species trees, they are NP-hard. Therefore, these problems are addressed by heuristics that typically lack any provable accuracy and precision. We describe fast dynamic programming algorithms that solve the GTP problems and the RF supertree problem exactly, and demonstrate that our algorithms can solve instances with data sets consisting of as many as 22 taxa. Extensions of our algorithms can also report the number of all optimal species trees, as well as the trees themselves. To better asses the quality of the resulting species trees that best fit the given gene trees, we also compute the worst case species trees, their numbers, and optimization score for each of the computational problems. Finally, we demonstrate the performance of our exact algorithms using empirical and simulated data sets, and analyze the quality of heuristic solutions for the studied problems by contrasting them with our exact solutions.

  17. Landscape-scale consequences of differential tree mortality from catastrophic wind disturbance in the Amazon.

    PubMed

    Rifai, Sami W; Urquiza Muñoz, José D; Negrón-Juárez, Robinson I; Ramírez Arévalo, Fredy R; Tello-Espinoza, Rodil; Vanderwel, Mark C; Lichstein, Jeremy W; Chambers, Jeffrey Q; Bohlman, Stephanie A

    2016-10-01

    Wind disturbance can create large forest blowdowns, which greatly reduces live biomass and adds uncertainty to the strength of the Amazon carbon sink. Observational studies from within the central Amazon have quantified blowdown size and estimated total mortality but have not determined which trees are most likely to die from a catastrophic wind disturbance. Also, the impact of spatial dependence upon tree mortality from wind disturbance has seldom been quantified, which is important because wind disturbance often kills clusters of trees due to large treefalls killing surrounding neighbors. We examine (1) the causes of differential mortality between adult trees from a 300-ha blowdown event in the Peruvian region of the northwestern Amazon, (2) how accounting for spatial dependence affects mortality predictions, and (3) how incorporating both differential mortality and spatial dependence affect the landscape level estimation of necromass produced from the blowdown. Standard regression and spatial regression models were used to estimate how stem diameter, wood density, elevation, and a satellite-derived disturbance metric influenced the probability of tree death from the blowdown event. The model parameters regarding tree characteristics, topography, and spatial autocorrelation of the field data were then used to determine the consequences of non-random mortality for landscape production of necromass through a simulation model. Tree mortality was highly non-random within the blowdown, where tree mortality rates were highest for trees that were large, had low wood density, and were located at high elevation. Of the differential mortality models, the non-spatial models overpredicted necromass, whereas the spatial model slightly underpredicted necromass. When parameterized from the same field data, the spatial regression model with differential mortality estimated only 7.5% more dead trees across the entire blowdown than the random mortality model, yet it estimated 51

  18. Atherosclerotic plaque regression: fact or fiction?

    PubMed

    Shanmugam, Nesan; Román-Rego, Ana; Ong, Peter; Kaski, Juan Carlos

    2010-08-01

    Coronary artery disease is the major cause of death in the western world. The formation and rapid progression of atheromatous plaques can lead to serious cardiovascular events in patients with atherosclerosis. The better understanding, in recent years, of the mechanisms leading to atheromatous plaque growth and disruption and the availability of powerful HMG CoA-reductase inhibitors (statins) has permitted the consideration of plaque regression as a realistic therapeutic goal. This article reviews the existing evidence underpinning current therapeutic strategies aimed at achieving atherosclerotic plaque regression. In this review we also discuss imaging modalities for the assessment of plaque regression, predictors of regression and whether plaque regression is associated with a survival benefit.

  19. Should metacognition be measured by logistic regression?

    PubMed

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria.

  20. Almost efficient estimation of relative risk regression

    PubMed Central

    Fitzmaurice, Garrett M.; Lipsitz, Stuart R.; Arriaga, Alex; Sinha, Debajyoti; Greenberg, Caprice; Gawande, Atul A.

    2014-01-01

    Relative risks (RRs) are often considered the preferred measures of association in prospective studies, especially when the binary outcome of interest is common. In particular, many researchers regard RRs to be more intuitively interpretable than odds ratios. Although RR regression is a special case of generalized linear models, specifically with a log link function for the binomial (or Bernoulli) outcome, the resulting log-binomial regression does not respect the natural parameter constraints. Because log-binomial regression does not ensure that predicted probabilities are mapped to the [0,1] range, maximum likelihood (ML) estimation is often subject to numerical instability that leads to convergence problems. To circumvent these problems, a number of alternative approaches for estimating RR regression parameters have been proposed. One approach that has been widely studied is the use of Poisson regression estimating equations. The estimating equations for Poisson regression yield consistent, albeit inefficient, estimators of the RR regression parameters. We consider the relative efficiency of the Poisson regression estimator and develop an alternative, almost efficient estimator for the RR regression parameters. The proposed method uses near-optimal weights based on a Maclaurin series (Taylor series expanded around zero) approximation to the true Bernoulli or binomial weight function. This yields an almost efficient estimator while avoiding convergence problems. We examine the asymptotic relative efficiency of the proposed estimator for an increase in the number of terms in the series. Using simulations, we demonstrate the potential for convergence problems with standard ML estimation of the log-binomial regression model and illustrate how this is overcome using the proposed estimator. We apply the proposed estimator to a study of predictors of pre-operative use of beta blockers among patients undergoing colorectal surgery after diagnosis of colon cancer. PMID

  1. Generalized linear and generalized additive models in studies of species distributions: Setting the scene

    USGS Publications Warehouse

    Guisan, A.; Edwards, T.C.; Hastie, T.

    2002-01-01

    An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.

  2. Investigating how students communicate tree-thinking

    NASA Astrophysics Data System (ADS)

    Boyce, Carrie Jo

    Learning is often an active endeavor that requires students work at building conceptual understandings of complex topics. Personal experiences, ideas, and communication all play large roles in developing knowledge of and understanding complex topics. Sometimes these experiences can promote formation of scientifically inaccurate or incomplete ideas. Representations are tools used to help individuals understand complex topics. In biology, one way that educators help people understand evolutionary histories of organisms is by using representations called phylogenetic trees. In order to understand phylogenetics trees, individuals need to understand the conventions associated with phylogenies. My dissertation, supported by the Tree-Thinking Representational Competence and Word Association frameworks, is a mixed-methods study investigating the changes in students' tree-reading, representational competence and mental association of phylogenetic terminology after participation in varied instruction. Participants included 128 introductory biology majors from a mid-sized southern research university. Participants were enrolled in either Introductory Biology I, where they were not taught phylogenetics, or Introductory Biology II, where they were explicitly taught phylogenetics. I collected data using a pre- and post-assessment consisting of a word association task and tree-thinking diagnostic (n=128). Additionally, I recruited a subset of students from both courses (n=37) to complete a computer simulation designed to teach students about phylogenetic trees. I then conducted semi-structured interviews consisting of a word association exercise with card sort task, a retrospective pre-assessment discussion, a post-assessment discussion, and interview questions. I found that students who received explicit lecture instruction had a significantly higher increase in scores on a tree-thinking diagnostic than students who did not receive lecture instruction. Students who received both

  3. Informing tree-ring reconstructions with automated dendrometer data: the case of single-leaf pinyon (Pinus monophylla) from Great Basin National Park, Nevada, USA

    NASA Astrophysics Data System (ADS)

    Biondi, F.

    2012-12-01

    One of the most pressing issues in modern tree-ring science is to reduce uncertainty of reconstructions while emphasizing that the composition and dynamics of modern ecosystems cannot be understood from the present alone. I present here the latest results from research on the environmental factors that control radial growth of single-leaf pinyon (Pinus monophylla) in the Great Basin of North America using dendrometer data collected at half-hour intervals during two full growing season, 2010 and 2011. Automated (solar-powered) sensors at the site consisted of 8 point dendrometers installed on 7 trees to measure stem size, together with environmental probes that recorded air temperature, soil temperature and soil moisture. Additional meteorological variables at hourly timesteps were available from the EPA-CASTNET station located within 100 m of the dendrometer site. Daily cycles of stem expansion and contraction were quantified using the approach of Deslauriers et al. 2011, and the amount of daily radial stem increment was regressed against environmental variables. Graphical and numerical results showed that tree growth is relatively insensitive to surface soil moisture during the growing season. This finding corroborates empirical dendroclimatic results that showed how tree-ring chronologies of single-leaf pinyon are mostly a proxy for the balance between winter-spring precipitation supply and growing season evapotranspiration demand, thereby making it an ideal species for drought reconstructions.

  4. Lidar-based measurement of surface roughness features of single tree crowns

    NASA Astrophysics Data System (ADS)

    Kolditz, Melanie; Krahwinkler, Petra M.; Roßmann, Jürgen

    2011-11-01

    In remote sensing data, trees have a low interspecies variability and show a high variability within the tree species. Therefore, specific features that distinguish between unique properties of two tree species are required for a single tree based genera classification. To improve classification results, the suitability of seven surface roughness features, calculated on single tree crown regions, is studied. The algorithms developed to provide roughness parameters can be validated and prototyped in a Virtual Forest testbed. The features are extracted from a normalized digital surface model with a resolution of 0.4m per pixel. Within the test area of 340km2 more than 4000 single trees of eleven different species and additionally 200 buildings are available as reference data. Technical standards define several parameters to describe surface properties. These roughness features are evaluated in the context of single tree crowns. All of these features are based on the deviation of the height values of the tree crown to its mean height. As an additional feature the relationship between the crown's surface area and its occupied ground area is used. The evaluation results of these features regarding the discrimination of tree species on different levels - eleven single tree species, seven tree classes, deciduous and coniferous - and also towards discrimination of trees from buildings will be presented.

  5. Chilling and heat requirements for flowering in temperate fruit trees

    NASA Astrophysics Data System (ADS)

    Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike

    2014-08-01

    Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut ( Castanea mollissima Blume) and jujube ( Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing's cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.

  6. Chilling and heat requirements for flowering in temperate fruit trees.

    PubMed

    Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike

    2014-08-01

    Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut (Castanea mollissima Blume) and jujube (Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing’s cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.

  7. Investigating bias in squared regression structure coefficients

    PubMed Central

    Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce

    2015-01-01

    The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273

  8. Investigating bias in squared regression structure coefficients.

    PubMed

    Nimon, Kim F; Zientek, Linda R; Thompson, Bruce

    2015-01-01

    The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients.

  9. Regression modeling of ground-water flow

    USGS Publications Warehouse

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  10. Isoprene emission from tropical tree species.

    PubMed

    Padhy, P K; Varshney, C K

    2005-05-01

    Foliar emission of isoprene was measured in nine commonly growing tree species of Delhi, India. Dynamic flow enclosure technique was used and gas samples were collected onto Tenax-GC/Carboseive cartridges, which were then attached to the sample injection system in the gas chromatograph (GC). Eluting compounds were analysed using a flame ionisation detector (FID). Out of the nine tree species, isoprene emission was found in six species (Eucalyptus sp., Ficus benghalensis, Ficus religiosa, Mangifera indica, Melia azedarach, and Syzygium jambolanum), whereas, in the remaining three tree species (Alstonia scholaris, Azadirachta indica, and Cassia fistula) no isoprene emission was detected or the levels of emission were negligible or below the detection limit (BDL). Among six tree species, the highest hourly emission (10.2 +/- 6.8 microg g(-1) leaf dry weight, average of five seasons) was observed in Ficus religiosa, while minimum emission was from Melia azedarach (2.2 +/- 4.9 microg g(-1) leaf dry weight, average of five seasons). Isoprene emission (average of six species), over five seasons, was found to vary between 3.9 and 8.5 microg g(-1) leaf dry weight during the rainy season. In addition, significant diurnal variation in isoprene emission was observed in each species. The preliminary estimate made in this study on the annual biogenic VOC emission from India may probably be the first of its kind from this part of the world.

  11. Relative risk regression analysis of epidemiologic data.

    PubMed

    Prentice, R L

    1985-11-01

    Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiologic cohort study, relative risk regression methods extend conventional survival data methods and binary response (e.g., logistic) regression models by taking explicit account of the time to disease occurrence while allowing arbitrary baseline disease rates, general censorship, and time-varying risk factors. This latter feature is particularly relevant to many environmental risk assessment problems wherein one wishes to relate disease rates at a particular point in time to aspects of a preceding risk factor history. Relative risk regression methods also adapt readily to time-matched case-control studies and to certain less standard designs. The uses of relative risk regression methods are illustrated and the state of development of these procedures is discussed. It is argued that asymptotic partial likelihood estimation techniques are now well developed in the important special case in which the disease rates of interest have interpretations as counting process intensity functions. Estimation of relative risks processes corresponding to disease rates falling outside this class has, however, received limited attention. The general area of relative risk regression model criticism has, as yet, not been thoroughly studied, though a number of statistical groups are studying such features as tests of fit, residuals, diagnostics and graphical procedures. Most such studies have been restricted to exponential form relative risks as have simulation studies of relative risk estimation

  12. A comparison of regression and regression-kriging for soil characterization using remote sensing imagery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In precision agriculture regression has been used widely to quality the relationship between soil attributes and other environmental variables. However, spatial correlation existing in soil samples usually makes the regression model suboptimal. In this study, a regression-kriging method was attemp...

  13. Regressive language in severe head injury.

    PubMed

    Thomsen, I V; Skinhoj, E

    1976-09-01

    In a follow-up study of 50 patients with severe head injuries three patients had echolalia. One patient with initially global aphasia had echolalia for some weeks when he started talking. Another patient with severe diffuse brain damage, dementia, and emotional regression had echolalia. The dysfunction was considered a detour performance. In the third patient echolalia and palilalia were details in a total pattern of regression lasting for months. The patient, who had extensive frontal atrophy secondary to a very severe head trauma, presented an extreme state of regression returning to a foetal-body pattern and behaving like a baby.

  14. Regression of altitude-produced cardiac hypertrophy.

    NASA Technical Reports Server (NTRS)

    Sizemore, D. A.; Mcintyre, T. W.; Van Liere, E. J.; Wilson , M. F.

    1973-01-01

    The rate of regression of cardiac hypertrophy with time has been determined in adult male albino rats. The hypertrophy was induced by intermittent exposure to simulated high altitude. The percentage hypertrophy was much greater (46%) in the right ventricle than in the left (16%). The regression could be adequately fitted to a single exponential function with a half-time of 6.73 plus or minus 0.71 days (90% CI). There was no significant difference in the rates of regression for the two ventricles.

  15. Elevation-dependent responses of tree mast seeding to climate change over 45 years

    PubMed Central

    Allen, Robert B; Hurst, Jennifer M; Portier, Jeanne; Richardson, Sarah J

    2014-01-01

    We use seed count data from a New Zealand mono-specific mountain beech forest to test for decadal trends in seed production along an elevation gradient in relation to changes in climate. Seedfall was collected (1965 to 2009) from seed trays located on transect lines at fixed elevations along an elevation gradient (1020 to 1370 m). We counted the number of seeds in the catch of each tray, for each year, and determined the number of viable seeds. Climate variables were obtained from a nearby (<2 km) climate station (914-m elevation). Variables were the sum or mean of daily measurements, using periods within each year known to correlate with subsequent interannual variation in seed production. To determine trends in mean seed production, at each elevation, and climate variables, we used generalized least squares (GLS) regression. We demonstrate a trend of increasing total and viable seed production, particularly at higher elevations, which emerged from marked interannual variation. Significant changes in four seasonal climate variables had GLS regression coefficients consistent with predictions of increased seed production. These variables subsumed the effect of year in GLS regressions with a greater influence on seed production with increasing elevation. Regression models enforce a view that the sequence of climate variables was additive in their influence on seed production throughout a reproductive cycle spanning more than 2 years and including three summers. Models with the most support always included summer precipitation as the earliest variable in the sequence followed by summer maximum daily temperatures. We interpret this as reflecting precipitation driven increases in soil nutrient availability enhancing seed production at higher elevations rather than the direct effects of climate, stand development or rising atmospheric CO2 partial pressures. Greater sensitivity of tree seeding at higher elevations to changes in climate reveals how ecosystem responses to

  16. Using Evidence-Based Decision Trees Instead of Formulas to Identify At-Risk Readers. REL 2014-036

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov; Foorman, Barbara R.

    2014-01-01

    This study examines whether the classification and regression tree (CART) model improves the early identification of students at risk for reading comprehension difficulties compared with the more difficult to interpret logistic regression model. CART is a type of predictive modeling that relies on nonparametric techniques. It presents results in…

  17. Clustering with shallow trees

    NASA Astrophysics Data System (ADS)

    Bailly-Bechet, M.; Bradde, S.; Braunstein, A.; Flaxman, A.; Foini, L.; Zecchina, R.

    2009-12-01

    We propose a new method for obtaining hierarchical clustering based on the optimization of a cost function over trees of limited depth, and we derive a message-passing method that allows one to use it efficiently. The method and the associated algorithm can be interpreted as a natural interpolation between two well-known approaches, namely that of single linkage and the recently presented affinity propagation. We analyse using this general scheme three biological/medical structured data sets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.

  18. Save a Tree

    NASA Astrophysics Data System (ADS)

    Williams, Kathryn R.

    1999-10-01

    Starting in September 1925, JCE reproduced pictures of famous chemists or chemistry-related works of art as frontispieces. Often, the Journal included a biography or other article about the picture. The August 1945 frontispiece featured the largest cork oak in the United States. An accompanying article described the goals of the Cork Project to plant cork trees in suitable locations in the U.S., to compensate for uncertain European and African sources during World War II. The final frontispiece appeared in December 1956. To view supplementary material, please refer to JCE Online's supplementary links.

  19. Palm tree peroxidases.

    PubMed

    Sakharov, I Yu

    2004-08-01

    Over the years novel plant peroxidases have been isolated from palm trees leaves. Some molecular and catalytic properties of palm peroxidases have been studied. The substrate specificity of palm peroxidases is distinct from the specificity of other plant peroxidases. Palm peroxidases show extremely high stability under acidic and alkaline conditions and high thermal stability. Moreover, these enzymes are more stable with respect to hydrogen peroxide treatment than other peroxidases. Due to their extremely high stability, palm peroxidases have been used successfully in the development of new bioanalytical tests, the construction of improved biosensors, and in polymer synthesis.

  20. Automatic localization of bifurcations and vessel crossings in digital fundus photographs using location regression

    NASA Astrophysics Data System (ADS)

    Niemeijer, Meindert; Dumitrescu, Alina V.; van Ginneken, Bram; Abrámoff, Michael D.

    2011-03-01

    Parameters extracted from the vasculature on the retina are correlated with various conditions such as diabetic retinopathy and cardiovascular diseases such as stroke. Segmentation of the vasculature on the retina has been a topic that has received much attention in the literature over the past decade. Analysis of the segmentation result, however, has only received limited attention with most works describing methods to accurately measure the width of the vessels. Analyzing the connectedness of the vascular network is an important step towards the characterization of the complete vascular tree. The retinal vascular tree, from an image interpretation point of view, originates at the optic disc and spreads out over the retina. The tree bifurcates and the vessels also cross each other. The points where this happens form the key to determining the connectedness of the complete tree. We present a supervised method to detect the bifurcations and crossing points of the vasculature of the retina. The method uses features extracted from the vasculature as well as the image in a location regression approach to find those locations of the segmented vascular tree where the bifurcation or crossing occurs (from here, POI, points of interest). We evaluate the method on the publicly available DRIVE database in which an ophthalmologist has marked the POI.

  1. Error Tree: A Tree Structure for Hamming and Edit Distances and Wildcards Matching.

    PubMed

    Al-Okaily, Anas

    2015-12-01

    Approximate pattern matching is a fundamental problem in the bioinformatics and information retrieval applications. The problem involves different matching relations such as Hamming distance, edit distances, and the wildcards matching problem. The input is usually a text of length n over a fixed alphabet of length Σ, a pattern of length m, and an integer k. The output is to find all positions that have ≤ k Hamming distance, edit distance, or wildcards matching with P. Many algorithms and indexes have been proposed to solve the problems more efficiently, but due to the space and time complexities of the problems, most tools adopted heuristics approaches based on, for instance, suffix tree, suffix array, or Burrows Wheeler Transform to reach practical implementations. Error Tree is a novel tree structure that is mainly oriented to solve the approximate pattern matching problems, using less space and faster computation time. The algorithm proposes for Hamming distance and wildcards matching a tree structure that needs [Formula: see text] words and takes [Formula: see text] in the average case) of query time for any online/offline pattern, where occ is the number of outputs. In addition, a tree structure of [Formula: see text] words and [Formula: see text] in the average case) query time for edit distance for any online/offline pattern.

  2. Tree shrew database (TreeshrewDB): a genomic knowledge base for the Chinese tree shrew

    PubMed Central

    Fan, Yu; Yu, Dandan; Yao, Yong-Gang

    2014-01-01

    The tree shrew (Tupaia belangeri) is a small mammal with a close relationship to primates and it has been proposed as an alternative experimental animal to primates in biomedical research. The recent release of a high-quality Chinese tree shrew genome enables more researchers to use this species as the model animal in their studies. With the aim to making the access to an extensively annotated genome database straightforward and easy, we have created the Tree shrew Database (TreeshrewDB). This is a web-based platform that integrates the currently available data from the tree shrew genome, including an updated gene set, with a systematic functional annotation and a mRNA expression pattern. In addition, to assist with automatic gene sequence analysis, we have integrated the common programs Blast, Muscle, GBrowse, GeneWise and codeml, into TreeshrewDB. We have also developed a pipeline for the analysis of positive selection. The user-friendly interface of TreeshrewDB, which is available at http://www.treeshrewdb.org, will undoubtedly help in many areas of biological research into the tree shrew. PMID:25413576

  3. Inference of reversible tree languages.

    PubMed

    López, Damián; Sempere, José M; García, Pedro

    2004-08-01

    In this paper, we study the notion of k-reversibility and k-testability when regular tree languages are involved. We present an inference algorithm for learning a k-testable tree language that runs in polynomial time with respect to the size of the sample used. We also study the tree language classes in relation to other well known ones, and some properties of these languages are proven.

  4. Large Deviations for Random Trees

    PubMed Central

    Heitsch, Christine

    2010-01-01

    We consider large random trees under Gibbs distributions and prove a Large Deviation Principle (LDP) for the distribution of degrees of vertices of the tree. The LDP rate function is given explicitly. An immediate consequence is a Law of Large Numbers for the distribution of vertex degrees in a large random tree. Our motivation for this study comes from the analysis of RNA secondary structures. PMID:20216937

  5. Kepler AutoRegressive Planet Search

    NASA Astrophysics Data System (ADS)

    Feigelson, Eric

    NASA's Kepler mission is the source of more exoplanets than any other instrument, but the discovery depends on complex statistical analysis procedures embedded in the Kepler pipeline. A particular challenge is mitigating irregular stellar variability without loss of sensitivity to faint periodic planetary transits. This proposal presents a two-stage alternative analysis procedure. First, parametric autoregressive ARFIMA models, commonly used in econometrics, remove most of the stellar variations. Second, a novel matched filter is used to create a periodogram from which transit-like periodicities are identified. This analysis procedure, the Kepler AutoRegressive Planet Search (KARPS), is confirming most of the Kepler Objects of Interest and is expected to identify additional planetary candidates. The proposed research will complete application of the KARPS methodology to the prime Kepler mission light curves of 200,000: stars, and compare the results with Kepler Objects of Interest obtained with the Kepler pipeline. We will then conduct a variety of astronomical studies based on the KARPS results. Important subsamples will be extracted including Habitable Zone planets, hot super-Earths, grazing-transit hot Jupiters, and multi-planet systems. Groundbased spectroscopy of poorly studied candidates will be performed to better characterize the host stars. Studies of stellar variability will then be pursued based on KARPS analysis. The autocorrelation function and nonstationarity measures will be used to identify spotted stars at different stages of autoregressive modeling. Periodic variables with folded light curves inconsistent with planetary transits will be identified; they may be eclipsing or mutually-illuminating binary star systems. Classification of stellar variables with KARPS-derived statistical properties will be attempted. KARPS procedures will then be applied to archived K2 data to identify planetary transits and characterize stellar variability.

  6. A new bivariate negative binomial regression model

    NASA Astrophysics Data System (ADS)

    Faroughi, Pouya; Ismail, Noriszura

    2014-12-01

    This paper introduces a new form of bivariate negative binomial (BNB-1) regression which can be fitted to bivariate and correlated count data with covariates. The BNB regression discussed in this study can be fitted to bivariate and overdispersed count data with positive, zero or negative correlations. The joint p.m.f. of the BNB1 distribution is derived from the product of two negative binomial marginals with a multiplicative factor parameter. Several testing methods were used to check overdispersion and goodness-of-fit of the model. Application of BNB-1 regression is illustrated on Malaysian motor insurance dataset. The results indicated that BNB-1 regression has better fit than bivariate Poisson and BNB-2 models with regards to Akaike information criterion.

  7. Some Simple Computational Formulas for Multiple Regression

    ERIC Educational Resources Information Center

    Aiken, Lewis R., Jr.

    1974-01-01

    Short-cut formulas are presented for direct computation of the beta weights, the standard errors of the beta weights, and the multiple correlation coefficient for multiple regression problems involving three independent variables and one dependent variable. (Author)

  8. An introduction to multilevel regression models.

    PubMed

    Austin, P C; Goel, V; van Walraven, C

    2001-01-01

    Data in health research are frequently structured hierarchically. For example, data may consist of patients nested within physicians, who in turn may be nested in hospitals or geographic regions. Fitting regression models that ignore the hierarchical structure of the data can lead to false inferences being drawn from the data. Implementing a statistical analysis that takes into account the hierarchical structure of the data requires special methodologies. In this paper, we introduce the concept of hierarchically structured data, and present an introduction to hierarchical regression models. We then compare the performance of a traditional regression model with that of a hierarchical regression model on a dataset relating test utilization at the annual health exam with patient and physician characteristics. In comparing the resultant models, we see that false inferences can be drawn by ignoring the structure of the data.

  9. Multiple Instance Regression with Structured Data

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri L.; Lane, Terran; Roper, Alex

    2008-01-01

    This slide presentation reviews the use of multiple instance regression with structured data from multiple and related data sets. It applies the concept to a practical problem, that of estimating crop yield using remote sensed country wide weekly observations.

  10. Bayesian Comparison of Two Regression Lines.

    ERIC Educational Resources Information Center

    Tsutakawa, Robert K.

    1978-01-01

    A Bayesian solution is presented for the Johnson-Neyman problem (whether or not the distance between two regression lines is statistically significant over a finite interval of the independent variable). (Author/CTM)

  11. TWSVR: Regression via Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh

    2016-02-01

    Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets.

  12. Dieback and episodic mortality of Cercidium microphyllum (foothill paloverde), a dominant Sonoran Desert tree

    USGS Publications Warehouse

    Bowers, Janice E.; Turner, R.M.

    2001-01-01

    Past and current dieback of Cercidium microphyllum, a dominant, drought-deciduous tree in the Sonoran Desert, was investigated at Tumamoc Hill, Tucson, Arizona, USA. Logistic regression predicted that the odds of a Cercidium plant being alive should decrease with increasing circumference, association with the columnar cactus Carnegiea gigantea, and occurrence on steep slopes. Slope azimuth, parasitization by Phoradendron californicum, and distance to nearest Cercidium within 5 m did not significantly affect the odds of survival. Carnegiea was a source of background mortality rather than a primary cause of dieback. Of the >1,000 living and dead plants sampled, 7.7% had died within the past 5 to 7 years. An additional 12.8% died in the more distant past. Diebacks tended to occur during severe deficits in annual, especially summer, rain. More than half of the dead plants in the sample were ???50 cm in girth. In current and past diebacks on Tumamoc Hill, it seems likely that severe drought interacted with natural senescence of an aging population, weakening large, old trees and hastening their deaths.

  13. Elevation, Not Deforestation, Promotes Genetic Differentiation in a Pioneer Tropical Tree

    PubMed Central

    Castilla, Antonio R.; Pope, Nathaniel; Jaffé, Rodolfo; Jha, Shalene

    2016-01-01

    The regeneration of disturbed forest is an essential part of tropical forest ecology, both with respect to natural disturbance regimes and large-scale human-mediated logging, grazing, and agriculture. Pioneer tree species are critical for facilitating the transition from deforested land to secondary forest because they stabilize terrain and enhance connectivity between forest fragments by increasing matrix permeability and initiating disperser community assembly. Despite the ecological importance of early successional species, little is known about their ability to maintain gene flow across deforested landscapes. Utilizing highly polymorphic microsatellite markers, we examined patterns of genetic diversity and differentiation for the pioneer understory tree Miconia affinis across the Isthmus of Panama. Furthermore, we investigated the impact of geographic distance, forest cover, and elevation on genetic differentiation among populations using circuit theory and regression modeling within a landscape genetics framework. We report marked differences in historical and contemporary migration rates and moderately high levels of genetic differentiation in M. affinis populations across the Isthmus of Panama. Genetic differentiation increased significantly with elevation and geographic distance among populations; however, we did not find that forest cover enhanced or reduced genetic differentiation in the study region. Overall, our results reveal strong dispersal for M. affinis across human-altered landscapes, highlighting the potential use of this species for reforestation in tropical regions. Additionally, this study demonstrates the importance of considering topography when designing programs aimed at conserving genetic diversity within degraded tropical landscapes. PMID:27280872

  14. Elevation, Not Deforestation, Promotes Genetic Differentiation in a Pioneer Tropical Tree.

    PubMed

    Castilla, Antonio R; Pope, Nathaniel; Jaffé, Rodolfo; Jha, Shalene

    2016-01-01

    The regeneration of disturbed forest is an essential part of tropical forest ecology, both with respect to natural disturbance regimes and large-scale human-mediated logging, grazing, and agriculture. Pioneer tree species are critical for facilitating the transition from deforested land to secondary forest because they stabilize terrain and enhance connectivity between forest fragments by increasing matrix permeability and initiating disperser community assembly. Despite the ecological importance of early successional species, little is known about their ability to maintain gene flow across deforested landscapes. Utilizing highly polymorphic microsatellite markers, we examined patterns of genetic diversity and differentiation for the pioneer understory tree Miconia affinis across the Isthmus of Panama. Furthermore, we investigated the impact of geographic distance, forest cover, and elevation on genetic differentiation among populations using circuit theory and regression modeling within a landscape genetics framework. We report marked differences in historical and contemporary migration rates and moderately high levels of genetic differentiation in M. affinis populations across the Isthmus of Panama. Genetic differentiation increased significantly with elevation and geographic distance among populations; however, we did not find that forest cover enhanced or reduced genetic differentiation in the study region. Overall, our results reveal strong dispersal for M. affinis across human-altered landscapes, highlighting the potential use of this species for reforestation in tropical regions. Additionally, this study demonstrates the importance of considering topography when designing programs aimed at conserving genetic diversity within degraded tropical landscapes.

  15. Discriminative Elastic-Net Regularized Linear Regression.

    PubMed

    Zhang, Zheng; Lai, Zhihui; Xu, Yong; Shao, Ling; Wu, Jian; Xie, Guo-Sen

    2017-03-01

    In this paper, we aim at learning compact and discriminative linear regression models. Linear regression has been widely used in different problems. However, most of the existing linear regression methods exploit the conventional zero-one matrix as the regression targets, which greatly narrows the flexibility of the regression model. Another major limitation of these methods is that the learned projection matrix fails to precisely project the image features to the target space due to their weak discriminative capability. To this end, we present an elastic-net regularized linear regression (ENLR) framework, and develop two robust linear regression models which possess the following special characteristics. First, our methods exploit two particular strategies to enlarge the margins of different classes by relaxing the strict binary targets into a more feasible variable matrix. Second, a robust elastic-net regularization of singular values is introduced to enhance the compactness and effectiveness of the learned projection matrix. Third, the resulting optimization problem of ENLR has a closed-form solution in each iteration, which can be solved efficiently. Finally, rather than directly exploiting the projection matrix for recognition, our methods employ the transformed features as the new discriminate representations to make final image classification. Compared with the traditional linear regression model and some of its variants, our method is much more accurate in image classification. Extensive experiments conducted on publicly available data sets well demonstrate that the proposed framework can outperform the state-of-the-art methods. The MATLAB codes of our methods can be available at http://www.yongxu.org/lunwen.html.

  16. The Geometry of Enhancement in Multiple Regression

    ERIC Educational Resources Information Center

    Waller, Niels G.

    2011-01-01

    In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…

  17. Genealogy and gene trees.

    PubMed

    Rasmuson, Marianne

    2008-02-01

    Heredity can be followed in persons or in genes. Persons can be identified only a few generations back, but simplified models indicate that universal ancestors to all now living persons have occurred in the past. Genetic variability can be characterized as variants of DNA sequences. Data are available only from living persons, but from the pattern of variation gene trees can be inferred by means of coalescence models. The merging of lines backwards in time leads to a MRCA (most recent common ancestor). The time and place of living for this inferred person can give insights in human evolutionary history. Demographic processes are incorporated in the model, but since culture and customs are known to influence demography the models used ought to be tested against available genealogy. The Icelandic data base offers a possibility to do so and points to some discrepancies. Mitochondrial DNA and Y chromosome patterns give a rather consistent view of human evolutionary history during the latest 100 000 years but the earlier epochs of human evolution demand gene trees with longer branches. The results of such studies reveal as yet unsolved problems about the sources of our genome.

  18. Distributed Merge Trees

    SciTech Connect

    Morozov, Dmitriy; Weber, Gunther

    2013-01-08

    Improved simulations and sensors are producing datasets whose increasing complexity exhausts our ability to visualize and comprehend them directly. To cope with this problem, we can detect and extract significant features in the data and use them as the basis for subsequent analysis. Topological methods are valuable in this context because they provide robust and general feature definitions. As the growth of serial computational power has stalled, data analysis is becoming increasingly dependent on massively parallel machines. To satisfy the computational demand created by complex datasets, algorithms need to effectively utilize these computer architectures. The main strength of topological methods, their emphasis on global information, turns into an obstacle during parallelization. We present two approaches to alleviate this problem. We develop a distributed representation of the merge tree that avoids computing the global tree on a single processor and lets us parallelize subsequent queries. To account for the increasing number of cores per processor, we develop a new data structure that lets us take advantage of multiple shared-memory cores to parallelize the work on a single node. Finally, we present experiments that illustrate the strengths of our approach as well as help identify future challenges.

  19. [Iris movement mediates pupillary membrane regression].

    PubMed

    Morizane, Yuki

    2007-11-01

    In the course of mammalian lens development, a transient capillary meshwork called as the pupillary membrane (PM) forms. It is located in the pupil area to nourish the anterior surface of the lens, and then regresses to clear the optical path. Although the involvement of the apoptotic process has been reported in PM regression, the initiating factor remains unknown. We initially found that regression of the PM coincided with the development of iris motility, and that iris movement caused cessation and resumption of blood flow within the PM. Therefore, we investigated whether the development of the capacity of the iris to constrict and dilate can function as an essential signal that induces apoptosis in the PM. Continuous inhibition of iris movement with mydriatic agents suppressed apoptosis of the PM and resulted in the persistence of PM in rats. The distribution of apoptotic cells in the regressing PM was diffuse and showed no apparent localization. These results indicated that iris movement induced regression of the PM by changing the blood flow within it. This study suggests the importance of the physiological interactions between tissues-in this case, the iris and the PM-as a signal to advance vascular regression during organ development.

  20. Multiple-Instance Regression with Structured Data

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri L.; Lane, Terran; Roper, Alex

    2008-01-01

    We present a multiple-instance regression algorithm that models internal bag structure to identify the items most relevant to the bag labels. Multiple-instance regression (MIR) operates on a set of bags with real-valued labels, each containing a set of unlabeled items, in which the relevance of each item to its bag label is unknown. The goal is to predict the labels of new bags from their contents. Unlike previous MIR methods, MI-ClusterRegress can operate on bags that are structured in that they contain items drawn from a number of distinct (but unknown) distributions. MI-ClusterRegress simultaneously learns a model of the bag's internal structure, the relevance of each item, and a regression model that accurately predicts labels for new bags. We evaluated this approach on the challenging MIR problem of crop yield prediction from remote sensing data. MI-ClusterRegress provided predictions that were more accurate than those obtained with non-multiple-instance approaches or MIR methods that do not model the bag structure.

  1. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    NASA Astrophysics Data System (ADS)

    Drzewiecki, Wojciech

    2016-12-01

    In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.

  2. Relating phylogenetic trees to transmission trees of infectious disease outbreaks.

    PubMed

    Ypma, Rolf J F; van Ballegooijen, W Marijn; Wallinga, Jacco

    2013-11-01

    Transmission events are the fundamental building blocks of the dynamics of any infectious disease. Much about the epidemiology of a disease can be learned when these individual transmission events are known or can be estimated. Such estimations are difficult and generally feasible only when detailed epidemiological data are available. The genealogy estimated from genetic sequences of sampled pathogens is another rich source of information on transmission history. Optimal inference of transmission events calls for the combination of genetic data and epidemiological data into one joint analysis. A key difficulty is that the transmission tree, which describes the transmission events between infected hosts, differs from the phylogenetic tree, which describes the ancestral relationships between pathogens sampled from these hosts. The trees differ both in timing of the internal nodes and in topology. These differences become more pronounced when a higher fraction of infected hosts is sampled. We show how the phylogenetic tree of sampled pathogens is related to the transmission tree of an outbreak of an infectious disease, by the within-host dynamics of pathogens. We provide a statistical framework to infer key epidemiological and mutational parameters by simultaneously estimating the phylogenetic tree and the transmission tree. We test the approach using simulations and illustrate its use on an outbreak of foot-and-mouth disease. The approach unifies existing methods in the emerging field of phylodynamics with transmission tree reconstruction methods that are used in infectious disease epidemiology.

  3. Analysis of Sting Balance Calibration Data Using Optimized Regression Models

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert; Bader, Jon B.

    2009-01-01

    Calibration data of a wind tunnel sting balance was processed using a search algorithm that identifies an optimized regression model for the data analysis. The selected sting balance had two moment gages that were mounted forward and aft of the balance moment center. The difference and the sum of the two gage outputs were fitted in the least squares sense using the normal force and the pitching moment at the balance moment center as independent variables. The regression model search algorithm predicted that the difference of the gage outputs should be modeled using the intercept and the normal force. The sum of the two gage outputs, on the other hand, should be modeled using the intercept, the pitching moment, and the square of the pitching moment. Equations of the deflection of a cantilever beam are used to show that the search algorithm s two recommended math models can also be obtained after performing a rigorous theoretical analysis of the deflection of the sting balance under load. The analysis of the sting balance calibration data set is a rare example of a situation when regression models of balance calibration data can directly be derived from first principles of physics and engineering. In addition, it is interesting to see that the search algorithm recommended the same regression models for the data analysis using only a set of statistical quality metrics.

  4. Supercooling Capacity Increases from Sea Level to Tree Line in the Hawaiian Tree Species Metrosideros polymorpha.

    PubMed

    Melcher; Cordell; Jones; Scowcroft; Niemczura; Giambelluca; Goldstein

    2000-05-01

    Population-specific differences in the freezing resistance of Metrosideros polymorpha leaves were studied along an elevational gradient from sea level to tree line (located at ca. 2500 m above sea level) on the east flank of the Mauna Loa volcano in Hawaii. In addition, we also studied 8-yr-old saplings grown in a common garden from seeds collected from the same field populations. Leaves of low-elevation field plants exhibited damage at -2 degrees C, before the onset of ice formation, which occurred at -5.7 degrees C. Leaves of high-elevation plants exhibited damage at ca. -8.5 degrees C, concurrent with ice formation in the leaf tissue, which is typical of plants that avoid freezing in their natural environment by supercooling. Nuclear magnetic resonance studies revealed that water molecules of both extra- and intracellular leaf water fractions from high-elevation plants had restricted mobility, which is consistent with their low water content and their high levels of osmotically active solutes. Decreased mobility of water molecules may delay ice nucleation and/or ice growth and may therefore enhance the ability of plant tissues to supercool. Leaf traits that correlated with specific differences in supercooling capacity were in part genetically determined and in part environmentally induced. Evidence indicated that lower apoplastic water content and smaller intercellular spaces were associated with the larger supercooling capacity of the plant's foliage at tree line. The irreversible tissue-damage temperature decreased by ca. 7 degrees C from sea level to tree line in leaves of field populations. However, this decrease appears to be only large enough to allow M. polymorpha trees to avoid leaf tissue damage from freezing up to a level of ca. 2500 m elevation, which is also the current tree line location on the east flank of Mauna Loa. The limited freezing resistance of M. polymorpha leaves may be partially responsible for the occurrence of tree line at a relatively

  5. Foliar ozone injury on different-sized Prumus serotina Ehrh. trees

    SciTech Connect

    Fredericksen, T.S.; Skelly, J.M.; Steiner, K.C.

    1995-06-01

    Black cherry (Prunus serotina Ehrh.) is a common tree species in the eastern U.S. that is highly sensitive to ozone relative to other associated deciduous tree species. Because of difficulties in conducting exposure-response experiments on large trees, air pollution studies have often utilized seedlings and extrapolated the results to predict the potential response of larger forest trees. However, physiological differences between seedlings and mature forest trees may alter responses to air pollutants. A comparative study of seedling, sapling, and canopy black cherry trees was conducted to determine the response of different-sized trees to known ozone exposures and amounts of ozone uptake. Apparent foliar sensitivity to ozone, observed as a dark adaxial leaf stipple, decreased with increasing tree size. An average of 46% of seedling leaf area was symptomatic by early September, compared to 15% - 20% for saplings and canopy trees. In addition to visible symptoms, seedlings also appeared to have greater rates of early leaf abscission than larger trees. Greater sensitivity (i.e., foliar symptoms) per unit exposure with decreasing tree size was closely correlated with rates of stomatal conductance. However, after accounting for differences in stomatal conductance, sensitivity appeared to increase with tree size.

  6. Intermediate tree cover can maximize groundwater recharge in the seasonally dry tropics

    NASA Astrophysics Data System (ADS)

    Ilstedt, U.; Bargués Tobella, A.; Bazié, H. R.; Bayala, J.; Verbeeten, E.; Nyberg, G.; Sanou, J.; Benegas, L.; Murdiyarso, D.; Laudon, H.; Sheil, D.; Malmer, A.

    2016-02-01

    Water scarcity contributes to the poverty of around one-third of the world’s people. Despite many benefits, tree planting in dry regions is often discouraged by concerns that trees reduce water availability. Yet relevant studies from the tropics are scarce, and the impacts of intermediate tree cover remain unexplored. We developed and tested an optimum tree cover theory in which groundwater recharge is maximized at an intermediate tree density. Below this optimal tree density the benefits from any additional trees on water percolation exceed their extra water use, leading to increased groundwater recharge, while above the optimum the opposite occurs. Our results, based on groundwater budgets calibrated with measurements of drainage and transpiration in a cultivated woodland in West Africa, demonstrate that groundwater recharge was maximised at intermediate tree densities. In contrast to the prevailing view, we therefore find that moderate tree cover can increase groundwater recharge, and that tree planting and various tree management options can improve groundwater resources. We evaluate the necessary conditions for these results to hold and suggest that they are likely to be common in the seasonally dry tropics, offering potential for widespread tree establishment and increased benefits for hundreds of millions of people.

  7. Intermediate tree cover can maximize groundwater recharge in the seasonally dry tropics.

    PubMed

    Ilstedt, U; Bargués Tobella, A; Bazié, H R; Bayala, J; Verbeeten, E; Nyberg, G; Sanou, J; Benegas, L; Murdiyarso, D; Laudon, H; Sheil, D; Malmer, A

    2016-02-24

    Water scarcity contributes to the poverty of around one-third of the world's people. Despite many benefits, tree planting in dry regions is often discouraged by concerns that trees reduce water availability. Yet relevant studies from the tropics are scarce, and the impacts of intermediate tree cover remain unexplored. We developed and tested an optimum tree cover theory in which groundwater recharge is maximized at an intermediate tree density. Below this optimal tree density the benefits from any additional trees on water percolation exceed their extra water use, leading to increased groundwater recharge, while above the optimum the opposite occurs. Our results, based on groundwater budgets calibrated with measurements of drainage and transpiration in a cultivated woodland in West Africa, demonstrate that groundwater recharge was maximised at intermediate tree densities. In contrast to the prevailing view, we therefore find that moderate tree cover can increase groundwater recharge, and that tree planting and various tree management options can improve groundwater resources. We evaluate the necessary conditions for these results to hold and suggest that they are likely to be common in the seasonally dry tropics, offering potential for widespread tree establishment and increased benefits for hundreds of millions of people.

  8. Intermediate tree cover can maximize groundwater recharge in the seasonally dry tropics

    PubMed Central

    Ilstedt, U.; Bargués Tobella, A.; Bazié, H. R.; Bayala, J.; Verbeeten, E.; Nyberg, G.; Sanou, J.; Benegas, L.; Murdiyarso, D.; Laudon, H.; Sheil, D.; Malmer, A.

    2016-01-01

    Water scarcity contributes to the poverty of around one-third of the world’s people. Despite many benefits, tree planting in dry regions is often discouraged by concerns that trees reduce water availability. Yet relevant studies from the tropics are scarce, and the impacts of intermediate tree cover remain unexplored. We developed and tested an optimum tree cover theory in which groundwater recharge is maximized at an intermediate tree density. Below this optimal tree density the benefits from any additional trees on water percolation exceed their extra water use, leading to increased groundwater recharge, while above the optimum the opposite occurs. Our results, based on groundwater budgets calibrated with measurements of drainage and transpiration in a cultivated woodland in West Africa, demonstrate that groundwater recharge was maximised at intermediate tree densities. In contrast to the prevailing view, we therefore find that moderate tree cover can increase groundwater recharge, and that tree planting and various tree management options can improve groundwater resources. We evaluate the necessary conditions for these results to hold and suggest that they are likely to be common in the seasonally dry tropics, offering potential for widespread tree establishment and increased benefits for hundreds of millions of people. PMID:26908158

  9. Study of traffic-related pollutant removal from street canyon with trees: dispersion and deposition perspective.

    PubMed

    Morakinyo, Tobi Eniolu; Lam, Yun Fat

    2016-11-01

    Numerical experiments involving street canyons of varying aspect ratio with traffic-induced pollutants (PM2.5) and implanted trees of varying aspect ratio, leaf area index, leaf area density distribution, trunk height, tree-covered area, and tree planting pattern under different wind conditions were conducted using a computational fluid dynamics (CFD) model, ENVI-met. Various aspects of dispersion and deposition were investigated, which include the influence of various tree configurations and wind condition on dispersion within the street canyon, pollutant mass at the free stream layer and street canyon, and comparison between mass removal by surface (leaf) deposition and mass enhancement due to the presence of trees. Results revealed that concentration level was enhanced especially within pedestrian level in street canyons with trees relative to their tree-free counterparts. Additionally, we found a dependence of the magnitude of concentration increase (within pedestrian level) and decrease (above pedestrian level) due to tree configuration and wind condition. Furthermore, we realized that only ∼0.1-3 % of PM2.5 was dispersed to the free stream layer while a larger percentage (∼97 %) remained in the canyon, regardless of its aspect ratio, prevailing wind condition, and either tree-free or with tree (of various configuration). Lastly, results indicate that pollutant removal due to deposition on leaf surfaces is potentially sufficient to counterbalance the enhancement of PM2.5 by such trees under some tree planting scenarios and wind conditions.

  10. Displayed Trees Do Not Determine Distinguishability Under the Network Multispecies Coalescent.

    PubMed

    Zhu, Sha; Degnan, James H

    2017-03-01

    Recent work in estimating species relationships from gene trees has included inferring networks assuming that past hybridization has occurred between species. Probabilistic models using the multispecies coalescent can be used in this framework for likelihood-based inference of both network topologies and parameters, including branch lengths and hybridization parameters. A difficulty for such methods is that it is not always clear whether, or to what extent, networks are identifiable-that is whether there could be two distinct networks that lead to the same distribution of gene trees. For cases in which incomplete lineage sorting occurs in addition to hybridization, we demonstrate a new representation of the species network likelihood that expresses the probability distribution of the gene tree topologies as a linear combination of gene tree distributions given a set of species trees. This representation makes it clear that in some cases in which two distinct networks give the same distribution of gene trees when sampling one allele per species, the two networks can be distinguished theoretically when multiple individuals are sampled per species. This result means that network identifiability is not only a function of the trees displayed by the networks but also depends on allele sampling within species. We additionally give an example in which two networks that display exactly the same trees can be distinguished from their gene trees even when there is only one lineage sampled per species. [gene tree, hybridization, identifiability, maximum likelihood, species tree, phylogeny.].

  11. Cork Oak Vulnerability to Fire: The Role of Bark Harvesting, Tree Characteristics and Abiotic Factors

    PubMed Central

    Catry, Filipe X.; Moreira, Francisco; Pausas, Juli G.; Fernandes, Paulo M.; Rego, Francisco; Cardillo, Enrique; Curt, Thomas

    2012-01-01

    Forest ecosystems where periodical tree bark harvesting is a major economic activity may be particularly vulnerable to disturbances such as fire, since debarking usually reduces tree vigour and protection against external agents. In this paper we asked how cork oak Quercus suber trees respond after wildfires and, in particular, how bark harvesting affects post-fire tree survival and resprouting. We gathered data from 22 wildfires (4585 trees) that occurred in three southern European countries (Portugal, Spain and France), covering a wide range of conditions characteristic of Q. suber ecosystems. Post-fire tree responses (tree mortality, stem mortality and crown resprouting) were examined in relation to management and ecological factors using generalized linear mixed-effects models. Results showed that bark thickness and bark harvesting are major factors affecting resistance of Q. suber to fire. Fire vulnerability was higher for trees with thin bark (young or recently debarked individuals) and decreased with increasing bark thickness until cork was 3–4 cm thick. This bark thickness corresponds to the moment when exploited trees are debarked again, meaning that exploited trees are vulnerable to fire during a longer period. Exploited trees were also more likely to be top-killed than unexploited trees, even for the same bark thickness. Additionally, vulnerability to fire increased with burn severity and with tree diameter, and was higher in trees burned in early summer or located in drier south-facing aspects. We provided tree response models useful to help estimating the impact of fire and to support management decisions. The results suggested that an appropriate management of surface fuels and changes in the bark harvesting regime (e.g. debarking coexisting trees in different years or increasing the harvesting cycle) would decrease vulnerability to fire and contribute to the conservation of cork oak ecosystems. PMID:22787521

  12. Cork oak vulnerability to fire: the role of bark harvesting, tree characteristics and abiotic factors.

    PubMed

    Catry, Filipe X; Moreira, Francisco; Pausas, Juli G; Fernandes, Paulo M; Rego, Francisco; Cardillo, Enrique; Curt, Thomas

    2012-01-01

    Forest ecosystems where periodical tree bark harvesting is a major economic activity may be particularly vulnerable to disturbances such as fire, since debarking usually reduces tree vigour and protection against external agents. In this paper we asked how cork oak Quercus suber trees respond after wildfires and, in particular, how bark harvesting affects post-fire tree survival and resprouting. We gathered data from 22 wildfires (4585 trees) that occurred in three southern European countries (Portugal, Spain and France), covering a wide range of conditions characteristic of Q. suber ecosystems. Post-fire tree responses (tree mortality, stem mortality and crown resprouting) were examined in relation to management and ecological factors using generalized linear mixed-effects models. Results showed that bark thickness and bark harvesting are major factors affecting resistance of Q. suber to fire. Fire vulnerability was higher for trees with thin bark (young or recently debarked individuals) and decreased with increasing bark thickness until cork was 3-4 cm thick. This bark thickness corresponds to the moment when exploited trees are debarked again, meaning that exploited trees are vulnerable to fire during a longer period. Exploited trees were also more likely to be top-killed than unexploited trees, even for the same bark thickness. Additionally, vulnerability to fire increased with burn severity and with tree diameter, and was higher in trees burned in early summer or located in drier south-facing aspects. We provided tree response models useful to help estimating the impact of fire and to support management decisions. The results suggested that an appropriate management of surface fuels and changes in the bark harvesting regime (e.g. debarking coexisting trees in different years or increasing the harvesting cycle) would decrease vulnerability to fire and contribute to the conservation of cork oak ecosystems.

  13. Modelling the ecological consequences of whole tree harvest for bioenergy production

    NASA Astrophysics Data System (ADS)

    Skår, Silje; Lange, Holger; Sogn, Trine

    2013-04-01

    There is an increasing demand for energy from biomass as a substitute to fossil fuels worldwide, and the Norwegian government plans to double the production of bioenergy to 9% of the national energy production or to 28 TWh per year by 2020. A large part of this increase may come from forests, which have a great potential with respect to biomass supply as forest growth increasingly has exceeded harvest in the last decades. One feasible option is the utilization of forest residues (needles, twigs and branches) in addition to stems, known as Whole Tree Harvest (WTH). As opposed to WTH, the residues are traditionally left in the forest with Conventional Timber Harvesting (CH). However, the residues contain a large share of the treés nutrients, indicating that WTH may possibly alter the supply of nutrients and organic matter to the soil and the forest ecosystem. This may potentially lead to reduced tree growth. Other implications can be nutrient imbalance, loss of carbon from the soil and changes in species composition and diversity. This study aims to identify key factors and appropriate strategies for ecologically sustainable WTH in Norway spruce (Picea abies) and Scots pine (Pinus sylvestris) forest stands in Norway. We focus on identifying key factors driving soil organic matter, nutrients, biomass, biodiversity etc. Simulations of the effect on the carbon and nitrogen budget with the two harvesting methods will also be conducted. Data from field trials and long-term manipulation experiments are used to obtain a first overview of key variables. The relationships between the variables are hitherto unknown, but it is by no means obvious that they could be assumed as linear; thus, an ordinary multiple linear regression approach is expected to be insufficient. Here we apply two advanced and highly flexible modelling frameworks which hardly have been used in the context of tree growth, nutrient balances and biomass removal so far: Generalized Additive Models (GAMs) and

  14. Predicting species’ range limits from functional traits for the tree flora of North America

    PubMed Central

    Stahl, Ulrike; Reu, Björn; Wirth, Christian

    2014-01-01

    Using functional traits to explain species’ range limits is a promising approach in functional biogeography. It replaces the idiosyncrasy of species-specific climate ranges with a generic trait-based predictive framework. In addition, it has the potential to shed light on specific filter mechanisms creating large-scale vegetation patterns. However, its application to a continental flora, spanning large climate gradients, has been hampered by a lack of trait data. Here, we explore whether five key plant functional traits (seed mass, wood density, specific leaf area (SLA), maximum height, and longevity of a tree)—indicative of life history, mechanical, and physiological adaptations—explain the climate ranges of 250 North American tree species distributed from the boreal to the subtropics. Although the relationship between traits and the median climate across a species range is weak, quantile regressions revealed strong effects on range limits. Wood density and seed mass were strongly related to the lower but not upper temperature range limits of species. Maximum height affects the species range limits in both dry and humid climates, whereas SLA and longevity do not show clear relationships. These results allow the definition and delineation of climatic “no-go areas” for North American tree species based on key traits. As some of these key traits serve as important parameters in recent vegetation models, the implementation of trait-based climatic constraints has the potential to predict both range shifts and ecosystem consequences on a more functional basis. Moreover, for future trait-based vegetation models our results provide a benchmark for model evaluation. PMID:25225398

  15. Predicting species' range limits from functional traits for the tree flora of North America.

    PubMed

    Stahl, Ulrike; Reu, Björn; Wirth, Christian

    2014-09-23

    Using functional traits to explain species' range limits is a promising approach in functional biogeography. It replaces the idiosyncrasy of species-specific climate ranges with a generic trait-based predictive framework. In addition, it has the potential to shed light on specific filter mechanisms creating large-scale vegetation patterns. However, its application to a continental flora, spanning large climate gradients, has been hampered by a lack of trait data. Here, we explore whether five key plant functional traits (seed mass, wood density, specific leaf area (SLA), maximum height, and longevity of a tree)--indicative of life history, mechanical, and physiological adaptations--explain the climate ranges of 250 North American tree species distributed from the boreal to the subtropics. Although the relationship between traits and the median climate across a species range is weak, quantile regressions revealed strong effects on range limits. Wood density and seed mass were strongly related to the lower but not upper temperature range limits of species. Maximum height affects the species range limits in both dry and humid climates, whereas SLA and longevity do not show clear relationships. These results allow the definition and delineation of climatic "no-go areas" for North American tree species based on key traits. As some of these key traits serve as important parameters in recent vegetation models, the implementation of trait-based climatic constraints has the potential to predict both range shifts and ecosystem consequences on a more functional basis. Moreover, for future trait-based vegetation models our results provide a benchmark for model evaluation.

  16. Two Trees: Migrating Fault Trees to Decision Trees for Real Time Fault Detection on International Space Station

    NASA Technical Reports Server (NTRS)

    Lee, Charles; Alena, Richard L.; Robinson, Peter

    2004-01-01

    We started from ISS fault trees example to migrate to decision trees, presented a method to convert fault trees to decision trees. The method shows that the visualizations of root cause of fault are easier and the tree manipulating becomes more programmatic via available decision tree programs. The visualization of decision trees for the diagnostic shows a format of straight forward and easy understands. For ISS real time fault diagnostic, the status of the systems could be shown by mining the signals through the trees and see where it stops at. The other advantage to use decision trees is that the trees can learn the fault patterns and predict the future fault from the historic data. The learning is not only on the static data sets but also can be online, through accumulating the real time data sets, the decision trees can gain and store faults patterns in the trees and recognize them when they come.

  17. Optimal estimation for regression models on τ-year survival probability.

    PubMed

    Kwak, Minjung; Kim, Jinseog; Jung, Sin-Ho

    2015-01-01

    A logistic regression method can be applied to regressing the [Formula: see text]-year survival probability to covariates, if there are no censored observations before time [Formula: see text]. But if some observations are incomplete due to censoring before time [Formula: see text], then the logistic regression cannot be applied. Jung (1996) proposed to modify the score function for logistic regression to accommodate the right-censored observations. His modified score function, motivated for a consistent estimation of regression parameters, becomes a regular logistic score function if no observations are censored before time [Formula: see text]. In this article, we propose a modification of Jung's estimating function for an optimal estimation for the regression parameters in addition to consistency. We prove that the optimal estimator is more efficient than Jung's estimator. This theoretical comparison is illustrated with a real example data analysis and simulations.

  18. Tree Diversity Limits the Impact of an Invasive Forest Pest

    PubMed Central

    Guyot, Virginie; Castagneyrol, Bastien; Vialatte, Aude; Deconchat, Marc; Selvi, Federico; Bussotti, Filippo; Jactel, Hervé

    2015-01-01

    The impact of invasive herbivore species may be lower in more diverse plant communities due to mechanisms of associational resistance. According to the “resource concentration hypothesis” the amount and accessibility of host plants is reduced in diverse plant communities, thus limiting the exploitation of resources by consumers. In addition, the “natural enemy hypothesis” suggests that richer plant assemblages provide natural enemies with more complementary resources and habitats, thus promoting top down regulation of herbivores. We tested these two hypotheses by comparing crown damage by the invasive Asian chestnut gall wasp (Dryocosmus kuriphilus) on chestnut trees (Castanea sativa) in pure and mixed stands in Italy. We estimated the defoliation on 70 chestnut trees in 15 mature stands sampled in the same region along a gradient of tree species richness ranging from one species (chestnut monocultures) to four species (mixtures of chestnut and three broadleaved species). Chestnut defoliation was significantly lower in stands with higher tree diversity. Damage on individual chestnut trees decreased with increasing height of neighboring, heterospecific trees. These results suggest that conservation biological control method based on tree species mixtures might help to reduce the impact of the Asian chestnut gall. PMID:26360881

  19. Pattern Matcher for Trees Constructed from Lists

    NASA Technical Reports Server (NTRS)

    James, Mark

    2007-01-01

    A software library has been developed that takes a high-level description of a pattern to be satisfied and applies it to a target. If the two match, it returns success; otherwise, it indicates a failure. The target is semantically a tree that is constructed from elements of terminal and non-terminal nodes represented through lists and symbols. Additionally, functionality is provided for finding the element in a set that satisfies a given pattern and doing a tree search, finding all occurrences of leaf nodes that match a given pattern. This process is valuable because it is a new algorithmic approach that significantly improves the productivity of the programmers and has the potential of making their resulting code more efficient by the introduction of a novel semantic representation language. This software has been used in many applications delivered to NASA and private industry, and the cost savings that have resulted from it are significant.

  20. Rubbery Polya Tree

    PubMed Central

    NIETO-BARAJAS, LUIS E.; MÜLLER, PETER

    2013-01-01

    Polya trees (PT) are random probability measures which can assign probability 1 to the set of continuous distributions for certain specifications of the hyperparameters. This feature distinguishes the PT from the popular Dirichlet process (DP) model which assigns probability 1 to the set of discrete distributions. However, the PT is not nearly as widely used as the DP prior. Probably the main reason is an awkward dependence of posterior inference on the choice of the partitioning subsets in the definition of the PT. We propose a generalization of the PT prior that mitigates this undesirable dependence on the partition structure, by allowing the branching probabilities to be dependent within the same level. The proposed new process is not a PT anymore. However, it is still a tail-free process and many of the prior properties remain the same as those for the PT. PMID:24368872

  1. The Group Tree of Experience.

    ERIC Educational Resources Information Center

    Ping, Ki

    1994-01-01

    Describes a group activity that uses a tree as a metaphor to reflect both group and personal growth during adventure activities. The tree's roots represent the group's formation, the branches and leaves represent the group's diversity and capabilities, and the seeds represent the personal learning and growth that took place within the group.…

  2. Fractions, trees and unfinished business

    NASA Astrophysics Data System (ADS)

    Shraiman, Boris

    In this talk, mourning the loss of a teacher and a dear friend, I would like to share some unfinished thoughts loosely connecting - via Farey fraction trees - Kadanoff's study of universality of quasi-periodic route to chaos with the effort to understand universal features of genealogical trees.

  3. Studying Evergreen Trees in December.

    ERIC Educational Resources Information Center

    Platt, Dorothy K.

    1991-01-01

    This lesson plan uses evergreen trees on sale in cities and villages during the Christmas season to teach identification techniques. Background information, activities, and recommended references guides deal with historical, symbolic and current uses of evergreen trees, physical characteristics, selection, care, and suggestions for post-Christmas…

  4. Tree Hydraulics: How Sap Rises

    ERIC Educational Resources Information Center

    Denny, Mark

    2012-01-01

    Trees transport water from roots to crown--a height that can exceed 100 m. The physics of tree hydraulics can be conveyed with simple fluid dynamics based upon the Hagen-Poiseuille equation and Murray's law. Here the conduit structure is modelled as conical pipes and as branching pipes. The force required to lift sap is generated mostly by…

  5. MULTILINEAR TENSOR REGRESSION FOR LONGITUDINAL RELATIONAL DATA.

    PubMed

    Hoff, Peter D

    2015-09-01

    A fundamental aspect of relational data, such as from a social network, is the possibility of dependence among the relations. In particular, the relations between members of one pair of nodes may have an effect on the relations between members of another pair. This article develops a type of regression model to estimate such effects in the context of longitudinal and multivariate relational data, or other data that can be represented in the form of a tensor. The model is based on a general multilinear tensor regression model, a special case of which is a tensor autoregression model in which the tensor of relations at one time point are parsimoniously regressed on relations from previous time points. This is done via a separable, or Kronecker-structured, regression parameter along with a separable covariance model. In the context of an analysis of longitudinal multivariate relational data, it is shown how the multilinear tensor regression model can represent patterns that often appear in relational and network data, such as reciprocity and transitivity.

  6. Hyperglycemia impairs atherosclerosis regression in mice.

    PubMed

    Gaudreault, Nathalie; Kumar, Nikit; Olivas, Victor R; Eberlé, Delphine; Stephens, Kyle; Raffai, Robert L

    2013-12-01

    Diabetic patients are known to be more susceptible to atherosclerosis and its associated cardiovascular complications. However, the effects of hyperglycemia on atherosclerosis regression remain unclear. We hypothesized that hyperglycemia impairs atherosclerosis regression by modulating the biological function of lesional macrophages. HypoE (Apoe(h/h)Mx1-Cre) mice express low levels of apolipoprotein E (apoE) and develop atherosclerosis when fed a high-fat diet. Atherosclerosis regression occurs in these mice upon plasma lipid lowering induced by a change in diet and the restoration of apoE expression. We examined the morphological characteristics of regressed lesions and assessed the biological function of lesional macrophages isolated with laser-capture microdissection in euglycemic and hyperglycemic HypoE mice. Hyperglycemia induced by streptozotocin treatment impaired lesion size reduction (36% versus 14%) and lipid loss (38% versus 26%) after the reversal of hyperlipidemia. However, decreases in lesional macrophage content and remodeling in both groups of mice were similar. Gene expression analysis revealed that hyperglycemia impaired cholesterol transport by modulating ATP-binding cassette A1, ATP-binding cassette G1, scavenger receptor class B family member (CD36), scavenger receptor class B1, and wound healing pathways in lesional macrophages during atherosclerosis regression. Hyperglycemia impairs both reduction in size and loss of lipids from atherosclerotic lesions upon plasma lipid lowering without significantly affecting the remodeling of the vascular wall.

  7. Regression models for estimating coseismic landslide displacement

    USGS Publications Warehouse

    Jibson, R.W.

    2007-01-01

    Newmark's sliding-block model is widely used to estimate coseismic slope performance. Early efforts to develop simple regression models to estimate Newmark displacement were based on analysis of the small number of strong-motion records then available. The current availability of a much larger set of strong-motion records dictates that these regression equations be updated. Regression equations were generated using data derived from a collection of 2270 strong-motion records from 30 worldwide earthquakes. The regression equations predict Newmark displacement in terms of (1) critical acceleration ratio, (2) critical acceleration ratio and earthquake magnitude, (3) Arias intensity and critical acceleration, and (4) Arias intensity and critical acceleration ratio. These equations are well constrained and fit the data well (71% < R2 < 88%), but they have standard deviations of about 0.5 log units, such that the range defined by the mean ?? one standard deviation spans about an order of magnitude. These regression models, therefore, are not recommended for use in site-specific design, but rather for regional-scale seismic landslide hazard mapping or for rapid preliminary screening of sites. ?? 2007 Elsevier B.V. All rights reserved.

  8. Tree Morphologic Plasticity Explains Deviation from Metabolic Scaling Theory in Semi-Arid Conifer Forests, Southwestern USA

    PubMed Central

    O’Connor, Christopher D.; Lynch, Ann M.

    2016-01-01

    A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST. PMID:27391084

  9. Tree Morphologic Plasticity Explains Deviation from Metabolic Scaling Theory in Semi-Arid Conifer Forests, Southwestern USA.

    PubMed

    Swetnam, Tyson L; O'Connor, Christopher D; Lynch, Ann M

    2016-01-01

    A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST.

  10. Nitrogen isotopes in Tree-Rings - An approach combining soil biogeochemistry and isotopic long series with statistical modeling

    NASA Astrophysics Data System (ADS)

    Savard, Martine M.; Bégin, Christian; Paré, David; Marion, Joëlle; Laganière, Jérôme; Séguin, Armand; Stefani, Franck; Smirnoff, Anna

    2016-04-01

    Monitoring atmospheric emissions from industrial centers in North America generally started less than 25 years ago. To compensate for the lack of monitoring, previous investigations have interpreted tree-ring N changes using the known chronology of human activities, without facing the challenge of separating climatic effects from potential anthropogenic impacts. Here we document such an attempt conducted in the oil sands (OS) mining region of Northeastern Alberta, Canada. The reactive nitrogen (Nr)-emitting oil extraction operations began in 1967, but air quality measurements were only initiated in 1997. To investigate if the beginning and intensification of OS operations induced changes in the forest N-cycle, we sampled white spruce (Picea glauca (Moench) Voss) stands located at various distances from the main mining area, and receiving low, but different N deposition. Our approach combines soil biogeochemical and metagenomic characterization with long, well dated, tree-ring isotopic series. To objectively delineate the natural N isotopic behaviour in trees, we have characterized tree-ring N isotope (15N/14N) ratios between 1880 and 2009, used statistical analyses of the isotopic values and local climatic parameters of the pre-mining period to calibrate response functions and project the isotopic responses to climate during the extraction period. During that period, the measured series depart negatively from the projected natural trends. In addition, these long-term negative isotopic trends are better reproduced by multiple-regression models combining climatic parameters with the proxy for regional mining Nr emissions. These negative isotopic trends point towards changes in the forest soil biogeochemical N cycle. The biogeochemical data and ultimate soil mechanisms responsible for such changes will be discussed during the presentation.

  11. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 36 Parks, Forests, and Public Property 2 2012-07-01 2012-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees...

  12. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 36 Parks, Forests, and Public Property 2 2013-07-01 2013-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees...

  13. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 36 Parks, Forests, and Public Property 2 2011-07-01 2011-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees...

  14. 36 CFR 223.4 - Exchange of trees or portions of trees.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 36 Parks, Forests, and Public Property 2 2014-07-01 2014-07-01 false Exchange of trees or portions of trees. 223.4 Section 223.4 Parks, Forests, and Public Property FOREST SERVICE, DEPARTMENT OF... PRODUCTS General Provisions § 223.4 Exchange of trees or portions of trees. Trees or portions of trees...

  15. Radiofrequency radiation injures trees around mobile phone base stations.

    PubMed

    Waldmann-Selsam, Cornelia; Balmori-de la Puente, Alfonso; Breunig, Helmut; Balmori, Alfonso

    2016-12-01

    In the last two decades, the deployment of phone masts around the world has taken place and, for many years, there has been a discussion in the scientific community about the possible environmental impact from mobile phone base stations. Trees have several advantages over animals as experimental subjects and the aim of this study was to verify whether there is a connection between unusual (generally unilateral) tree damage and radiofrequency exposure. To achieve this, a detailed long-term (2006-2015) field monitoring study was performed in the cities of Bamberg and Hallstadt (Germany). During monitoring, observations and photographic recordings of unusual or unexplainable tree damage were taken, alongside the measurement of electromagnetic radiation. In 2015 measurements of RF-EMF (Radiofrequency Electromagnetic Fields) were carried out. A polygon spanning both cities was chosen as the study site, where 144 measurements of the radiofrequency of electromagnetic fields were taken at a height of 1.5m in streets and parks at different locations. By interpolation of the 144 measurement points, we were able to compile an electromagnetic map of the power flux density in Bamberg and Hallstadt. We selected 60 damaged trees, in addition to 30 randomly selected trees and 30 trees in low radiation areas (n=120) in this polygon. The measurements of all trees revealed significant differences between the damaged side facing a phone mast and the opposite side, as well as differences between the exposed side of damaged trees and all other groups of trees in both sides. Thus, we found that side differences in measured values of power flux density corresponded to side differences in damage. The 30 selected trees in low radiation areas (no visual contact to any phone mast and power flux density under 50μW/m(2)) showed no damage. Statistical analysis demonstrated that electromagnetic radiation from mobile phone masts is harmful for trees. These results are consistent with the fact

  16. Tree reconstruction from partial orders

    SciTech Connect

    Kannan, S.K. ); Warnow, T.J. )

    1993-01-01

    The problem of constructing trees given a matrix of interleaf distances is motivated by applications in computational evolutionary biology and linguistics. The general problem is to find an edge-weighted tree which most closely approximates the distance matrix. Although the construction problem is easy when the tree exactly fits the distance matrix, optimization problems under all popular criteria are either known or conjectured to be NP-complete. In this paper we consider the related problem where we are given a partial order on the pairwise distances, and wish to construct (if possible) an edge-weighted tree realizing the partial order. In particular we are interested in partial orders which arise from experiments on triples of species, which determine either a linear ordering of the three pairwise distances (called Total Order Model or TOM experiments) or only the pair(s) of minimum distance apart (called Partial Order Model or POM experiments). The POM and TOM experimental model is inspired by the model proposed by Kannan, Lawler, and Warnow for constructing trees from experiments which determine the rooted topology for any triple of species. We examine issues of construction of trees and consistency of TOM and POM experiments, where the trees may either be weighted or unweighted. Using these experiments to construct unweighted trees without nodes of degree two is motivated by a similar problem studied by Winkler, called the Discrete Metric Realization problem, which he showed to be strongly NP-hard. We have the following results: Determining consistency of a set of TOM or POM experiments is NP-Complete whether the tree is weighted or constrained to be unweighted and without degree two nodes. We can construct unweighted trees without degree two nodes from TOM experiments in optimal O(n[sup 3]) time and from POM experiments in O(n[sup 4]) time.

  17. Tree reconstruction from partial orders

    SciTech Connect

    Kannan, S.K.; Warnow, T.J.

    1993-03-01

    The problem of constructing trees given a matrix of interleaf distances is motivated by applications in computational evolutionary biology and linguistics. The general problem is to find an edge-weighted tree which most closely approximates the distance matrix. Although the construction problem is easy when the tree exactly fits the distance matrix, optimization problems under all popular criteria are either known or conjectured to be NP-complete. In this paper we consider the related problem where we are given a partial order on the pairwise distances, and wish to construct (if possible) an edge-weighted tree realizing the partial order. In particular we are interested in partial orders which arise from experiments on triples of species, which determine either a linear ordering of the three pairwise distances (called Total Order Model or TOM experiments) or only the pair(s) of minimum distance apart (called Partial Order Model or POM experiments). The POM and TOM experimental model is inspired by the model proposed by Kannan, Lawler, and Warnow for constructing trees from experiments which determine the rooted topology for any triple of species. We examine issues of construction of trees and consistency of TOM and POM experiments, where the trees may either be weighted or unweighted. Using these experiments to construct unweighted trees without nodes of degree two is motivated by a similar problem studied by Winkler, called the Discrete Metric Realization problem, which he showed to be strongly NP-hard. We have the following results: Determining consistency of a set of TOM or POM experiments is NP-Complete whether the tree is weighted or constrained to be unweighted and without degree two nodes. We can construct unweighted trees without degree two nodes from TOM experiments in optimal O(n{sup 3}) time and from POM experiments in O(n{sup 4}) time.

  18. Spontaneous skin regression and predictors of skin regression in Thai scleroderma patients.

    PubMed

    Foocharoen, Chingching; Mahakkanukrauh, Ajanee; Suwannaroj, Siraphop; Nanagara, Ratanavadee

    2011-09-01

    Skin tightness is a major clinical manifestation of systemic sclerosis (SSc). Importantly for both clinicians and patients, spontaneous regression of the fibrosis process has been documented. The purpose of this study is to identify the incidence and related clinical characteristics of spontaneous regression among Thai SSc patients. A historical cohort with 4 years of follow-up was performed among SSc patients over 15 years of age diagnosed with SSc between January 1, 2005 and December 31, 2006 in Khon Kaen, Thailand. The start date was the date of the first symptom and the end date was the date of the skin score ≤2. To estimate the respective probability of regression and to assess the associated factors, the Kaplan-Meier method and Cox regression analysis was used. One hundred seventeen cases of SSc were included with a female to male ratio of 1.5:1. Thirteen patients (11.1%) experienced regression. The incidence rate of spontaneous skin regression was 0.31 per 100 person-months and the average duration of SSc at the time of regression was 35.9±15.6 months (range, 15.7-60 months). The factors that negatively correlated with regression were (a) diffuse cutaneous type, (b) Raynaud's phenomenon, (c) esophageal dysmotility, and (d) colchicine treatment at onset with a respective hazard ratio (HR) of 0.19, 0.19, 0.26, and 0.20. By contrast, the factor that positively correlated with regression was active alveolitis with cyclophosphamide therapy at onset with an HR of 4.23 (95% CI, 1.23-14.10). After regression analysis, only Raynaud's phenomenon at onset and diffuse cutaneous type had a significantly negative correlation to regression. A spontaneous regression of the skin fibrosis process was not uncommon among Thai SSc patients. The factors suggesting a poor predictor for cutaneous manifestation were Raynaud's phenomenon, diffuse cutaneous type while early cyclophosphamide therapy might be related to a better skin outcome.

  19. Parametric modeling of quantile regression coefficient functions.

    PubMed

    Frumento, Paolo; Bottai, Matteo

    2016-03-01

    Estimating the conditional quantiles of outcome variables of interest is frequent in many research areas, and quantile regression is foremost among the utilized methods. The coefficients of a quantile regression model depend on the order of the quantile being estimated. For example, the coefficients for the median are generally different from those of the 10th centile. In this article, we describe an approach to modeling the regression coefficients as parametric functions of the order of the quantile. This approach may have advantages in terms of parsimony, efficiency, and may expand the potential of statistical modeling. Goodness-of-fit measures and testing procedures are discussed, and the results of a simulation study are presented. We apply the method to analyze the data that motivated this work. The described method is implemented in the qrcm R package.

  20. Computing aspects of power for multiple regression.

    PubMed

    Dunlap, William P; Xin, Xue; Myers, Leann

    2004-11-01

    Rules of thumb for power in multiple regression research abound. Most such rules dictate the necessary sample size, but they are based only upon the number of predictor variables, usually ignoring other critical factors necessary to compute power accurately. Other guides to power in multiple regression typically use approximate rather than precise equations for the underlying distribution; entail complex preparatory computations; require interpolation with tabular presentation formats; run only under software such as Mathmatica or SAS that may not be immediately available to the user; or are sold to the user as parts of power computation packages. In contrast, the program we offer herein is immediately downloadable at no charge, runs under Windows, is interactive, self-explanatory, flexible to fit the user's own regression problems, and is as accurate as single precision computation ordinarily permits.