Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois
2013-01-01
This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…
Developing Models to Forcast Sales of Natural Christmas Trees
Lawrence D. Garrett; Thomas H. Pendleton
1977-01-01
A study of practices for marketing Christmas trees in Winston-Salem, North Carolina, and Denver, Colorado, revealed that such factors as retail lot competition, tree price, consumer traffic, and consumer income were very important in determining a particular retailer's sales. Analyses of 4 years of market data were used in developing regression models for...
CADDIS Volume 4. Data Analysis: Basic Analyses
Use of statistical tests to determine if an observation is outside the normal range of expected values. Details of CART, regression analysis, use of quantile regression analysis, CART in causal analysis, simplifying or pruning resulting trees.
Acid rain, air pollution, and tree growth in southeastern New York
Puckett, L.J.
1982-01-01
Whether dendroecological analyses could be used to detect changes in the relationship of tree growth to climate that might have resulted from chronic exposure to components of the acid rain-air pollution complex was determined. Tree-ring indices of white pine (Pinus strobus L.), eastern hemlock (Tsuga canadensis (L.) Cart.), pitch pine (Pinus rigida Mill.), and chestnut oak (Quercus prinus L.) were regressed against orthogonally transformed values of temperature and precipitation in order to derive a response-function relationship. Results of the regression analyses for three time periods, 1901–1920, 1926–1945, and 1954–1973 suggest that the relationship of tree growth to climate has been altered. Statistical tests of the temperature and precipitation data suggest that this change was nonclimatic. Temporally, the shift in growth response appears to correspond with the suspected increase in acid rain and air pollution in the Shawangunk Mountain area of southeastern New York in the early 1950's. This change could be the result of physiological stress induced by components of the acid rain-air pollution complex, causing climatic conditions to be more limiting to tree growth.
Teng, Ju-Hsi; Lin, Kuan-Chia; Ho, Bin-Shenq
2007-10-01
A community-based aboriginal study was conducted and analysed to explore the application of classification tree and logistic regression. A total of 1066 aboriginal residents in Yilan County were screened during 2003-2004. The independent variables include demographic characteristics, physical examinations, geographic location, health behaviours, dietary habits and family hereditary diseases history. Risk factors of cardiovascular diseases were selected as the dependent variables in further analysis. The completion rate for heath interview is 88.9%. The classification tree results find that if body mass index is higher than 25.72 kg m(-2) and the age is above 51 years, the predicted probability for number of cardiovascular risk factors > or =3 is 73.6% and the population is 322. If body mass index is higher than 26.35 kg m(-2) and geographical latitude of the village is lower than 24 degrees 22.8', the predicted probability for number of cardiovascular risk factors > or =4 is 60.8% and the population is 74. As the logistic regression results indicate that body mass index, drinking habit and menopause are the top three significant independent variables. The classification tree model specifically shows the discrimination paths and interactions between the risk groups. The logistic regression model presents and analyses the statistical independent factors of cardiovascular risks. Applying both models to specific situations will provide a different angle for the design and management of future health intervention plans after community-based study.
Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.
2008-01-01
Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742
Huang, Li-Shan; Myers, Gary J.; Davidson, Philip W.; Cox, Christopher; Xiao, Fenyuan; Thurston, Sally W.; Cernichiari, Elsa; Shamlaye, Conrad F.; Sloane-Reeves, Jean; Georger, Lesley; Clarkson, Thomas W.
2007-01-01
Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age nine years. The analyses for the most recent nine-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated nonlinearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of twenty-one endpoints available at age nine years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other nine-year endpoints that in the linear analysis has a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels. PMID:17942158
Huang, Li-Shan; Myers, Gary J; Davidson, Philip W; Cox, Christopher; Xiao, Fenyuan; Thurston, Sally W; Cernichiari, Elsa; Shamlaye, Conrad F; Sloane-Reeves, Jean; Georger, Lesley; Clarkson, Thomas W
2007-11-01
Studies of the association between prenatal methylmercury exposure from maternal fish consumption during pregnancy and neurodevelopmental test scores in the Seychelles Child Development Study have found no consistent pattern of associations through age 9 years. The analyses for the most recent 9-year data examined the population effects of prenatal exposure, but did not address the possibility of non-homogeneous susceptibility. This paper presents a regression tree approach: covariate effects are treated non-linearly and non-additively and non-homogeneous effects of prenatal methylmercury exposure are permitted among the covariate clusters identified by the regression tree. The approach allows us to address whether children in the lower or higher ends of the developmental spectrum differ in susceptibility to subtle exposure effects. Of 21 endpoints available at age 9 years, we chose the Weschler Full Scale IQ and its associated covariates to construct the regression tree. The prenatal mercury effect in each of the nine resulting clusters was assessed linearly and non-homogeneously. In addition we reanalyzed five other 9-year endpoints that in the linear analysis had a two-tailed p-value <0.2 for the effect of prenatal exposure. In this analysis, motor proficiency and activity level improved significantly with increasing MeHg for 53% of the children who had an average home environment. Motor proficiency significantly decreased with increasing prenatal MeHg exposure in 7% of the children whose home environment was below average. The regression tree results support previous analyses of outcomes in this cohort. However, this analysis raises the intriguing possibility that an effect may be non-homogeneous among children with different backgrounds and IQ levels.
Buchner, Florian; Wasem, Jürgen; Schillo, Sonja
2017-01-01
Risk equalization formulas have been refined since their introduction about two decades ago. Because of the complexity and the abundance of possible interactions between the variables used, hardly any interactions are considered. A regression tree is used to systematically search for interactions, a methodologically new approach in risk equalization. Analyses are based on a data set of nearly 2.9 million individuals from a major German social health insurer. A two-step approach is applied: In the first step a regression tree is built on the basis of the learning data set. Terminal nodes characterized by more than one morbidity-group-split represent interaction effects of different morbidity groups. In the second step the 'traditional' weighted least squares regression equation is expanded by adding interaction terms for all interactions detected by the tree, and regression coefficients are recalculated. The resulting risk adjustment formula shows an improvement in the adjusted R 2 from 25.43% to 25.81% on the evaluation data set. Predictive ratios are calculated for subgroups affected by the interactions. The R 2 improvement detected is only marginal. According to the sample level performance measures used, not involving a considerable number of morbidity interactions forms no relevant loss in accuracy. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Bayesian models for comparative analysis integrating phylogenetic uncertainty.
de Villemereuil, Pierre; Wells, Jessie A; Edwards, Robert D; Blomberg, Simon P
2012-06-28
Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language.
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for phylogenetic comparative analyses, particularly for modelling in the face of phylogenetic uncertainty and accounting for measurement error or individual variation in explanatory variables. Code for all models is provided in the BUGS model description language. PMID:22741602
Regression methods for spatially correlated data: an example using beetle attacks in a seed orchard
Preisler Haiganoush; Nancy G. Rappaport; David L. Wood
1997-01-01
We present a statistical procedure for studying the simultaneous effects of observed covariates and unmeasured spatial variables on responses of interest. The procedure uses regression type analyses that can be used with existing statistical software packages. An example using the rate of twig beetle attacks on Douglas-fir trees in a seed orchard illustrates the...
J. Stephen Brewer
2010-01-01
Quantifying per capita impacts of invasive species on resident communities requires integrating regression analyses with experiments under natural conditions. Using multivariate and univariate approaches, I regressed the abundance of 105 resident species of groundcover plants and tree seedlings against the abundance and height of an invasive grass, Microstegium...
Modeling time-to-event (survival) data using classification tree analysis.
Linden, Ariel; Yarnold, Paul R
2017-12-01
Time to the occurrence of an event is often studied in health research. Survival analysis differs from other designs in that follow-up times for individuals who do not experience the event by the end of the study (called censored) are accounted for in the analysis. Cox regression is the standard method for analysing censored data, but the assumptions required of these models are easily violated. In this paper, we introduce classification tree analysis (CTA) as a flexible alternative for modelling censored data. Classification tree analysis is a "decision-tree"-like classification model that provides parsimonious, transparent (ie, easy to visually display and interpret) decision rules that maximize predictive accuracy, derives exact P values via permutation tests, and evaluates model cross-generalizability. Using empirical data, we identify all statistically valid, reproducible, longitudinally consistent, and cross-generalizable CTA survival models and then compare their predictive accuracy to estimates derived via Cox regression and an unadjusted naïve model. Model performance is assessed using integrated Brier scores and a comparison between estimated survival curves. The Cox regression model best predicts average incidence of the outcome over time, whereas CTA survival models best predict either relatively high, or low, incidence of the outcome over time. Classification tree analysis survival models offer many advantages over Cox regression, such as explicit maximization of predictive accuracy, parsimony, statistical robustness, and transparency. Therefore, researchers interested in accurate prognoses and clear decision rules should consider developing models using the CTA-survival framework. © 2017 John Wiley & Sons, Ltd.
Seligman, D A; Pullinger, A G
2000-01-01
Confusion about the relationship of occlusion to temporomandibular disorders (TMD) persists. This study attempted to identify occlusal and attrition factors plus age that would characterize asymptomatic normal female subjects. A total of 124 female patients with intracapsular TMD were compared with 47 asymptomatic female controls for associations to 9 occlusal factors, 3 attrition severity measures, and age using classification tree, multiple stepwise logistic regression, and univariate analyses. Models were tested for accuracy (sensitivity and specificity) and total contribution to the variance. The classification tree model had 4 terminal nodes that used only anterior attrition and age. "Normals" were mainly characterized by low attrition levels, whereas patients had higher attrition and tended to be younger. The tree model was only moderately useful (sensitivity 63%, specificity 94%) in predicting normals. The logistic regression model incorporated unilateral posterior crossbite and mediotrusive attrition severity in addition to the 2 factors in the tree, but was slightly less accurate than the tree (sensitivity 51%, specificity 90%). When only occlusal factors were considered in the analysis, normals were additionally characterized by a lack of anterior open bite, smaller overjet, and smaller RCP-ICP slides. The log likelihood accounted for was similar for both the tree (pseudo R(2) = 29.38%; mean deviance = 0.95) and the multiple logistic regression (Cox Snell R(2) = 30.3%, mean deviance = 0.84) models. The occlusal and attrition factors studied were only moderately useful in differentiating normals from TMD patients.
Empirical analyses of plant-climate relationships for the western United States
Gerald E. Rehfeldt; Nicholas L. Crookston; Marcus V. Warwell; Jeffrey S. Evans
2006-01-01
The Random Forests multiple-regression tree was used to model climate profiles of 25 biotic communities of the western United States and nine of their constituent species. Analyses of the communities were based on a gridded sample of ca. 140,000 points, while those for the species used presence-absence data from ca. 120,000 locations. Independent variables included 35...
Partitioning sources of variation in vertebrate species richness
Boone, R.B.; Krohn, W.B.
2000-01-01
Aim: To explore biogeographic patterns of terrestrial vertebrates in Maine, USA using techniques that would describe local and spatial correlations with the environment. Location: Maine, USA. Methods: We delineated the ranges within Maine (86,156 km2) of 275 species using literature and expert review. Ranges were combined into species richness maps, and compared to geomorphology, climate, and woody plant distributions. Methods were adapted that compared richness of all vertebrate classes to each environmental correlate, rather than assessing a single explanatory theory. We partitioned variation in species richness into components using tree and multiple linear regression. Methods were used that allowed for useful comparisons between tree and linear regression results. For both methods we partitioned variation into broad-scale (spatially autocorrelated) and fine-scale (spatially uncorrelated) explained and unexplained components. By partitioning variance, and using both tree and linear regression in analyses, we explored the degree of variation in species richness for each vertebrate group that Could be explained by the relative contribution of each environmental variable. Results: In tree regression, climate variation explained richness better (92% of mean deviance explained for all species) than woody plant variation (87%) and geomorphology (86%). Reptiles were highly correlated with environmental variation (93%), followed by mammals, amphibians, and birds (each with 84-82% deviance explained). In multiple linear regression, climate was most closely associated with total vertebrate richness (78%), followed by woody plants (67%) and geomorphology (56%). Again, reptiles were closely correlated with the environment (95%), followed by mammals (73%), amphibians (63%) and birds (57%). Main conclusions: Comparing variation explained using tree and multiple linear regression quantified the importance of nonlinear relationships and local interactions between species richness and environmental variation, identifying the importance of linear relationships between reptiles and the environment, and nonlinear relationships between birds and woody plants, for example. Conservation planners should capture climatic variation in broad-scale designs; temperatures may shift during climate change, but the underlying correlations between the environment and species richness will presumably remain.
Koch, George W; Sillett, Stephen C; Jennings, Gregory M; Davis, Stephen D
2004-04-22
Trees grow tall where resources are abundant, stresses are minor, and competition for light places a premium on height growth. The height to which trees can grow and the biophysical determinants of maximum height are poorly understood. Some models predict heights of up to 120 m in the absence of mechanical damage, but there are historical accounts of taller trees. Current hypotheses of height limitation focus on increasing water transport constraints in taller trees and the resulting reductions in leaf photosynthesis. We studied redwoods (Sequoia sempervirens), including the tallest known tree on Earth (112.7 m), in wet temperate forests of northern California. Our regression analyses of height gradients in leaf functional characteristics estimate a maximum tree height of 122-130 m barring mechanical damage, similar to the tallest recorded trees of the past. As trees grow taller, increasing leaf water stress due to gravity and path length resistance may ultimately limit leaf expansion and photosynthesis for further height growth, even with ample soil moisture.
NASA Astrophysics Data System (ADS)
Zabret, Katarina; Rakovec, Jože; Šraj, Mojca
2018-03-01
Rainfall partitioning is an important part of the ecohydrological cycle, influenced by numerous variables. Rainfall partitioning for pine (Pinus nigra Arnold) and birch (Betula pendula Roth.) trees was measured from January 2014 to June 2017 in an urban area of Ljubljana, Slovenia. 180 events from more than three years of observations were analyzed, focusing on 13 meteorological variables, including the number of raindrops, their diameter, and velocity. Regression tree and boosted regression tree analyses were performed to evaluate the influence of the variables on rainfall interception loss, throughfall, and stemflow in different phenoseasons. The amount of rainfall was recognized as the most influential variable, followed by rainfall intensity and the number of raindrops. Higher rainfall amount, intensity, and the number of drops decreased percentage of rainfall interception loss. Rainfall amount and intensity were the most influential on interception loss by birch and pine trees during the leafed and leafless periods, respectively. Lower wind speed was found to increase throughfall, whereas wind direction had no significant influence. Consideration of drop size spectrum properties proved to be important, since the number of drops, drop diameter, and median volume diameter were often recognized as important influential variables.
Brabant, Marie-Eve; Hébert, Martine; Chagnon, François
2013-01-01
This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression, posttraumatic stress symptoms, and hopelessness discriminated profiles of suicidal and nonsuicidal survivors. The elevated prevalence of suicidal ideations among adolescent survivors of sexual abuse underscores the importance of investigating the presence of suicidal ideations in sexual abuse survivors. However, suicidal ideation is not the sole variable that needs to be investigated; depression, hopelessness and posttraumatic stress symptoms are also related to suicidal ideations in survivors and could therefore guide interventions.
Fraver, Shawn; D'Amato, Anthony W.; Bradford, John B.; Jonsson, Bengt Gunnar; Jönsson, Mari; Esseen, Per-Anders
2013-01-01
Question: What factors best characterize tree competitive environments in this structurally diverse old-growth forest, and do these factors vary spatially within and among stands? Location: Old-growth Picea abies forest of boreal Sweden. Methods: Using long-term, mapped permanent plot data augmented with dendrochronological analyses, we evaluated the effect of neighbourhood competition on focal tree growth by means of standard competition indices, each modified to include various metrics of trees size, neighbour mortality weighting (for neighbours that died during the inventory period), and within-neighbourhood tree clustering. Candidate models were evaluated using mixed-model linear regression analyses, with mean basal area increment as the response variable. We then analysed stand-level spatial patterns of competition indices and growth rates (via kriging) to determine if the relationship between these patterns could further elucidate factors influencing tree growth. Results: Inter-tree competition clearly affected growth rates, with crown volume being the size metric most strongly influencing the neighbourhood competitive environment. Including neighbour tree mortality weightings in models only slightly improved descriptions of competitive interactions. Although the within-neighbourhood clustering index did not improve model predictions, competition intensity was influenced by the underlying stand-level tree spatial arrangement: stand-level clustering locally intensified competition and reduced tree growth, whereas in the absence of such clustering, inter-tree competition played a lesser role in constraining tree growth. Conclusions: Our findings demonstrate that competition continues to influence forest processes and structures in an old-growth system that has not experienced major disturbances for at least two centuries. The finding that the underlying tree spatial pattern influenced the competitive environment suggests caution in interpreting traditional tree competition studies, in which tree spatial patterning is typically not taken into account. Our findings highlight the importance of forest structure – particularly the spatial arrangement of trees – in regulating inter-tree competition and growth in structurally diverse forests, and they provide insight into the causes and consequences of heterogeneity in this old-growth system.
Tree-ring variation in western larch (Larix occidentalis Nutt. ) exposed to sulfur dioxide emissions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fox, C.A.; Kincaid, W.B.; Nash, T.H. III
1984-12-01
Tree-ring analysis of western larch (Larix occidentialis Nutt) demonstrated both direct and indirect affects of sulfur dioxide emissions from the lead/zinc smelter at Trail, B.C. Tree cores were collected from 5 stands known to have been polluted and from 3 control stands. Age effects were removed by fitting theoretical growth curves, and macrocliate was modeled using the average of the controls and two laged values thereof. Separate analyses were performed for years before and after installation of two tall stacks, for drought and nondrought years, and for years prior to initiation of smelting. Regression analyses revealed a negative effect onmore » annual growth that diminished with increasing distance from the smelter and during drought years. Furthermore, chronology statistics suggested an increase in sensitivity to climate that persisted decades beyond implementation of pollution controls, which reduced emissions 10-fold. 38 references, 6 figures, 3 tables.« less
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J
2015-12-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. (c) 2015 APA, all rights reserved).
Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J.
2016-01-01
In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. PMID:26389526
The influence of tree morphology on stemflow generation in a tropical lowland rainforest
NASA Astrophysics Data System (ADS)
Uber, Magdalena; Levia, Delphis F.; Zimmermann, Beate; Zimmermann, Alexander
2014-05-01
Even though stemflow usually accounts for only a small proportion of rainfall, it is an important point source of water and ion input to forest floors and may, for instance, influence soil moisture patterns and groundwater recharge. Previous studies showed that the generation of stemflow depends on a multitude of meteorological and biological factors. Interestingly, despite the tremendous progress in stemflow research during the last decades it is still largely unknown which combination of tree characteristics determines stemflow volumes in species-rich tropical forests. This knowledge gap motivated us to analyse the influence of tree characteristics on stemflow volumes in a 1 hectare plot located in a Panamanian lowland rainforest. Our study comprised stemflow measurements in six randomly selected 10 m by 10 m subplots. In each subplot we measured stemflow of all trees with a diameter at breast height (DBH) > 5 cm on an event-basis for a period of six weeks. Additionally, we identified all tree species and determined a set of tree characteristics including DBH, crown diameter, bark roughness, bark furrowing, epiphyte coverage, tree architecture, stem inclination, and crown position. During the sampling period, we collected 985 L of stemflow (0.98 % of total rainfall). Based on regression analyses and comparisons among plant functional groups we show that palms were most efficient in yielding stemflow due to their large inclined fronds. Trees with large emergent crowns also produced relatively large amounts of stemflow. Due to their abundance, understory trees contribute much to stemflow yield not on individual but on the plot scale. Even though parameters such as crown diameter, branch inclination and position of the crown influence stemflow generation to some extent, these parameters explain less than 30 % of the variation in stemflow volumes. In contrast to published results from temperate forests, we did not detect a negative correlation between bark roughness and stemflow volume. This is because other parameters such as crown diameter obscured this relationship. Due to multicollinearity and poor correlations between single tree characteristics with stemflow volume, an assessment of stemflow volumes based on forest characteristics remains cumbersome in highly diverse ecosystems. Instead of relying on regression relationships, we therefore advocate a total sampling of trees in several plots to determine stand-scale stemflow yield in tropical forests.
Predicting postfire Douglas-fir beetle attacks and tree mortality in the northern Rocky Mountains
Sharon Hood; Barbara Bentz
2007-01-01
Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco) were monitored for 4 years following three wildfires. Logistic regression analyses were used to develop models predicting the probability of attack by Douglas-fir beetle (Dendroctonus pseudotsugae Hopkins, 1905) and the probability of Douglas-fir mortality within 4 years following...
Andreas, Sylke; Harries-Hedder, Karin; Schwenk, Wolfgang; Hausberg, Maria; Koch, Uwe; Schulz, Holger
2010-07-01
The Health of the Nation Outcome Scales (HoNOS) is an internationally established clinician-rated instrument. The aim of the study was to assess the psychometric properties in inpatients with substance-related disorders. The HoNOS was applied in a multicenter, consecutive sample of 417 inpatients. Interrater reliability coefficients, confirmatory factor analysis, and regression tree analyses were calculated to assess the reliability and validity of the HoNOS. The factor validity of the HoNOS and its total score could not be confirmed. After training, all items of the HoNOS revealed sufficient values of interrater reliabilities. As the results of the regression tree analyses showed, the single items of the HoNOS were one of the most important predictor of service utilization. The HoNOS can be recommended for obtaining detailed ratings of the problems of inpatients with substance-related disorders as a clinical application in routine mental health care at present. Further studies should include comparisons of HoNOS and Addiction Severity Index. Copyright 2010 Elsevier Inc. All rights reserved.
Custer, Christine M.; Custer, Thomas W.; Dummer, Paul; Etterson, Matthew A.; Thogmartin, Wayne E.; Wu, Qian; Kannan, Kurunthachalam; Trowbridge, Annette; McKann, Patrick C.
2013-01-01
The exposure and effects of perfluoroalkyl substances (PFASs) were studied at eight locations in Minnesota and Wisconsin between 2007 and 2011 using tree swallows (Tachycineta bicolor). Concentrations of PFASs were quantified as were reproductive success end points. The sample egg method was used wherein an egg sample is collected, and the hatching success of the remaining eggs in the nest is assessed. The association between PFAS exposure and reproductive success was assessed by site comparisons, logistic regression analysis, and multistate modeling, a technique not previously used in this context. There was a negative association between concentrations of perfluorooctane sulfonate (PFOS) in eggs and hatching success. The concentration at which effects became evident (150–200 ng/g wet weight) was far lower than effect levels found in laboratory feeding trials or egg-injection studies of other avian species. This discrepancy was likely because behavioral effects and other extrinsic factors are not accounted for in these laboratory studies and the possibility that tree swallows are unusually sensitive to PFASs. The results from multistate modeling and simple logistic regression analyses were nearly identical. Multistate modeling provides a better method to examine possible effects of additional covariates and assessment of models using Akaike information criteria analyses. There was a credible association between PFOS concentrations in plasma and eggs, so extrapolation between these two commonly sampled tissues can be performed.
Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw
2006-01-01
We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.
Ultrasonographic Diagnosis of Biliary Atresia Based on a Decision-Making Tree Model.
Lee, So Mi; Cheon, Jung-Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun-Hae; Cho, Hyun-Hye; Kim, In-One; You, Sun Kyoung
2015-01-01
To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology.
The process and utility of classification and regression tree methodology in nursing research
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-01-01
Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048
The process and utility of classification and regression tree methodology in nursing research.
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-06-01
This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.
Chilling and heat requirements for flowering in temperate fruit trees
NASA Astrophysics Data System (ADS)
Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike
2014-08-01
Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut ( Castanea mollissima Blume) and jujube ( Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing's cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.
Chilling and heat requirements for flowering in temperate fruit trees.
Guo, Liang; Dai, Junhu; Ranjitkar, Sailesh; Yu, Haiying; Xu, Jianchu; Luedeling, Eike
2014-08-01
Climate change has affected the rates of chilling and heat accumulation, which are vital for flowering and production, in temperate fruit trees, but few studies have been conducted in the cold-winter climates of East Asia. To evaluate tree responses to variation in chill and heat accumulation rates, partial least squares regression was used to correlate first flowering dates of chestnut (Castanea mollissima Blume) and jujube (Zizyphus jujube Mill.) in Beijing, China, with daily chill and heat accumulation between 1963 and 2008. The Dynamic Model and the Growing Degree Hour Model were used to convert daily records of minimum and maximum temperature into horticulturally meaningful metrics. Regression analyses identified the chilling and forcing periods for chestnut and jujube. The forcing periods started when half the chilling requirements were fulfilled. Over the past 50 years, heat accumulation during tree dormancy increased significantly, while chill accumulation remained relatively stable for both species. Heat accumulation was the main driver of bloom timing, with effects of variation in chill accumulation negligible in Beijing’s cold-winter climate. It does not seem likely that reductions in chill will have a major effect on the studied species in Beijing in the near future. Such problems are much more likely for trees grown in locations that are substantially warmer than their native habitats, such as temperate species in the subtropics and tropics.
KAWAGUCHI, TAKUMI; SUETSUGU, TAKURO; OGATA, SHYOU; IMANAGA, MINAMI; ISHII, KUMIKO; ESAKI, NAO; SUGIMOTO, MASAKO; OTSUYAMA, JYURI; NAGAMATSU, AYU; TANIGUCHI, EITARO; ITOU, MINORU; ORIISHI, TETSUHARU; IWASAKI, SHOKO; MIURA, HIROKO; TORIMURA, TAKUJI
2016-01-01
The incidence of traffic accidents in patients with chronic liver disease (CLD) is high in the USA. However, the characteristics of patients, including dietary habits, differ between Japan and the USA. The present study investigated the incidence of traffic accidents in CLD patients and the clinical profiles associated with traffic accidents in Japan using a data-mining analysis. A cross-sectional study was performed and 256 subjects [148 CLD patients (CLD group) and 106 patients with other digestive diseases (disease control group)] were enrolled; 2 patients were excluded. The incidence of traffic accidents was compared between the two groups. Independent factors for traffic accidents were analyzed using logistic regression and decision-tree analyses. The incidence of traffic accidents did not differ between the CLD and disease control groups (8.8 vs. 11.3%). The results of the logistic regression analysis showed that yoghurt consumption was the only independent risk factor for traffic accidents (odds ratio, 0.37; 95% confidence interval, 0.16–0.85; P=0.0197). Similarly, the results of the decision-tree analysis showed that yoghurt consumption was the initial divergence variable. In patients who consumed yoghurt habitually, the incidence of traffic accidents was 6.6%, while that in patients who did not consume yoghurt was 16.0%. CLD was not identified as an independent factor in the logistic regression and decision-tree analyses. In conclusion, the difference in the incidence of traffic accidents in Japan between the CLD and disease control groups was insignificant. Furthermore, yoghurt consumption was an independent negative risk factor for traffic accidents in patients with digestive diseases, including CLD. PMID:27123257
Kawaguchi, Takumi; Suetsugu, Takuro; Ogata, Shyou; Imanaga, Minami; Ishii, Kumiko; Esaki, Nao; Sugimoto, Masako; Otsuyama, Jyuri; Nagamatsu, Ayu; Taniguchi, Eitaro; Itou, Minoru; Oriishi, Tetsuharu; Iwasaki, Shoko; Miura, Hiroko; Torimura, Takuji
2016-05-01
The incidence of traffic accidents in patients with chronic liver disease (CLD) is high in the USA. However, the characteristics of patients, including dietary habits, differ between Japan and the USA. The present study investigated the incidence of traffic accidents in CLD patients and the clinical profiles associated with traffic accidents in Japan using a data-mining analysis. A cross-sectional study was performed and 256 subjects [148 CLD patients (CLD group) and 106 patients with other digestive diseases (disease control group)] were enrolled; 2 patients were excluded. The incidence of traffic accidents was compared between the two groups. Independent factors for traffic accidents were analyzed using logistic regression and decision-tree analyses. The incidence of traffic accidents did not differ between the CLD and disease control groups (8.8 vs. 11.3%). The results of the logistic regression analysis showed that yoghurt consumption was the only independent risk factor for traffic accidents (odds ratio, 0.37; 95% confidence interval, 0.16-0.85; P=0.0197). Similarly, the results of the decision-tree analysis showed that yoghurt consumption was the initial divergence variable. In patients who consumed yoghurt habitually, the incidence of traffic accidents was 6.6%, while that in patients who did not consume yoghurt was 16.0%. CLD was not identified as an independent factor in the logistic regression and decision-tree analyses. In conclusion, the difference in the incidence of traffic accidents in Japan between the CLD and disease control groups was insignificant. Furthermore, yoghurt consumption was an independent negative risk factor for traffic accidents in patients with digestive diseases, including CLD.
Diameter and height growth of suppressed grand fir saplings after overstory removal.
K.W. Seidel
1980-01-01
The 2- and 5-year diameter and height growth of suppressed grand fir (Abies grandis (Dougl. ex D. Don) Lindl.) advance reproduction was measured in central Oregon after the overstory was removed. Multiple regression analyses were used to predict growth response as a function of individual tree variables. The resulting equations, although highly...
Evaluation of open source data mining software packages
Bonnie Ruefenacht; Greg Liknes; Andrew J. Lister; Haans Fisk; Dan Wendt
2009-01-01
Since 2001, the USDA Forest Service (USFS) has used classification and regression-tree technology to map USFS Forest Inventory and Analysis (FIA) biomass, forest type, forest type groups, and National Forest vegetation. This prior work used Cubist/See5 software for the analyses. The objective of this project, sponsored by the Remote Sensing Steering Committee (RSSC),...
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
Integration of vessel traits, wood density, and height in angiosperm shrubs and trees.
Martínez-Cabrera, Hugo I; Schenk, H Jochen; Cevallos-Ferriz, Sergio R S; Jones, Cynthia S
2011-05-01
Trees and shrubs tend to occupy different niches within and across ecosystems; therefore, traits related to their resource use and life history are expected to differ. Here we analyzed how growth form is related to variation in integration among vessel traits, wood density, and height. We also considered the ecological and evolutionary consequences of such differences. In a sample of 200 woody plant species (65 shrubs and 135 trees) from Argentina, Mexico, and the United States, standardized major axis (SMA) regression, correlation analyses, and ANOVA were used to determine whether relationships among traits differed between growth forms. The influence of phylogenetic relationships was examined with a phylogenetic ANOVA and phylogenetically independent contrasts (PICs). A principal component analysis was conducted to determine whether trees and shrubs occupy different portions of multivariate trait space. Wood density did not differ between shrubs and trees, but there were significant differences in vessel diameter, vessel density, theoretical conductivity, and as expected, height. In addition, relationships between vessel traits and wood density differed between growth forms. Trees showed coordination among vessel traits, wood density, and height, but in shrubs, wood density and vessel traits were independent. These results hold when phylogenetic relationships were considered. In the multivariate analyses, these differences translated as significantly different positions in multivariate trait space occupied by shrubs and trees. Differences in trait integration between growth forms suggest that evolution of growth form in some lineages might be associated with the degree of trait interrelation.
Genetic variation and seed transfer guidelines for lodgepole pine in central Oregon.
Frank C. Sorensen
1992-01-01
Cones were collected from 272 trees at 189 locations uniformly distributed over the east slopes of the Oregon Cascade Range and Warner Mountains. Variation in seed and seedling traits was related to (1) seed source latitude, distance from the Cascade crest, elevation, slope, and aspect in multiple regression analyses; and (2) seed zone and elevation band in...
Paul G. Schaberg; Brynne E. Lazarus; Gary J. Hawley; Joshua M. Halman; Catherine H. Borer; Christopher F. Hansen
2011-01-01
Despite considerable study, it remains uncertain what environmental factors contribute to red spruce (Picea rubens Sarg.) foliar winter injury and how much this injury influences tree C stores. We used a long-term record of winter injury in a plantation in New Hampshire and conducted stepwise linear regression analyses with local weather and regional...
Avelino, Jacques; Cabut, Sandrine; Barboza, Bernardo; Barquero, Miguel; Alfaro, Ronny; Esquivel, César; Durand, Jean-François; Cilas, Christian
2007-12-01
ABSTRACT We monitored the development of American leaf spot of coffee, a disease caused by the gemmiferous fungus Mycena citricolor, in 57 plots in Costa Rica for 1 or 2 years in order to gain a clearer understanding of conditions conducive to the disease and improve its control. During the investigation, characteristics of the coffee trees, crop management, and the environment were recorded. For the analyses, we used partial least-squares regression via the spline functions (PLSS), which is a nonlinear extension to partial least-squares regression (PLS). The fungus developed well in areas located between approximately 1,100 and 1,550 m above sea level. Slopes were conducive to its development, but eastern-facing slopes were less affected than the others, probably because they were more exposed to sunlight, especially in the rainy season. The distance between planting rows, the shade percentage, coffee tree height, the type of shade, and the pruning system explained disease intensity due to their effects on coffee tree shading and, possibly, on the humidity conditions in the plot. Forest trees and fruit trees intercropped with coffee provided particularly propitious conditions. Apparently, fertilization was unfavorable for the disease, probably due to dilution phenomena associated with faster coffee tree growth. Finally, series of wet spells interspersed with dry spells, which were frequent in the middle of the rainy season, were critical for the disease, probably because they affected the production and release of gemmae and their viability. These results could be used to draw up a map of epidemic risks taking topographical factors into account. To reduce those risks and improve chemical control, our results suggested that farmers should space planting rows further apart, maintain light shading in the plantation, and prune their coffee trees.
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling
NASA Astrophysics Data System (ADS)
Galelli, S.; Castelletti, A.
2013-07-01
Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
Using decision tree analysis to identify risk factors for relapse to smoking
Piper, Megan E.; Loh, Wei-Yin; Smith, Stevens S.; Japuntich, Sandra J.; Baker, Timothy B.
2010-01-01
This research used classification tree analysis and logistic regression models to identify risk factors related to short- and long-term abstinence. Baseline and cessation outcome data from two smoking cessation trials, conducted from 2001 to 2002, in two Midwestern urban areas, were analyzed. There were 928 participants (53.1% women, 81.8% white) with complete data. Both analyses suggest that relapse risk is produced by interactions of risk factors and that early and late cessation outcomes reflect different vulnerability factors. The results illustrate the dynamic nature of relapse risk and suggest the importance of efficient modeling of interactions in relapse prediction. PMID:20397871
Boosted regression tree, table, and figure data
Spreadsheets are included here to support the manuscript Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition. This dataset is associated with the following publication:Golden , H., C. Lane , A. Prues, and E. D'Amico. Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition. JAWRA. American Water Resources Association, Middleburg, VA, USA, 52(5): 1251-1274, (2016).
Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen Fitzgerald
2012-01-01
Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...
Mapping tree and impervious cover using Ikonos imagery: links with water quality and stream health
NASA Astrophysics Data System (ADS)
Wright, R.; Goetz, S. J.; Smith, A.; Zinecker, E.
2002-12-01
Precision georeferened Ikonos satellite imagery was used to map tree cover and impervious surface area in Montgomery county Maryland. The derived maps were used to assess riparian zone stream buffer tree cover and to predict, with multivariate logistic regression, stream health ratings across 246 small watersheds averaging 472 km2 in size. Stream health was assessed by state and county experts using a combination of physical measurements (e.g., dissolved oxygen) and biological indicators (e.g., benthic macroinvertebrates). We found it possible to create highly accurate (90+ per cent) maps of tree and impervious cover using decision tree classifiers, provided extensive field data were available for algorithm training. Impervious surface area was found to be the primary predictor of stream health, followed by tree cover in riparian buffers, and total tree cover within entire watersheds. A number of issues associated with mapping using Ikonos imagery were encountered, including differences in phenological and atmospheric conditions, shadowing within canopies and between scene elements, and limited spectral discrimination of cover types. We report on both the capabilities and limitations of Ikonos imagery for these applications, and considerations for extending these analyses to other areas.
Tedersoo, Leho; Bahram, Mohammad; Cajthaml, Tomáš; Põlme, Sergei; Hiiesalu, Indrek; Anslan, Sten; Harend, Helery; Buegger, Franz; Pritsch, Karin; Koricheva, Julia; Abarenkov, Kessy
2016-01-01
Plant species richness and the presence of certain influential species (sampling effect) drive the stability and functionality of ecosystems as well as primary production and biomass of consumers. However, little is known about these floristic effects on richness and community composition of soil biota in forest habitats owing to methodological constraints. We developed a DNA metabarcoding approach to identify the major eukaryote groups directly from soil with roughly species-level resolution. Using this method, we examined the effects of tree diversity and individual tree species on soil microbial biomass and taxonomic richness of soil biota in two experimental study systems in Finland and Estonia and accounted for edaphic variables and spatial autocorrelation. Our analyses revealed that the effects of tree diversity and individual species on soil biota are largely context dependent. Multiple regression and structural equation modelling suggested that biomass, soil pH, nutrients and tree species directly affect richness of different taxonomic groups. The community composition of most soil organisms was strongly correlated due to similar response to environmental predictors rather than causal relationships. On a local scale, soil resources and tree species have stronger effect on diversity of soil biota than tree species richness per se. PMID:26172210
Tedersoo, Leho; Bahram, Mohammad; Cajthaml, Tomáš; Põlme, Sergei; Hiiesalu, Indrek; Anslan, Sten; Harend, Helery; Buegger, Franz; Pritsch, Karin; Koricheva, Julia; Abarenkov, Kessy
2016-02-01
Plant species richness and the presence of certain influential species (sampling effect) drive the stability and functionality of ecosystems as well as primary production and biomass of consumers. However, little is known about these floristic effects on richness and community composition of soil biota in forest habitats owing to methodological constraints. We developed a DNA metabarcoding approach to identify the major eukaryote groups directly from soil with roughly species-level resolution. Using this method, we examined the effects of tree diversity and individual tree species on soil microbial biomass and taxonomic richness of soil biota in two experimental study systems in Finland and Estonia and accounted for edaphic variables and spatial autocorrelation. Our analyses revealed that the effects of tree diversity and individual species on soil biota are largely context dependent. Multiple regression and structural equation modelling suggested that biomass, soil pH, nutrients and tree species directly affect richness of different taxonomic groups. The community composition of most soil organisms was strongly correlated due to similar response to environmental predictors rather than causal relationships. On a local scale, soil resources and tree species have stronger effect on diversity of soil biota than tree species richness per se.
Hutton, Eileen K; Simioni, Julia C; Thabane, Lehana
2017-08-01
Among women with a fetus with a non-cephalic presentation, external cephalic version (ECV) has been shown to reduce the rate of breech presentation at birth and cesarean birth. Compared with ECV at term, beginning ECV prior to 37 weeks' gestation decreases the number of infants in a non-cephalic presentation at birth. The purpose of this secondary analysis was to investigate factors associated with a successful ECV procedure and to present this in a clinically useful format. Data were collected as part of the Early ECV Pilot and Early ECV2 Trials, which randomized 1776 women with a fetus in breech presentation to either early ECV (34-36 weeks' gestation) or delayed ECV (at or after 37 weeks). The outcome of interest was successful ECV, defined as the fetus being in a cephalic presentation immediately following the procedure, as well as at the time of birth. The importance of several factors in predicting successful ECV was investigated using two statistical methods: logistic regression and classification and regression tree (CART) analyses. Among nulliparas, non-engagement of the presenting part and an easily palpable fetal head were independently associated with success. Among multiparas, non-engagement of the presenting part, gestation less than 37 weeks and an easily palpable fetal head were found to be independent predictors of success. These findings were consistent with results of the CART analyses. Regardless of parity, descent of the presenting part was the most discriminating factor in predicting successful ECV and cephalic presentation at birth. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.
Habitat features and predictive habitat modeling for the Colorado chipmunk in southern New Mexico
Rivieccio, M.; Thompson, B.C.; Gould, W.R.; Boykin, K.G.
2003-01-01
Two subspecies of Colorado chipmunk (state threatened and federal species of concern) occur in southern New Mexico: Tamias quadrivittatus australis in the Organ Mountains and T. q. oscuraensis in the Oscura Mountains. We developed a GIS model of potentially suitable habitat based on vegetation and elevation features, evaluated site classifications of the GIS model, and determined vegetation and terrain features associated with chipmunk occurrence. We compared GIS model classifications with actual vegetation and elevation features measured at 37 sites. At 60 sites we measured 18 habitat variables regarding slope, aspect, tree species, shrub species, and ground cover. We used logistic regression to analyze habitat variables associated with chipmunk presence/absence. All (100%) 37 sample sites (28 predicted suitable, 9 predicted unsuitable) were classified correctly by the GIS model regarding elevation and vegetation. For 28 sites predicted suitable by the GIS model, 18 sites (64%) appeared visually suitable based on habitat variables selected from logistic regression analyses, of which 10 sites (36%) were specifically predicted as suitable habitat via logistic regression. We detected chipmunks at 70% of sites deemed suitable via the logistic regression models. Shrub cover, tree density, plant proximity, presence of logs, and presence of rock outcrop were retained in the logistic model for the Oscura Mountains; litter, shrub cover, and grass cover were retained in the logistic model for the Organ Mountains. Evaluation of predictive models illustrates the need for multi-stage analyses to best judge performance. Microhabitat analyses indicate prospective needs for different management strategies between the subspecies. Sensitivities of each population of the Colorado chipmunk to natural and prescribed fire suggest that partial burnings of areas inhabited by Colorado chipmunks in southern New Mexico may be beneficial. These partial burnings may later help avoid a fire that could substantially reduce habitat of chipmunks over a mountain range.
What contributes to perceived stress in later life? A recursive partitioning approach.
Scott, Stacey B; Jackson, Brenda R; Bergeman, C S
2011-12-01
One possible explanation for the individual differences in outcomes of stress is the diversity of inputs that produce perceptions of being stressed. The current study examines how combinations of contextual features (e.g., social isolation, neighborhood quality, health problems, age discrimination, financial concerns, and recent life events) of later life contribute to overall feelings of stress. Recursive partitioning techniques (regression trees and random forests) were used to examine unique interrelations between predictors of perceived stress in a sample of 282 community-dwelling adults. Trees provided possible examples of equifinality (i.e., subsets of people with similar levels of perceived stress but different predictors) as well as identification both of contextual combinations that separated participants with very high and very low perceived stress. Random forest analyses aggregated across many trees based on permuted versions of the data and predictors; loneliness, financial strain, neighborhood strain, ageism, and to some extent life events emerged as important predictors. Interviews with a subsample of participants provided both thick description of the complex relationships identified in the trees, as well as additional risks not appearing in the survey results. Together, the analyses highlight what may be missed when stress is used as a simple unidimensional construct and can guide differential intervention efforts.
What contributes to perceived stress in later life? A recursive partitioning approach
Scott, Stacey B.; Jackson, Brenda R.; Bergeman, C. S.
2011-01-01
One possible explanation for the individual differences in outcomes of stress is the diversity of inputs that produce perceptions of being stressed. The current study examines how combinations of contextual features (e.g., social isolation, neighborhood quality, health problems, age discrimination, financial concerns, and recent life events) of later life contribute to overall feelings of stress. Recursive partitioning techniques (regression trees and random forests) were used to examine unique interrelations between predictors of perceived stress in a sample of 282 community-dwelling adults. Trees provided possible examples of equifinality (i.e., subsets of people with similar levels of perceived stress but different predictors) as well as for the identification both of contextual combinations that separated participants with very high and very low perceived stress. Random forest analyses aggregated across many trees based on permuted versions of the data and predictors; loneliness, financial strain, neighborhood strain, ageism, and to some extent life events emerged as important predictors. Interviews with a subsample of participants provided both thick description of the complex relationships identified in the trees, as well as additional risks not appearing in the survey results. Together, the analyses highlight what may be missed when stress is used as a simple unidimensional construct and can guide differential intervention efforts. PMID:21604885
Duncan, Dustin T; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A; Arbia, Giuseppe; Castro, Marcia C; White, Kellee; Williams, David R
2014-04-01
The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran's I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran's I range from 0.24 to 0.86, all P =0.001), for tree density (Global Moran's I =0.452, P =0.001), and in the OLS regression residuals (Global Moran's I range from 0.32 to 0.38, all P <0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (r S =-0.19; conventional P -value=0.016; spatially adjusted P -value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (r S =-0.18; conventional P -value=0.019; spatially adjusted P -value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed.
Sankari, E Siva; Manimegalai, D
2017-12-21
Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.
Assessing visual green effects of individual urban trees using airborne Lidar data.
Chen, Ziyue; Xu, Bing; Gao, Bingbo
2015-12-01
Urban trees benefit people's daily life in terms of air quality, local climate, recreation and aesthetics. Among these functions, a growing number of studies have been conducted to understand the relationship between residents' preference towards local environments and visual green effects of urban greenery. However, except for on-site photography, there are few quantitative methods to calculate green visibility, especially tree green visibility, from viewers' perspectives. To fill this research gap, a case study was conducted in the city of Cambridge, which has a diversity of tree species, sizes and shapes. Firstly, a photograph-based survey was conducted to approximate the actual value of visual green effects of individual urban trees. In addition, small footprint airborne Lidar (Light detection and ranging) data was employed to measure the size and shape of individual trees. Next, correlations between visual tree green effects and tree structural parameters were examined. Through experiments and gradual refinement, a regression model with satisfactory R2 and limited large errors is proposed. Considering the diversity of sample trees and the result of cross-validation, this model has the potential to be applied to other study sites. This research provides urban planners and decision makers with an innovative method to analyse and evaluate landscape patterns in terms of tree greenness. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik
2016-03-01
A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.
Louis R Iverson; Anantha M. Prasad; Mark W. Schwartz; Mark W. Schwartz
2005-01-01
We predict current distribution and abundance for tree species present in eastern North America, and subsequently estimate potential suitable habitat for those species under a changed climate with 2 x CO2. We used a series of statistical models (i.e., Regression Tree Analysis (RTA), Multivariate Adaptive Regression Splines (MARS), Bagging Trees (...
Climate Response of Tree Radial Growth at Different Timescales in the Qinling Mountains.
Sun, Changfeng; Liu, Yu
2016-01-01
The analysis of the tree radial growth response to climate is crucial for dendroclimatological research. However, the response relationships between tree-ring indices and climatic factors at different timescales are not yet clear. In this study, the tree-ring width of Huashan pine (Pinus armandii) from Huashan in the Qinling Mountains, north-central China, was used to explore the response differences of tree growth to climatic factors at daily, pentad (5 days), dekad (10 days) and monthly timescales. Correlation function and linear regression analysis were applied in this paper. The tree-ring width showed a more sensitive response to daily and pentad climatic factors. With the timescale decreasing, the absolute value of the maximum correlation coefficient between the tree-ring data and precipitation increases as well as temperature (mean, minimum and maximum temperature). Compared to the other three timescales, pentad was more suitable for analysing the response of tree growth to climate. Relative to the monthly climate data, the association between the tree-ring data and the pentad climate data was more remarkable and accurate, and the reconstruction function based on the pentad climate was also more reliable and stable. We found that the major climatic factor limiting Huashan pine growth was the precipitation of pentads 20-35 (from April 6 to June 24) rather than the well-known April-June precipitation. The pentad was also proved to be a better timescale for analysing the climate and tree growth in the western and eastern Qinling Mountains. The formation of the earlywood density of Chinese pine (Pinus tabulaeformis) from Shimenshan in western Qinling was mainly affected by the maximum temperature of pentads 28-32 (from May 16 to June 9). The maximum temperature of pentads 28-33 (from May 16 to June 14) was the major factor affecting the ring width of Chinese pine from Shirenshan in eastern Qinling.
NASA Astrophysics Data System (ADS)
Park, J.; Yoo, K.
2013-12-01
For groundwater resource conservation, it is important to accurately assess groundwater pollution sensitivity or vulnerability. In this work, we attempted to use data mining approach to assess groundwater pollution vulnerability in a TCE (trichloroethylene) contaminated Korean industrial site. The conventional DRASTIC method failed to describe TCE sensitivity data with a poor correlation with hydrogeological properties. Among the different data mining methods such as Artificial Neural Network (ANN), Multiple Logistic Regression (MLR), Case Base Reasoning (CBR), and Decision Tree (DT), the accuracy and consistency of Decision Tree (DT) was the best. According to the following tree analyses with the optimal DT model, the failure of the conventional DRASTIC method in fitting with TCE sensitivity data may be due to the use of inaccurate weight values of hydrogeological parameters for the study site. These findings provide a proof of concept that DT based data mining approach can be used in predicting and rule induction of groundwater TCE sensitivity without pre-existing information on weights of hydrogeological properties.
Charles E. Rose; Thomas B. Lynch
2001-01-01
A method was developed for estimating parameters in an individual tree basal area growth model using a system of equations based on dbh rank classes. The estimation method developed is a compromise between an individual tree and a stand level basal area growth model that accounts for the correlation between trees within a plot by using seemingly unrelated regression (...
Susan L. King
2003-01-01
The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...
Thomas A. Hanley; Cathy L. Rose
1987-01-01
Snow depth and density were measured in 33 stands of western hemlock-Sitka spruce (Tsuga heterophylla [Rat] Sarg.-Picea sitchensis [Bong.] Carr.) over a 3-year period. The stands, near Juneau, Alaska, provided broad ranges of species composition, age, over-story canopy coverage, tree density, and wood volume. Stepwise multiple regression analyses indicated that both...
Duncan, Dustin T.; Kawachi, Ichiro; Kum, Susan; Aldstadt, Jared; Piras, Gianfranco; Matthews, Stephen A.; Arbia, Giuseppe; Castro, Marcia C.; White, Kellee; Williams, David R.
2017-01-01
The racial/ethnic and income composition of neighborhoods often influences local amenities, including the potential spatial distribution of trees, which are important for population health and community wellbeing, particularly in urban areas. This ecological study used spatial analytical methods to assess the relationship between neighborhood socio-demographic characteristics (i.e. minority racial/ethnic composition and poverty) and tree density at the census tact level in Boston, Massachusetts (US). We examined spatial autocorrelation with the Global Moran’s I for all study variables and in the ordinary least squares (OLS) regression residuals as well as computed Spearman correlations non-adjusted and adjusted for spatial autocorrelation between socio-demographic characteristics and tree density. Next, we fit traditional regressions (i.e. OLS regression models) and spatial regressions (i.e. spatial simultaneous autoregressive models), as appropriate. We found significant positive spatial autocorrelation for all neighborhood socio-demographic characteristics (Global Moran’s I range from 0.24 to 0.86, all P=0.001), for tree density (Global Moran’s I=0.452, P=0.001), and in the OLS regression residuals (Global Moran’s I range from 0.32 to 0.38, all P<0.001). Therefore, we fit the spatial simultaneous autoregressive models. There was a negative correlation between neighborhood percent non-Hispanic Black and tree density (rS=−0.19; conventional P-value=0.016; spatially adjusted P-value=0.299) as well as a negative correlation between predominantly non-Hispanic Black (over 60% Black) neighborhoods and tree density (rS=−0.18; conventional P-value=0.019; spatially adjusted P-value=0.180). While the conventional OLS regression model found a marginally significant inverse relationship between Black neighborhoods and tree density, we found no statistically significant relationship between neighborhood socio-demographic composition and tree density in the spatial regression models. Methodologically, our study suggests the need to take into account spatial autocorrelation as findings/conclusions can change when the spatial autocorrelation is ignored. Substantively, our findings suggest no need for policy intervention vis-à-vis trees in Boston, though we hasten to add that replication studies, and more nuanced data on tree quality, age and diversity are needed. PMID:29354668
Huang, C.; Townshend, J.R.G.
2003-01-01
A stepwise regression tree (SRT) algorithm was developed for approximating complex nonlinear relationships. Based on the regression tree of Breiman et al . (BRT) and a stepwise linear regression (SLR) method, this algorithm represents an improvement over SLR in that it can approximate nonlinear relationships and over BRT in that it gives more realistic predictions. The applicability of this method to estimating subpixel forest was demonstrated using three test data sets, on all of which it gave more accurate predictions than SLR and BRT. SRT also generated more compact trees and performed better than or at least as well as BRT at all 10 equal forest proportion interval ranging from 0 to 100%. This method is appealing to estimating subpixel land cover over large areas.
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-08-01
Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Lofaro, Danilo; Jager, Kitty J; Abu-Hanna, Ameen; Groothoff, Jaap W; Arikoski, Pekka; Hoecker, Britta; Roussey-Kesler, Gwenaelle; Spasojević, Brankica; Verrina, Enrico; Schaefer, Franz; van Stralen, Karlijn J
2016-02-01
Identification of patient groups by risk of renal graft loss might be helpful for accurate patient counselling and clinical decision-making. Survival tree models are an alternative statistical approach to identify subgroups, offering cut-off points for covariates and an easy-to-interpret representation. Within the European Society of Pediatric Nephrology/European Renal Association-European Dialysis and Transplant Association (ESPN/ERA-EDTA) Registry data we identified paediatric patient groups with specific profiles for 5-year renal graft survival. Two analyses were performed, including (i) parameters known at time of transplantation and (ii) additional clinical measurements obtained early after transplantation. The identified subgroups were added as covariates in two survival models. The prognostic performance of the models was tested and compared with conventional Cox regression analyses. The first analysis included 5275 paediatric renal transplants. The best 5-year graft survival (90.4%) was found among patients who received a renal graft as a pre-emptive transplantation or after short-term dialysis (<45 days), whereas graft survival was poorest (51.7%) in adolescents transplanted after long-term dialysis (>2.2 years). The Cox model including both pre-transplant factors and tree subgroups had a significantly better predictive performance than conventional Cox regression (P < 0.001). In the analysis including clinical factors, graft survival ranged from 97.3% [younger patients with estimated glomerular filtration rate (eGFR) >30 mL/min/1.73 m(2) and dialysis <20 months] to 34.7% (adolescents with eGFR <60 mL/min/1.73 m(2) and dialysis >20 months). Also in this case combining tree findings and clinical factors improved the predictive performance as compared with conventional Cox model models (P < 0.0001). In conclusion, we demonstrated the tree model to be an accurate and attractive tool to predict graft failure for patients with specific characteristics. This may aid the evaluation of individual graft prognosis and thereby the design of measures to improve graft survival in the poor prognosis groups. © The Author 2015. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Eric H. Wharton; Tiberius Cunia
1987-01-01
Proceedings of a workshop co-sponsored by the USDA Forest Service, the State University of New York, and the Society of American Foresters. Presented were papers on the methodology of sample tree selection, tree biomass measurement, construction of biomass tables and estimation of their error, and combining the error of biomass tables with that of the sample plots or...
Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M
2014-12-01
Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed. © 2014 SETAC.
Using decision trees to understand structure in missing data
Tierney, Nicholas J; Harden, Fiona A; Harden, Maurice J; Mengersen, Kerrie L
2015-01-01
Objectives Demonstrate the application of decision trees—classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs)—to understand structure in missing data. Setting Data taken from employees at 3 different industrial sites in Australia. Participants 7915 observations were included. Materials and methods The approach was evaluated using an occupational health data set comprising results of questionnaires, medical tests and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the type of data (medical or environmental), the site in which it was collected, the number of visits, and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured as compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusions Researchers are encouraged to use CART and BRT models to explore and understand missing data. PMID:26124509
NASA Astrophysics Data System (ADS)
Muller, Sybrand Jacobus; van Niekerk, Adriaan
2016-07-01
Soil salinity often leads to reduced crop yield and quality and can render soils barren. Irrigated areas are particularly at risk due to intensive cultivation and secondary salinization caused by waterlogging. Regular monitoring of salt accumulation in irrigation schemes is needed to keep its negative effects under control. The dynamic spatial and temporal characteristics of remote sensing can provide a cost-effective solution for monitoring salt accumulation at irrigation scheme level. This study evaluated a range of pan-fused SPOT-5 derived features (spectral bands, vegetation indices, image textures and image transformations) for classifying salt-affected areas in two distinctly different irrigation schemes in South Africa, namely Vaalharts and Breede River. The relationship between the input features and electro conductivity measurements were investigated using regression modelling (stepwise linear regression, partial least squares regression, curve fit regression modelling) and supervised classification (maximum likelihood, nearest neighbour, decision tree analysis, support vector machine and random forests). Classification and regression trees and random forest were used to select the most important features for differentiating salt-affected and unaffected areas. The results showed that the regression analyses produced weak models (<0.4 R squared). Better results were achieved using the supervised classifiers, but the algorithms tend to over-estimate salt-affected areas. A key finding was that none of the feature sets or classification algorithms stood out as being superior for monitoring salt accumulation at irrigation scheme level. This was attributed to the large variations in the spectral responses of different crops types at different growing stages, coupled with their individual tolerances to saline conditions.
Stemflow in low-density and hedgerow olive orchards in Portugal
NASA Astrophysics Data System (ADS)
Dias, Pedro D.; Valente, Fernanda; Pereira, Fernando L.; Abreu, Francisco G.
2015-04-01
Stemflow (Sf) is responsible for a localized water and solute input to soil around tree's trunks, playing an important eco-hydrological role in forest and agricultural ecosystems. Sf was monitored for seven months in 25 Olea europaea L. trees distributed in three orchards managed in two different ways, traditional low-density and super high density hedgerow. The orchards were located in central Portugal in the regions of Santarém (Várzea and Azóia) and Lisboa (Tapada). Seven olive varieties were analysed: Arbequina, Galega, Picual, Maçanilha, Cordovil, Azeiteira, Negrinha and Blanqueta. Measured Sf ranged from 7.5 to 87.2 mm (relative to crown-projected area), corresponding to 1.2 and 16.7% of gross rainfall (Pg). To understand better the variables that affect Sf and to be able to predict its value, linear regression models were fitted to these data. Whenever possible, the linear models were simplified using the backward stepwise algorithm based on the Akaike information criterion. For each tree, multiple linear regressions were adjusted between Sf and the duration, volume and intensity of rainfall episodes and maximum evaporation rate. In the low-density Várzea grove the more relevant explanatory variables were the three rainfall characteristics. In the super high density Azóia orchard only rainfall volume and intensity were considered relevant. In the low-density Tapada's grove all trees had a different sub-model with Pg being the only common variable. To try to explain differences between trees and to improve the quality of the modeling in each orchard, another set of explanatory variables was added: canopy volume, tree and trunk heights and trunk perimeter at the height of the first branches. The variables present in all sub-models were rainfall volume and intensity and the tree and trunk heights. Canopy volume and rainfall duration were also present in the sub-models of the two low-density groves (Tapada and Várzea). The determination coefficient (R2) of all models ranged from 0.5 to 0.76. The size of leaves was also analysed. Although there were significant differences between varieties and between trees of the same variety, they did not seem to affect the amount of Sf generated. Through analysis of bark storage capacity, it was found that older trees, with rough and thick bark, had higher trunk storage capacity and, therefore, originated less Sf. The results confirm the need for considering the contribution of stemflow when trying to correctly assess interception loss in olive orchards. Although the use of simple and general statistical models may be an attractive option, their precision may be small, making direct measurements or conceptual modelling preferable methods.
Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti
2017-01-01
District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run.
Association of tRNA methyltransferase NSUN2/IGF-II molecular signature with ovarian cancer survival.
Yang, Jia-Cheng; Risch, Eric; Zhang, Meiqin; Huang, Chan; Huang, Huatian; Lu, Lingeng
2017-09-01
To investigate the association between NSUN2/IGF-II signature and ovarian cancer survival. Using a publicly accessible dataset of RNA sequencing and clinical follow-up data, we performed Classification and Regression Tree and survival analyses. Patients with NSUN2 high IGF-II low had significantly superior overall and disease progression-free survival, followed by NSUN2 low IGF-II low , NSUN2 high IGF-II high and NSUN2 low IGF-II high (p < 0.0001 for overall, p = 0.0024 for progression-free survival, respectively). The associations of NSUN2/IGF-II signature with the risks of death and relapse remained significant in multivariate Cox regression models. Random-effects meta-analyses show the upregulated NSUN2 and IGF-II expression in ovarian cancer versus normal tissues. The NSUN2/IGF-II signature associates with heterogeneous outcome and may have clinical implications in managing ovarian cancer.
Mining Health App Data to Find More and Less Successful Weight Loss Subgroups
2016-01-01
Background More than half of all smartphone app downloads involve weight, diet, and exercise. If successful, these lifestyle apps may have far-reaching effects for disease prevention and health cost-savings, but few researchers have analyzed data from these apps. Objective The purposes of this study were to analyze data from a commercial health app (Lose It!) in order to identify successful weight loss subgroups via exploratory analyses and to verify the stability of the results. Methods Cross-sectional, de-identified data from Lose It! were analyzed. This dataset (n=12,427,196) was randomly split into 24 subsamples, and this study used 3 subsamples (combined n=972,687). Classification and regression tree methods were used to explore groupings of weight loss with one subsample, with descriptive analyses to examine other group characteristics. Data mining validation methods were conducted with 2 additional subsamples. Results In subsample 1, 14.96% of users lost 5% or more of their starting body weight. Classification and regression tree analysis identified 3 distinct subgroups: “the occasional users” had the lowest proportion (4.87%) of individuals who successfully lost weight; “the basic users” had 37.61% weight loss success; and “the power users” achieved the highest percentage of weight loss success at 72.70%. Behavioral factors delineated the subgroups, though app-related behavioral characteristics further distinguished them. Results were replicated in further analyses with separate subsamples. Conclusions This study demonstrates that distinct subgroups can be identified in “messy” commercial app data and the identified subgroups can be replicated in independent samples. Behavioral factors and use of custom app features characterized the subgroups. Targeting and tailoring information to particular subgroups could enhance weight loss success. Future studies should replicate data mining analyses to increase methodology rigor. PMID:27301853
Beauregard, Eric; Deslauriers-Varin, Nadine; St-Yves, Michel
2010-09-01
Most studies of confessions have looked at the influence of individual factors, neglecting the potential interactions between these factors and their impact on the decision to confess or not during an interrogation. Classification and regression tree analyses conducted on a sample of 624 convicted sex offenders showed that certain factors related to the offenders (e.g., personality, criminal career), victims (e.g., sex, relationship to offender), and case (e.g., time of day of the crime) were related to the decision to confess or not during the police interrogation. Several interactions were also observed between these factors. Results will be discussed in light of previous findings and interrogation strategies for sex offenders.
Suchetana, Bihu; Rajagopalan, Balaji; Silverstein, JoAnn
2017-11-15
A regression tree-based diagnostic approach is developed to evaluate factors affecting US wastewater treatment plant compliance with ammonia discharge permit limits using Discharge Monthly Report (DMR) data from a sample of 106 municipal treatment plants for the period of 2004-2008. Predictor variables used to fit the regression tree are selected using random forests, and consist of the previous month's effluent ammonia, influent flow rates and plant capacity utilization. The tree models are first used to evaluate compliance with existing ammonia discharge standards at each facility and then applied assuming more stringent discharge limits, under consideration in many states. The model predicts that the ability to meet both current and future limits depends primarily on the previous month's treatment performance. With more stringent discharge limits predicted ammonia concentration relative to the discharge limit, increases. In-sample validation shows that the regression trees can provide a median classification accuracy of >70%. The regression tree model is validated using ammonia discharge data from an operating wastewater treatment plant and is able to accurately predict the observed ammonia discharge category approximately 80% of the time, indicating that the regression tree model can be applied to predict compliance for individual treatment plants providing practical guidance for utilities and regulators with an interest in controlling ammonia discharges. The proposed methodology is also used to demonstrate how to delineate reliable sources of demand and supply in a point source-to-point source nutrient credit trading scheme, as well as how planners and decision makers can set reasonable discharge limits in future. Copyright © 2017 Elsevier B.V. All rights reserved.
Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.
Chung, Yi-Shih
2013-12-01
Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies. Copyright © 2012 Elsevier Ltd. All rights reserved.
New star on the stage: amount of ray parenchyma in tree rings shows a link to climate.
Olano, José Miguel; Arzac, Alberto; García-Cervigón, Ana I; von Arx, Georg; Rozas, Vicente
2013-04-01
Tree-ring anatomy reflects the year-by-year impact of environmental factors on tree growth. Up to now, research in this field has mainly focused on the hydraulic architecture, with ray parenchyma neglected despite the growing recognition of its relevance for xylem function. Our aim was to address this gap by exploring the potential of the annual patterns of xylem parenchyma as a climate proxy. We constructed ring-width and ray-parenchyma chronologies from 1965 to 2004 for 20 Juniperus thurifera trees growing in a Mediterranean continental climate. Chronologies were related to climate records by means of correlation, multiple regression and partial correlation analyses. Ray parenchyma responded to climatic conditions at critical stages during the xylogenetic process; namely, at the end of the previous year's xylogenesis (October) and at the onset of earlywood (May) and latewood formation (August). Ray parenchyma-based chronologies have potential to complement ring-width chronologies as a tool for climate reconstructions. Furthermore, medium- and low-frequency signals in the variation of ray parenchyma may improve our understanding of how trees respond to environmental fluctuations and to global change. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Dynamic travel time estimation using regression trees.
DOT National Transportation Integrated Search
2008-10-01
This report presents a methodology for travel time estimation by using regression trees. The dissemination of travel time information has become crucial for effective traffic management, especially under congested road conditions. In the absence of c...
Jose F. Negron
1998-01-01
Infested and uninfested areas within Douglas fir, Pseudotsuga menziesii Mirb.. Franco, stands affected by the Douglas-fir beetle, Dendroctonus pseudotsugae Hopk. were sampled in the Colorado Front Range, CO. Classification tree models were built to predict probabilities of infestation. Regression trees and linear regression analysis were used to model amount of tree...
Using nonlinear quantile regression to estimate the self-thinning boundary curve
Quang V. Cao; Thomas J. Dean
2015-01-01
The relationship between tree size (quadratic mean diameter) and tree density (number of trees per unit area) has been a topic of research and discussion for many decades. Starting with Reineke in 1933, the maximum size-density relationship, on a log-log scale, has been assumed to be linear. Several techniques, including linear quantile regression, have been employed...
NASA Astrophysics Data System (ADS)
Freeman, Mary Pyott
ABSTRACT An Analysis of Tree Mortality Using High Resolution Remotely-Sensed Data for Mixed-Conifer Forests in San Diego County by Mary Pyott Freeman The montane mixed-conifer forests of San Diego County are currently experiencing extensive tree mortality, which is defined as dieback where whole stands are affected. This mortality is likely the result of the complex interaction of many variables, such as altered fire regimes, climatic conditions such as drought, as well as forest pathogens and past management strategies. Conifer tree mortality and its spatial pattern and change over time were examined in three components. In component 1, two remote sensing approaches were compared for their effectiveness in delineating dead trees, a spatial contextual approach and an OBIA (object based image analysis) approach, utilizing various dates and spatial resolutions of airborne image data. For each approach transforms and masking techniques were explored, which were found to improve classifications, and an object-based assessment approach was tested. In component 2, dead tree maps produced by the most effective techniques derived from component 1 were utilized for point pattern and vector analyses to further understand spatio-temporal changes in tree mortality for the years 1997, 2000, 2002, and 2005 for three study areas: Palomar, Volcan and Laguna mountains. Plot-based fieldwork was conducted to further assess mortality patterns. Results indicate that conifer mortality was significantly clustered, increased substantially between 2002 and 2005, and was non-random with respect to tree species and diameter class sizes. In component 3, multiple environmental variables were used in Generalized Linear Model (GLM-logistic regression) and decision tree classifier model development, revealing the importance of climate and topographic factors such as precipitation and elevation, in being able to predict areas of high risk for tree mortality. The results from this study highlight the importance of multi-scale spatial as well as temporal analyses, in order to understand mixed-conifer forest structure, dynamics, and processes of decline, which can lead to more sustainable management of forests with continued natural and anthropogenic disturbance.
Bittencourt, Natalia F N; Ocarino, Juliana M; Mendonça, Luciana D M; Hewett, Timothy E; Fonseca, Sergio T
2012-12-01
Cross-sectional. To investigate predictors of increased frontal plane knee projection angle (FPKPA) in athletes. The underlying mechanisms that lead to increased FPKPA are likely multifactorial and depend on how the musculoskeletal system adapts to the possible interactions between its distal and proximal segments. Bivariate and linear analyses traditionally employed to analyze the occurrence of increased FPKPA are not sufficiently robust to capture complex relationships among predictors. The investigation of nonlinear interactions among biomechanical factors is necessary to further our understanding of the interdependence of lower-limb segments and resultant dynamic knee alignment. The FPKPA was assessed in 101 athletes during a single-leg squat and in 72 athletes at the moment of landing from a jump. The investigated predictors were sex, hip abductor isometric torque, passive range of motion (ROM) of hip internal rotation (IR), and shank-forefoot alignment. Classification and regression trees were used to investigate nonlinear interactions among predictors and their influence on the occurrence of increased FPKPA. During single-leg squatting, the occurrence of high FPKPA was predicted by the interaction between hip abductor isometric torque and passive hip IR ROM. At the moment of landing, the shank-forefoot alignment, abductor isometric torque, and passive hip IR ROM were predictors of high FPKPA. In addition, the classification and regression trees established cutoff points that could be used in clinical practice to identify athletes who are at potential risk for excessive FPKPA. The models captured nonlinear interactions between hip abductor isometric torque, passive hip IR ROM, and shank-forefoot alignment.
NASA Astrophysics Data System (ADS)
Deng, Chengbin; Wu, Changshan
2013-12-01
Urban impervious surface information is essential for urban and environmental applications at the regional/national scales. As a popular image processing technique, spectral mixture analysis (SMA) has rarely been applied to coarse-resolution imagery due to the difficulty of deriving endmember spectra using traditional endmember selection methods, particularly within heterogeneous urban environments. To address this problem, we derived endmember signatures through a least squares solution (LSS) technique with known abundances of sample pixels, and integrated these endmember signatures into SMA for mapping large-scale impervious surface fraction. In addition, with the same sample set, we carried out objective comparative analyses among SMA (i.e. fully constrained and unconstrained SMA) and machine learning (i.e. Cubist regression tree and Random Forests) techniques. Analysis of results suggests three major conclusions. First, with the extrapolated endmember spectra from stratified random training samples, the SMA approaches performed relatively well, as indicated by small MAE values. Second, Random Forests yields more reliable results than Cubist regression tree, and its accuracy is improved with increased sample sizes. Finally, comparative analyses suggest a tentative guide for selecting an optimal approach for large-scale fractional imperviousness estimation: unconstrained SMA might be a favorable option with a small number of samples, while Random Forests might be preferred if a large number of samples are available.
Predicting the limits to tree height using statistical regressions of leaf traits.
Burgess, Stephen S O; Dawson, Todd E
2007-01-01
Leaf morphology and physiological functioning demonstrate considerable plasticity within tree crowns, with various leaf traits often exhibiting pronounced vertical gradients in very tall trees. It has been proposed that the trajectory of these gradients, as determined by regression methods, could be used in conjunction with theoretical biophysical limits to estimate the maximum height to which trees can grow. Here, we examined this approach using published and new experimental data from tall conifer and angiosperm species. We showed that height predictions were sensitive to tree-to-tree variation in the shape of the regression and to the biophysical endpoints selected. We examined the suitability of proposed end-points and their theoretical validity. We also noted that site and environment influenced height predictions considerably. Use of leaf mass per unit area or leaf water potential coupled with vulnerability of twigs to cavitation poses a number of difficulties for predicting tree height. Photosynthetic rate and carbon isotope discrimination show more promise, but in the second case, the complex relationship between light, water availability, photosynthetic capacity and internal conductance to CO(2) must first be characterized.
Association between split selection instability and predictive error in survival trees.
Radespiel-Tröger, M; Gefeller, O; Rabenstein, T; Hothorn, T
2006-01-01
To evaluate split selection instability in six survival tree algorithms and its relationship with predictive error by means of a bootstrap study. We study the following algorithms: logrank statistic with multivariate p-value adjustment without pruning (LR), Kaplan-Meier distance of survival curves (KM), martingale residuals (MR), Poisson regression for censored data (PR), within-node impurity (WI), and exponential log-likelihood loss (XL). With the exception of LR, initial trees are pruned by using split-complexity, and final trees are selected by means of cross-validation. We employ a real dataset from a clinical study of patients with gallbladder stones. The predictive error is evaluated using the integrated Brier score for censored data. The relationship between split selection instability and predictive error is evaluated by means of box-percentile plots, covariate and cutpoint selection entropy, and cutpoint selection coefficients of variation, respectively, in the root node. We found a positive association between covariate selection instability and predictive error in the root node. LR yields the lowest predictive error, while KM and MR yield the highest predictive error. The predictive error of survival trees is related to split selection instability. Based on the low predictive error of LR, we recommend the use of this algorithm for the construction of survival trees. Unpruned survival trees with multivariate p-value adjustment can perform equally well compared to pruned trees. The analysis of split selection instability can be used to communicate the results of tree-based analyses to clinicians and to support the application of survival trees.
L.R. Iverson; A.M. Prasad; A. Liaw
2004-01-01
More and better machine learning tools are becoming available for landscape ecologists to aid in understanding species-environment relationships and to map probable species occurrence now and potentially into the future. To thal end, we evaluated three statistical models: Regression Tree Analybib (RTA), Bagging Trees (BT) and Random Forest (RF) for their utility in...
Equations for predicting biomass in 2- to 6-year-old Eucalyptus saligna in Hawaii
Craig D. Whitesell; Susan C. Miyasaka; Robert F. Strand; Thomas H. Schubert; Katharine E. McDuffie
1988-01-01
Eucalyptus saligna trees grown in short-rotation plantations on the island of Hawaii were measured, harvested, and weighed to provide data for developing regression equations using non-destructive stand measurements. Regression analysis of the data from 190 trees in the 2.0- to 3.5-year range and 96 trees in the 4- to 6-year range related stem-only...
Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett
2009-01-01
Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
Lo, Benjamin W Y; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A H
2016-01-01
Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56-2.45, P < 0.01). A clinically useful classification tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH.
Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi
2017-06-01
Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p < 0.05). Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.
Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition
Boosted regression tree (BRT) models were developed to quantify the nonlinear relationships between landscape variables and nutrient concentrations in a mesoscale mixed land cover watershed during base-flow conditions. Factors that affect instream biological components, based on ...
Differences in Risk Factors for Rotator Cuff Tears between Elderly Patients and Young Patients.
Watanabe, Akihisa; Ono, Qana; Nishigami, Tomohiko; Hirooka, Takahiko; Machida, Hirohisa
2018-02-01
It has been unclear whether the risk factors for rotator cuff tears are the same at all ages or differ between young and older populations. In this study, we examined the risk factors for rotator cuff tears using classification and regression tree analysis as methods of nonlinear regression analysis. There were 65 patients in the rotator cuff tears group and 45 patients in the intact rotator cuff group. Classification and regression tree analysis was performed to predict rotator cuff tears. The target factor was rotator cuff tears; explanatory variables were age, sex, trauma, and critical shoulder angle≥35°. In the results of classification and regression tree analysis, the tree was divided at age 64. For patients aged≥64, the tree was divided at trauma. For patients aged<64, the tree was divided at critical shoulder angle≥35°. The odds ratio for critical shoulder angle≥35° was significant for all ages (5.89), and for patients aged<64 (10.3) while trauma was only a significant factor for patients aged≥64 (5.13). Age, trauma, and critical shoulder angle≥35° were related to rotator cuff tears in this study. However, these risk factors showed different trends according to age group, not a linear relationship.
Preliminary Survey on TRY Forest Traits and Growth Index Relations - New Challenges
NASA Astrophysics Data System (ADS)
Lyubenova, Mariyana; Kattge, Jens; van Bodegom, Peter; Chikalanov, Alexandre; Popova, Silvia; Zlateva, Plamena; Peteva, Simona
2016-04-01
Forest ecosystems provide critical ecosystem goods and services, including food, fodder, water, shelter, nutrient cycling, and cultural and recreational value. Forests also store carbon, provide habitat for a wide range of species and help alleviate land degradation and desertification. Thus they have a potentially significant role to play in climate change adaptation planning through maintaining ecosystem services and providing livelihood options. Therefore the study of forest traits is such an important issue not just for individual countries but for the planet as a whole. We need to know what functional relations between forest traits exactly can express TRY data base and haw it will be significant for the global modeling and IPBES. The study of the biodiversity characteristics at all levels and functional links between them is extremely important for the selection of key indicators for assessing biodiversity and ecosystem services for sustainable natural capital control. By comparing the available information in tree data bases: TRY, ITR (International Tree Ring) and SP-PAM the 42 tree species are selected for the traits analyses. The dependence between location characteristics (latitude, longitude, altitude, annual precipitation, annual temperature and soil type) and forest traits (specific leaf area, leaf weight ratio, wood density and growth index) is studied by by multiply regression analyses (RDA) using the statistical software package Canoco 4.5. The Pearson correlation coefficient (measure of linear correlation), Kendal rank correlation coefficient (non parametric measure of statistical dependence) and Spearman correlation coefficient (monotonic function relationship between two variables) are calculated for each pair of variables (indexes) and species. After analysis of above mentioned correlation coefficients the dimensional linear regression models, multidimensional linear and nonlinear regression models and multidimensional neural networks models are built. The strongest dependence between It and WD was obtained. The research will support the work on: Strategic Plan for Biodiversity 2011-2020, modelling and implementation of ecosystem-based approaches to climate change adaptation and disaster risk reduction. Key words: Specific leaf area (SLA), Leaf weight ratio (LWR), Wood density (WD), Growth index (It)
Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data
Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.
2016-01-01
We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872
Kumar, Rajesh; Dogra, Vishal; Rani, Khushbu; Sahu, Kanti
2017-01-01
Background: District level determinants of total fertility rate in Empowered Action Group states of India can help in ongoing population stabilization programs in India. Objective: Present study intends to assess the role of district level determinants in predicting total fertility rate among districts of the Empowered Action Group states of India. Material and Methods: Data from Annual Health Survey (2011-12) was analysed using STATA and R software packages. Multiple linear regression models were built and evaluated using Akaike Information Criterion. For further understanding, recursive partitioning was used to prepare a regression tree. Results: Female married illiteracy positively associated with total fertility rate and explained more than half (53%) of variance. Under multiple linear regression model, married illiteracy, infant mortality rate, Ante natal care registration, household size, median age of live birth and sex ratio explained 70% of total variance in total fertility rate. In regression tree, female married illiteracy was the root node and splits at 42% determined TFR <= 2.7. The next left side branch was again married illiteracy with splits at 23% to determine TFR <= 2.1. Conclusion: We conclude that female married illiteracy is one of the most important determinants explaining total fertility rate among the districts of an Empowered Action Group states. Focus on female literacy is required to stabilize the population growth in long run. PMID:29416999
Mani, Ashutosh; Rao, Marepalli; James, Kelley; Bhattacharya, Amit
2015-01-01
The purpose of this study was to explore data-driven models, based on decision trees, to develop practical and easy to use predictive models for early identification of firefighters who are likely to cross the threshold of hyperthermia during live-fire training. Predictive models were created for three consecutive live-fire training scenarios. The final predicted outcome was a categorical variable: will a firefighter cross the upper threshold of hyperthermia - Yes/No. Two tiers of models were built, one with and one without taking into account the outcome (whether a firefighter crossed hyperthermia or not) from the previous training scenario. First tier of models included age, baseline heart rate and core body temperature, body mass index, and duration of training scenario as predictors. The second tier of models included the outcome of the previous scenario in the prediction space, in addition to all the predictors from the first tier of models. Classification and regression trees were used independently for prediction. The response variable for the regression tree was the quantitative variable: core body temperature at the end of each scenario. The predicted quantitative variable from regression trees was compared to the upper threshold of hyperthermia (38°C) to predict whether a firefighter would enter hyperthermia. The performance of classification and regression tree models was satisfactory for the second (success rate = 79%) and third (success rate = 89%) training scenarios but not for the first (success rate = 43%). Data-driven models based on decision trees can be a useful tool for predicting physiological response without modeling the underlying physiological systems. Early prediction of heat stress coupled with proactive interventions, such as pre-cooling, can help reduce heat stress in firefighters.
Austin, Peter C.; Tu, Jack V.; Ho, Jennifer E.; Levy, Daniel; Lee, Douglas S.
2014-01-01
Objective Physicians classify patients into those with or without a specific disease. Furthermore, there is often interest in classifying patients according to disease etiology or subtype. Classification trees are frequently used to classify patients according to the presence or absence of a disease. However, classification trees can suffer from limited accuracy. In the data-mining and machine learning literature, alternate classification schemes have been developed. These include bootstrap aggregation (bagging), boosting, random forests, and support vector machines. Study design and Setting We compared the performance of these classification methods with those of conventional classification trees to classify patients with heart failure according to the following sub-types: heart failure with preserved ejection fraction (HFPEF) vs. heart failure with reduced ejection fraction (HFREF). We also compared the ability of these methods to predict the probability of the presence of HFPEF with that of conventional logistic regression. Results We found that modern, flexible tree-based methods from the data mining literature offer substantial improvement in prediction and classification of heart failure sub-type compared to conventional classification and regression trees. However, conventional logistic regression had superior performance for predicting the probability of the presence of HFPEF compared to the methods proposed in the data mining literature. Conclusion The use of tree-based methods offers superior performance over conventional classification and regression trees for predicting and classifying heart failure subtypes in a population-based sample of patients from Ontario. However, these methods do not offer substantial improvements over logistic regression for predicting the presence of HFPEF. PMID:23384592
A self-trained classification technique for producing 30 m percent-water maps from Landsat data
Rover, Jennifer R.; Wylie, Bruce K.; Ji, Lei
2010-01-01
Small bodies of water can be mapped with moderate-resolution satellite data using methods where water is mapped as subpixel fractions using field measurements or high-resolution images as training datasets. A new method, developed from a regression-tree technique, uses a 30 m Landsat image for training the regression tree that, in turn, is applied to the same image to map subpixel water. The self-trained method was evaluated by comparing the percent-water map with three other maps generated from established percent-water mapping methods: (1) a regression-tree model trained with a 5 m SPOT 5 image, (2) a regression-tree model based on endmembers and (3) a linear unmixing classification technique. The results suggest that subpixel water fractions can be accurately estimated when high-resolution satellite data or intensively interpreted training datasets are not available, which increases our ability to map small water bodies or small changes in lake size at a regional scale.
Scalable Regression Tree Learning on Hadoop using OpenPlanet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yin, Wei; Simmhan, Yogesh; Prasanna, Viktor
As scientific and engineering domains attempt to effectively analyze the deluge of data arriving from sensors and instruments, machine learning is becoming a key data mining tool to build prediction models. Regression tree is a popular learning model that combines decision trees and linear regression to forecast numerical target variables based on a set of input features. Map Reduce is well suited for addressing such data intensive learning applications, and a proprietary regression tree algorithm, PLANET, using MapReduce has been proposed earlier. In this paper, we describe an open source implement of this algorithm, OpenPlanet, on the Hadoop framework usingmore » a hybrid approach. Further, we evaluate the performance of OpenPlanet using realworld datasets from the Smart Power Grid domain to perform energy use forecasting, and propose tuning strategies of Hadoop parameters to improve the performance of the default configuration by 75% for a training dataset of 17 million tuples on a 64-core Hadoop cluster on FutureGrid.« less
[Soil and forest structure in the Colombian Amazon].
Calle-Rendón, Bayron R; Moreno, Flavio; Cárdenas López, Dairon
2011-09-01
Forests structural differences could result of environmental variations at different scales. Because soils are an important component of plant's environment, it is possible that edaphic and structural variables are associated and that, in consequence, spatial autocorrelation occurs. This paper aims to answer two questions: (1) are structural and edaphic variables associated at local scale in a terra firme forest of Colombian Amazonia? and (2) are these variables regionalized at the scale of work? To answer these questions we analyzed the data of a 6ha plot established in a terra firme forest of the Amacayacu National Park. Structural variables included basal area and density of large trees (diameter > or = 10cm) (Gdos and Ndos), basal area and density of understory individuals (diameter < 10cm) (Gsot and Nsot) and number of species of large trees (sp). Edaphic variables included were pH, organic matter, P, Mg, Ca, K, Al, sand, silt and clay. Structural and edaphic variables were reduced through a principal component analysis (PCA); then, the association between edaphic and structural components from PCA was evaluated by multiple regressions. The existence of regionalization of these variables was studied through isotropic variograms, and autocorrelated variables were spatially mapped. PCA found two significant components for structure, corresponding to the structure of large trees (G, Gdos, Ndos and sp) and of small trees (N, Nsot and Gsot), which explained 43.9% and 36.2% of total variance, respectively. Four components were identified for edaphic variables, which globally explained 81.9% of total variance and basically represent drainage and soil fertility. Regression analyses were significant (p < 0.05) and showed that the structure of both large and small trees is associated with greater sand contents and low soil fertility, though they explained a low proportion of total variability (R2 was 4.9% and 16.5% for the structure of large trees and small tress, respectively). Variables with spatial autocorrelation were the structure of small trees, Al, silt, and sand. Among them, Nsot and sand content showed similar patterns of spatial distribution inside the plot.
Method for estimating potential tree-grade distributions for northeastern forest species
Daniel A. Yaussy; Daniel A. Yaussy
1993-01-01
Generalized logistic regression was used to distribute trees into four potential tree grades for 20 northeastern species groups. The potential tree grade is defined as the tree grade based on the length and amount of clear cuttings and defects only, disregarding minimum grading diameter. The algorithms described use site index and tree diameter as the predictive...
Additivity of nonlinear biomass equations
Bernard R. Parresol
2001-01-01
Two procedures that guarantee the property of additivity among the components of tree biomass and total tree biomass utilizing nonlinear functions are developed. Procedure 1 is a simple combination approach, and procedure 2 is based on nonlinear joint-generalized regression (nonlinear seemingly unrelated regressions) with parameter restrictions. Statistical theory is...
Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung
2015-12-01
This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Gu, Yingxin; Wylie, Bruce K.; Boyte, Stephen; Picotte, Joshua J.; Howard, Danny; Smith, Kelcy; Nelson, Kurtis
2016-01-01
Regression tree models have been widely used for remote sensing-based ecosystem mapping. Improper use of the sample data (model training and testing data) may cause overfitting and underfitting effects in the model. The goal of this study is to develop an optimal sampling data usage strategy for any dataset and identify an appropriate number of rules in the regression tree model that will improve its accuracy and robustness. Landsat 8 data and Moderate-Resolution Imaging Spectroradiometer-scaled Normalized Difference Vegetation Index (NDVI) were used to develop regression tree models. A Python procedure was designed to generate random replications of model parameter options across a range of model development data sizes and rule number constraints. The mean absolute difference (MAD) between the predicted and actual NDVI (scaled NDVI, value from 0–200) and its variability across the different randomized replications were calculated to assess the accuracy and stability of the models. In our case study, a six-rule regression tree model developed from 80% of the sample data had the lowest MAD (MADtraining = 2.5 and MADtesting = 2.4), which was suggested as the optimal model. This study demonstrates how the training data and rule number selections impact model accuracy and provides important guidance for future remote-sensing-based ecosystem modeling.
USDA-ARS?s Scientific Manuscript database
Incomplete meteorological data has been a problem in environmental modeling studies. The objective of this work was to develop a technique to reconstruct missing daily precipitation data in the central part of Chesapeake Bay Watershed using regression trees (RT) and artificial neural networks (ANN)....
USDA-ARS?s Scientific Manuscript database
Missing meteorological data have to be estimated for agricultural and environmental modeling. The objective of this work was to develop a technique to reconstruct the missing daily precipitation data in the central part of the Chesapeake Bay Watershed using regression trees (RT) and artificial neura...
Yaghoubian, Arezou; de Virgilio, Christian; Dauphine, Christine; Lewis, Roger J; Lin, Matthew
2007-09-01
Simple admission laboratory values can be used to classify patients with necrotizing soft-tissue infection (NSTI) into high and low mortality risk groups. Chart review. Public teaching hospital. All patients with NSTI from 1997 through 2006. Variables analyzed included medical history, admission vital signs, laboratory values, and microbiologic findings. Data analyses included univariate and classification and regression tree analyses. Mortality. One hundred twenty-four patients were identified with NSTI. The overall mortality rate was 21 of 124 (17%). On univariate analysis, factors associated with mortality included a history of cancer (P = .03), intravenous drug abuse (P < .001), low systolic blood pressure on admission (P = .03), base deficit (P = .009), and elevated white blood cell count (P = .06). On exploratory classification and regression tree analysis, admission serum lactate and sodium levels were predictors of mortality, with a sensitivity of 100%, specificity of 28%, positive predictive value of 23%, and negative predictive value of 100%. A serum lactate level greater than or equal to 54.1 mg/dL (6 mmol/L) alone was associated with a 32% mortality, whereas a serum sodium level greater than or equal to 135 mEq/L combined with a lactate level less than 54.1 mg/dL was associated with a mortality of 0%. Mortality for NSTIs remains high. A simple model, using admission serum lactate and serum sodium levels, may help identify patients at greatest risk for death.
NASA Astrophysics Data System (ADS)
Tang, Jie; Liu, Rong; Zhang, Yue-Li; Liu, Mou-Ze; Hu, Yong-Fang; Shao, Ming-Jie; Zhu, Li-Jun; Xin, Hua-Wen; Feng, Gui-Wen; Shang, Wen-Jun; Meng, Xiang-Guang; Zhang, Li-Rong; Ming, Ying-Zi; Zhang, Wei
2017-02-01
Tacrolimus has a narrow therapeutic window and considerable variability in clinical use. Our goal was to compare the performance of multiple linear regression (MLR) and eight machine learning techniques in pharmacogenetic algorithm-based prediction of tacrolimus stable dose (TSD) in a large Chinese cohort. A total of 1,045 renal transplant patients were recruited, 80% of which were randomly selected as the “derivation cohort” to develop dose-prediction algorithm, while the remaining 20% constituted the “validation cohort” to test the final selected algorithm. MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied and their performances were compared in this work. Among all the machine learning models, RT performed best in both derivation [0.71 (0.67-0.76)] and validation cohorts [0.73 (0.63-0.82)]. In addition, the ideal rate of RT was 4% higher than that of MLR. To our knowledge, this is the first study to use machine learning models to predict TSD, which will further facilitate personalized medicine in tacrolimus administration in the future.
Tree STEM and Canopy Biomass Estimates from Terrestrial Laser Scanning Data
NASA Astrophysics Data System (ADS)
Olofsson, K.; Holmgren, J.
2017-10-01
In this study an automatic method for estimating both the tree stem and the tree canopy biomass is presented. The point cloud tree extraction techniques operate on TLS data and models the biomass using the estimated stem and canopy volume as independent variables. The regression model fit error is of the order of less than 5 kg, which gives a relative model error of about 5 % for the stem estimate and 10-15 % for the spruce and pine canopy biomass estimates. The canopy biomass estimate was improved by separating the models by tree species which indicates that the method is allometry dependent and that the regression models need to be recomputed for different areas with different climate and different vegetation.
Andersen, Douglas
2016-01-01
Knowledge of the factors affecting the vigor of desert riparian trees is important for their conservation and management. I used multiple regression to assess effects of streamflow and climate (12–14 years of data) or climate alone (up to 60 years of data) on radial growth of clonal narrowleaf cottonwood (Populus angustifolia), a foundation species in the arid, Closed Basin portion of the San Luis Valley, Colorado. I collected increment cores from trees (14–90 cm DBH) at four sites along each of Sand and Deadman creeks (total N = 85), including both perennial and ephemeral reaches. Analyses on trees <110 m from the stream channel explained 33–64% of the variation in standardized growth index (SGI) over the period having discharge measurements. Only 3 of 7 models included a streamflow variable; inclusion of prior-year conditions was common. Models for trees farther from the channel or over a deep water table explained 23–71% of SGI variability, and 4 of 5 contained a streamflow variable. Analyses using solely climate variables over longer time periods explained 17–85% of SGI variability, and 10 of 12 included a variable indexing summer precipitation. Three large, abrupt shifts in recent decades from wet to dry conditions (indexed by a seasonal Palmer Drought Severity Index) coincided with dramatically reduced radial growth. Each shift was presumably associated with branch dieback that produced a legacy effect apparent in many SGI series: uncharacteristically low SGI in the year following the shift. My results suggest trees in locations distant from the active channel rely on the regional shallow unconfined aquifer, summer rainfall, or both to meet water demands. The landscape-level differences in the water supplies sustaining these trees imply variable effects from shifts in winter-versus monsoon-related precipitation, and from climate change versus streamflow or groundwater management.
Chin, Weng-Yee; Wan, Eric Yuk Fai; Dowrick, Christopher; Arroll, Bruce; Lam, Cindy Lo Kuen
2018-04-26
The aim of this study was to explore the relationship between patient self-reported Patient Health Questionnaire-9 (PHQ-9) symptoms and doctor diagnosis of depression using a tree analysis approach. This was a secondary analysis on a dataset obtained from 10 179 adult primary care patients and 59 primary care physicians (PCPs) across Hong Kong. Patients completed a waiting room survey collecting data on socio-demographics and the PHQ-9. Blinded doctors documented whether they thought the patient had depression. Data were analyzed using multiple logistic regression and conditional inference decision tree modeling. PCPs diagnosed 594 patients with depression. Logistic regression identified gender, age, employment status, past history of depression, family history of mental illness and recent doctor visit as factors associated with a depression diagnosis. Tree analyses revealed different pathways of association between PHQ-9 symptoms and depression diagnosis for patients with and without past depression. The PHQ-9 symptom model revealed low mood, sense of worthlessness, fatigue, sleep disturbance and functional impairment as early classifiers. The PHQ-9 total score model revealed cut-off scores of >12 and >15 were most frequently associated with depression diagnoses in patients with and without past depression. A past history of depression is the most significant factor associated with the diagnosis of depression. PCPs appear to utilize a hypothetical-deductive problem-solving approach incorporating pre-test probability, with different associated factors for patients with and without past depression. Diagnostic thresholds may be too low for patients with past depression and too high for those without, potentially leading to over and under diagnosis of depression.
Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William
2014-01-01
Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies.
Anderson, Weston; Guikema, Seth; Zaitchik, Ben; Pan, William
2014-01-01
Obtaining accurate small area estimates of population is essential for policy and health planning but is often difficult in countries with limited data. In lieu of available population data, small area estimate models draw information from previous time periods or from similar areas. This study focuses on model-based methods for estimating population when no direct samples are available in the area of interest. To explore the efficacy of tree-based models for estimating population density, we compare six different model structures including Random Forest and Bayesian Additive Regression Trees. Results demonstrate that without information from prior time periods, non-parametric tree-based models produced more accurate predictions than did conventional regression methods. Improving estimates of population density in non-sampled areas is important for regions with incomplete census data and has implications for economic, health and development policies. PMID:24992657
Keller, Thomas E.; Blakeslee, Jennifer E.; Lemon, Stephenie C.; Courtney, Mark E.
2010-01-01
Objective: Distinctive combinations of factors are likely to be associated with serious alcohol problems among adolescents about to emancipate from the foster care system and face the difficult transition to independent adulthood. This study identifies particular subpopulations of older foster youths that differ markedly in the probability of a lifetime diagnosis for alcohol abuse or dependence. Method: Classification and regression tree (CART) analysis was applied to a large, representative sample (N = 732) of individuals, 17 years of age or older, placed in the child welfare system for more than 1 year. CART evaluated two exploratory sets of variables for optimal splits into groups distinguished from each other on the criterion of lifetime alcohol-use disorder diagnosis. Results: Each classification tree yielded four terminal groups with different rates of lifetime alcohol-use disorder diagnosis. Notable groups in the first tree included one characterized by high levels of both delinquency and violence exposure (53% diagnosed) and another that featured lower delinquency but an independent-living placement (21% diagnosed). Notable groups in the second tree included African American adolescents (only 8% diagnosed), White adolescents not close to caregivers (40% diagnosed), and White adolescents closer to caregivers but with a history of psychological abuse (36% diagnosed). Conclusions: Analyses incorporating variables that could be comorbid with or symptomatic of alcohol problems, such as delinquency, yielded classifications potentially useful for assessment and service planning. Analyses without such variables identified other factors, such as quality of caregiving relationships and maltreatment, associated with serious alcohol problems, suggesting opportunities for prevention or intervention. PMID:20946738
Generalized and synthetic regression estimators for randomized branch sampling
David L. R. Affleck; Timothy G. Gregoire
2015-01-01
In felled-tree studies, ratio and regression estimators are commonly used to convert more readily measured branch characteristics to dry crown mass estimates. In some cases, data from multiple trees are pooled to form these estimates. This research evaluates the utility of both tactics in the estimation of crown biomass following randomized branch sampling (...
Cloud-Free Satellite Image Mosaics with Regression Trees and Histogram Matching.
E.H. Helmer; B. Ruefenacht
2005-01-01
Cloud-free optical satellite imagery simplifies remote sensing, but land-cover phenology limits existing solutions to persistent cloudiness to compositing temporally resolute, spatially coarser imagery. Here, a new strategy for developing cloud-free imagery at finer resolution permits simple automatic change detection. The strategy uses regression trees to predict...
Regression estimators for late-instar gypsy moth larvae at low pupulation densities
W.E. Wallnr; A.S. Devito; Stanley J. Zarnoch
1989-01-01
Two regression estimators were developed for determining densities of late-instar gypsy moth, Lymantria dispar (Lepidoptera: Lymantriidae), larvae from burlap band and pyrethrin spray counts on oak trees in Vermont, Massachusetts, Connecticut, and New York. Studies were conducted by marking larvae on individual burlap banded trees within 15...
What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis
ERIC Educational Resources Information Center
Thomas, Emily H.; Galambos, Nora
2004-01-01
To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…
Increased spruce tree growth in Central Europe since 1960s.
Cienciala, Emil; Altman, Jan; Doležal, Jiří; Kopáček, Jiří; Štěpánek, Petr; Ståhl, Göran; Tumajer, Jan
2018-04-01
Tree growth response to recent environmental changes is of key interest for forest ecology. This study addressed the following questions with respect to Norway spruce (Picea abies, L. Karst.) in Central Europe: Has tree growth accelerated during the last five decades? What are the main environmental drivers of the observed tree radial stem growth and how much variability can be explained by them? Using a nationwide dendrochronological sampling of Norway spruce in the Czech Republic (1246 trees, 266 plots), novel regional tree-ring width chronologies for 40(±10)- and 60(±10)-year old trees were assembled, averaged across three elevation zones (break points at 500 and 700m). Correspondingly averaged drivers, including temperature, precipitation, nitrogen (N) deposition and ambient CO 2 concentration, were used in a general linear model (GLM) to analyze the contribution of these in explaining tree ring width variability for the period from 1961 to 2013. Spruce tree radial stem growth responded strongly to the changing environment in Central Europe during the period, with a mean tree ring width increase of 24 and 32% for the 40- and 60-year old trees, respectively. The indicative General Linear Model analysis identified CO 2 , precipitation during the vegetation season, spring air temperature (March-May) and N-deposition as the significant covariates of growth, with the latter including interactions with elevation zones. The regression models explained 57% and 55% of the variability in the two tree ring width chronologies, respectively. Growth response to N-deposition showed the highest variability along the elevation gradient with growth stimulation/limitation at sites below/above 700m. A strong sensitivity of stem growth to CO 2 was also indicated, suggesting that the effect of rising ambient CO 2 concentration (direct or indirect by increased water use efficiency) should be considered in analyses of long-term growth together with climatic factors and N-deposition. Copyright © 2017 Elsevier B.V. All rights reserved.
Freitas, Alex A; Limbu, Kriti; Ghafourian, Taravat
2015-01-01
Volume of distribution is an important pharmacokinetic property that indicates the extent of a drug's distribution in the body tissues. This paper addresses the problem of how to estimate the apparent volume of distribution at steady state (Vss) of chemical compounds in the human body using decision tree-based regression methods from the area of data mining (or machine learning). Hence, the pros and cons of several different types of decision tree-based regression methods have been discussed. The regression methods predict Vss using, as predictive features, both the compounds' molecular descriptors and the compounds' tissue:plasma partition coefficients (Kt:p) - often used in physiologically-based pharmacokinetics. Therefore, this work has assessed whether the data mining-based prediction of Vss can be made more accurate by using as input not only the compounds' molecular descriptors but also (a subset of) their predicted Kt:p values. Comparison of the models that used only molecular descriptors, in particular, the Bagging decision tree (mean fold error of 2.33), with those employing predicted Kt:p values in addition to the molecular descriptors, such as the Bagging decision tree using adipose Kt:p (mean fold error of 2.29), indicated that the use of predicted Kt:p values as descriptors may be beneficial for accurate prediction of Vss using decision trees if prior feature selection is applied. Decision tree based models presented in this work have an accuracy that is reasonable and similar to the accuracy of reported Vss inter-species extrapolations in the literature. The estimation of Vss for new compounds in drug discovery will benefit from methods that are able to integrate large and varied sources of data and flexible non-linear data mining methods such as decision trees, which can produce interpretable models. Graphical AbstractDecision trees for the prediction of tissue partition coefficient and volume of distribution of drugs.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Rooper, Christopher N.; Zimmermann, Mark; Prescott, Megan M.
2017-08-01
Deep-sea coral and sponge ecosystems are widespread throughout most of Alaska's marine waters, and are associated with many different species of fishes and invertebrates. These ecosystems are vulnerable to the effects of commercial fishing activities and climate change. We compared four commonly used species distribution models (general linear models, generalized additive models, boosted regression trees and random forest models) and an ensemble model to predict the presence or absence and abundance of six groups of benthic invertebrate taxa in the Gulf of Alaska. All four model types performed adequately on training data for predicting presence and absence, with regression forest models having the best overall performance measured by the area under the receiver-operating-curve (AUC). The models also performed well on the test data for presence and absence with average AUCs ranging from 0.66 to 0.82. For the test data, ensemble models performed the best. For abundance data, there was an obvious demarcation in performance between the two regression-based methods (general linear models and generalized additive models), and the tree-based models. The boosted regression tree and random forest models out-performed the other models by a wide margin on both the training and testing data. However, there was a significant drop-off in performance for all models of invertebrate abundance ( 50%) when moving from the training data to the testing data. Ensemble model performance was between the tree-based and regression-based methods. The maps of predictions from the models for both presence and abundance agreed very well across model types, with an increase in variability in predictions for the abundance data. We conclude that where data conforms well to the modeled distribution (such as the presence-absence data and binomial distribution in this study), the four types of models will provide similar results, although the regression-type models may be more consistent with biological theory. For data with highly zero-inflated distributions and non-normal distributions such as the abundance data from this study, the tree-based methods performed better. Ensemble models that averaged predictions across the four model types, performed better than the GLM or GAM models but slightly poorer than the tree-based methods, suggesting ensemble models might be more robust to overfitting than tree methods, while mitigating some of the disadvantages in predictive performance of regression methods.
DIF Trees: Using Classification Trees to Detect Differential Item Functioning
ERIC Educational Resources Information Center
Vaughn, Brandon K.; Wang, Qiu
2010-01-01
A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…
NASA Astrophysics Data System (ADS)
Doucet, A.; Savard, M. M.; Bégin, C.; Ouarda, T. B.; Marion, J.
2010-12-01
The combined analyses of tree-ring δ13C, δ18O, δ15N, 206Pb/207Pb, 206Pb/204Pb and 206Pb/208Pb isotope ratios of three red spruce specimens from the Tantaré ecological reserve located 40 km northwest of Québec City (Canada) were studied with the aim of reconstructing environmental conditions and unravel past air-quality changes of the 1880-2007 period. To separate the tree-ring δ18O and δ13C patterns induced by natural conditions from those generated by anthropogenic perturbations, a linear regression was applied between the most explicative meteorological parameters and the isotopic series for the period of low pollution (1880 to 1909). The model equations were then applied to the most recent part of the series (1910-2007) to verify if climatic conditions have remained the main driver of the tree-ring isotopic variations. The good fit between the modeled and measured δ18O series for the entire studied period suggests that the assimilation of oxygen by red spruce trees is not significantly affected by pollution stress near Québec City. However, the deviation between the measured and modeled δ13C values for the 1944-2007 period indicates that diffuse pollution affected carbon assimilation by the investigated trees. To independently validate if atmospheric pollution could have generated the deviation between the measured and the estimated δ13C values, a linear regression was applied between the portion of the residual δ13C values and atmospheric pollution (Canadian fossil fuel proxy from 1958 to 2000). The nice fit between the modeled δ13C values from the combination of the two regression analyses based on climate and emission proxy strongly supports the hypothesis that there is a natural and an anthropogenic portion in the δ13C variations of the studied specimens. The short-term variations of the red spruce δ15N series are correlated with the instrumentally measured amounts of provincial N emissions for the 1990 to 2006 period (longest measurements available). Additionally, the long-term decrease of the δ15N series after 1956 is linked to the low isotopic values of NOx emitted by car exhausts, as expressed by the provincial number of cars which reflect the amount of transport-related N deposition at the provincial scale. The 208Pb/206Pb and 204Pb/206Pb ratios as a function of 206Pb/207Pb of the 1880-1919 period reflect a mixture of natural lead from the mineral soil horizon and mainly anthropogenic lead from north-eastern American coal combustion. The lower Pb ratios of the 1920-1989 period correlate well with the introduction of leaded additives to gasoline characterized by lower ratios relative to coal combustion. Inferring the lead sources of the 1990-2008 period is not as straightforward because lead can potentially derive from three main sources: coal combustion, burnt recycled material and natural lead present in soils. Our results show the great potential of tree-ring stable isotopes to record pollution events in the context of peri-urban diffuse pollution, and to prolong the pollution history in regions where direct measurements of pollutants only covers a relatively short period.
Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
Rahman, Raziur; Haider, Saad; Ghosh, Souparno; Pal, Ranadip
2015-01-01
Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity prediction problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error. PMID:27081304
NASA Astrophysics Data System (ADS)
Gessesse, B.; Bewket, W.; Bräuning, A.
2015-11-01
Land degradation due to lack of sustainable land management practices are one of the critical challenges in many developing countries including Ethiopia. This study explores the major determinants of farm level tree planting decision as a land management strategy in a typical framing and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and binary logistic regression model. The model significantly predicted farmers' tree planting decision (Chi-square = 37.29, df = 15, P<0.001). Besides, the computed significant value of the model suggests that all the considered predictor variables jointly influenced the farmers' decision to plant trees as a land management strategy. In this regard, the finding of the study show that local land-users' willingness to adopt tree growing decision is a function of a wide range of biophysical, institutional, socioeconomic and household level factors, however, the likelihood of household size, productive labour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system have positively and significantly influence on tree growing investment decisions in the study watershed. Eventually, the processes of land use conversion and land degradation are serious which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, devising sustainable and integrated land management policy options and implementing them would enhance ecological restoration and livelihood sustainability in the study watershed.
NASA Astrophysics Data System (ADS)
Gessesse, Berhan; Bewket, Woldeamlak; Bräuning, Achim
2016-04-01
Land degradation due to lack of sustainable land management practices is one of the critical challenges in many developing countries including Ethiopia. This study explored the major determinants of farm-level tree-planting decisions as a land management strategy in a typical farming and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and a binary logistic regression model. The model significantly predicted farmers' tree-planting decisions (χ2 = 37.29, df = 15, P < 0.001). Besides, the computed significant value of the model revealed that all the considered predictor variables jointly influenced the farmers' decisions to plant trees as a land management strategy. The findings of the study demonstrated that the adoption of tree-growing decisions by local land users was a function of a wide range of biophysical, institutional, socioeconomic and household-level factors. In this regard, the likelihood of household size, productive labour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system had a critical influence on tree-growing investment decisions in the study watershed. Eventually, the processes of land-use conversion and land degradation were serious, which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, the study recommended that devising and implementing sustainable land management policy options would enhance ecological restoration and livelihood sustainability in the study watershed.
Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338
Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.
A Clinical Decision Support System for Breast Cancer Patients
NASA Astrophysics Data System (ADS)
Fernandes, Ana S.; Alves, Pedro; Jarman, Ian H.; Etchells, Terence A.; Fonseca, José M.; Lisboa, Paulo J. G.
This paper proposes a Web clinical decision support system for clinical oncologists and for breast cancer patients making prognostic assessments, using the particular characteristics of the individual patient. This system comprises three different prognostic modelling methodologies: the clinically widely used Nottingham prognostic index (NPI); the Cox regression modelling and a partial logistic artificial neural network with automatic relevance determination (PLANN-ARD). All three models yield a different prognostic index that can be analysed together in order to obtain a more accurate prognostic assessment of the patient. Missing data is incorporated in the mentioned models, a common issue in medical data that was overcome using multiple imputation techniques. Risk group assignments are also provided through a methodology based on regression trees, where Boolean rules can be obtained expressed with patient characteristics.
Lisa M. Ganio; Robert A. Progar
2017-01-01
Wild and prescribed fire-induced injury to forest trees can produce immediate or delayed tree mortality but fire-injured trees can also survive. Land managers use logistic regression models that incorporate tree-injury variables to discriminate between fatally injured trees and those that will survive. We used data from 4024 ponderosa pine (Pinus ponderosa...
Using Evidence-Based Decision Trees Instead of Formulas to Identify At-Risk Readers. REL 2014-036
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov; Foorman, Barbara R.
2014-01-01
This study examines whether the classification and regression tree (CART) model improves the early identification of students at risk for reading comprehension difficulties compared with the more difficult to interpret logistic regression model. CART is a type of predictive modeling that relies on nonparametric techniques. It presents results in…
Forest type mapping of the Interior West
Bonnie Ruefenacht; Gretchen G. Moisen; Jock A. Blackard
2004-01-01
This paper develops techniques for the mapping of forest types in Arizona, New Mexico, and Wyoming. The methods involve regression-tree modeling using a variety of remote sensing and GIS layers along with Forest Inventory Analysis (FIA) point data. Regression-tree modeling is a fast and efficient technique of estimating variables for large data sets with high accuracy...
ERIC Educational Resources Information Center
Thomas, Emily H.; Galambos, Nora
To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…
ERIC Educational Resources Information Center
Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.
2016-01-01
In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…
Teale, Stephen A; Letkowski, Steven; Matusick, George; Stehman, Stephen V; Castello, John D
2009-08-01
Beech scale, Cryptococcus fagisuga Lindinger, is a non-native invasive insect associated with beech bark disease. A quantitative method of measuring viable scale density at the levels of the individual tree and localized bark patches was developed. Bark patches (10 cm(2)) were removed at 0, 1, and 2 m above the ground and at the four cardinal directions from 13 trees in northern New York and 12 trees in northern Michigan. Digital photographs of each patch were made, and the wax mass area was measured from two random 1-cm(2) subsamples on each bark patch using image analysis software. Viable scale insects were counted after removing the wax under a dissecting microscope. Separate regression analyses at the whole tree level for the New York and Michigan sites each showed a strong positive relationship of wax mass area with the number of underlying viable scale insects. The relationships for the New York and Michigan data were not significantly different from each other, and when pooling data from the two sites, there was still a significant positive relationship between wax mass area and the number of scale insects. The relationships between viable scale insects and wax mass area were different at the 0-, 1-, and 2-m sampling heights but do not seem to affect the relationship. This method does not disrupt the insect or its interactions with the host tree.
Impacts of human-related practices on Ommatissus lybicus infestations of date palm in Oman.
Al-Kindi, Khalifa M; Kwan, Paul; Andrew, Nigel R; Welch, Mitchell
2017-01-01
Date palm cultivation is economically important in the Sultanate of Oman, with significant financial investments coming from both the government and private individuals. However, a widespread Dubas bug (DB) (Ommatissus lybicus Bergevin) infestation has impacted regions including the Middle East, North Africa, Southeast Russia, and Spain, resulting in widespread damages to date palms. In this study, techniques in spatial statistics including ordinary least squares (OLS), geographically weighted regression (GRW), and exploratory regression (ER) were applied to (a) model the correlation between DB infestations and human-related practices that include irrigation methods, row spacing, palm tree density, and management of undercover and intercropped vegetation, and (b) predict the locations of future DB infestations in northern Oman. Firstly, we extracted row spacing and palm tree density information from remote sensed satellite images. Secondly, we collected data on irrigation practices and management by using a simple questionnaire, augmented with spatial data. Thirdly, we conducted our statistical analyses using all possible combinations of values over a given set of candidate variables using the chosen predictive modelling and regression techniques. Lastly, we identified the combination of human-related practices that are most conducive to the survival and spread of DB. Our results show that there was a strong correlation between DB infestations and several human-related practices parameters (R2 = 0.70). Variables including palm tree density, spacing between trees (less than 5 x 5 m), insecticide application, date palm and farm service (pruning, dethroning, remove weeds, and thinning), irrigation systems, offshoots removal, fertilisation and labour (non-educated) issues, were all found to significantly influence the degree of DB infestations. This study is expected to help reduce the extent and cost of aerial and ground sprayings, while facilitating the allocation of date palm plantations. An integrated pest management (IPM) system monitoring DB infestations, driven by GIS and remote sensed data collections and spatial statistical models, will allow for an effective DB management program in Oman. This will in turn ensure the competitiveness of Oman in the global date fruits market and help preserve national yields.
Zhao, Yang; Zheng, Wei; Zhuo, Daisy Y; Lu, Yuefeng; Ma, Xiwen; Liu, Hengchang; Zeng, Zhen; Laird, Glen
2017-10-11
Personalized medicine, or tailored therapy, has been an active and important topic in recent medical research. Many methods have been proposed in the literature for predictive biomarker detection and subgroup identification. In this article, we propose a novel decision tree-based approach applicable in randomized clinical trials. We model the prognostic effects of the biomarkers using additive regression trees and the biomarker-by-treatment effect using a single regression tree. Bayesian approach is utilized to periodically revise the split variables and the split rules of the decision trees, which provides a better overall fitting. Gibbs sampler is implemented in the MCMC procedure, which updates the prognostic trees and the interaction tree separately. We use the posterior distribution of the interaction tree to construct the predictive scores of the biomarkers and to identify the subgroup where the treatment is superior to the control. Numerical simulations show that our proposed method performs well under various settings comparing to existing methods. We also demonstrate an application of our method in a real clinical trial.
DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony.
Wehe, André; Bansal, Mukul S; Burleigh, J Gordon; Eulenstein, Oliver
2008-07-01
DupTree is a new software program for inferring rooted species trees from collections of gene trees using the gene tree parsimony approach. The program implements a novel algorithm that significantly improves upon the run time of standard search heuristics for gene tree parsimony, and enables the first truly genome-scale phylogenetic analyses. In addition, DupTree allows users to examine alternate rootings and to weight the reconciliation costs for gene trees. DupTree is an open source project written in C++. DupTree for Mac OS X, Windows, and Linux along with a sample dataset and an on-line manual are available at http://genome.cs.iastate.edu/CBL/DupTree
Gmur, Stephan; Vogt, Daniel; Zabowski, Darlene; Moskal, L. Monika
2012-01-01
The characterization of soil attributes using hyperspectral sensors has revealed patterns in soil spectra that are known to respond to mineral composition, organic matter, soil moisture and particle size distribution. Soil samples from different soil horizons of replicated soil series from sites located within Washington and Oregon were analyzed with the FieldSpec Spectroradiometer to measure their spectral signatures across the electromagnetic range of 400 to 1,000 nm. Similarity rankings of individual soil samples reveal differences between replicate series as well as samples within the same replicate series. Using classification and regression tree statistical methods, regression trees were fitted to each spectral response using concentrations of nitrogen, carbon, carbonate and organic matter as the response variables. Statistics resulting from fitted trees were: nitrogen R2 0.91 (p < 0.01) at 403, 470, 687, and 846 nm spectral band widths, carbonate R2 0.95 (p < 0.01) at 531 and 898 nm band widths, total carbon R2 0.93 (p < 0.01) at 400, 409, 441 and 907 nm band widths, and organic matter R2 0.98 (p < 0.01) at 300, 400, 441, 832 and 907 nm band widths. Use of the 400 to 1,000 nm electromagnetic range utilizing regression trees provided a powerful, rapid and inexpensive method for assessing nitrogen, carbon, carbonate and organic matter for upper soil horizons in a nondestructive method. PMID:23112620
Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis
ERIC Educational Resources Information Center
Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John
2012-01-01
Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…
Louys, Julien; Meloro, Carlo; Elton, Sarah; Ditchfield, Peter; Bishop, Laura C
2015-01-01
We test the performance of two models that use mammalian communities to reconstruct multivariate palaeoenvironments. While both models exploit the correlation between mammal communities (defined in terms of functional groups) and arboreal heterogeneity, the first uses a multiple multivariate regression of community structure and arboreal heterogeneity, while the second uses a linear regression of the principal components of each ecospace. The success of these methods means the palaeoenvironment of a particular locality can be reconstructed in terms of the proportions of heavy, moderate, light, and absent tree canopy cover. The linear regression is less biased, and more precisely and accurately reconstructs heavy tree canopy cover than the multiple multivariate model. However, the multiple multivariate model performs better than the linear regression for all other canopy cover categories. Both models consistently perform better than randomly generated reconstructions. We apply both models to the palaeocommunity of the Upper Laetolil Beds, Tanzania. Our reconstructions indicate that there was very little heavy tree cover at this site (likely less than 10%), with the palaeo-landscape instead comprising a mixture of light and absent tree cover. These reconstructions help resolve the previous conflicting palaeoecological reconstructions made for this site. Copyright © 2014 Elsevier Ltd. All rights reserved.
Modeling vertebrate diversity in Oregon using satellite imagery
NASA Astrophysics Data System (ADS)
Cablk, Mary Elizabeth
Vertebrate diversity was modeled for the state of Oregon using a parametric approach to regression tree analysis. This exploratory data analysis effectively modeled the non-linear relationships between vertebrate richness and phenology, terrain, and climate. Phenology was derived from time-series NOAA-AVHRR satellite imagery for the year 1992 using two methods: principal component analysis and derivation of EROS data center greenness metrics. These two measures of spatial and temporal vegetation condition incorporated the critical temporal element in this analysis. The first three principal components were shown to contain spatial and temporal information about the landscape and discriminated phenologically distinct regions in Oregon. Principal components 2 and 3, 6 greenness metrics, elevation, slope, aspect, annual precipitation, and annual seasonal temperature difference were investigated as correlates to amphibians, birds, all vertebrates, reptiles, and mammals. Variation explained for each regression tree by taxa were: amphibians (91%), birds (67%), all vertebrates (66%), reptiles (57%), and mammals (55%). Spatial statistics were used to quantify the pattern of each taxa and assess validity of resulting predictions from regression tree models. Regression tree analysis was relatively robust against spatial autocorrelation in the response data and graphical results indicated models were well fit to the data.
Fernández, Leónides; Mediano, Pilar; García, Ricardo; Rodríguez, Juan M; Marín, María
2016-09-01
Objectives Lactational mastitis frequently leads to a premature abandonment of breastfeeding; its development has been associated with several risk factors. This study aims to use a decision tree (DT) approach to establish the main risk factors involved in mastitis and to compare its performance for predicting this condition with a stepwise logistic regression (LR) model. Methods Data from 368 cases (breastfeeding women with mastitis) and 148 controls were collected by a questionnaire about risk factors related to medical history of mother and infant, pregnancy, delivery, postpartum, and breastfeeding practices. The performance of the DT and LR analyses was compared using the area under the receiver operating characteristic (ROC) curve. Sensitivity, specificity and accuracy of both models were calculated. Results Cracked nipples, antibiotics and antifungal drugs during breastfeeding, infant age, breast pumps, familial history of mastitis and throat infection were significant risk factors associated with mastitis in both analyses. Bottle-feeding and milk supply were related to mastitis for certain subgroups in the DT model. The areas under the ROC curves were similar for LR and DT models (0.870 and 0.835, respectively). The LR model had better classification accuracy and sensitivity than the DT model, but the last one presented better specificity at the optimal threshold of each curve. Conclusions The DT and LR models constitute useful and complementary analytical tools to assess the risk of lactational infectious mastitis. The DT approach identifies high-risk subpopulations that need specific mastitis prevention programs and, therefore, it could be used to make the most of public health resources.
Neighborhood greenspace and health in a large urban center
NASA Astrophysics Data System (ADS)
Kardan, Omid; Gozdyra, Peter; Misic, Bratislav; Moola, Faisal; Palmer, Lyle J.; Paus, Tomáš; Berman, Marc G.
2015-07-01
Studies have shown that natural environments can enhance health and here we build upon that work by examining the associations between comprehensive greenspace metrics and health. We focused on a large urban population center (Toronto, Canada) and related the two domains by combining high-resolution satellite imagery and individual tree data from Toronto with questionnaire-based self-reports of general health perception, cardio-metabolic conditions and mental illnesses from the Ontario Health Study. Results from multiple regressions and multivariate canonical correlation analyses suggest that people who live in neighborhoods with a higher density of trees on their streets report significantly higher health perception and significantly less cardio-metabolic conditions (controlling for socio-economic and demographic factors). We find that having 10 more trees in a city block, on average, improves health perception in ways comparable to an increase in annual personal income of $10,000 and moving to a neighborhood with $10,000 higher median income or being 7 years younger. We also find that having 11 more trees in a city block, on average, decreases cardio-metabolic conditions in ways comparable to an increase in annual personal income of $20,000 and moving to a neighborhood with $20,000 higher median income or being 1.4 years younger.
Distribution of cavity trees in midwestern old-growth and second-growth forests
Zhaofei Fan; Stephen R. Shifley; Martin A. Spetich; Frank R. Thompson; David R. Larsen
2003-01-01
We used classification and regression tree analysis to determine the primary variables associated with the occurrence of cavity trees and the hierarchical structure among those variables. We applied that information to develop logistic models predicting cavity tree probability as a function of diameter, species group, and decay class. Inventories of cavity abundance in...
Distribution of cavity trees in midwesternold-growth and second-growth forests
Zhaofei Fan; Stephen R. Shifley; Martin A. Spetich; Frank R., III Thompson; David R. Larsen
2003-01-01
We used classification and regression tree analysis to determine the primary variables associated with the occurrence of cavity trees and the hierarchical structure among those variables. We applied that information to develop logistic models predicting cavity tree probability as a function of diameter, species group, and decay class. Inventories of cavity abundance in...
A hierarchical linear model for tree height prediction.
Vicente J. Monleon
2003-01-01
Measuring tree height is a time-consuming process. Often, tree diameter is measured and height is estimated from a published regression model. Trees used to develop these models are clustered into stands, but this structure is ignored and independence is assumed. In this study, hierarchical linear models that account explicitly for the clustered structure of the data...
Modeling individual tree survial
Quang V. Cao
2016-01-01
Information provided by growth and yield models is the basis for forest managers to make decisions on how to manage their forests. Among different types of growth models, whole-stand models offer predictions at stand level, whereas individual-tree models give detailed information at tree level. The well-known logistic regression is commonly used to predict tree...
[Hyperspectral Estimation of Apple Tree Canopy LAI Based on SVM and RF Regression].
Han, Zhao-ying; Zhu, Xi-cun; Fang, Xian-yi; Wang, Zhuo-yuan; Wang, Ling; Zhao, Geng-Xing; Jiang, Yuan-mao
2016-03-01
Leaf area index (LAI) is the dynamic index of crop population size. Hyperspectral technology can be used to estimate apple canopy LAI rapidly and nondestructively. It can be provide a reference for monitoring the tree growing and yield estimation. The Red Fuji apple trees of full bearing fruit are the researching objects. Ninety apple trees canopies spectral reflectance and LAI values were measured by the ASD Fieldspec3 spectrometer and LAI-2200 in thirty orchards in constant two years in Qixia research area of Shandong Province. The optimal vegetation indices were selected by the method of correlation analysis of the original spectral reflectance and vegetation indices. The models of predicting the LAI were built with the multivariate regression analysis method of support vector machine (SVM) and random forest (RF). The new vegetation indices, GNDVI527, ND-VI676, RVI682, FD-NVI656 and GRVI517 and the previous two main vegetation indices, NDVI670 and NDVI705, are in accordance with LAI. In the RF regression model, the calibration set decision coefficient C-R2 of 0.920 and validation set decision coefficient V-R2 of 0.889 are higher than the SVM regression model by 0.045 and 0.033 respectively. The root mean square error of calibration set C-RMSE of 0.249, the root mean square error validation set V-RMSE of 0.236 are lower than that of the SVM regression model by 0.054 and 0.058 respectively. Relative analysis of calibrating error C-RPD and relative analysis of validation set V-RPD reached 3.363 and 2.520, 0.598 and 0.262, respectively, which were higher than the SVM regression model. The measured and predicted the scatterplot trend line slope of the calibration set and validation set C-S and V-S are close to 1. The estimation result of RF regression model is better than that of the SVM. RF regression model can be used to estimate the LAI of red Fuji apple trees in full fruit period.
Log and tree sawing times for hardwood mills
Everette D. Rast
1974-01-01
Data on 6,850 logs and 1,181 trees were analyzed to predict sawing times. For both logs and trees, regression equations were derived that express (in minutes) sawing time per log or tree and per Mbf. For trees, merchantable height is expressed in number of logs as well as in feet. One of the major uses for the tables of average sawing times is as a bench mark against...
Logistic regression trees for initial selection of interesting loci in case-control studies
Nickolov, Radoslav Z; Milanov, Valentin B
2007-01-01
Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557
Decision tree analysis of factors influencing rainfall-related building damage
NASA Astrophysics Data System (ADS)
Spekkers, M. H.; Kok, M.; Clemens, F. H. L. R.; ten Veldhuis, J. A. E.
2014-04-01
Flood damage prediction models are essential building blocks in flood risk assessments. Little research has been dedicated so far to damage of small-scale urban floods caused by heavy rainfall, while there is a need for reliable damage models for this flood type among insurers and water authorities. The aim of this paper is to investigate a wide range of damage-influencing factors and their relationships with rainfall-related damage, using decision tree analysis. For this, district-aggregated claim data from private property insurance companies in the Netherlands were analysed, for the period of 1998-2011. The databases include claims of water-related damage, for example, damages related to rainwater intrusion through roofs and pluvial flood water entering buildings at ground floor. Response variables being modelled are average claim size and claim frequency, per district per day. The set of predictors include rainfall-related variables derived from weather radar images, topographic variables from a digital terrain model, building-related variables and socioeconomic indicators of households. Analyses were made separately for property and content damage claim data. Results of decision tree analysis show that claim frequency is most strongly associated with maximum hourly rainfall intensity, followed by real estate value, ground floor area, household income, season (property data only), buildings age (property data only), ownership structure (content data only) and fraction of low-rise buildings (content data only). It was not possible to develop statistically acceptable trees for average claim size, which suggest that variability in average claim size is related to explanatory variables that cannot be defined at the district scale. Cross-validation results show that decision trees were able to predict 22-26% of variance in claim frequency, which is considerably better compared to results from global multiple regression models (11-18% of variance explained). Still, a large part of the variance in claim frequency is left unexplained, which is likely to be caused by variations in data at subdistrict scale and missing explanatory variables.
Andrew T. Hudak; Nicholas L. Crookston; Jeffrey S. Evans; Michael K. Falkowski; Alistair M. S. Smith; Paul E. Gessler; Penelope Morgan
2006-01-01
We compared the utility of discrete-return light detection and ranging (lidar) data and multispectral satellite imagery, and their integration, for modeling and mapping basal area and tree density across two diverse coniferous forest landscapes in north-central Idaho. We applied multiple linear regression models subset from a suite of 26 predictor variables derived...
ERIC Educational Resources Information Center
Kitsantas, Anastasia; Kitsantas, Panagiota; Kitsantas, Thomas
2012-01-01
The purpose of this exploratory study was to assess the relative importance of a number of variables in predicting students' interest in math and/or computer science. Classification and regression trees (CART) were employed in the analysis of survey data collected from 276 college students enrolled in two U.S. and Greek universities. The results…
Duchêne, David; Duchêne, Sebastian; Ho, Simon Y W
2015-07-01
Phylogenetic estimation of evolutionary timescales has become routine in biology, forming the basis of a wide range of evolutionary and ecological studies. However, there are various sources of bias that can affect these estimates. We investigated whether tree imbalance, a property that is commonly observed in phylogenetic trees, can lead to reduced accuracy or precision of phylogenetic timescale estimates. We analysed simulated data sets with calibrations at internal nodes and at the tips, taking into consideration different calibration schemes and levels of tree imbalance. We also investigated the effect of tree imbalance on two empirical data sets: mitogenomes from primates and serial samples of the African swine fever virus. In analyses calibrated using dated, heterochronous tips, we found that tree imbalance had a detrimental impact on precision and produced a bias in which the overall timescale was underestimated. A pronounced effect was observed in analyses with shallow calibrations. The greatest decreases in accuracy usually occurred in the age estimates for medium and deep nodes of the tree. In contrast, analyses calibrated at internal nodes did not display a reduction in estimation accuracy or precision due to tree imbalance. Our results suggest that molecular-clock analyses can be improved by increasing taxon sampling, with the specific aims of including deeper calibrations, breaking up long branches and reducing tree imbalance. © 2014 John Wiley & Sons Ltd.
Identification of extremely premature infants at high risk of rehospitalization.
Ambalavanan, Namasivayam; Carlo, Waldemar A; McDonald, Scott A; Yao, Qing; Das, Abhik; Higgins, Rosemary D
2011-11-01
Extremely low birth weight infants often require rehospitalization during infancy. Our objective was to identify at the time of discharge which extremely low birth weight infants are at higher risk for rehospitalization. Data from extremely low birth weight infants in Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network centers from 2002-2005 were analyzed. The primary outcome was rehospitalization by the 18- to 22-month follow-up, and secondary outcome was rehospitalization for respiratory causes in the first year. Using variables and odds ratios identified by stepwise logistic regression, scoring systems were developed with scores proportional to odds ratios. Classification and regression-tree analysis was performed by recursive partitioning and automatic selection of optimal cutoff points of variables. A total of 3787 infants were evaluated (mean ± SD birth weight: 787 ± 136 g; gestational age: 26 ± 2 weeks; 48% male, 42% black). Forty-five percent of the infants were rehospitalized by 18 to 22 months; 14.7% were rehospitalized for respiratory causes in the first year. Both regression models (area under the curve: 0.63) and classification and regression-tree models (mean misclassification rate: 40%-42%) were moderately accurate. Predictors for the primary outcome by regression were shunt surgery for hydrocephalus, hospital stay of >120 days for pulmonary reasons, necrotizing enterocolitis stage II or higher or spontaneous gastrointestinal perforation, higher fraction of inspired oxygen at 36 weeks, and male gender. By classification and regression-tree analysis, infants with hospital stays of >120 days for pulmonary reasons had a 66% rehospitalization rate compared with 42% without such a stay. The scoring systems and classification and regression-tree analysis models identified infants at higher risk of rehospitalization and might assist planning for care after discharge.
Identification of Extremely Premature Infants at High Risk of Rehospitalization
Carlo, Waldemar A.; McDonald, Scott A.; Yao, Qing; Das, Abhik; Higgins, Rosemary D.
2011-01-01
OBJECTIVE: Extremely low birth weight infants often require rehospitalization during infancy. Our objective was to identify at the time of discharge which extremely low birth weight infants are at higher risk for rehospitalization. METHODS: Data from extremely low birth weight infants in Eunice Kennedy Shriver National Institute of Child Health and Human Development Neonatal Research Network centers from 2002–2005 were analyzed. The primary outcome was rehospitalization by the 18- to 22-month follow-up, and secondary outcome was rehospitalization for respiratory causes in the first year. Using variables and odds ratios identified by stepwise logistic regression, scoring systems were developed with scores proportional to odds ratios. Classification and regression-tree analysis was performed by recursive partitioning and automatic selection of optimal cutoff points of variables. RESULTS: A total of 3787 infants were evaluated (mean ± SD birth weight: 787 ± 136 g; gestational age: 26 ± 2 weeks; 48% male, 42% black). Forty-five percent of the infants were rehospitalized by 18 to 22 months; 14.7% were rehospitalized for respiratory causes in the first year. Both regression models (area under the curve: 0.63) and classification and regression-tree models (mean misclassification rate: 40%–42%) were moderately accurate. Predictors for the primary outcome by regression were shunt surgery for hydrocephalus, hospital stay of >120 days for pulmonary reasons, necrotizing enterocolitis stage II or higher or spontaneous gastrointestinal perforation, higher fraction of inspired oxygen at 36 weeks, and male gender. By classification and regression-tree analysis, infants with hospital stays of >120 days for pulmonary reasons had a 66% rehospitalization rate compared with 42% without such a stay. CONCLUSIONS: The scoring systems and classification and regression-tree analysis models identified infants at higher risk of rehospitalization and might assist planning for care after discharge. PMID:22007016
Explaining Match Outcome During The Men’s Basketball Tournament at The Olympic Games
Leicht, Anthony S.; Gómez, Miguel A.; Woods, Carl T.
2017-01-01
In preparation for the Olympics, there is a limited opportunity for coaches and athletes to interact regularly with team performance indicators providing important guidance to coaches for enhanced match success at the elite level. This study examined the relationship between match outcome and team performance indicators during men’s basketball tournaments at the Olympic Games. Twelve team performance indicators were collated from all men’s teams and matches during the basketball tournament of the 2004-2016 Olympic Games (n = 156). Linear and non-linear analyses examined the relationship between match outcome and team performance indicator characteristics; namely, binary logistic regression and a conditional interference (CI) classification tree. The most parsimonious logistic regression model retained ‘assists’, ‘defensive rebounds’, ‘field-goal percentage’, ‘fouls’, ‘fouls against’, ‘steals’ and ‘turnovers’ (delta AIC <0.01; Akaike weight = 0.28) with a classification accuracy of 85.5%. Conversely, four performance indicators were retained with the CI classification tree with an average classification accuracy of 81.4%. However, it was the combination of ‘field-goal percentage’ and ‘defensive rebounds’ that provided the greatest probability of winning (93.2%). Match outcome during the men’s basketball tournaments at the Olympic Games was identified by a unique combination of performance indicators. Despite the average model accuracy being marginally higher for the logistic regression analysis, the CI classification tree offered a greater practical utility for coaches through its resolution of non-linear phenomena to guide team success. Key points A unique combination of team performance indicators explained 93.2% of winning observations in men’s basketball at the Olympics. Monitoring of these team performance indicators may provide coaches with the capability to devise multiple game plans or strategies to enhance their likelihood of winning. Incorporation of machine learning techniques with team performance indicators may provide a valuable and strategic approach to explain patterns within multivariate datasets in sport science. PMID:29238245
Jon C. Regelbrugge
1993-01-01
Abstract. We modeled tree mortality occurring two years following wildfire in Pinus ponderosa forests using data from 1275 trees in 25 stands burned during the 1987 Stanislaus Complex fires. We used logistic regression analysis to develop models relating the probability of wildfire-induced mortality with tree size and fire severity for Pinus ponderosa, Calocedrus...
Finding structure in data using multivariate tree boosting
Miller, Patrick J.; Lubke, Gitta H.; McArtor, Daniel B.; Bergeman, C. S.
2016-01-01
Technology and collaboration enable dramatic increases in the size of psychological and psychiatric data collections, but finding structure in these large data sets with many collected variables is challenging. Decision tree ensembles such as random forests (Strobl, Malley, & Tutz, 2009) are a useful tool for finding structure, but are difficult to interpret with multiple outcome variables which are often of interest in psychology. To find and interpret structure in data sets with multiple outcomes and many predictors (possibly exceeding the sample size), we introduce a multivariate extension to a decision tree ensemble method called gradient boosted regression trees (Friedman, 2001). Our extension, multivariate tree boosting, is a method for nonparametric regression that is useful for identifying important predictors, detecting predictors with nonlinear effects and interactions without specification of such effects, and for identifying predictors that cause two or more outcome variables to covary. We provide the R package ‘mvtboost’ to estimate, tune, and interpret the resulting model, which extends the implementation of univariate boosting in the R package ‘gbm’ (Ridgeway et al., 2015) to continuous, multivariate outcomes. To illustrate the approach, we analyze predictors of psychological well-being (Ryff & Keyes, 1995). Simulations verify that our approach identifies predictors with nonlinear effects and achieves high prediction accuracy, exceeding or matching the performance of (penalized) multivariate multiple regression and multivariate decision trees over a wide range of conditions. PMID:27918183
Estimating carbon and showing impacts of drought using satellite data in regression-tree models
Boyte, Stephen; Wylie, Bruce K.; Howard, Danny; Dahal, Devendra; Gilmanov, Tagir G.
2018-01-01
Integrating spatially explicit biogeophysical and remotely sensed data into regression-tree models enables the spatial extrapolation of training data over large geographic spaces, allowing a better understanding of broad-scale ecosystem processes. The current study presents annual gross primary production (GPP) and annual ecosystem respiration (RE) for 2000–2013 in several short-statured vegetation types using carbon flux data from towers that are located strategically across the conterminous United States (CONUS). We calculate carbon fluxes (annual net ecosystem production [NEP]) for each year in our study period, which includes 2012 when drought and higher-than-normal temperatures influence vegetation productivity in large parts of the study area. We present and analyse carbon flux dynamics in the CONUS to better understand how drought affects GPP, RE, and NEP. Model accuracy metrics show strong correlation coefficients (r) (r ≥ 94%) between training and estimated data for both GPP and RE. Overall, average annual GPP, RE, and NEP are relatively constant throughout the study period except during 2012 when almost 60% less carbon is sequestered than normal. These results allow us to conclude that this modelling method effectively estimates carbon dynamics through time and allows the exploration of impacts of meteorological anomalies and vegetation types on carbon dynamics.
Liu, Yang; Lü, Yi-he; Zheng, Hai-feng; Chen, Li-ding
2010-05-01
Based on the 10-day SPOT VEGETATION NDVI data and the daily meteorological data from 1998 to 2007 in Yan' an City, the main meteorological variables affecting the annual and interannual variations of NDVI were determined by using regression tree. It was found that the effects of test meteorological variables on the variability of NDVI differed with seasons and time lags. Temperature and precipitation were the most important meteorological variables affecting the annual variation of NDVI, and the average highest temperature was the most important meteorological variable affecting the inter-annual variation of NDVI. Regression tree was very powerful in determining the key meteorological variables affecting NDVI variation, but could not build quantitative relations between NDVI and meteorological variables, which limited its further and wider application.
Devarajan, Karthik; Parsons, Theodore; Wang, Qiong; O'Neill, Raymond; Solomides, Charalambos; Peiper, Stephen C.; Testa, Joseph R.; Uzzo, Robert; Yang, Haifeng
2017-01-01
Intratumoral heterogeneity (ITH) is a prominent feature of kidney cancer. It is not known whether it has utility in finding associations between protein expression and clinical parameters. We used ITH that is detected by immunohistochemistry (IHC) to aid the association analysis between the loss of SWI/SNF components and clinical parameters.160 ccRCC tumors (40 per tumor stage) were used to generate tissue microarray (TMA). Four foci from different regions of each tumor were selected. IHC was performed against PBRM1, ARID1A, SETD2, SMARCA4, and SMARCA2. Statistical analyses were performed to correlate biomarker losses with patho-clinical parameters. Categorical variables were compared between groups using Fisher's exact tests. Univariate and multivariable analyses were used to correlate biomarker changes and patient survivals. Multivariable analyses were performed by constructing decision trees using the classification and regression trees (CART) methodology. IHC detected widespread ITH in ccRCC tumors. The statistical analysis of the “Truncal loss” (root loss) found additional correlations between biomarker losses and tumor stages than the traditional “Loss in tumor (total)”. Losses of SMARCA4 or SMARCA2 significantly improved prognosis for overall survival (OS). Losses of PBRM1, ARID1A or SETD2 had the opposite effect. Thus “Truncal Loss” analysis revealed hidden links between protein losses and patient survival in ccRCC. PMID:28445125
Amirabadizadeh, Alireza; Nezami, Hossein; Vaughn, Michael G; Nakhaee, Samaneh; Mehrpour, Omid
2018-05-12
Substance abuse exacts considerable social and health care burdens throughout the world. The aim of this study was to create a prediction model to better identify risk factors for drug use. A prospective cross-sectional study was conducted in South Khorasan Province, Iran. Of the total of 678 eligible subjects, 70% (n: 474) were randomly selected to provide a training set for constructing decision tree and multiple logistic regression (MLR) models. The remaining 30% (n: 204) were employed in a holdout sample to test the performance of the decision tree and MLR models. Predictive performance of different models was analyzed by the receiver operating characteristic (ROC) curve using the testing set. Independent variables were selected from demographic characteristics and history of drug use. For the decision tree model, the sensitivity and specificity for identifying people at risk for drug abuse were 66% and 75%, respectively, while the MLR model was somewhat less effective at 60% and 73%. Key independent variables in the analyses included first substance experience, age at first drug use, age, place of residence, history of cigarette use, and occupational and marital status. While study findings are exploratory and lack generalizability they do suggest that the decision tree model holds promise as an effective classification approach for identifying risk factors for drug use. Convergent with prior research in Western contexts is that age of drug use initiation was a critical factor predicting a substance use disorder.
Ensemble of trees approaches to risk adjustment for evaluating a hospital's performance.
Liu, Yang; Traskin, Mikhail; Lorch, Scott A; George, Edward I; Small, Dylan
2015-03-01
A commonly used method for evaluating a hospital's performance on an outcome is to compare the hospital's observed outcome rate to the hospital's expected outcome rate given its patient (case) mix and service. The process of calculating the hospital's expected outcome rate given its patient mix and service is called risk adjustment (Iezzoni 1997). Risk adjustment is critical for accurately evaluating and comparing hospitals' performances since we would not want to unfairly penalize a hospital just because it treats sicker patients. The key to risk adjustment is accurately estimating the probability of an Outcome given patient characteristics. For cases with binary outcomes, the method that is commonly used in risk adjustment is logistic regression. In this paper, we consider ensemble of trees methods as alternatives for risk adjustment, including random forests and Bayesian additive regression trees (BART). Both random forests and BART are modern machine learning methods that have been shown recently to have excellent performance for prediction of outcomes in many settings. We apply these methods to carry out risk adjustment for the performance of neonatal intensive care units (NICU). We show that these ensemble of trees methods outperform logistic regression in predicting mortality among babies treated in NICU, and provide a superior method of risk adjustment compared to logistic regression.
Modeling Caribbean tree stem diameters from tree height and crown width measurements
Thomas Brandeis; KaDonna Randolph; Mike Strub
2009-01-01
Regression models to predict diameter at breast height (DBH) as a function of tree height and maximum crown radius were developed for Caribbean forests based on data collected by the U.S. Forest Service in the Commonwealth of Puerto Rico and Territory of the U.S. Virgin Islands. The model predicting DBH from tree height fit reasonably well (R2 = 0.7110), with...
Perceived Organizational Support for Enhancing Welfare at Work: A Regression Tree Model
Giorgi, Gabriele; Dubin, David; Perez, Javier Fiz
2016-01-01
When trying to examine outcomes such as welfare and well-being, research tends to focus on main effects and take into account limited numbers of variables at a time. There are a number of techniques that may help address this problem. For example, many statistical packages available in R provide easy-to-use methods of modeling complicated analysis such as classification and tree regression (i.e., recursive partitioning). The present research illustrates the value of recursive partitioning in the prediction of perceived organizational support in a sample of more than 6000 Italian bankers. Utilizing the tree function party package in R, we estimated a regression tree model predicting perceived organizational support from a multitude of job characteristics including job demand, lack of job control, lack of supervisor support, training, etc. The resulting model appears particularly helpful in pointing out several interactions in the prediction of perceived organizational support. In particular, training is the dominant factor. Another dimension that seems to influence organizational support is reporting (perceived communication about safety and stress concerns). Results are discussed from a theoretical and methodological point of view. PMID:28082924
Prediction of strontium bromide laser efficiency using cluster and decision tree analysis
NASA Astrophysics Data System (ADS)
Iliev, Iliycho; Gocheva-Ilieva, Snezhana; Kulin, Chavdar
2018-01-01
Subject of investigation is a new high-powered strontium bromide (SrBr2) vapor laser emitting in multiline region of wavelengths. The laser is an alternative to the atom strontium lasers and electron free lasers, especially at the line 6.45 μm which line is used in surgery for medical processing of biological tissues and bones with minimal damage. In this paper the experimental data from measurements of operational and output characteristics of the laser are statistically processed by means of cluster analysis and tree-based regression techniques. The aim is to extract the more important relationships and dependences from the available data which influence the increase of the overall laser efficiency. There are constructed and analyzed a set of cluster models. It is shown by using different cluster methods that the seven investigated operational characteristics (laser tube diameter, length, supplied electrical power, and others) and laser efficiency are combined in 2 clusters. By the built regression tree models using Classification and Regression Trees (CART) technique there are obtained dependences to predict the values of efficiency, and especially the maximum efficiency with over 95% accuracy.
Digression and Value Concatenation to Enable Privacy-Preserving Regression.
Li, Xiao-Bai; Sarkar, Sumit
2014-09-01
Regression techniques can be used not only for legitimate data analysis, but also to infer private information about individuals. In this paper, we demonstrate that regression trees, a popular data-analysis and data-mining technique, can be used to effectively reveal individuals' sensitive data. This problem, which we call a "regression attack," has not been addressed in the data privacy literature, and existing privacy-preserving techniques are not appropriate in coping with this problem. We propose a new approach to counter regression attacks. To protect against privacy disclosure, our approach introduces a novel measure, called digression , which assesses the sensitive value disclosure risk in the process of building a regression tree model. Specifically, we develop an algorithm that uses the measure for pruning the tree to limit disclosure of sensitive data. We also propose a dynamic value-concatenation method for anonymizing data, which better preserves data utility than a user-defined generalization scheme commonly used in existing approaches. Our approach can be used for anonymizing both numeric and categorical data. An experimental study is conducted using real-world financial, economic and healthcare data. The results of the experiments demonstrate that the proposed approach is very effective in protecting data privacy while preserving data quality for research and analysis.
Feedback-driven response to multidecadal climatic variability at an alpine treeline
Alftine, K.J.; Malanson, G.P.; Fagre, D.B.
2003-01-01
The Pacific Decadal Oscillation (PDO) has significant climatological and ecological effects in northwestern North America. Its possible effects and their modification by feedbacks are examined in the forest-tundra ecotone in Glacier National Park, Montana, USA. Tree ring samples were collected to estimate establishment dates in 10 quadrats. Age-diameter regressions were used to estimate the ages of uncored trees. The temporal pattern of establishment and survival was compared to the pattern of the PDO. A wave of establishment began in the mid-1940s, rose to a peak rate in the mid-1970s, and dropped precipitously beginning ca. 1980 to near zero for the 1990s. The period of establishment primarily coincided with the negative phase of the PDO, but the establishment and survival pattern is not correlated with the PDO index. The pattern indicates a period during which establishment was possible and was augmented by positive feedback from surviving trees. Snow may be the most important factor in the feedback, but studies indicate that its effects vary locally. Spatially differentiated analyses of decadal or longer periodicity may elucidate responses to climatic variation. ?? 2003 by V. H. Winston and Son, Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Lilly, P.; Yanai, R. D.; Buckley, H. L.; Case, B. S.; Woollons, R. C.; Holdaway, R. J.; Johnson, J.
2016-12-01
Calculations of forest biomass and elemental content require many measurements and models, each contributing uncertainty to the final estimates. While sampling error is commonly reported, based on replicate plots, error due to uncertainty in the regression used to estimate biomass from tree diameter is usually not quantified. Some published estimates of uncertainty due to the regression models have used the uncertainty in the prediction of individuals, ignoring uncertainty in the mean, while others have propagated uncertainty in the mean while ignoring individual variation. Using the simple case of the calcium concentration of sugar maple leaves, we compare the variation among individuals (the standard deviation) to the uncertainty in the mean (the standard error) and illustrate the declining importance in the prediction of individual concentrations as the number of individuals increases. For allometric models, the analogous statistics are the prediction interval (or the residual variation in the model fit) and the confidence interval (describing the uncertainty in the best fit model). The effect of propagating these two sources of error is illustrated using the mass of sugar maple foliage. The uncertainty in individual tree predictions was large for plots with few trees; for plots with 30 trees or more, the uncertainty in individuals was less important than the uncertainty in the mean. Authors of previously published analyses have reanalyzed their data to show the magnitude of these two sources of uncertainty in scales ranging from experimental plots to entire countries. The most correct analysis will take both sources of uncertainty into account, but for practical purposes, country-level reports of uncertainty in carbon stocks, as required by the IPCC, can ignore the uncertainty in individuals. Ignoring the uncertainty in the mean will lead to exaggerated estimates of confidence in estimates of forest biomass and carbon and nutrient contents.
Collevatti, Rosane G; Rodrigues, Eduardo E; Vitorino, Luciana C; Lima-Ribeiro, Matheus S; Chaves, Lázaro J; Telles, Mariana P C
2018-04-20
Spatial distribution of species genetic diversity is often driven by geographical distance (isolation by distance) or environmental conditions (isolation by environment), especially under climate change scenarios such as Quaternary glaciations. Here, we used coalescent analyses coupled with ecological niche modelling (ENM), spatially explicit quantile regression analyses and the multiple matrix regression with randomization (MMRR) approach to unravel the patterns of genetic differentiation in the widely distributed Neotropical savanna tree, Hancornia speciosa (Apocynaceae). Due to its high morphological differentiation, the species was originally classified into six botanical varieties by Monachino, and has recently been recognized as only two varieties by Flora do Brasil 2020. Thus, H. speciosa is a good biological model for learning about evolution of phenotypic plasticity under genetic and ecological effects, and predicting their responses to changing environmental conditions. We sampled 28 populations (777 individuals) of Monachino's four varieties of H. speciosa and used seven microsatellite loci to genotype them. Bayesian clustering showed five distinct genetic groups (K = 5) with high admixture among Monachino's varieties, mainly among populations in the central area of the species geographical range. Genetic differentiation among Monachino's varieties was lower than the genetic differentiation among populations within varieties, with higher within-population inbreeding. A high historical connectivity among populations of the central Cerrado shown by coalescent analyses may explain the high admixture among varieties. In addition, areas of higher climatic suitability also presented higher genetic diversity in such a way that the wide historical refugium across central Brazil might have promoted the long-term connectivity among populations. Yet, FST was significantly related to geographic distances, but not to environmental distances, and coalescent analyses and ENM predicted a demographical scenario of quasi-stability through time. Our findings show that demographical history and isolation by distance, but not isolation by environment, drove genetic differentiation of populations. Finally, the genetic clusters do not support the two recently recognized botanical varieties of H. speciosa, but partially support Monachino's classification at least for the four sampled varieties, similar to morphological variation.
Decision trees in epidemiological research.
Venkatasubramaniam, Ashwini; Wolfson, Julian; Mitchell, Nathan; Barnes, Timothy; JaKa, Meghan; French, Simone
2017-01-01
In many studies, it is of interest to identify population subgroups that are relatively homogeneous with respect to an outcome. The nature of these subgroups can provide insight into effect mechanisms and suggest targets for tailored interventions. However, identifying relevant subgroups can be challenging with standard statistical methods. We review the literature on decision trees, a family of techniques for partitioning the population, on the basis of covariates, into distinct subgroups who share similar values of an outcome variable. We compare two decision tree methods, the popular Classification and Regression tree (CART) technique and the newer Conditional Inference tree (CTree) technique, assessing their performance in a simulation study and using data from the Box Lunch Study, a randomized controlled trial of a portion size intervention. Both CART and CTree identify homogeneous population subgroups and offer improved prediction accuracy relative to regression-based approaches when subgroups are truly present in the data. An important distinction between CART and CTree is that the latter uses a formal statistical hypothesis testing framework in building decision trees, which simplifies the process of identifying and interpreting the final tree model. We also introduce a novel way to visualize the subgroups defined by decision trees. Our novel graphical visualization provides a more scientifically meaningful characterization of the subgroups identified by decision trees. Decision trees are a useful tool for identifying homogeneous subgroups defined by combinations of individual characteristics. While all decision tree techniques generate subgroups, we advocate the use of the newer CTree technique due to its simplicity and ease of interpretation.
Indicators of Terrorism Vulnerability in Africa
2015-03-26
the terror threat and vulnerabilities across Africa. Key words: Terrorism, Africa, Negative Binomial Regression, Classification Tree iv I would like...31 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Log -likelihood...70 viii Page 5.3 Classification Tree Description
NASA Astrophysics Data System (ADS)
Di, Nur Faraidah Muhammad; Satari, Siti Zanariah
2017-05-01
Outlier detection in linear data sets has been done vigorously but only a small amount of work has been done for outlier detection in circular data. In this study, we proposed multiple outliers detection in circular regression models based on the clustering algorithm. Clustering technique basically utilizes distance measure to define distance between various data points. Here, we introduce the similarity distance based on Euclidean distance for circular model and obtain a cluster tree using the single linkage clustering algorithm. Then, a stopping rule for the cluster tree based on the mean direction and circular standard deviation of the tree height is proposed. We classify the cluster group that exceeds the stopping rule as potential outlier. Our aim is to demonstrate the effectiveness of proposed algorithms with the similarity distances in detecting the outliers. It is found that the proposed methods are performed well and applicable for circular regression model.
Kadiyala, Akhil; Kaur, Devinder; Kumar, Ashok
2013-02-01
The present study developed a novel approach to modeling indoor air quality (IAQ) of a public transportation bus by the development of hybrid genetic-algorithm-based neural networks (also known as evolutionary neural networks) with input variables optimized from using the regression trees, referred as the GART approach. This study validated the applicability of the GART modeling approach in solving complex nonlinear systems by accurately predicting the monitored contaminants of carbon dioxide (CO2), carbon monoxide (CO), nitric oxide (NO), sulfur dioxide (SO2), 0.3-0.4 microm sized particle numbers, 0.4-0.5 microm sized particle numbers, particulate matter (PM) concentrations less than 1.0 microm (PM10), and PM concentrations less than 2.5 microm (PM2.5) inside a public transportation bus operating on 20% grade biodiesel in Toledo, OH. First, the important variables affecting each monitored in-bus contaminant were determined using regression trees. Second, the analysis of variance was used as a complimentary sensitivity analysis to the regression tree results to determine a subset of statistically significant variables affecting each monitored in-bus contaminant. Finally, the identified subsets of statistically significant variables were used as inputs to develop three artificial neural network (ANN) models. The models developed were regression tree-based back-propagation network (BPN-RT), regression tree-based radial basis function network (RBFN-RT), and GART models. Performance measures were used to validate the predictive capacity of the developed IAQ models. The results from this approach were compared with the results obtained from using a theoretical approach and a generalized practicable approach to modeling IAQ that included the consideration of additional independent variables when developing the aforementioned ANN models. The hybrid GART models were able to capture majority of the variance in the monitored in-bus contaminants. The genetic-algorithm-based neural network IAQ models outperformed the traditional ANN methods of the back-propagation and the radial basis function networks. The novelty of this research is the development of a novel approach to modeling vehicular indoor air quality by integration of the advanced methods of genetic algorithms, regression trees, and the analysis of variance for the monitored in-vehicle gaseous and particulate matter contaminants, and comparing the results obtained from using the developed approach with conventional artificial intelligence techniques of back propagation networks and radial basis function networks. This study validated the newly developed approach using holdout and threefold cross-validation methods. These results are of great interest to scientists, researchers, and the public in understanding the various aspects of modeling an indoor microenvironment. This methodology can easily be extended to other fields of study also.
Pollen-limited reproduction in blue oak: Implications for wind pollination in fragmented populations
Knapp, E.E.; Goedde, M.A.; Rice, K.J.
2001-01-01
Human activities are fragmenting forests and woodlands worldwide, but the impact of reduced tree population densities on pollen transfer in wind-pollinated trees is poorly understood. In a 4-year study, we evaluated relationships among stand density, pollen availability, and seed production in a thinned and fragmented population of blue oak (Quercus douglasii). Geographic coordinates were established and flowering interval determined for 100 contiguous trees. The number of neighboring trees within 60 m that released pollen during each tree's flowering period was calculated and relationships with acorn production explored using multiple regression. We evaluated the effects of female flower production, average temperature, and relative humidity during the pollination period, and number of pollen-producing neighbors on individual trees' acorn production. All factors except temperature were significant in at least one of the years of our study, but the combination of factors influencing acorn production varied among years. In 1996, a year of large acorn crop size, acorn production was significantly positively associated with number of neighboring pollen producers and density of female flowers. In 1997, 1998, and 1999, many trees produced few or no acorns, and significant associations between number of pollen-producing neighbors and acorn production were only apparent among moderately to highly reproductive trees. Acorn production by these reproductive trees in 1997 was significantly positively associated with number of neighboring pollen producers and significantly negatively associated with average relative humidity during the pollination period. In 1998, no analysis was possible, because too few trees produced a moderate to large acorn crop. Only density of female flowers was significantly associated with acorn production of moderately to highly reproductive trees in 1999. The effect of spatial scale was also investigated by conducting analyses with pollen producers counted in radii ranging from 30 m to 80 m. The association between number of pollen-producing neighbors and acorn production was strongest when neighborhood sizes of 60 m or larger were considered. Our results suggest that fragmentation and thinning of blue oak woodlands may reduce pollen availability and limit reproduction in this wind-pollinated species.
Comprehensive database of diameter-based biomass regressions for North American tree species
Jennifer C. Jenkins; David C. Chojnacky; Linda S. Heath; Richard A. Birdsey
2004-01-01
A database consisting of 2,640 equations compiled from the literature for predicting the biomass of trees and tree components from diameter measurements of species found in North America. Bibliographic information, geographic locations, diameter limits, diameter and biomass units, equation forms, statistical errors, and coefficients are provided for each equation,...
Xi, Zhenxiang; Liu, Liang; Davis, Charles C
2015-11-01
The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014). Copyright © 2015 Elsevier Inc. All rights reserved.
Cheng, Feon W; Gao, Xiang; Bao, Le; Mitchell, Diane C; Wood, Craig; Sliwinski, Martin J; Smiciklas-Wright, Helen; Still, Christopher D; Rolston, David D K; Jensen, Gordon L
2017-07-01
To examine the risk factors of developing functional decline and make probabilistic predictions by using a tree-based method that allows higher order polynomials and interactions of the risk factors. The conditional inference tree analysis, a data mining approach, was used to construct a risk stratification algorithm for developing functional limitation based on BMI and other potential risk factors for disability in 1,951 older adults without functional limitations at baseline (baseline age 73.1 ± 4.2 y). We also analyzed the data with multivariate stepwise logistic regression and compared the two approaches (e.g., cross-validation). Over a mean of 9.2 ± 1.7 years of follow-up, 221 individuals developed functional limitation. Higher BMI, age, and comorbidity were consistently identified as significant risk factors for functional decline using both methods. Based on these factors, individuals were stratified into four risk groups via the conditional inference tree analysis. Compared to the low-risk group, all other groups had a significantly higher risk of developing functional limitation. The odds ratio comparing two extreme categories was 9.09 (95% confidence interval: 4.68, 17.6). Higher BMI, age, and comorbid disease were consistently identified as significant risk factors for functional decline among older individuals across all approaches and analyses. © 2017 The Obesity Society.
Towards lidar-based mapping of tree age at the Arctic forest tundra ecotone.
NASA Astrophysics Data System (ADS)
Jensen, J.; Maguire, A.; Oelkers, R.; Andreu-Hayles, L.; Boelman, N.; D'Arrigo, R.; Griffin, K. L.; Jennewein, J. S.; Hiers, E.; Meddens, A. J.; Russell, M.; Vierling, L. A.; Eitel, J.
2017-12-01
Climate change may cause spatial shifts in the forest-tundra ecotone (FTE). To improve our ability to study these spatial shifts, information on tree demography along the FTE is needed. The objective of this study was to assess the suitability of lidar derived tree heights as a surrogate for tree age. We calculated individual tree age from 48 tree cores collected at basal height from white spruce (Picea glauca) within the FTE in northern Alaska. Tree height was obtained from terrestrial lidar scans (<1cm spatial resolution). The relationship between age and height was examined using a linear regression model forced through the origin. We found a very strong predictive relationship between tree height and age (R2 = 0.90, RMSE = 19.34 years) for trees that ranged between 14 to 230 years. Separate regression models were also developed for small (height < 3 m) and large trees (height >= 3 m), yielding strong predictive relationships between height and age (R2 = 0.86, RMSE 12.21 years, and R2 = 0.93, RMSE = 25.16 years, respectively). The slope coefficient for small and large tree models (16.83 and 12.98 years/m, respectively) indicate that small trees grow 1.3 times faster than large trees at these FTE study sites. Although a strong, predictive relationship between age and height is uncommon in light-limited forest environments, our findings suggest that the sparseness of trees within the FTE may explain the strong tree height-age relationships found herein. Further analysis of 36 additional tree cores recently collected within the FTE near Inuvik, Canada will be performed. Our preliminary analysis suggests that lidar derived tree height could be a reliable proxy for tree age at the FTE, thereby establishing a new technique for scaling tree structure and demographics across larger portions of this sensitive ecotone.
Rifai, Sami W; Urquiza Muñoz, José D; Negrón-Juárez, Robinson I; Ramírez Arévalo, Fredy R; Tello-Espinoza, Rodil; Vanderwel, Mark C; Lichstein, Jeremy W; Chambers, Jeffrey Q; Bohlman, Stephanie A
2016-10-01
Wind disturbance can create large forest blowdowns, which greatly reduces live biomass and adds uncertainty to the strength of the Amazon carbon sink. Observational studies from within the central Amazon have quantified blowdown size and estimated total mortality but have not determined which trees are most likely to die from a catastrophic wind disturbance. Also, the impact of spatial dependence upon tree mortality from wind disturbance has seldom been quantified, which is important because wind disturbance often kills clusters of trees due to large treefalls killing surrounding neighbors. We examine (1) the causes of differential mortality between adult trees from a 300-ha blowdown event in the Peruvian region of the northwestern Amazon, (2) how accounting for spatial dependence affects mortality predictions, and (3) how incorporating both differential mortality and spatial dependence affect the landscape level estimation of necromass produced from the blowdown. Standard regression and spatial regression models were used to estimate how stem diameter, wood density, elevation, and a satellite-derived disturbance metric influenced the probability of tree death from the blowdown event. The model parameters regarding tree characteristics, topography, and spatial autocorrelation of the field data were then used to determine the consequences of non-random mortality for landscape production of necromass through a simulation model. Tree mortality was highly non-random within the blowdown, where tree mortality rates were highest for trees that were large, had low wood density, and were located at high elevation. Of the differential mortality models, the non-spatial models overpredicted necromass, whereas the spatial model slightly underpredicted necromass. When parameterized from the same field data, the spatial regression model with differential mortality estimated only 7.5% more dead trees across the entire blowdown than the random mortality model, yet it estimated 51% greater necromass. We suggest that predictions of forest carbon loss from wind disturbance are sensitive to not only the underlying spatial dependence of observations, but also the biological differences between individuals that promote differential levels of mortality. © 2016 by the Ecological Society of America.
O’Connor, Christopher D.; Lynch, Ann M.
2016-01-01
A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST. PMID:27391084
Swetnam, Tyson L; O'Connor, Christopher D; Lynch, Ann M
2016-01-01
A significant concern about Metabolic Scaling Theory (MST) in real forests relates to consistent differences between the values of power law scaling exponents of tree primary size measures used to estimate mass and those predicted by MST. Here we consider why observed scaling exponents for diameter and height relationships deviate from MST predictions across three semi-arid conifer forests in relation to: (1) tree condition and physical form, (2) the level of inter-tree competition (e.g. open vs closed stand structure), (3) increasing tree age, and (4) differences in site productivity. Scaling exponent values derived from non-linear least-squares regression for trees in excellent condition (n = 381) were above the MST prediction at the 95% confidence level, while the exponent for trees in good condition were no different than MST (n = 926). Trees that were in fair or poor condition, characterized as diseased, leaning, or sparsely crowned had exponent values below MST predictions (n = 2,058), as did recently dead standing trees (n = 375). Exponent value of the mean-tree model that disregarded tree condition (n = 3,740) was consistent with other studies that reject MST scaling. Ostensibly, as stand density and competition increase trees exhibited greater morphological plasticity whereby the majority had characteristically fair or poor growth forms. Fitting by least-squares regression biases the mean-tree model scaling exponent toward values that are below MST idealized predictions. For 368 trees from Arizona with known establishment dates, increasing age had no significant impact on expected scaling. We further suggest height to diameter ratios below MST relate to vertical truncation caused by limitation in plant water availability. Even with environmentally imposed height limitation, proportionality between height and diameter scaling exponents were consistent with the predictions of MST.
Effects of mining-associated lead and zinc soil contamination on native floristic quality.
Struckhoff, Matthew A; Stroh, Esther D; Grabner, Keith W
2013-04-15
We assessed the quality of plant communities across a range of lead (Pb) and zinc (Zn) soil concentrations at a variety of sites associated with Pb mining in southeast Missouri, USA. In a novel application, two standard floristic quality measures, Mean Coefficient of Conservatism (Mean C) and Floristic Quality Index (FQI), were examined in relation to concentrations of Pb and Zn, soil nutrients, and other soil characteristics. Nonmetric Multidimensional Scaling and Regression Tree Analyses identified soil Pb and Zn concentrations as primary explanatory variables for plant community composition and indicated negative relationships between soil metals concentrations and both Mean C and FQI. Univariate regression also demonstrated significant negative relationships between metals concentrations and floristic quality. The negative effects of metals in native soils with otherwise relatively undisturbed conditions indicate that elevated soil metals concentrations adversely affect native floristic quality where no other human disturbance is evident. Published by Elsevier Ltd.
Effects of mining-associated lead and zinc soil contamination on native floristic quality
Struckhoff, Matthew A.; Stroh, Esther D.; Grabner, Keith W.
2013-01-01
We assessed the quality of plant communities across a range of lead (Pb) and zinc (Zn) soil concentrations at a variety of sites associated with Pb mining in southeast Missouri, USA. In a novel application, two standard floristic quality measures, Mean Coefficient of Conservatism (Mean C) and Floristic Quality Index (FQI), were examined in relation to concentrations of Pb and Zn, soil nutrients, and other soil characteristics. Nonmetric Multidimensional Scaling and Regression Tree Analyses identified soil Pb and Zn concentrations as primary explanatory variables for plant community composition and indicated negative relationships between soil metals concentrations and both Mean C and FQI. Univariate regression also demonstrated significant negative relationships between metals concentrations and floristic quality. The negative effects of metals in native soils with otherwise relatively undisturbed conditions indicate that elevated soil metals concentrations adversely affect native floristic quality where no other human disturbance is evident.
McLaughlin, Samuel B; Wullschleger, Stan D; Nosal, Miloslav
2003-11-01
To evaluate indicators of whole-tree physiological responses to climate stress, we determined seasonal, daily and diurnal patterns of growth and water use in 10 yellow poplar (Liriodendron tulipifera L.) trees in a stand recently released from competition. Precise measurements of stem increment and sap flow made with automated electronic dendrometers and thermal dissipation probes, respectively, indicated close temporal linkages between water use and patterns of stem shrinkage and swelling during daily cycles of water depletion and recharge of extensible outer-stem tissues. These cycles also determined net daily basal area increment. Multivariate regression models based on a 123-day data series showed that daily diameter increments were related negatively to vapor pressure deficit (VPD), but positively to precipitation and temperature. The same model form with slight changes in coefficients yielded coefficients of determination of about 0.62 (0.57-0.66) across data subsets that included widely variable growth rates and VPDs. Model R2 was improved to 0.75 by using 3-day running mean daily growth data. Rapid recovery of stem diameter growth following short-term, diurnal reductions in VPD indicated that water stored in extensible stem tissues was part of a fast recharge system that limited hydration changes in the cambial zone during periods of water stress. There were substantial differences in the seasonal dynamics of growth among individual trees, and analyses indicated that faster-growing trees were more positively affected by precipitation, solar irradiance and temperature and more negatively affected by high VPD than slower-growing trees. There were no negative effects of ozone on daily growth rates in a year of low ozone concentrations.
Unravelling the limits to tree height: a major role for water and nutrient trade-offs.
Cramer, Michael D
2012-05-01
Competition for light has driven forest trees to grow exceedingly tall, but the lack of a single universal limit to tree height indicates multiple interacting environmental limitations. Because soil nutrient availability is determined by both nutrient concentrations and soil water, water and nutrient availabilities may interact in determining realised nutrient availability and consequently tree height. In SW Australia, which is characterised by nutrient impoverished soils that support some of the world's tallest forests, total [P] and water availability were independently correlated with tree height (r = 0.42 and 0.39, respectively). However, interactions between water availability and each of total [P], pH and [Mg] contributed to a multiple linear regression model of tree height (r = 0.72). A boosted regression tree model showed that maximum tree height was correlated with water availability (24%), followed by soil properties including total P (11%), Mg (10%) and total N (9%), amongst others, and that there was an interaction between water availability and total [P] in determining maximum tree height. These interactions indicated a trade-off between water and P availability in determining maximum tree height in SW Australia. This is enabled by a species assemblage capable of growing tall and surviving (some) disturbances. The mechanism for this trade-off is suggested to be through water enabling mass-flow and diffusive mobility of P, particularly of relatively mobile organic P, although water interactions with microbial activity could also play a role.
Zhong, Buqing; Liang, Tao; Wang, Lingqing; Li, Kexin
2014-08-15
An extensive soil survey was conducted to study pollution sources and delineate contamination of heavy metals in one of the metalliferous industrial bases, in the karst areas of southwest China. A total of 597 topsoil samples were collected and the concentrations of five heavy metals, namely Cd, As (metalloid), Pb, Hg and Cr were analyzed. Stochastic models including a conditional inference tree (CIT) and a finite mixture distribution model (FMDM) were applied to identify the sources and partition the contribution from natural and anthropogenic sources for heavy metal in topsoils of the study area. Regression trees for Cd, As, Pb and Hg were proved to depend mostly on indicators of anthropogenic activities such as industrial type and distance from urban area, while the regression tree for Cr was found to be mainly influenced by the geogenic characteristics. The FMDM analysis showed that the geometric means of modeled background values for Cd, As, Pb, Hg and Cr were close to their background values previously reported in the study area, while the contamination of Cd and Hg were widespread in the study area, imposing potentially detrimental effects on organisms through the food chain. Finally, the probabilities of single and multiple heavy metals exceeding the threshold values derived from the FMDM were estimated using indicator kriging (IK) and multivariate indicator kriging (MVIK). The high probabilities exceeding the thresholds of heavy metals were associated with metalliferous production and atmospheric deposition of heavy metals transported from the urban and industrial areas. Geostatistics coupled with stochastic models provide an effective way to delineate multiple heavy metal pollution to facilitate improved environmental management. Copyright © 2014 Elsevier B.V. All rights reserved.
Patel, Rita B; Mathur, Maya B; Gould, Michael; Uyeki, Timothy M; Bhattacharya, Jay; Xiao, Yang; Khazeni, Nayer
2014-01-01
Human infections with highly pathogenic avian influenza (HPAI) A (H5N1) viruses have occurred in 15 countries, with high mortality to date. Determining risk factors for morbidity and mortality from HPAI H5N1 can inform preventive and therapeutic interventions. We included all cases of human HPAI H5N1 reported in World Health Organization Global Alert and Response updates and those identified through a systematic search of multiple databases (PubMed, Scopus, and Google Scholar), including articles in all languages. We abstracted predefined clinical and demographic predictors and mortality and used bivariate logistic regression analyses to examine the relationship of each candidate predictor with mortality. We developed and pruned a decision tree using nonparametric Classification and Regression Tree methods to create risk strata for mortality. We identified 617 human cases of HPAI H5N1 occurring between December 1997 and April 2013. The median age of subjects was 18 years (interquartile range 6-29 years) and 54% were female. HPAI H5N1 case-fatality proportion was 59%. The final decision tree for mortality included age, country, per capita government health expenditure, and delay from symptom onset to hospitalization, with an area under the receiver operator characteristic (ROC) curve of 0.81 (95% CI: 0.76-0.86). A model defined by four clinical and demographic predictors successfully estimated the probability of mortality from HPAI H5N1 illness. These parameters highlight the importance of early diagnosis and treatment and may enable early, targeted pharmaceutical therapy and supportive care for symptomatic patients with HPAI H5N1 virus infection.
Multivariate regression model for partitioning tree volume of white oak into round-product classes
Daniel A. Yaussy; David L. Sonderman
1984-01-01
Describes the development of multivariate equations that predict the expected cubic volume of four round-product classes from independent variables composed of individual tree-quality characteristics. Although the model has limited application at this time, it does demonstrate the feasibility of partitioning total tree cubic volume into round-product classes based on...
Kathleen L. Kavanaugh; Matthew B. Dickinson; Anthony S. Bova
2010-01-01
Current operational methods for predicting tree mortality from fire injury are regression-based models that only indirectly consider underlying causes and, thus, have limited generality. A better understanding of the physiological consequences of tree heating and injury are needed to develop biophysical process models that can make predictions under changing or novel...
Height-age relationships for regeneration-size trees in the northern Rocky Mountains, USA
Dennis E. Ferguson; Clinton E. Carlson
2010-01-01
Regression equations were developed to predict heights of 10 conifer species inregenerating stands in central and northern Idaho, western Montana, and eastern Washington. Most sample trees were natural regeneration that became established after conventional harvest and site preparation methods. Heights are predicted as a function of tree age, residual overstory density...
Louis R. Iverson; Anantha M. Prasad; Anantha M. Prasad
2002-01-01
Global climate change could have profound effects on the Earth's biota, including large redistributions of tree species and forest types. We used DISTRIB, a deterministic regression tree analysis model, to examine environmental drivers related to current forest-species distributions and then model potential suitable habitat under five climate change scenarios...
Kabeshova, A; Annweiler, C; Fantino, B; Philip, T; Gromov, V A; Launay, C P; Beauchet, O
2014-06-01
Regression tree (RT) analyses are particularly adapted to explore the risk of recurrent falling according to various combinations of fall risk factors compared to logistic regression models. The aims of this study were (1) to determine which combinations of fall risk factors were associated with the occurrence of recurrent falls in older community-dwellers, and (2) to compare the efficacy of RT and multiple logistic regression model for the identification of recurrent falls. A total of 1,760 community-dwelling volunteers (mean age ± standard deviation, 71.0 ± 5.1 years; 49.4 % female) were recruited prospectively in this cross-sectional study. Age, gender, polypharmacy, use of psychoactive drugs, fear of falling (FOF), cognitive disorders and sad mood were recorded. In addition, the history of falls within the past year was recorded using a standardized questionnaire. Among 1,760 participants, 19.7 % (n = 346) were recurrent fallers. The RT identified 14 nodes groups and 8 end nodes with FOF as the first major split. Among participants with FOF, those who had sad mood and polypharmacy formed the end node with the greatest OR for recurrent falls (OR = 6.06 with p < 0.001). Among participants without FOF, those who were male and not sad had the lowest OR for recurrent falls (OR = 0.25 with p < 0.001). The RT correctly classified 1,356 from 1,414 non-recurrent fallers (specificity = 95.6 %), and 65 from 346 recurrent fallers (sensitivity = 18.8 %). The overall classification accuracy was 81.0 %. The multiple logistic regression correctly classified 1,372 from 1,414 non-recurrent fallers (specificity = 97.0 %), and 61 from 346 recurrent fallers (sensitivity = 17.6 %). The overall classification accuracy was 81.4 %. Our results show that RT may identify specific combinations of risk factors for recurrent falls, the combination most associated with recurrent falls involving FOF, sad mood and polypharmacy. The FOF emerged as the risk factor strongly associated with recurrent falls. In addition, RT and multiple logistic regression were not sensitive enough to identify the majority of recurrent fallers but appeared efficient in detecting individuals not at risk of recurrent falls.
Holtschlag, David J.; Shively, Dawn; Whitman, Richard L.; Haack, Sheridan K.; Fogarty, Lisa R.
2008-01-01
Regression analyses and hydrodynamic modeling were used to identify environmental factors and flow paths associated with Escherichia coli (E. coli) concentrations at Memorial and Metropolitan Beaches on Lake St. Clair in Macomb County, Mich. Lake St. Clair is part of the binational waterway between the United States and Canada that connects Lake Huron with Lake Erie in the Great Lakes Basin. Linear regression, regression-tree, and logistic regression models were developed from E. coli concentration and ancillary environmental data. Linear regression models on log10 E. coli concentrations indicated that rainfall prior to sampling, water temperature, and turbidity were positively associated with bacteria concentrations at both beaches. Flow from Clinton River, changes in water levels, wind conditions, and log10 E. coli concentrations 2 days before or after the target bacteria concentrations were statistically significant at one or both beaches. In addition, various interaction terms were significant at Memorial Beach. Linear regression models for both beaches explained only about 30 percent of the variability in log10 E. coli concentrations. Regression-tree models were developed from data from both Memorial and Metropolitan Beaches but were found to have limited predictive capability in this study. The results indicate that too few observations were available to develop reliable regression-tree models. Linear logistic models were developed to estimate the probability of E. coli concentrations exceeding 300 most probable number (MPN) per 100 milliliters (mL). Rainfall amounts before bacteria sampling were positively associated with exceedance probabilities at both beaches. Flow of Clinton River, turbidity, and log10 E. coli concentrations measured before or after the target E. coli measurements were related to exceedances at one or both beaches. The linear logistic models were effective in estimating bacteria exceedances at both beaches. A receiver operating characteristic (ROC) analysis was used to determine cut points for maximizing the true positive rate prediction while minimizing the false positive rate. A two-dimensional hydrodynamic model was developed to simulate horizontal current patterns on Lake St. Clair in response to wind, flow, and water-level conditions at model boundaries. Simulated velocity fields were used to track hypothetical massless particles backward in time from the beaches along flow paths toward source areas. Reverse particle tracking for idealized steady-state conditions shows changes in expected flow paths and traveltimes with wind speeds and directions from 24 sectors. The results indicate that three to four sets of contiguous wind sectors have similar effects on flow paths in the vicinity of the beaches. In addition, reverse particle tracking was used for transient conditions to identify expected flow paths for 10 E. coli sampling events in 2004. These results demonstrate the ability to track hypothetical particles from the beaches, backward in time, to likely source areas. This ability, coupled with a greater frequency of bacteria sampling, may provide insight into changes in bacteria concentrations between source and sink areas.
NASA Astrophysics Data System (ADS)
Künne, A.; Fink, M.; Kipka, H.; Krause, P.; Flügel, W.-A.
2012-06-01
In this paper, a method is presented to estimate excess nitrogen on large scales considering single field processes. The approach was implemented by using the physically based model J2000-S to simulate the nitrogen balance as well as the hydrological dynamics within meso-scale test catchments. The model input data, the parameterization, the results and a detailed system understanding were used to generate the regression tree models with GUIDE (Loh, 2002). For each landscape type in the federal state of Thuringia a regression tree was calibrated and validated using the model data and results of excess nitrogen from the test catchments. Hydrological parameters such as precipitation and evapotranspiration were also used to predict excess nitrogen by the regression tree model. Hence they had to be calculated and regionalized as well for the state of Thuringia. Here the model J2000g was used to simulate the water balance on the macro scale. With the regression trees the excess nitrogen was regionalized for each landscape type of Thuringia. The approach allows calculating the potential nitrogen input into the streams of the drainage area. The results show that the applied methodology was able to transfer the detailed model results of the meso-scale catchments to the entire state of Thuringia by low computing time without losing the detailed knowledge from the nitrogen transport modeling. This was validated with modeling results from Fink (2004) in a catchment lying in the regionalization area. The regionalized and modeled excess nitrogen correspond with 94%. The study was conducted within the framework of a project in collaboration with the Thuringian Environmental Ministry, whose overall aim was to assess the effect of agro-environmental measures regarding load reduction in the water bodies of Thuringia to fulfill the requirements of the European Water Framework Directive (Bäse et al., 2007; Fink, 2006; Fink et al., 2007).
Environmental correlates of plant diversity in Korean temperate forests
NASA Astrophysics Data System (ADS)
Černý, Tomáš; Doležal, Jiří; Janeček, Štěpán; Šrůtek, Miroslav; Valachovič, Milan; Petřík, Petr; Altman, Jan; Bartoš, Michael; Song, Jong-Suk
2013-02-01
Mountainous areas of the Korean Peninsula are among the biodiversity hotspots of the world's temperate forests. Understanding patterns in spatial distribution of their species richness requires explicit consideration of different environmental drivers and their effects on functionally differing components. In this study, we assess the impact of both geographical and soil variables on the fine-scale (400 m2) pattern of plant diversity using field data from six national parks, spanning a 1300 m altitudinal gradient. Species richness and the slopes of species-area curves were calculated separately for the tree, shrub and herb layer and used as response variables in regression tree analyses. A cluster analysis distinguished three dominant forest communities with specific patterns in the diversity-environment relationship. The most widespread middle-altitude oak forests had the highest tree richness but the lowest richness of herbaceous plants due to a dense bamboo understory. Total richness was positively associated with soil reaction and negatively associated with soluble phosphorus and solar radiation (site dryness). Tree richness was associated mainly with soil factors, although trees are frequently assumed to be controlled mainly by factors with large-scale impact. A U-shaped relationship was found between herbaceous plant richness and altitude, caused by a distribution pattern of dwarf bamboo in understory. No correlation between the degree of canopy openness and herb layer richness was detected. Slopes of the species-area curves indicated the various origins of forest communities. Variable diversity-environment responses in different layers and communities reinforce the necessity of context-dependent differentiation for the assessment of impacts of climate and land-use changes in these diverse but intensively exploited regions.
Do ray cells provide a pathway for radial water movement in the stems of conifer trees?
Barnard, David M; Lachenbruch, Barbara; McCulloh, Katherine A; Kitin, Peter; Meinzer, Frederick C
2013-02-01
The pathway of radial water movement in tree stems presents an unknown with respect to whole-tree hydraulics. Radial profiles have shown substantial axial sap flow in deeper layers of sapwood (that may lack direct connection to transpiring leaves), which suggests the existence of a radial pathway for water movement. Rays in tree stems include ray tracheids and/or ray parenchyma cells and may offer such a pathway for radial water transport. This study investigated relationships between radial hydraulic conductivity (k(s-rad)) and ray anatomical and stem morphological characteristics in the stems of three conifer species whose distributions span a natural aridity gradient across the Cascade Mountain range in Oregon, United States. The k(s-rad) was measured with a high-pressure flow meter. Ray tracheid and ray parenchyma characteristics and water transport properties were visualized using autofluorescence or confocal microscopy. The k(s-rad) did not vary predictably with sapwood depth among species and populations. Dye tracer did not infiltrate ray tracheids, and infiltration into ray parenchyma was limited. Regression analyses revealed inconsistent relationships between k(s-rad) and selected anatomical or growth characteristics when ecotypes were analyzed individually and weak relationships between k(s-rad) and these characteristics when data were pooled by tree species. The lack of significant relationships between k(s-rad) and the ray and stem morphologies we studied, combined with the absence of dye tracer in ray tracheid and limited movement of dye into ray parenchyma suggests that rays may not facilitate radial water transport in the three conifer species studied.
Predicting Diameter at Breast Height from Stump Diameters for Northeastern Tree Species
Eric H. Wharton; Eric H. Wharton
1984-01-01
Presents equations to predict diameter at breast height from stump diameter measurements for 17 northeastern tree species. Simple linear regression was used to develop the equations. Application of the equations is discussed.
Analyses of Great Smoky Mountain Red Spruce Tree Ring Data
Paul C. van Deusen; [Editor
1988-01-01
Four different analyses of red spruce tree ring data from the Great Smoky Mountains are presented along with a description of the spruce/fir ecosystem.The analyses use several techniques including spatial analysis, fractals, spline detrending, and the Kalman filter.
Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun
2006-01-01
In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.
Toward a Periodic Table of Niches, or Exploring the Lizard Niche Hypervolume.
Pianka, Eric R; Vitt, Laurie J; Pelegrin, Nicolás; Fitzgerald, Daniel B; Winemiller, Kirk O
2017-11-01
Widespread niche convergence suggests that species can be organized according to functional trait combinations to create a framework analogous to a periodic table. We compiled ecological data for lizards to examine patterns of global and regional niche diversification, and we used multivariate statistical approaches to develop the beginnings for a periodic table of niches. Data (50+ variables) for five major niche dimensions (habitat, diet, life history, metabolism, defense) were compiled for 134 species of lizards representing 24 of the 38 extant families. Principal coordinates analyses were performed on niche dimensional data sets, and species scores for the first three axes were used as input for a principal components analysis to ordinate species in continuous niche space and for a regression tree analysis to separate species into discrete niche categories. Three-dimensional models facilitate exploration of species positions in relation to major gradients within the niche hypervolume. The first gradient loads on body size, foraging mode, and clutch size. The second was influenced by metabolism and terrestrial versus arboreal microhabitat. The third was influenced by activity time, life history, and diet. Natural dichotomies are activity time, foraging mode, parity mode, and habitat. Regression tree analysis identified 103 cases of extreme niche conservatism within clades and 100 convergences between clades. Extending this approach to other taxa should lead to a wider understanding of niche evolution.
Kim, So-Ra; Kwak, Doo-Ahn; Lee, Woo-Kyun; oLee, Woo-Kyun; Son, Yowhan; Bae, Sang-Won; Kim, Choonsig; Yoo, Seongjin
2010-07-01
The objective of this study was to estimate the carbon storage capacity of Pinus densiflora stands using remotely sensed data by combining digital aerial photography with light detection and ranging (LiDAR) data. A digital canopy model (DCM), generated from the LiDAR data, was combined with aerial photography for segmenting crowns of individual trees. To eliminate errors in over and under-segmentation, the combined image was smoothed using a Gaussian filtering method. The processed image was then segmented into individual trees using a marker-controlled watershed segmentation method. After measuring the crown area from the segmented individual trees, the individual tree diameter at breast height (DBH) was estimated using a regression function developed from the relationship observed between the field-measured DBH and crown area. The above ground biomass of individual trees could be calculated by an image-derived DBH using a regression function developed by the Korea Forest Research Institute. The carbon storage, based on individual trees, was estimated by simple multiplication using the carbon conversion index (0.5), as suggested in guidelines from the Intergovernmental Panel on Climate Change. The mean carbon storage per individual tree was estimated and then compared with the field-measured value. This study suggested that the biomass and carbon storage in a large forest area can be effectively estimated using aerial photographs and LiDAR data.
Louis R. Iverson; Anantha Prasad; Mark W. Schwartz; Mark W. Schwartz
1999-01-01
We are using a deterministic regression tree analysis model (DISTRIB) and a stochastic migration model (SHIFT) to examine potential distributions of ~66 individual species of eastern US trees under a 2 x CO2 climate change scenario. This process is demonstrated for Virginia pine (Pinus virginiana).
Potential Changes in Tree Species Richness and Forest Community Types following Climate Change
Louis R. Iverson; Anantha M. Prasad
2001-01-01
Potential changes in tree species richness and forest community types were evaluated for the eastern United States according to five scenarios of future climate change resulting from a doubling of atmospheric carbon dioxide (CO2). DISTRIB, an empirical model that uses a regression tree analysis approach, was used to generate suitable habitat, or potential future...
Estimating tree crown widths for the primary Acadian species in Maine
Matthew B. Russell; Aaron R. Weiskittel
2012-01-01
In this analysis, data for seven conifer and eight hardwood species were gathered from across the state of Maine for estimating tree crown widths. Maximum and largest crown width equations were developed using tree diameter at breast height as the primary predicting variable. Quantile regression techniques were used to estimate the maximum crown width and a constrained...
Du, Hua Qiang; Sun, Xiao Yan; Han, Ning; Mao, Fang Jie
2017-10-01
By synergistically using the object-based image analysis (OBIA) and the classification and regression tree (CART) methods, the distribution information, the indexes (including diameter at breast, tree height, and crown closure), and the aboveground carbon storage (AGC) of moso bamboo forest in Shanchuan Town, Anji County, Zhejiang Province were investigated. The results showed that the moso bamboo forest could be accurately delineated by integrating the multi-scale ima ge segmentation in OBIA technique and CART, which connected the image objects at various scales, with a pretty good producer's accuracy of 89.1%. The investigation of indexes estimated by regression tree model that was constructed based on the features extracted from the image objects reached normal or better accuracy, in which the crown closure model archived the best estimating accuracy of 67.9%. The estimating accuracy of diameter at breast and tree height was relatively low, which was consistent with conclusion that estimating diameter at breast and tree height using optical remote sensing could not achieve satisfactory results. Estimation of AGC reached relatively high accuracy, and accuracy of the region of high value achieved above 80%.
David J. Nowak; Jeffrey T. Walton; James Baldwin; Jerry Bond
2015-01-01
Information on street trees is critical for management of this important resource. Sampling of street tree populations provides an efficient means to obtain street tree population information. Long-term repeat measures of street tree samples supply additional information on street tree changes and can be used to report damages from catastrophic events. Analyses of...
ETE: a python Environment for Tree Exploration.
Huerta-Cepas, Jaime; Dopazo, Joaquín; Gabaldón, Toni
2010-01-13
Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org.
ETE: a python Environment for Tree Exploration
2010-01-01
Background Many bioinformatics analyses, ranging from gene clustering to phylogenetics, produce hierarchical trees as their main result. These are used to represent the relationships among different biological entities, thus facilitating their analysis and interpretation. A number of standalone programs are available that focus on tree visualization or that perform specific analyses on them. However, such applications are rarely suitable for large-scale surveys, in which a higher level of automation is required. Currently, many genome-wide analyses rely on tree-like data representation and hence there is a growing need for scalable tools to handle tree structures at large scale. Results Here we present the Environment for Tree Exploration (ETE), a python programming toolkit that assists in the automated manipulation, analysis and visualization of hierarchical trees. ETE libraries provide a broad set of tree handling options as well as specific methods to analyze phylogenetic and clustering trees. Among other features, ETE allows for the independent analysis of tree partitions, has support for the extended newick format, provides an integrated node annotation system and permits to link trees to external data such as multiple sequence alignments or numerical arrays. In addition, ETE implements a number of built-in analytical tools, including phylogeny-based orthology prediction and cluster validation techniques. Finally, ETE's programmable tree drawing engine can be used to automate the graphical rendering of trees with customized node-specific visualizations. Conclusions ETE provides a complete set of methods to manipulate tree data structures that extends current functionality in other bioinformatic toolkits of a more general purpose. ETE is free software and can be downloaded from http://ete.cgenomics.org. PMID:20070885
Nutrient limitation in soils and trees of a treeline ecotone in Rolwaling Himal, Nepal
NASA Astrophysics Data System (ADS)
Drollinger, Simon; Müller, Michael; Schickhoff, Udo; Böhner, Jürgen; Scholten, Thomas
2015-04-01
At a global scale, tree growth and thus the position of natural alpine treelines is limited by low temperatures. At landscape and local scales, however, the treeline position depends on multiple interactions of influencing factors and mechanisms. The aim of our research is to understand local scale effects of soil properties and nutrient cycling on tree growth limitation, and their interactions with other abiotic and biotic factors, in a near-natural alpine treeline ecotone of Rolwaling Himal, Nepal. In total 48 plots (20 m x 20 m) were investigated. Three north-facing slopes were separated in four different altitudinal zones with the characteristic vegetation of tree species Rhododendron campanulatum, Abies spectabilis, Betula utilis, Sorbus microphylla and Acer spec. We collected 151 soil horizon samples (Ah, Ae, Bh, Bs), 146 litter layer samples (L), and 146 decomposition layer samples (Of) in 2013, as well as 251 leaves from standing biomass (SB) in 2013 and 2014. All samples were analysed for exchangeable cations or nutrient concentrations of C, N, P, K, Mg, Ca, Mn, Fe and Al. Soil moisture, soil and surface air temperatures were measured by 34 installed sensors. Precipitation and air temperatures were measured by three climate stations. The main pedogenic process is leaching of dissolved organic carbon, aluminium and iron from topsoil to subsoil. Soil types are classified as podzols with generally low nutrient concentrations. Soil acidity is extremely high and humus quality of mineral soils is poor. Our results indicate multilateral interactions and a great spatial variability of essential nutrients within the treeline ecotone. Both, soil nutrients and leave macronutrient concentrations of nitrogen (N), magnesium (Mg), potassium (K) decrease significantly with elevation in the treeline ecotone. Besides, phosphorus (P) foliar concentrations decrease significantly with elevation. Based on regression analyses, low soil temperatures and malnutrition most likely affect tree growth in high altitudes. Thus, we assume a high influence of soil properties and nutrient supply on the position of alpine treeline at a local scale. In addition, a manganese (Mn) excess in foliage of woody species was determined above treeline. With the help of multivariate statistical approaches, potential determining factors of treeline position could be quantified.
NASA Astrophysics Data System (ADS)
Kriegs, Stefanie; Buddenbaum, Henning; Rogge, Derek; Steffens, Markus
2015-04-01
Laboratory imaging Vis-NIR spectroscopy of soil profiles is a novel technique in soil science that can determine quantity and quality of various chemical soil properties with a hitherto unreached spatial resolution in undisturbed soil profiles. We have applied this technique to soil cores in order to get quantitative proof of redoximorphic processes under two different tree species and to proof tree-soil interactions at microscale. Due to the imaging capabilities of Vis-NIR spectroscopy a spatially explicit understanding of soil processes and properties can be achieved. Spatial heterogeneity of the soil profile can be taken into account. We took six 30 cm long rectangular soil columns of adjacent Luvisols derived from quaternary aeolian sediments (Loess) in a forest soil near Freising/Bavaria using stainless steel boxes (100×100×300 mm). Three profiles were sampled under Norway spruce and three under European beech. A hyperspectral camera (VNIR, 400-1000 nm in 160 spectral bands) with spatial resolution of 63×63 µm² per pixel was used for data acquisition. Reference samples were taken at representative spots and analysed for organic carbon (OC) quantity and quality with a CN elemental analyser and for iron oxides (Fe) content using dithionite extraction followed by ICP-OES measurement. We compared two supervised classification algorithms, Spectral Angle Mapper and Maximum Likelihood, using different sets of training areas and spectral libraries. As established in chemometrics we used multivariate analysis such as partial least-squares regression (PLSR) in addition to multivariate adaptive regression splines (MARS) to correlate chemical data with Vis-NIR spectra. As a result elemental mapping of Fe and OC within the soil core at high spatial resolution has been achieved. The regression model was validated by a new set of reference samples for chemical analysis. Digital soil classification easily visualizes soil properties within the soil profiles. By combining both techniques, detailed soil maps, elemental balances and a deeper understanding of soil forming processes at the microscale become feasible for complete soil profiles.
Zhao, Cai-Yun; Xu, Jing; Liu, Xiao-Yan
2017-01-01
Abstract Globalization increases the opportunities for unintentionally introduced invasive alien species, especially for insects, and most of these species could damage ecosystems and cause economic loss in China. In this study, we analyzed drivers of the distribution of unintentionally introduced invasive alien insects. Based on the number of unintentionally introduced invasive alien insects and their presence/absence records in each province in mainland China, regression trees were built to elucidate the roles of environmental and anthropogenic factors on the number distribution and similarity of species composition of these insects. Classification and regression trees indicated climatic suitability (the mean temperature in January) and human economic activity (sum of total freight) are primary drivers for the number distribution pattern of unintentionally introduced invasive alien insects at provincial scale, while only environmental factors (the mean January temperature, the annual precipitation and the areas of provinces) significantly affect the similarity of them based on the multivariate regression trees. PMID:28973576
Teets, Aaron; Fraver, Shawn; Weiskittel, Aaron R; Hollinger, David Y
2018-03-11
A range of environmental factors regulate tree growth; however, climate is generally thought to most strongly influence year-to-year variability in growth. Numerous dendrochronological (tree-ring) studies have identified climate factors that influence year-to-year variability in growth for given tree species and location. However, traditional dendrochronology methods have limitations that prevent them from adequately assessing stand-level (as opposed to species-level) growth. We argue that stand-level growth analyses provide a more meaningful assessment of forest response to climate fluctuations, as well as the management options that may be employed to sustain forest productivity. Working in a mature, mixed-species stand at the Howland Research Forest of central Maine, USA, we used two alternatives to traditional dendrochronological analyses by (1) selecting trees for coring using a stratified (by size and species), random sampling method that ensures a representative sample of the stand, and (2) converting ring widths to biomass increments, which once summed, produced a representation of stand-level growth, while maintaining species identities or canopy position if needed. We then tested the relative influence of seasonal climate variables on year-to-year variability in the biomass increment using generalized least squares regression, while accounting for temporal autocorrelation. Our results indicate that stand-level growth responded most strongly to previous summer and current spring climate variables, resulting from a combination of individualistic climate responses occurring at the species- and canopy-position level. Our climate models were better fit to stand-level biomass increment than to species-level or canopy-position summaries. The relative growth responses (i.e., percent change) predicted from the most influential climate variables indicate stand-level growth varies less from to year-to-year than species-level or canopy-position growth responses. By assessing stand-level growth response to climate, we provide an alternative perspective on climate-growth relationships of forests, improving our understanding of forest growth dynamics under a fluctuating climate. © 2018 John Wiley & Sons Ltd.
Bevilacqua, M; Ciarapica, F E; Giacchetta, G
2008-07-01
This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.
Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali
2016-01-01
Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-01-01
Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332
Arenja, Nisha; Riffel, Johannes H; Fritz, Thomas; André, Florian; Aus dem Siepen, Fabian; Mueller-Hennessen, Matthias; Giannitsis, Evangelos; Katus, Hugo A; Friedrich, Matthias G; Buss, Sebastian J
2017-06-01
Purpose To assess the utility of established functional markers versus two additional functional markers derived from standard cardiovascular magnetic resonance (MR) images for their incremental diagnostic and prognostic information in patients with nonischemic dilated cardiomyopathy (NIDCM). Materials and Methods Approval was obtained from the local ethics committee. MR images from 453 patients with NIDCM and 150 healthy control subjects were included between 2005 and 2013 and were analyzed retrospectively. Myocardial contraction fraction (MCF) was calculated by dividing left ventricular (LV) stroke volume by LV myocardial volume, and long-axis strain (LAS) was calculated from the distances between the epicardial border of the LV apex and the midpoint of a line connecting the origins of the mitral valve leaflets at end systole and end diastole. Receiver operating characteristic curve, Kaplan-Meier method, Cox regression, and classification and regression tree (CART) analyses were performed for diagnostic and prognostic performances. Results LAS (area under the receiver operating characteristic curve [AUC] = 0.93, P < .001) and MCF (AUC = 0.92, P < .001) can be used to discriminate patients with NIDCM from age- and sex-matched control subjects. A total of 97 patients reached the combined end point during a median follow-up of 4.8 years. In multivariate Cox regression analysis, only LV ejection fraction (EF) and LAS independently indicated the combined end point (hazard ratio = 2.8 and 1.9, respectively; P < .001 for both). In a risk stratification approach with classification and regression tree analysis, combined LV EF and LAS cutoff values were used to stratify patients into three risk groups (log-rank test, P < .001). Conclusion Cardiovascular MR-derived MCF and LAS serve as reliable diagnostic and prognostic markers in patients with NIDCM. LAS, as a marker for longitudinal contractile function, is an independent parameter for outcome and offers incremental information beyond LV EF and the presence of myocardial fibrosis. © RSNA, 2017 Online supplemental material is available for this article.
NASA Astrophysics Data System (ADS)
Pathak, Prasad A.
The Arctic region of Alaska is experiencing severe impacts of climate change. The Arctic lakes ecosystems are bound to undergo alterations in its trophic structure and other chemical properties. However, landscape factors controlling the lake influxes were not studied till date. This research has examined the currently existing lake landscape interactions using Remote Sensing and GIS technology. The statistical modeling was carried out using Regression and CART methods. Remote sensing data was applied to derive the required landscape indices. Remote sensing in the Arctic Alaska faces many challenges including persistent cloud cover, low sun angle and limited snow free period. Tundra vegetation types are interspersed and intricate to classify unlike managed forest stands. Therefore, historical studies have remained underachieved with respect thematic accuracies. However, looking at vegetation communities at watershed level and the implementation of expert classification system achieved the accuracies up to 90%. The research has highlighted the probable role of interactions between vegetation root zones, nutrient availability within active zone, as well as importance of permafrost thawing. Multiple regression analyses and Classification Trees were developed to understand relationships between landscape factors with various chemical parameters as well as chlorophyll readings. Spatial properties of Shrubs and Riparian complexes such as complexity of individual patches at watershed level and within proximity of water channels were influential on Chlorophyll production of lakes. Till-age had significant impact on Total Nitrogen contents. Moreover, relatively young tills exhibited significantly positive correlation with concentration of various ions and conductivity of lakes. Similarly, density of patches of Heath complexes was found to be important with respect to Total Phosphorus contents in lakes. All the regression models developed in this study were significant at 95% confidence level. However, the classification trees could not achieve high predictabilities due to limited number of lakes sampled. Keywords: Landscape factors, Lake primary productivity, Arctic, Climate change, Regression, CART
Gretchen G. Moisen; Elizabeth A. Freeman; Jock A. Blackard; Tracey S. Frescino; Niklaus E. Zimmermann; Thomas C. Edwards
2006-01-01
Many efforts are underway to produce broad-scale forest attribute maps by modelling forest class and structure variables collected in forest inventories as functions of satellite-based and biophysical information. Typically, variants of classification and regression trees implemented in Rulequest's© See5 and Cubist (for binary and continuous responses,...
E. Freeman; G. Moisen; J. Coulston; B. Wilson
2014-01-01
Random forests (RF) and stochastic gradient boosting (SGB), both involving an ensemble of classification and regression trees, are compared for modeling tree canopy cover for the 2011 National Land Cover Database (NLCD). The objectives of this study were twofold. First, sensitivity of RF and SGB to choices in tuning parameters was explored. Second, performance of the...
Jose Negron
1997-01-01
Classification trees and linear regression analysis were used to build models to predict probabilities of infestation and amount of tree mortality in terms of basal area resulting from roundheaded pine beetle, Dendroctonus adjunctus Blandford, activity in ponderosa pine, Pinus ponderosa Laws., in the Sacramento Mountains, New Mexico. Classification trees were built for...
An Extension of CART's Pruning Algorithm. Program Statistics Research Technical Report No. 91-11.
ERIC Educational Resources Information Center
Kim, Sung-Ho
Among the computer-based methods used for the construction of trees such as AID, THAID, CART, and FACT, the only one that uses an algorithm that first grows a tree and then prunes the tree is CART. The pruning component of CART is analogous in spirit to the backward elimination approach in regression analysis. This idea provides a tool in…
STX--Fortran-4 program for estimates of tree populations from 3P sample-tree-measurements
L. R. Grosenbaugh
1967-01-01
Describes how to use an improved and greatly expanded version of an earlier computer program (1964) that converts dendrometer measurements of 3P-sample trees to population values in terms of whatever units user desires. Many new options are available, including that of obtaining a product-yield and appraisal report based on regression coefficients supplied by user....
Portable Language-Independent Adaptive Translation from OCR. Phase 1
2009-04-01
including brute-force k-Nearest Neighbors ( kNN ), fast approximate kNN using hashed k-d trees, classification and regression trees, and locality...achieved by refinements in ground-truthing protocols. Recent algorithmic improvements to our approximate kNN classifier using hashed k-D trees allows...recent years discriminative training has been shown to outperform phonetic HMMs estimated using ML for speech recognition. Standard ML estimation
Phylogenomic analyses data of the avian phylogenomics project.
Jarvis, Erich D; Mirarab, Siavash; Aberer, Andre J; Li, Bo; Houde, Peter; Li, Cai; Ho, Simon Y W; Faircloth, Brant C; Nabholz, Benoit; Howard, Jason T; Suh, Alexander; Weber, Claudia C; da Fonseca, Rute R; Alfaro-Núñez, Alonzo; Narula, Nitish; Liu, Liang; Burt, Dave; Ellegren, Hans; Edwards, Scott V; Stamatakis, Alexandros; Mindell, David P; Cracraft, Joel; Braun, Edward L; Warnow, Tandy; Jun, Wang; Gilbert, M Thomas Pius; Zhang, Guojie
2015-01-01
Determining the evolutionary relationships among the major lineages of extant birds has been one of the biggest challenges in systematic biology. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders. We used these genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomic analyses. Here we present the datasets associated with the phylogenomic analyses, which include sequence alignment files consisting of nucleotides, amino acids, indels, and transposable elements, as well as tree files containing gene trees and species trees. Inferring an accurate phylogeny required generating: 1) A well annotated data set across species based on genome synteny; 2) Alignments with unaligned or incorrectly overaligned sequences filtered out; and 3) Diverse data sets, including genes and their inferred trees, indels, and transposable elements. Our total evidence nucleotide tree (TENT) data set (consisting of exons, introns, and UCEs) gave what we consider our most reliable species tree when using the concatenation-based ExaML algorithm or when using statistical binning with the coalescence-based MP-EST algorithm (which we refer to as MP-EST*). Other data sets, such as the coding sequence of some exons, revealed other properties of genome evolution, namely convergence. The Avian Phylogenomics Project is the largest vertebrate phylogenomics project to date that we are aware of. The sequence, alignment, and tree data are expected to accelerate analyses in phylogenomics and other related areas.
Zhou, Xiaofan; Shen, Xing-Xing; Hittinger, Chris Todd
2018-01-01
Abstract The sizes of the data matrices assembled to resolve branches of the tree of life have increased dramatically, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these programs are widely used, a systematic evaluation and comparison of their performance using empirical genome-scale data matrices has so far been lacking. To address this question, we evaluated these four programs on 19 empirical phylogenomic data sets with hundreds to thousands of genes and up to 200 taxa with respect to likelihood maximization, tree topology, and computational speed. For single-gene tree inference, we found that the more exhaustive and slower strategies (ten searches per alignment) outperformed faster strategies (one tree search per alignment) using RAxML, PhyML, or IQ-TREE. Interestingly, single-gene trees inferred by the three programs yielded comparable coalescent-based species tree estimations. For concatenation-based species tree inference, IQ-TREE consistently achieved the best-observed likelihoods for all data sets, and RAxML/ExaML was a close second. In contrast, PhyML often failed to complete concatenation-based analyses, whereas FastTree was the fastest but generated lower likelihood values and more dissimilar tree topologies in both types of analyses. Finally, data matrix properties, such as the number of taxa and the strength of phylogenetic signal, sometimes substantially influenced the programs’ relative performance. Our results provide real-world gene and species tree phylogenetic inference benchmarks to inform the design and execution of large-scale phylogenomic data analyses. PMID:29177474
Improving Cluster Analysis with Automatic Variable Selection Based on Trees
2014-12-01
regression trees Daisy DISsimilAritY PAM partitioning around medoids PMA penalized multivariate analysis SPC sparse principal components UPGMA unweighted...unweighted pair-group average method ( UPGMA ). This method measures dissimilarities between all objects in two clusters and takes the average value
Predicting U.S. Army Reserve Unit Manning Using Market Demographics
2015-06-01
develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S
Simulation of land use change in the three gorges reservoir area based on CART-CA
NASA Astrophysics Data System (ADS)
Yuan, Min
2018-05-01
This study proposes a new method to simulate spatiotemporal complex multiple land uses by using classification and regression tree algorithm (CART) based CA model. In this model, we use classification and regression tree algorithm to calculate land class conversion probability, and combine neighborhood factor, random factor to extract cellular transformation rules. The overall Kappa coefficient is 0.8014 and the overall accuracy is 0.8821 in the land dynamic simulation results of the three gorges reservoir area from 2000 to 2010, and the simulation results are satisfactory.
Gamal El-Dien, Omnia; Ratcliffe, Blaise; Klápště, Jaroslav; Chen, Charles; Porth, Ilga; El-Kassaby, Yousry A
2015-05-09
Genomic selection (GS) in forestry can substantially reduce the length of breeding cycle and increase gain per unit time through early selection and greater selection intensity, particularly for traits of low heritability and late expression. Affordable next-generation sequencing technologies made it possible to genotype large numbers of trees at a reasonable cost. Genotyping-by-sequencing was used to genotype 1,126 Interior spruce trees representing 25 open-pollinated families planted over three sites in British Columbia, Canada. Four imputation algorithms were compared (mean value (MI), singular value decomposition (SVD), expectation maximization (EM), and a newly derived, family-based k-nearest neighbor (kNN-Fam)). Trees were phenotyped for several yield and wood attributes. Single- and multi-site GS prediction models were developed using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) and the Generalized Ridge Regression (GRR) to test different assumption about trait architecture. Finally, using PCA, multi-trait GS prediction models were developed. The EM and kNN-Fam imputation methods were superior for 30 and 60% missing data, respectively. The RR-BLUP GS prediction model produced better accuracies than the GRR indicating that the genetic architecture for these traits is complex. GS prediction accuracies for multi-site were high and better than those of single-sites while multi-site predictability produced the lowest accuracies reflecting type-b genetic correlations and deemed unreliable. The incorporation of genomic information in quantitative genetics analyses produced more realistic heritability estimates as half-sib pedigree tended to inflate the additive genetic variance and subsequently both heritability and gain estimates. Principle component scores as representatives of multi-trait GS prediction models produced surprising results where negatively correlated traits could be concurrently selected for using PCA2 and PCA3. The application of GS to open-pollinated family testing, the simplest form of tree improvement evaluation methods, was proven to be effective. Prediction accuracies obtained for all traits greatly support the integration of GS in tree breeding. While the within-site GS prediction accuracies were high, the results clearly indicate that single-site GS models ability to predict other sites are unreliable supporting the utilization of multi-site approach. Principle component scores provided an opportunity for the concurrent selection of traits with different phenotypic optima.
Xiaoping Zhou; Miles A. Hemstrom
2010-01-01
Timber availability, aboveground tree biomass, and changes in aboveground carbon pools are important consequences of landscape management. There are several models available for calculating tree volume and aboveground tree biomass pools. This paper documents species-specific regional equations for tree volume and aboveground live tree biomass estimation that might be...
Batterham, Philip J; Christensen, Helen; Mackinnon, Andrew J
2009-11-22
Relative to physical health conditions such as cardiovascular disease, little is known about risk factors that predict the prevalence of depression. The present study investigates the expected effects of a reduction of these risks over time, using the decision tree method favoured in assessing cardiovascular disease risk. The PATH through Life cohort was used for the study, comprising 2,105 20-24 year olds, 2,323 40-44 year olds and 2,177 60-64 year olds sampled from the community in the Canberra region, Australia. A decision tree methodology was used to predict the presence of major depressive disorder after four years of follow-up. The decision tree was compared with a logistic regression analysis using ROC curves. The decision tree was found to distinguish and delineate a wide range of risk profiles. Previous depressive symptoms were most highly predictive of depression after four years, however, modifiable risk factors such as substance use and employment status played significant roles in assessing the risk of depression. The decision tree was found to have better sensitivity and specificity than a logistic regression using identical predictors. The decision tree method was useful in assessing the risk of major depressive disorder over four years. Application of the model to the development of a predictive tool for tailored interventions is discussed.
A study of Solar-Enso correlation with southern Brazil tree ring index (1955- 1991)
NASA Astrophysics Data System (ADS)
Rigozo, N.; Nordemann, D.; Vieira, L.; Echer, E.
The effects of solar activity and El Niño-Southern Oscillation on tree growth in Southern Brazil were studied by correlation analysis. Trees for this study were native Araucaria (Araucaria Angustifolia)from four locations in Rio Grande do Sul State, in Southern Brazil: Canela (29o18`S, 50o51`W, 790 m asl), Nova Petropolis (29o2`S, 51o10`W, 579 m asl), Sao Francisco de Paula (29o25`S, 50o24`W, 930 m asl) and Sao Martinho da Serra (29o30`S, 53o53`W, 484 m asl). From these four sites, an average tree ring Index for this region was derived, for the period 1955-1991. Linear correlations were made on annual and 10 year running averages of this tree ring Index, of sunspot number Rz and SOI. For annual averages, the correlation coefficients were low, and the multiple regression between tree ring and SOI and Rz indicates that 20% of the variance in tree rings was explained by solar activity and ENSO variability. However, when the 10 year running averages correlations were made, the coefficient correlations were much higher. A clear anticorrelation is observed between SOI and Index (r=-0.81) whereas Rz and Index show a positive correlation (r=0.67). The multiple regression of 10 year running averages indicates that 76% of the variance in tree ring INdex was explained by solar activity and ENSO. These results indicate that the effects of solar activity and ENSO on tree rings are better seen on long timescales.
Predicting water suppy and actual evapotranspiration of street trees
NASA Astrophysics Data System (ADS)
Wessolek, Gerd; Heiner, Moreen; Trinks, Steffen
2017-04-01
It's well known that street trees cool air temperature in summer-time by transpiration and shading and also reduce runoff. However, it's difficult to analyse if trees have water shortage or not. This contribution focus on predicting water supply, actual evapotranspiration, and runoff by using easily available climate data (precipiation, potential evapotranspiration) and site characteristics (water retention, space, sealing degree, groundwater depth). These parameter were used as input data for Hydro-Pedotransfer-Functions (HPTFs) allowing the estimation of the annual water budget. Results give statements on water supply of trees, drought stress, and additional water demand by irrigation. Procedure also analyse, to which extent the surrounding partly sealed surfaces deliver water to the trees. Four representative street canyons of Berlin City were analysed and evaluated within in training program for M.A. students of „Urban Eco-system Science" at the Technische Universität Berlin.
Estimating the Effective Sample Size of Tree Topologies from Bayesian Phylogenetic Analyses
Lanfear, Robert; Hua, Xia; Warren, Dan L.
2016-01-01
Bayesian phylogenetic analyses estimate posterior distributions of phylogenetic tree topologies and other parameters using Markov chain Monte Carlo (MCMC) methods. Before making inferences from these distributions, it is important to assess their adequacy. To this end, the effective sample size (ESS) estimates how many truly independent samples of a given parameter the output of the MCMC represents. The ESS of a parameter is frequently much lower than the number of samples taken from the MCMC because sequential samples from the chain can be non-independent due to autocorrelation. Typically, phylogeneticists use a rule of thumb that the ESS of all parameters should be greater than 200. However, we have no method to calculate an ESS of tree topology samples, despite the fact that the tree topology is often the parameter of primary interest and is almost always central to the estimation of other parameters. That is, we lack a method to determine whether we have adequately sampled one of the most important parameters in our analyses. In this study, we address this problem by developing methods to estimate the ESS for tree topologies. We combine these methods with two new diagnostic plots for assessing posterior samples of tree topologies, and compare their performance on simulated and empirical data sets. Combined, the methods we present provide new ways to assess the mixing and convergence of phylogenetic tree topologies in Bayesian MCMC analyses. PMID:27435794
Drivers of metacommunity structure diverge for common and rare Amazonian tree species.
Bispo, Polyanna da Conceição; Balzter, Heiko; Malhi, Yadvinder; Slik, J W Ferry; Dos Santos, João Roberto; Rennó, Camilo Daleles; Espírito-Santo, Fernando D; Aragão, Luiz E O C; Ximenes, Arimatéa C; Bispo, Pitágoras da Conceição
2017-01-01
We analysed the flora of 46 forest inventory plots (25 m x 100 m) in old growth forests from the Amazonian region to identify the role of environmental (topographic) and spatial variables (obtained using PCNM, Principal Coordinates of Neighbourhood Matrix analysis) for common and rare species. For the analyses, we used multiple partial regression to partition the specific effects of the topographic and spatial variables on the univariate data (standardised richness, total abundance and total biomass) and partial RDA (Redundancy Analysis) to partition these effects on composition (multivariate data) based on incidence, abundance and biomass. The different attributes (richness, abundance, biomass and composition based on incidence, abundance and biomass) used to study this metacommunity responded differently to environmental and spatial processes. Considering standardised richness, total abundance (univariate) and composition based on biomass, the results for common species differed from those obtained for all species. On the other hand, for total biomass (univariate) and for compositions based on incidence and abundance, there was a correspondence between the data obtained for the total community and for common species. Our data also show that in general, environmental and/or spatial components are important to explain the variability in tree communities for total and common species. However, with the exception of the total abundance, the environmental and spatial variables measured were insufficient to explain the attributes of the communities of rare species. These results indicate that predicting the attributes of rare tree species communities based on environmental and spatial variables is a substantial challenge. As the spatial component was relevant for several community attributes, our results demonstrate the importance of using a metacommunities approach when attempting to understand the main ecological processes underlying the diversity of tropical forest communities.
Suzuki, Satoshi N; Kachi, Naoki; Suzuki, Jun-Ichirou
2008-09-01
During the development of an even-aged plant population, the spatial distribution of individuals often changes from a clumped pattern to a random or regular one. The development of local size hierarchies in an Abies forest was analysed for a period of 47 years following a large disturbance in 1959. In 1980 all trees in an 8 x 8 m plot were mapped and their height growth after the disturbance was estimated. Their mortality and growth were then recorded at 1- to 4-year intervals between 1980 and 2006. Spatial distribution patterns of trees were analysed by the pair correlation function. Spatial correlations between tree heights were analysed with a spatial autocorrelation function and the mark correlation function. The mark correlation function was able to detect a local size hierarchy that could not be detected by the spatial autocorrelation function alone. The small-scale spatial distribution pattern of trees changed from clumped to slightly regular during the 47 years. Mortality occurred in a density-dependent manner, which resulted in regular spacing between trees after 1980. The spatial autocorrelation and mark correlation functions revealed the existence of tree patches consisting of large trees at the initial stage. Development of a local size hierarchy was detected within the first decade after the disturbance, although the spatial autocorrelation was not negative. Local size hierarchies that developed persisted until 2006, and the spatial autocorrelation became negative at later stages (after about 40 years). This is the first study to detect local size hierarchies as a prelude to regular spacing using the mark correlation function. The results confirm that use of the mark correlation function together with the spatial autocorrelation function is an effective tool to analyse the development of a local size hierarchy of trees in a forest.
Tamulonis, Kathryn L.; Kappel, William M.
2009-01-01
Dendrogeomorphic techniques were used to assess soil movement within the Rattlesnake Gulf landslide in the Tully Valley of central New York during the last century. This landslide is a postglacial, slow-moving earth slide that covers 23 acres and consists primarily of rotated, laminated, glaciolacustrine silt and clay. Sixty-two increment cores were obtained from 30 hemlock (Tsuga canadensis) trees across the active part of the landslide and from 3 control sites to interpret the soil-displacement history. Annual growth rings were measured and reaction wood was identified to indicate years in which ring growth changed from concentric to eccentric, on the premise that soil movement triggered compensatory growth in displaced trees. These data provided a basis for an 'event index' to identify years of landslide activity over the 108 years of record represented by the oldest trees. Event-index values and total annual precipitation increased during this time, but years with sudden event-index increases did not necessarily correspond to years with above-average precipitation. Multiple-regression and residual-values analyses indicated a possible correlation between precipitation and movement within the landslide and a possible cyclic (decades-long) tree-ring response to displacement within the landslide area from the toe upward to, and possibly beyond, previously formed landslide features. The soil movement is triggered by a sequence of factors that include (1) periods of several months with below-average precipitation followed by persistent above-average precipitation, (2) the attendant increase in streamflow, which erodes the landslide toe and results in an upslope propagation of slumping, and (3) the harvesting of mature trees within this landslide during the last century and continuing to the present.
Machine learning reveals orbital interaction in materials
NASA Astrophysics Data System (ADS)
Lam Pham, Tien; Kino, Hiori; Terakura, Kiyoyuki; Miyake, Takashi; Tsuda, Koji; Takigawa, Ichigaku; Chi Dam, Hieu
2017-12-01
We propose a novel representation of materials named an 'orbital-field matrix (OFM)', which is based on the distribution of valence shell electrons. We demonstrate that this new representation can be highly useful in mining material data. Experimental investigation shows that the formation energies of crystalline materials, atomization energies of molecular materials, and local magnetic moments of the constituent atoms in bimetal alloys of lanthanide metal and transition-metal can be predicted with high accuracy using the OFM. Knowledge regarding the role of the coordination numbers of the transition-metal and lanthanide elements in determining the local magnetic moments of the transition-metal sites can be acquired directly from decision tree regression analyses using the OFM.
James W. Flewelling
2009-01-01
Remotely sensed data can be used to make digital maps showing individual tree crowns (ITC) for entire forests. Attributes of the ITCs may include area, shape, height, and color. The crown map is sampled in a way that provides an unbiased linkage between ITCs and identifiable trees measured on the ground. Methods of avoiding edge bias are given. In an example from a...
Austin Troy; J. Morgan Grove; Jarlath O' Neill-Dunne
2012-01-01
The extent to which urban tree cover influences crime is in debate in the literature. This research took advantage of geocoded crime point data and high resolution tree canopy data to address this question in Baltimore City and County, MD, an area that includes a significant urban-rural gradient. Using ordinary least squares and spatially adjusted regression and...
Extensions and applications of ensemble-of-trees methods in machine learning
NASA Astrophysics Data System (ADS)
Bleich, Justin
Ensemble-of-trees algorithms have emerged to the forefront of machine learning due to their ability to generate high forecasting accuracy for a wide array of regression and classification problems. Classic ensemble methodologies such as random forests (RF) and stochastic gradient boosting (SGB) rely on algorithmic procedures to generate fits to data. In contrast, more recent ensemble techniques such as Bayesian Additive Regression Trees (BART) and Dynamic Trees (DT) focus on an underlying Bayesian probability model to generate the fits. These new probability model-based approaches show much promise versus their algorithmic counterparts, but also offer substantial room for improvement. The first part of this thesis focuses on methodological advances for ensemble-of-trees techniques with an emphasis on the more recent Bayesian approaches. In particular, we focus on extensions of BART in four distinct ways. First, we develop a more robust implementation of BART for both research and application. We then develop a principled approach to variable selection for BART as well as the ability to naturally incorporate prior information on important covariates into the algorithm. Next, we propose a method for handling missing data that relies on the recursive structure of decision trees and does not require imputation. Last, we relax the assumption of homoskedasticity in the BART model to allow for parametric modeling of heteroskedasticity. The second part of this thesis returns to the classic algorithmic approaches in the context of classification problems with asymmetric costs of forecasting errors. First we consider the performance of RF and SGB more broadly and demonstrate its superiority to logistic regression for applications in criminology with asymmetric costs. Next, we use RF to forecast unplanned hospital readmissions upon patient discharge with asymmetric costs taken into account. Finally, we explore the construction of stable decision trees for forecasts of violence during probation hearings in court systems.
Spatial Assessment of Model Errors from Four Regression Techniques
Lianjun Zhang; Jeffrey H. Gove; Jeffrey H. Gove
2005-01-01
Fomst modelers have attempted to account for the spatial autocorrelations among trees in growth and yield models by applying alternative regression techniques such as linear mixed models (LMM), generalized additive models (GAM), and geographicalIy weighted regression (GWR). However, the model errors are commonly assessed using average errors across the entire study...
Steen, Paul J.; Passino-Reader, Dora R.; Wiley, Michael J.
2006-01-01
As a part of the Great Lakes Regional Aquatic Gap Analysis Project, we evaluated methodologies for modeling associations between fish species and habitat characteristics at a landscape scale. To do this, we created brook trout Salvelinus fontinalis presence and absence models based on four different techniques: multiple linear regression, logistic regression, neural networks, and classification trees. The models were tested in two ways: by application to an independent validation database and cross-validation using the training data, and by visual comparison of statewide distribution maps with historically recorded occurrences from the Michigan Fish Atlas. Although differences in the accuracy of our models were slight, the logistic regression model predicted with the least error, followed by multiple regression, then classification trees, then the neural networks. These models will provide natural resource managers a way to identify habitats requiring protection for the conservation of fish species.
Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula
2011-01-01
Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
Analysis of biweight site chronologies: relative weights of individual trees over time
Kurt H. Riitters
1990-01-01
The relative weights on individual trees in a biweight site chronology can indicate the consistency of tree growth responses to macroclimate and can be the basis for stratifying trees in climate-growth analyses. This was explored with 45 years of ring-width indices for 200 trees from five even-aged jack pine (Pinus hanksiana Lamb.) stands. Average individual-tree...
USDA-ARS?s Scientific Manuscript database
Western juniper trees can cause late term abortions in cattle, similar to ponderosa pine trees. Analyses of western juniper trees from 35 locations across the state of Oregon suggest that western juniper trees in all areas present an abortion risk in pregnant cattle. Results from this study demonstr...
NASA Astrophysics Data System (ADS)
Ogle, K.; Fell, M.; Barber, J. J.
2016-12-01
Empirical, field studies of plant functional traits have revealed important trade-offs among pairs or triplets of traits, such as the leaf (LES) and wood (WES) economics spectra. Trade-offs include correlations between leaf longevity (LL) vs specific leaf area (SLA), LL vs mass-specific leaf respiration rate (RmL), SLA vs RmL, and resistance to breakage vs wood density. Ordination analyses (e.g., PCA) show groupings of traits that tend to align with different life-history strategies or taxonomic groups. It is unclear, however, what underlies such trade-offs and emergent spectra. Do they arise from inherent physiological constraints on growth, or are they more reflective of environmental filtering? The relative importance of these mechanisms has implications for predicting biogeochemical cycling, which is influenced by trait distributions of the plant community. We address this question using an individual-based model of tree growth (ACGCA) to quantify the theoretical trait space of trees that emerges from physiological constraints. ACGCA's inputs include 32 physiological, anatomical, and allometric traits, many of which are related to the LES and WES. We fit ACGCA to 1.6 million USFS FIA observations of tree diameters and heights to obtain vectors of trait values that produce realistic growth, and we explored the structure of this trait space. No notable correlations emerged among the 496 trait pairs, but stepwise regressions revealed complicated multi-variate structure: e.g., relationships between pairs of traits (e.g., RmL and SLA) are governed by other traits (e.g., LL, radiation-use efficiency [RUE]). We also simulated growth under various canopy gap scenarios that impose varying degrees of environmental filtering to explore the multi-dimensional trait space (hypervolume) of trees that died vs survived. The centroid and volume of the hypervolumes differed among dead and live trees, especially under gap conditions leading to low mortality. Traits most predictive of tree-level mortality were maximum tree height, RUE, xylem conducting area, and branch turn-over rate. We are using these hypervolumes as priors to an emulator that approximates the ACGCA, which we are fitting to the FIA data to quantify species-specific trait spectra and to explore factors giving rise to species differences.
Zhang, Ling Yu; Liu, Zhao Gang
2017-12-01
Based on the data collected from 108 permanent plots of the forest resources survey in Maoershan Experimental Forest Farm during 2004-2016, this study investigated the spatial distribution of recruitment trees in natural secondary forest by global Poisson regression and geographically weighted Poisson regression (GWPR) with four bandwidths of 2.5, 5, 10 and 15 km. The simulation effects of the 5 regressions and the factors influencing the recruitment trees in stands were analyzed, a description was given to the spatial autocorrelation of the regression residuals on global and local levels using Moran's I. The results showed that the spatial distribution of the number of natural secondary forest recruitment was significantly influenced by stands and topographic factors, especially average DBH. The GWPR model with small scale (2.5 km) had high accuracy of model fitting, a large range of model parameter estimates was generated, and the localized spatial distribution effect of the model parameters was obtained. The GWPR model at small scale (2.5 and 5 km) had produced a small range of model residuals, and the stability of the model was improved. The global spatial auto-correlation of the GWPR model residual at the small scale (2.5 km) was the lowe-st, and the local spatial auto-correlation was significantly reduced, in which an ideal spatial distribution pattern of small clusters with different observations was formed. The local model at small scale (2.5 km) was much better than the global model in the simulation effect on the spatial distribution of recruitment tree number.
Carroll B. Williams; David L. Azuma; George T. Ferrell
1992-01-01
Approximately 3.200 trees in young mixed-conifer stands were examined for pest activity and human-caused or mechanical injuries, and approximately 25 percent of these trees were randomly selected for stem analyses. The examination of trees felled for stem analyses showed that 409 (47 percent) were free of pests and 466 (53 percent) had one or more pest categories....
Shi, Huilan; Jia, Junya; Li, Dong; Wei, Li; Shang, Wenya; Zheng, Zhenfeng
2018-02-09
Precise renal histopathological diagnosis will guide therapy strategy in patients with lupus nephritis. Blood oxygen level dependent (BOLD) magnetic resonance imaging (MRI) has been applicable noninvasive technique in renal disease. This current study was performed to explore whether BOLD MRI could contribute to diagnose renal pathological pattern. Adult patients with lupus nephritis renal pathological diagnosis were recruited for this study. Renal biopsy tissues were assessed based on the lupus nephritis ISN/RPS 2003 classification. The Blood oxygen level dependent magnetic resonance imaging (BOLD-MRI) was used to obtain functional magnetic resonance parameter, R2* values. Several functions of R2* values were calculated and used to construct algorithmic models for renal pathological patterns. In addition, the algorithmic models were compared as to their diagnostic capability. Both Histopathology and BOLD MRI were used to examine a total of twelve patients. Renal pathological patterns included five classes III (including 3 as class III + V) and seven classes IV (including 4 as class IV + V). Three algorithmic models, including decision tree, line discriminant, and logistic regression, were constructed to distinguish the renal pathological pattern of class III and class IV. The sensitivity of the decision tree model was better than that of the line discriminant model (71.87% vs 59.48%, P < 0.001) and inferior to that of the Logistic regression model (71.87% vs 78.71%, P < 0.001). The specificity of decision tree model was equivalent to that of the line discriminant model (63.87% vs 63.73%, P = 0.939) and higher than that of the logistic regression model (63.87% vs 38.0%, P < 0.001). The Area under the ROC curve (AUROCC) of the decision tree model was greater than that of the line discriminant model (0.765 vs 0.629, P < 0.001) and logistic regression model (0.765 vs 0.662, P < 0.001). BOLD MRI is a useful non-invasive imaging technique for the evaluation of lupus nephritis. Decision tree models constructed using functions of R2* values may facilitate the prediction of renal pathological patterns.
Lin, Lei; Wang, Qian; Sadek, Adel W
2016-06-01
The duration of freeway traffic accidents duration is an important factor, which affects traffic congestion, environmental pollution, and secondary accidents. Among previous studies, the M5P algorithm has been shown to be an effective tool for predicting incident duration. M5P builds a tree-based model, like the traditional classification and regression tree (CART) method, but with multiple linear regression models as its leaves. The problem with M5P for accident duration prediction, however, is that whereas linear regression assumes that the conditional distribution of accident durations is normally distributed, the distribution for a "time-to-an-event" is almost certainly nonsymmetrical. A hazard-based duration model (HBDM) is a better choice for this kind of a "time-to-event" modeling scenario, and given this, HBDMs have been previously applied to analyze and predict traffic accidents duration. Previous research, however, has not yet applied HBDMs for accident duration prediction, in association with clustering or classification of the dataset to minimize data heterogeneity. The current paper proposes a novel approach for accident duration prediction, which improves on the original M5P tree algorithm through the construction of a M5P-HBDM model, in which the leaves of the M5P tree model are HBDMs instead of linear regression models. Such a model offers the advantage of minimizing data heterogeneity through dataset classification, and avoids the need for the incorrect assumption of normality for traffic accident durations. The proposed model was then tested on two freeway accident datasets. For each dataset, the first 500 records were used to train the following three models: (1) an M5P tree; (2) a HBDM; and (3) the proposed M5P-HBDM, and the remainder of data were used for testing. The results show that the proposed M5P-HBDM managed to identify more significant and meaningful variables than either M5P or HBDMs. Moreover, the M5P-HBDM had the lowest overall mean absolute percentage error (MAPE). Copyright © 2016 Elsevier Ltd. All rights reserved.
Phytoremediation of trichloroethene (TCE) using cottonwood trees
Jones, S.A.; Lee, R.W.; Kuniansky, E.L.; Leeson, Andrea; Alleman, Bruce C.
1999-01-01
Phytoremediation uses the natural ability of plants to degrade contaminants in ground water. A field demonstration designed to remediate aerobic shallow ground water that contains trichloroethene began in April 1996 with the planting of cottonwood trees over an approximately 0.2-hectare area at the Naval Air Station, Fort Worth, Tx. Ground water was sampled in July 1997, November 1997, February 1998, and June 1998. Analyses from samples indicate that tree roots have the potential to create anaerobic conditions in the ground water that will facilitate degradation of trichloroethene by microbially mediated reductive dichlorination. Dissolved oxygen concentrations, which varied across the site, were smallest near a mature cottonwood tree (about-20 years old) 60 meters southwest of the cottonwood plantings. Reduction of dissolved oxygen is the primary microbially mediated reaction occurring in the ground water beneath the planted trees, whereas near the mature cottonwood tree, data indicate that methanogenesis is the most probable reaction occurring. Reductive dichlorination either is not occurring or is not a primary process away from the mature tree. On the basis of isotopic analyses of carbon-13 at locations away from the mature tree, trichloroethene concentration is controlled by volatilization.Phytoremediation uses the natural ability of plants to degrade contaminants in ground water. A field demonstration designed to remediate aerobic shallow ground water that contains trichloroethene began in April 1996 with the planting of cottonwood trees over an approximately 0.2-hectare area at the Naval Air Station, Fort Worth, Tx. Ground water was sampled in July 1997, November 1997, February 1998, and June 1998. Analyses from samples indicate that tree roots have the potential to create anaerobic conditions in the ground water that will facilitate degradation of trichloroethene by microbially mediated reductive dichlorination. Dissolved oxygen concentrations, which varied across the site, were smallest near a mature cottonwood tree (about-20 years old) 60 meters southwest of the cottonwood plantings. Reduction of dissolved oxygen is the primary microbially mediated reaction occurring in the ground water beneath the planted trees, whereas near the mature cottonwood tree, data indicate that methanogenesis is the most probable reaction occurring. Reductive dichlorination either is not occurring or is not a primary process away from the mature tree. On the basis of isotopic analyses of carbon-13 at locations away from the mature tree, trichloroethene concentration is controlled by volatilization.
Urban tree mortality: a primer on demographic approaches
Lara A. Roman; John J. Battles; Joe R. McBride
2016-01-01
Realizing the benefits of tree planting programs depends on tree survival. Projections of urban forest ecosystem services and cost-benefit analyses are sensitive to assumptions about tree mortality rates. Long-term mortality data are needed to improve the accuracy of these models and optimize the public investment in tree planting. With more accurate population...
Observed Methods for Felling Hardwood Trees with Chain Saws
Jerry L. Koger
1983-01-01
The angles and lengths of the cutting surfaces made by chain saw operators on hardwood tree stumps are described by means, standard deviations, ranges, and regression equations. Recommended felling guidelines are compared with observed felling methods used by experienced timber cutters in the southern Appalachian Mountains.
One tree to link them all: a phylogenetic dataset for the European tetrapoda.
Roquet, Cristina; Lavergne, Sébastien; Thuiller, Wilfried
2014-08-08
Since the ever-increasing availability of phylogenetic informative data, the last decade has seen an upsurge of ecological studies incorporating information on evolutionary relationships among species. However, detailed species-level phylogenies are still lacking for many large groups and regions, which are necessary for comprehensive large-scale eco-phylogenetic analyses. Here, we provide a dataset of 100 dated phylogenetic trees for all European tetrapods based on a mixture of supermatrix and supertree approaches. Phylogenetic inference was performed separately for each of the main Tetrapoda groups of Europe except mammals (i.e. amphibians, birds, squamates and turtles) by means of maximum likelihood (ML) analyses of supermatrix applying a tree constraint at the family (amphibians and squamates) or order (birds and turtles) levels based on consensus knowledge. For each group, we inferred 100 ML trees to be able to provide a phylogenetic dataset that accounts for phylogenetic uncertainty, and assessed node support with bootstrap analyses. Each tree was dated using penalized-likelihood and fossil calibration. The trees obtained were well-supported by existing knowledge and previous phylogenetic studies. For mammals, we modified the most complete supertree dataset available on the literature to include a recent update of the Carnivora clade. As a final step, we merged the phylogenetic trees of all groups to obtain a set of 100 phylogenetic trees for all European Tetrapoda species for which data was available (91%). We provide this phylogenetic dataset (100 chronograms) for the purpose of comparative analyses, macro-ecological or community ecology studies aiming to incorporate phylogenetic information while accounting for phylogenetic uncertainty.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rupšys, P.
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
Wylie, Bruce K.; Howard, Daniel; Dahal, Devendra; Gilmanov, Tagir; Ji, Lei; Zhang, Li; Smith, Kelcy
2016-01-01
This paper presents the methodology and results of two ecological-based net ecosystem production (NEP) regression tree models capable of up scaling measurements made at various flux tower sites throughout the U.S. Great Plains. Separate grassland and cropland NEP regression tree models were trained using various remote sensing data and other biogeophysical data, along with 15 flux towers contributing to the grassland model and 15 flux towers for the cropland model. The models yielded weekly mean daily grassland and cropland NEP maps of the U.S. Great Plains at 250 m resolution for 2000–2008. The grassland and cropland NEP maps were spatially summarized and statistically compared. The results of this study indicate that grassland and cropland ecosystems generally performed as weak net carbon (C) sinks, absorbing more C from the atmosphere than they released from 2000 to 2008. Grasslands demonstrated higher carbon sink potential (139 g C·m−2·year−1) than non-irrigated croplands. A closer look into the weekly time series reveals the C fluctuation through time and space for each land cover type.
Zhao, Cai-Yun; Li, Jun-Sheng; Xu, Jing; Liu, Xiao-Yan
2017-05-01
Globalization increases the opportunities for unintentionally introduced invasive alien species, especially for insects, and most of these species could damage ecosystems and cause economic loss in China. In this study, we analyzed drivers of the distribution of unintentionally introduced invasive alien insects. Based on the number of unintentionally introduced invasive alien insects and their presence/absence records in each province in mainland China, regression trees were built to elucidate the roles of environmental and anthropogenic factors on the number distribution and similarity of species composition of these insects. Classification and regression trees indicated climatic suitability (the mean temperature in January) and human economic activity (sum of total freight) are primary drivers for the number distribution pattern of unintentionally introduced invasive alien insects at provincial scale, while only environmental factors (the mean January temperature, the annual precipitation and the areas of provinces) significantly affect the similarity of them based on the multivariate regression trees. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America.
Brito-Rocha, E; Schilling, A C; Dos Anjos, L; Piotto, D; Dalmolin, A C; Mielke, M S
2016-01-01
Individual leaf area (LA) is a key variable in studies of tree ecophysiology because it directly influences light interception, photosynthesis and evapotranspiration of adult trees and seedlings. We analyzed the leaf dimensions (length - L and width - W) of seedlings and adults of seven Neotropical rainforest tree species (Brosimum rubescens, Manilkara maxima, Pouteria caimito, Pouteria torta, Psidium cattleyanum, Symphonia globulifera and Tabebuia stenocalyx) with the objective to test the feasibility of single regression models to estimate LA of both adults and seedlings. In southern Bahia, Brazil, a first set of data was collected between March and October 2012. From the seven species analyzed, only two (P. cattleyanum and T. stenocalyx) had very similar relationships between LW and LA in both ontogenetic stages. For these two species, a second set of data was collected in August 2014, in order to validate the single models encompassing adult and seedlings. Our results show the possibility of development of models for predicting individual leaf area encompassing different ontogenetic stages for tropical tree species. The development of these models was more dependent on the species than the differences in leaf size between seedlings and adults.
NASA Astrophysics Data System (ADS)
Fadaei, H.; Suzuki, R.; Sakai, T.; Torii, K.
2012-07-01
Vegetation indices that provide important key to predict amount vegetation in forest such as percentage vegetation cover, aboveground biomass, and leaf-area index. Arid and semi-arid areas are not exempt of this rule. Arid and semi-arid areas of northeast Iran cover about 3.4 million ha and are populated by two main tree species, the broadleaf Pistacia vera (pistachio) and the conifer Juniperus excelsa ssp. polycarpos (Persian juniper). Natural stands of pistachio in Iran are not only environmentally important but also genetically essential as seed sources for pistachio production in orchards. We investigated the relationships between tree density and vegetation indices in the arid and semi-arid regions in the northeast of Iran by analysing Advanced Land Observing Satellite (ALOS) data PRISM is a panchromatic radiometer with a 2.5 m spatial resolution at nadir, and has one band with a wavelength of 0.52-0.77 μm (JAXA EORC). AVNIR-2 is a visible and near infrared radiometer for observing land and coastal zones with a 10 m spatial resolution at nadir, and has four multispectral bands: blue (0.42-0.50 μm), green (0.52-0.60 μm), red (0.61-0.69 μm), and near infrared (0.76-0.89 μm) (JAXA EORC). In this study, we estimated various vegetation indices using maximum filtering algorithm (5×5) and examined. This study carried out of juniper forests and natural pistachio stand using Advanced Land Observing Satellite (ALOS) and field inventories. Have been compared linear regression model of vegetation indices and proposed new vegetation index for arid and semi-arid regions. Also, we estimated the densities of juniper forests and natural pistachio stands using remote sensing to help in the sustainable management and production of pistachio in Iran. We present a new vegetation index for arid and semi-arid regions with sparse forest cover, the Total Ratio Vegetation Index (TRVI), and we investigate the relationship of the new index to tree density by analysing data from the Advanced Land Observing Satellite (ALOS) using 5×5 maximum filtering algorithms. The results for pistachio forest showed the coefficient regression of NDVI, SAVI, MSAVI, OSAVI, and TRVI were (R2= 0.68, 0.67, 0.68, 0.68, and 0.71) respectively. The results for juniper forest showed the coefficient regression of NDVI, SAVI, MSAVI, OSAVI, and TRVI were (R2= 0.51, 0.52, 0.51, 0.52, and 0.56) respectively. I hope this research can provide decision of managers to helping sustainable management for arid and semi-arid regions in Iran.
Ng, S S W; Lak, D C C; Lee, S C K; Ng, P P K
2015-03-01
Occupational therapists play a major role in the assessment and referral of clients with severe mental illness for supported employment. Nonetheless, there is scarce literature about the content and predictive validity of the process. In addition, the criteria of successful job matching have not been analysed and job supervisors have relied on experience rather than objective standards in recruitment. This study aimed to explore the profile of successful clients working in 'shop sales' in a supportive environment using a neurocognitive assessment protocol, and to validate the protocol against 'internal standards' of the job supervisors. This was a concurrent validation study of criterion-related scales for a single job type. The subjective ratings from the supervisors were concurrently validated against the results of neurocognitive assessment of intellectual function and work-related cognitive behaviour. A regression model was established for clients who succeeded and failed in employment using supervisor's ratings and a cutoff value of 10.5 for the Performance Fitness Rating Scale (R(2) = 0.918, F[41] = 3.794, p = 0.003). Classification And Regression Tree was also plotted to identify the profile of cases, with an overall accuracy of 0.861 (relative error, 0.26). Use of both inference statistics and data mining techniques enables the decision tree of neurocognitive assessments to be more readily applied by therapists in vocational rehabilitation, and thus directly improve the efficiency and efficacy of the process.
Deciphering factors controlling groundwater arsenic spatial variability in Bangladesh
NASA Astrophysics Data System (ADS)
Tan, Z.; Yang, Q.; Zheng, C.; Zheng, Y.
2017-12-01
Elevated concentrations of geogenic arsenic in groundwater have been found in many countries to exceed 10 μg/L, the WHO's guideline value for drinking water. A common yet unexplained characteristic of groundwater arsenic spatial distribution is the extensive variability at various spatial scales. This study investigates factors influencing the spatial variability of groundwater arsenic in Bangladesh to improve the accuracy of models predicting arsenic exceedance rate spatially. A novel boosted regression tree method is used to establish a weak-learning ensemble model, which is compared to a linear model using a conventional stepwise logistic regression method. The boosted regression tree models offer the advantage of parametric interaction when big datasets are analyzed in comparison to the logistic regression. The point data set (n=3,538) of groundwater hydrochemistry with 19 parameters was obtained by the British Geological Survey in 2001. The spatial data sets of geological parameters (n=13) were from the Consortium for Spatial Information, Technical University of Denmark, University of East Anglia and the FAO, while the soil parameters (n=42) were from the Harmonized World Soil Database. The aforementioned parameters were regressed to categorical groundwater arsenic concentrations below or above three thresholds: 5 μg/L, 10 μg/L and 50 μg/L to identify respective controlling factors. Boosted regression tree method outperformed logistic regression methods in all three threshold levels in terms of accuracy, specificity and sensitivity, resulting in an improvement of spatial distribution map of probability of groundwater arsenic exceeding all three thresholds when compared to disjunctive-kriging interpolated spatial arsenic map using the same groundwater arsenic dataset. Boosted regression tree models also show that the most important controlling factors of groundwater arsenic distribution include groundwater iron content and well depth for all three thresholds. The probability of a well with iron content higher than 5mg/L to contain greater than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be more than 91%, 85% and 51%, respectively, while the probability of a well from depth more than 160m to contain more than 5 μg/L, 10 μg/L and 50 μg/L As is estimated to be less than 38%, 25% and 14%, respectively.
Bayesian averaging over Decision Tree models for trauma severity scoring.
Schetinin, V; Jakaite, L; Krzanowski, W
2018-01-01
Health care practitioners analyse possible risks of misleading decisions and need to estimate and quantify uncertainty in predictions. We have examined the "gold" standard of screening a patient's conditions for predicting survival probability, based on logistic regression modelling, which is used in trauma care for clinical purposes and quality audit. This methodology is based on theoretical assumptions about data and uncertainties. Models induced within such an approach have exposed a number of problems, providing unexplained fluctuation of predicted survival and low accuracy of estimating uncertainty intervals within which predictions are made. Bayesian method, which in theory is capable of providing accurate predictions and uncertainty estimates, has been adopted in our study using Decision Tree models. Our approach has been tested on a large set of patients registered in the US National Trauma Data Bank and has outperformed the standard method in terms of prediction accuracy, thereby providing practitioners with accurate estimates of the predictive posterior densities of interest that are required for making risk-aware decisions. Copyright © 2017 Elsevier B.V. All rights reserved.
Healey, Ryan; Naugler, Christopher; de Koning, Lawrence; Patel, Jay L
2015-01-01
We sought to improve the diagnostic efficiency of flow cytometry investigation on blood by developing data-driven ordering guidelines. Our goal was to improve flow cytometry utilization by decreasing negative testing, therefore reducing healthcare costs. We investigated several laboratory tests performed alongside flow cytometry to identify biomarkers useful in excluding non-leukemic bloods. Test results and patient demographic features were subjected to receiver-operator characteristic (ROC) curve, logistic regression and classification tree analyses to find significant predictors and develop decision rules. Our data show that, in the absence of a compelling clinical indication, flow cytometry testing is largely non-informative on bloods from patients less than 50 years of age having an absolute lymphocyte count (ALC) below 5.0 × 10(9)/L. For patients over age 50 having an ALC below this value, a ferritin value above 450 μg/L is counter-indicative of B-cell clonality. Using these guidelines, 26% of cases were correctly predicted as negative with greater than 97% accuracy.
Field responses of Prunus serotina and Asclepias syriaca to ozone around southern Lake Michigan.
Bennett, J P; Jepsen, E A; Roth, J A
2006-07-01
Higher ozone concentrations east of southern Lake Michigan compared to west of the lake were used to test hypotheses about injury and growth effects on two plant species. We measured approximately 1000 black cherry trees and over 3000 milkweed stems from 1999 to 2001 for this purpose. Black cherry branch elongation and milkweed growth and pod formation were significantly higher west of Lake Michigan while ozone injury was greater east of Lake Michigan. Using classification and regression tree (CART) analyses we determined that departures from normal precipitation, soil nitrogen and ozone exposure/peak hourly concentrations were the most important variables affecting cherry branch elongation, and milkweed stem height and pod formation. The effects of ozone were not consistently comparable with the effects of soil nutrients, weather, insect or disease injury, and depended on species. Ozone SUM06 exposures greater than 13 ppm-h decreased cherry branch elongation 18%; peak 1-h exposures greater than 93 ppb reduced milkweed stem height 13%; and peak 1-h concentrations greater than 98 ppb reduced pod formation 11% in milkweed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrew G. Peterson; J. Timothy Ball; Yiqi Luo
1998-09-25
Estimation of leaf photosynthetic rate (A) from leaf nitrogen content (N) is both conceptually and numerically important in models of plant, ecosystem and biosphere responses to global change. The relationship between A and N has been studied extensively at ambient CO{sub 2} but much less at elevated CO{sub 2}. This study was designed to (1) assess whether the A-N relationship was more similar for species within than between community and vegetation types, and (2) examine how growth at elevated CO{sub 2} affects the A-N relationship. Data were obtained for 39 C{sub 3} species grown at ambient CO{sub 2} and 10more » C{sub 3} species grown at ambient and elevated CO{sub 2}. A regression model was applied to each species as well as to species pooled within different community and vegetation types. Cluster analysis of the regression coefficients indicated that species measured at ambient CO{sub 2} did not separate into distinct groups matching community or vegetation type. Instead, most community and vegetation types shared the same general parameter space for regression coefficients. Growth at elevated CO{sub 2} increased photosynthetic nitrogen use efficiency for pines and deciduous trees. When species were pooled by vegetation type, the A-N relationship for deciduous trees expressed on a leaf-mass bask was not altered by elevated CO{sub 2}, while the intercept increased for pines. When regression coefficients were averaged to give mean responses for different vegetation types, elevated CO{sub 2} increased the intercept and the slope for deciduous trees but increased only the intercept for pines. There were no statistical differences between the pines and deciduous trees for the effect of CO{sub 2}. Generalizations about the effect of elevated CO{sub 2} on the A-N relationship, and differences between pines and deciduous trees will be enhanced as more data become available.« less
NASA Astrophysics Data System (ADS)
Rao, M.; Vuong, H.
2013-12-01
The overall objective of this study is to develop a method for estimating total aboveground biomass of redwood stands in Jackson Demonstration State Forest, Mendocino, California using airborne LiDAR data. LiDAR data owing to its vertical and horizontal accuracy are increasingly being used to characterize landscape features including ground surface elevation and canopy height. These LiDAR-derived metrics involving structural signatures at higher precision and accuracy can help better understand ecological processes at various spatial scales. Our study is focused on two major species of the forest: redwood (Sequoia semperirens [D.Don] Engl.) and Douglas-fir (Pseudotsuga mensiezii [Mirb.] Franco). Specifically, the objectives included linear regression models fitting tree diameter at breast height (dbh) to LiDAR derived height for each species. From 23 random points on the study area, field measurement (dbh and tree coordinate) were collected for more than 500 trees of Redwood and Douglas-fir over 0.2 ha- plots. The USFS-FUSION application software along with its LiDAR Data Viewer (LDV) were used to to extract Canopy Height Model (CHM) from which tree heights would be derived. Based on the LiDAR derived height and ground based dbh, a linear regression model was developed to predict dbh. The predicted dbh was used to estimate the biomass at the single tree level using Jenkin's formula (Jenkin et al 2003). The linear regression models were able to explain 65% of the variability associated with Redwood's dbh and 80% of that associated with Douglas-fir's dbh.
Dispersion patterns and sampling plans for Diaphorina citri (Hemiptera: Psyllidae) in citrus.
Sétamou, Mamoudou; Flores, Daniel; French, J Victor; Hall, David G
2008-08-01
The abundance and spatial dispersion of Diaphorina citri Kuwayama (Hemiptera: Psyllidae) were studied in 34 grapefruit (Citrus paradisi Macfad.) and six sweet orange [Citrus sinensis (L.) Osbeck] orchards from March to August 2006 when the pest is more abundant in southern Texas. Although flush shoot infestation levels did not vary with host plant species, densities of D. citri eggs, nymphs, and adults were significantly higher on sweet orange than on grapefruit. D. citri immatures also were found in significantly higher numbers in the southeastern quadrant of trees than other parts of the canopy. The spatial distribution of D. citri nymphs and adults was analyzed using Iowa's patchiness regression and Taylor's power law. Taylor's power law fitted the data better than Iowa's model. Based on both regression models, the field dispersion patterns of D. citri nymphs and adults were aggregated among flush shoots in individual trees as indicated by the regression slopes that were significantly >1. For the average density of each life stage obtained during our surveys, the minimum number of flush shoots per tree needed to estimate D. citri densities varied from eight for eggs to four flush shoots for adults. Projections indicated that a sampling plan consisting of 10 trees and eight flush shoots per tree would provide density estimates of the three developmental stages of D. citri acceptable enough for population studies and management decisions. A presence-absence sampling plan with a fixed precision level was developed and can be used to provide a quick estimation of D. citri populations in citrus orchards.
1985-12-01
consists of the node t and all descendants of t in T. (3) Definition 3. Pruning a branch Tt from a tree T con- sists of deleting from T all...The default is 1.0 so that actually, this keyword did not need to appear in the above file. (5) DELETE . This keyword does not appear in our example, but...when it is used associated with some variable names, it indicates that we want to delete these vari- ables from the regression. If this keyword is
Jovanovic, Milos; Radovanovic, Sandro; Vukicevic, Milan; Van Poucke, Sven; Delibasic, Boris
2016-09-01
Quantification and early identification of unplanned readmission risk have the potential to improve the quality of care during hospitalization and after discharge. However, high dimensionality, sparsity, and class imbalance of electronic health data and the complexity of risk quantification, challenge the development of accurate predictive models. Predictive models require a certain level of interpretability in order to be applicable in real settings and create actionable insights. This paper aims to develop accurate and interpretable predictive models for readmission in a general pediatric patient population, by integrating a data-driven model (sparse logistic regression) and domain knowledge based on the international classification of diseases 9th-revision clinical modification (ICD-9-CM) hierarchy of diseases. Additionally, we propose a way to quantify the interpretability of a model and inspect the stability of alternative solutions. The analysis was conducted on >66,000 pediatric hospital discharge records from California, State Inpatient Databases, Healthcare Cost and Utilization Project between 2009 and 2011. We incorporated domain knowledge based on the ICD-9-CM hierarchy in a data driven, Tree-Lasso regularized logistic regression model, providing the framework for model interpretation. This approach was compared with traditional Lasso logistic regression resulting in models that are easier to interpret by fewer high-level diagnoses, with comparable prediction accuracy. The results revealed that the use of a Tree-Lasso model was as competitive in terms of accuracy (measured by area under the receiver operating characteristic curve-AUC) as the traditional Lasso logistic regression, but integration with the ICD-9-CM hierarchy of diseases provided more interpretable models in terms of high-level diagnoses. Additionally, interpretations of models are in accordance with existing medical understanding of pediatric readmission. Best performing models have similar performances reaching AUC values 0.783 and 0.779 for traditional Lasso and Tree-Lasso, respectfully. However, information loss of Lasso models is 0.35 bits higher compared to Tree-Lasso model. We propose a method for building predictive models applicable for the detection of readmission risk based on Electronic Health records. Integration of domain knowledge (in the form of ICD-9-CM taxonomy) and a data-driven, sparse predictive algorithm (Tree-Lasso Logistic Regression) resulted in an increase of interpretability of the resulting model. The models are interpreted for the readmission prediction problem in general pediatric population in California, as well as several important subpopulations, and the interpretations of models comply with existing medical understanding of pediatric readmission. Finally, quantitative assessment of the interpretability of the models is given, that is beyond simple counts of selected low-level features. Copyright © 2016 Elsevier B.V. All rights reserved.
Stadler, Tanja; Degnan, James H.; Rosenberg, Noah A.
2016-01-01
Classic null models for speciation and extinction give rise to phylogenies that differ in distribution from empirical phylogenies. In particular, empirical phylogenies are less balanced and have branching times closer to the root compared to phylogenies predicted by common null models. This difference might be due to null models of the speciation and extinction process being too simplistic, or due to the empirical datasets not being representative of random phylogenies. A third possibility arises because phylogenetic reconstruction methods often infer gene trees rather than species trees, producing an incongruity between models that predict species tree patterns and empirical analyses that consider gene trees. We investigate the extent to which the difference between gene trees and species trees under a combined birth–death and multispecies coalescent model can explain the difference in empirical trees and birth–death species trees. We simulate gene trees embedded in simulated species trees and investigate their difference with respect to tree balance and branching times. We observe that the gene trees are less balanced and typically have branching times closer to the root than the species trees. Empirical trees from TreeBase are also less balanced than our simulated species trees, and model gene trees can explain an imbalance increase of up to 8% compared to species trees. However, we see a much larger imbalance increase in empirical trees, about 100%, meaning that additional features must also be causing imbalance in empirical trees. This simulation study highlights the necessity of revisiting the assumptions made in phylogenetic analyses, as these assumptions, such as equating the gene tree with the species tree, might lead to a biased conclusion. PMID:26968785
Error analysis of leaf area estimates made from allometric regression models
NASA Technical Reports Server (NTRS)
Feiveson, A. H.; Chhikara, R. S.
1986-01-01
Biological net productivity, measured in terms of the change in biomass with time, affects global productivity and the quality of life through biochemical and hydrological cycles and by its effect on the overall energy balance. Estimating leaf area for large ecosystems is one of the more important means of monitoring this productivity. For a particular forest plot, the leaf area is often estimated by a two-stage process. In the first stage, known as dimension analysis, a small number of trees are felled so that their areas can be measured as accurately as possible. These leaf areas are then related to non-destructive, easily-measured features such as bole diameter and tree height, by using a regression model. In the second stage, the non-destructive features are measured for all or for a sample of trees in the plots and then used as input into the regression model to estimate the total leaf area. Because both stages of the estimation process are subject to error, it is difficult to evaluate the accuracy of the final plot leaf area estimates. This paper illustrates how a complete error analysis can be made, using an example from a study made on aspen trees in northern Minnesota. The study was a joint effort by NASA and the University of California at Santa Barbara known as COVER (Characterization of Vegetation with Remote Sensing).
STRIDE: Species Tree Root Inference from Gene Duplication Events.
Emms, David M; Kelly, Steven
2017-12-01
The correct interpretation of any phylogenetic tree is dependent on that tree being correctly rooted. We present STRIDE, a fast, effective, and outgroup-free method for identification of gene duplication events and species tree root inference in large-scale molecular phylogenetic analyses. STRIDE identifies sets of well-supported in-group gene duplication events from a set of unrooted gene trees, and analyses these events to infer a probability distribution over an unrooted species tree for the location of its root. We show that STRIDE correctly identifies the root of the species tree in multiple large-scale molecular phylogenetic data sets spanning a wide range of timescales and taxonomic groups. We demonstrate that the novel probability model implemented in STRIDE can accurately represent the ambiguity in species tree root assignment for data sets where information is limited. Furthermore, application of STRIDE to outgroup-free inference of the origin of the eukaryotic tree resulted in a root probability distribution that provides additional support for leading hypotheses for the origin of the eukaryotes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Bryson, Robert W; Jaeger, Jef R; Lemos-Espinal, Julio A; Lazcano, David
2012-09-01
Interpretations of phylogeographic patterns can change when analyses shift from single gene-tree to multilocus coalescent analyses. Using multilocus coalescent approaches, a species tree and divergence times can be estimated from a set of gene trees while accounting for gene-tree stochasticity. We utilized the conceptual strengths of a multilocus coalescent approach coupled with complete range-wide sampling to examine the speciation history of a broadly distributed, North American warm-desert toad, Anaxyrus punctatus. Phylogenetic analyses provided strong support for three major lineages within A. punctatus. Each lineage broadly corresponded to one of three desert regions. Early speciation in A. punctatus appeared linked to late Miocene-Pliocene development of the Baja California peninsula. This event was likely followed by a Pleistocene divergence associated with the separation of the Chihuahuan and Sonoran Deserts. Our multilocus coalescent-based reconstruction provides an informative contrast to previous single gene-tree estimates of the evolutionary history of A. punctatus. Copyright © 2012 Elsevier Inc. All rights reserved.
Rieger, Isaak; Kowarik, Ingo; Cherubini, Paolo; Cierjacks, Arne
2017-01-01
Aboveground carbon (C) sequestration in trees is important in global C dynamics, but reliable techniques for its modeling in highly productive and heterogeneous ecosystems are limited. We applied an extended dendrochronological approach to disentangle the functioning of drivers from the atmosphere (temperature, precipitation), the lithosphere (sedimentation rate), the hydrosphere (groundwater table, river water level fluctuation), the biosphere (tree characteristics), and the anthroposphere (dike construction). Carbon sequestration in aboveground biomass of riparian Quercus robur L. and Fraxinus excelsior L. was modeled (1) over time using boosted regression tree analysis (BRT) on cross-datable trees characterized by equal annual growth ring patterns and (2) across space using a subsequent classification and regression tree analysis (CART) on cross-datable and not cross-datable trees. While C sequestration of cross-datable Q. robur responded to precipitation and temperature, cross-datable F. excelsior also responded to a low Danube river water level. However, CART revealed that C sequestration over time is governed by tree height and parameters that vary over space (magnitude of fluctuation in the groundwater table, vertical distance to mean river water level, and longitudinal distance to upstream end of the study area). Thus, a uniform response to climatic drivers of aboveground C sequestration in Q. robur was only detectable in trees of an intermediate height class and in taller trees (>21.8m) on sites where the groundwater table fluctuated little (≤0.9m). The detection of climatic drivers and the river water level in F. excelsior depended on sites at lower altitudes above the mean river water level (≤2.7m) and along a less dynamic downstream section of the study area. Our approach indicates unexploited opportunities of understanding the interplay of different environmental drivers in aboveground C sequestration. Results may support species-specific and locally adapted forest management plans to increase carbon dioxide sequestration from the atmosphere in trees. Copyright © 2016 Elsevier B.V. All rights reserved.
Molecular proxies for climate maladaptation in a long-lived tree (Pinus pinaster Aiton, Pinaceae).
Jaramillo-Correa, Juan-Pablo; Rodríguez-Quilón, Isabel; Grivet, Delphine; Lepoittevin, Camille; Sebastiani, Federico; Heuertz, Myriam; Garnier-Géré, Pauline H; Alía, Ricardo; Plomion, Christophe; Vendramin, Giovanni G; González-Martínez, Santiago C
2015-03-01
Understanding adaptive genetic responses to climate change is a main challenge for preserving biological diversity. Successful predictive models for climate-driven range shifts of species depend on the integration of information on adaptation, including that derived from genomic studies. Long-lived forest trees can experience substantial environmental change across generations, which results in a much more prominent adaptation lag than in annual species. Here, we show that candidate-gene SNPs (single nucleotide polymorphisms) can be used as predictors of maladaptation to climate in maritime pine (Pinus pinaster Aiton), an outcrossing long-lived keystone tree. A set of 18 SNPs potentially associated with climate, 5 of them involving amino acid-changing variants, were retained after performing logistic regression, latent factor mixed models, and Bayesian analyses of SNP-climate correlations. These relationships identified temperature as an important adaptive driver in maritime pine and highlighted that selective forces are operating differentially in geographically discrete gene pools. The frequency of the locally advantageous alleles at these selected loci was strongly correlated with survival in a common garden under extreme (hot and dry) climate conditions, which suggests that candidate-gene SNPs can be used to forecast the likely destiny of natural forest ecosystems under climate change scenarios. Differential levels of forest decline are anticipated for distinct maritime pine gene pools. Geographically defined molecular proxies for climate adaptation will thus critically enhance the predictive power of range-shift models and help establish mitigation measures for long-lived keystone forest trees in the face of impending climate change. Copyright © 2015 by the Genetics Society of America.
Pitcher, Brandon; Alaqla, Ali; Noujeim, Marcel; Wealleans, James A; Kotsakis, Georgios; Chrepa, Vanessa
2017-03-01
Cone-beam computed tomographic (CBCT) analysis allows for 3-dimensional assessment of periradicular lesions and may facilitate preoperative periapical cyst screening. The purpose of this study was to develop and assess the predictive validity of a cyst screening method based on CBCT volumetric analysis alone or combined with designated radiologic criteria. Three independent examiners evaluated 118 presurgical CBCT scans from cases that underwent apicoectomies and had an accompanying gold standard histopathological diagnosis of either a cyst or granuloma. Lesion volume, density, and specific radiologic characteristics were assessed using specialized software. Logistic regression models with histopathological diagnosis as the dependent variable were constructed for cyst prediction, and receiver operating characteristic curves were used to assess the predictive validity of the models. A conditional inference binary decision tree based on a recursive partitioning algorithm was constructed to facilitate preoperative screening. Interobserver agreement was excellent for volume and density, but it varied from poor to good for the radiologic criteria. Volume and root displacement were strong predictors for cyst screening in all analyses. The binary decision tree classifier determined that if the volume of the lesion was >247 mm 3 , there was 80% probability of a cyst. If volume was <247 mm 3 and root displacement was present, cyst probability was 60% (78% accuracy). The good accuracy and high specificity of the decision tree classifier renders it a useful preoperative cyst screening tool that can aid in clinical decision making but not a substitute for definitive histopathological diagnosis after biopsy. Confirmatory studies are required to validate the present findings. Published by Elsevier Inc.
Molecular Proxies for Climate Maladaptation in a Long-Lived Tree (Pinus pinaster Aiton, Pinaceae)
Jaramillo-Correa, Juan-Pablo; Rodríguez-Quilón, Isabel; Grivet, Delphine; Lepoittevin, Camille; Sebastiani, Federico; Heuertz, Myriam; Garnier-Géré, Pauline H.; Alía, Ricardo; Plomion, Christophe; Vendramin, Giovanni G.; González-Martínez, Santiago C.
2015-01-01
Understanding adaptive genetic responses to climate change is a main challenge for preserving biological diversity. Successful predictive models for climate-driven range shifts of species depend on the integration of information on adaptation, including that derived from genomic studies. Long-lived forest trees can experience substantial environmental change across generations, which results in a much more prominent adaptation lag than in annual species. Here, we show that candidate-gene SNPs (single nucleotide polymorphisms) can be used as predictors of maladaptation to climate in maritime pine (Pinus pinaster Aiton), an outcrossing long-lived keystone tree. A set of 18 SNPs potentially associated with climate, 5 of them involving amino acid-changing variants, were retained after performing logistic regression, latent factor mixed models, and Bayesian analyses of SNP–climate correlations. These relationships identified temperature as an important adaptive driver in maritime pine and highlighted that selective forces are operating differentially in geographically discrete gene pools. The frequency of the locally advantageous alleles at these selected loci was strongly correlated with survival in a common garden under extreme (hot and dry) climate conditions, which suggests that candidate-gene SNPs can be used to forecast the likely destiny of natural forest ecosystems under climate change scenarios. Differential levels of forest decline are anticipated for distinct maritime pine gene pools. Geographically defined molecular proxies for climate adaptation will thus critically enhance the predictive power of range-shift models and help establish mitigation measures for long-lived keystone forest trees in the face of impending climate change. PMID:25549630
Lambert, Shea M; Reeder, Tod W; Wiens, John J
2015-01-01
Simulation studies suggest that coalescent-based species-tree methods are generally more accurate than concatenated analyses. However, these species-tree methods remain impractical for many large datasets. Thus, a critical but unresolved issue is when and why concatenated and coalescent species-tree estimates will differ. We predict such differences for branches in concatenated trees that are short, weakly supported, and have conflicting gene trees. We test these predictions in Scincidae, the largest lizard family, with data from 10 nuclear genes for 17 ingroup taxa and 44 genes for 12 taxa. We support our initial predictions, andsuggest that simply considering uncertainty in concatenated trees may sometimes encompass the differences between these methods. We also found that relaxed-clock concatenated trees can be surprisingly similar to the species-tree estimate. Remarkably, the coalescent species-tree estimates had slightly lower support values when based on many more genes (44 vs. 10) and a small (∼30%) reduction in taxon sampling. Thus, taxon sampling may be more important than gene sampling when applying species-tree methods to deep phylogenetic questions. Finally, our coalescent species-tree estimates tentatively support division of Scincidae into three monophyletic subfamilies, a result otherwise found only in concatenated analyses with extensive species sampling. Copyright © 2014 Elsevier Inc. All rights reserved.
Inferring gene regression networks with model trees
2010-01-01
Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate areas of the search space favoring to infer localized similarities over a more global similarity. Furthermore, experimental results show the good performance of REGNET. PMID:20950452
More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011
NASA Astrophysics Data System (ADS)
Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R.
2016-01-01
Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.
More Trees, More Poverty? The Socioeconomic Effects of Tree Plantations in Chile, 2001-2011.
Andersson, Krister; Lawrence, Duncan; Zavaleta, Jennifer; Guariguata, Manuel R
2016-01-01
Tree plantations play a controversial role in many nations' efforts to balance goals for economic development, ecological conservation, and social justice. This paper seeks to contribute to this debate by analyzing the socioeconomic impact of such plantations. We focus our study on Chile, a country that has experienced extraordinary growth of industrial tree plantations. Our analysis draws on a unique dataset with longitudinal observations collected in 180 municipal territories during 2001-2011. Employing panel data regression techniques, we find that growth in plantation area is associated with higher than average rates of poverty during this period.
A Decision Tree to Identify Children Affected by Prenatal Alcohol Exposure
Goh, Patrick K.; Doyle, Lauren R.; Glass, Leila; Jones, Kenneth L.; Riley, Edward P.; Coles, Claire D.; Hoyme, H. Eugene; Kable, Julie A.; May, Philip A.; Kalberg, Wendy O.; Elizabeth, R. Sowell; Wozniak, Jeffrey R.; Mattson, Sarah N.
2017-01-01
Objective To develop and validate a hierarchical decision tree model, combining neurobehavioral and physical measures, for identification of children affected by prenatal alcohol exposure even when facial dysmorphology is not present. Study design Data were collected as part of a multisite study across the United States. The model was developed after evaluating over 1000 neurobehavioral and dysmorphology variables collected from 434 children (8–16y) with prenatal alcohol exposure, with and without fetal alcohol syndrome (FAS), and non-exposed controls, with and without other clinically-relevant behavioral or cognitive concerns. The model was subsequently validated in an independent sample of 454 children in two age ranges (5–7y or 10–16y). In all analyses, the discriminatory ability of each model step was tested with logistic regression. Classification accuracies and positive and negative predictive values were calculated. Results The model consisted of variables from 4 measures (2 parent questionnaires, an IQ score, and a physical examination). Overall accuracy rates for both the development and validation samples met or exceeded our goal of 80% overall accuracy. Conclusions The decision tree model distinguished children affected by prenatal alcohol exposure from non-exposed controls, including those with other behavioral concerns or conditions. Improving identification of this population will streamline access to clinical services, including multidisciplinary evaluation and treatment. PMID:27476634
The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis.
Koziol, James A; Feng, Anne C; Jia, Zhenyu; Wang, Yipeng; Goodison, Seven; McClelland, Michael; Mercola, Dan
2009-01-01
Classification and regression trees have long been used for cancer diagnosis and prognosis. Nevertheless, instability and variable selection bias, as well as overfitting, are well-known problems of tree-based methods. In this article, we investigate whether ensemble tree classifiers can ameliorate these difficulties, using data from two recent studies of radical prostatectomy in prostate cancer. Using time to progression following prostatectomy as the relevant clinical endpoint, we found that ensemble tree classifiers robustly and reproducibly identified three subgroups of patients in the two clinical datasets: non-progressors, early progressors and late progressors. Moreover, the consensus classifications were independent predictors of time to progression compared to known clinical prognostic factors.
NASA Astrophysics Data System (ADS)
Parajuli, A.; Nadeau, D.; Anctil, F.; Parent, A. C.; Bouchard, B.; Jutras, S.
2017-12-01
In snow-fed catchments, it is crucial to monitor and to model snow water equivalent (SWE), particularly to simulate the melt water runoff. However, the distribution of SWE can be highly heterogeneous, particularly within forested environments, mainly because of the large variability in snow depths. Although the boreal forest is the dominant land cover in Canada and in a few other northern countries, very few studies have quantified the spatiotemporal variability of snow depths and snowpack dynamics within this biome. The objective of this paper is to fill this research gap, through a detailed monitoring of snowpack dynamics at nine locations within a 3.57 km2 experimental forested catchment in southern Quebec, Canada (47°N, 71°W). The catchment receives 6 m of snow annually on average and is predominantly covered with balsam fir stand with some traces of spruce and white birch. In this study, we used a network of nine so-called `snow profiling stations', providing automated snow depth and snowpack temperature profile measurements, as well as three contrasting sites (juvenile, sapling and open areas) where sublimation rates were directly measured with flux towers. In addition, a total of 1401 manual snow samples supported by 20 snow pits measurements were collected throughout the winter of 2017. This paper presents some preliminary analyses of this unique dataset. Simple empirical relations relying SWE with easy-to-determine proxies, such as snow depths and snow temperature, are tested. Then, binary regression trees and multiple regression analysis are used to model SWE using topographic characteristics (slope, aspect, elevation), forest features (tree height, tree diameter, forest density and gap fraction) and meteorological forcing (solar radiation, wind speed, snow-pack temperature profile, air temperature, humidity). An analysis of sublimation rates comparing open area, saplings and juvenile forest is also presented in this paper.
Beating the Odds: Trees to Success in Different Countries
ERIC Educational Resources Information Center
Finch, W. Holmes; Marchant, Gregory J.
2017-01-01
A recursive partitioning model approach in the form of classification and regression trees (CART) was used with 2012 PISA data for five countries (Canada, Finland, Germany, Singapore-China, and the Unites States). The objective of the study was to determine demographic and educational variables that differentiated between low SES student that were…
Geospatial relationships of tree species damage caused by Hurricane Katrina in south Mississippi
Mark W. Garrigues; Zhaofei Fan; David L. Evans; Scott D. Roberts; William H. Cooke III
2012-01-01
Hurricane Katrina generated substantial impacts on the forests and biological resources of the affected area in Mississippi. This study seeks to use classification tree analysis (CTA) to determine which variables are significant in predicting hurricane damage (shear or windthrow) in the Southeast Mississippi Institute for Forest Inventory District. Logistic regressions...
Using Classification Trees to Predict Alumni Giving for Higher Education
ERIC Educational Resources Information Center
Weerts, David J.; Ronca, Justin M.
2009-01-01
As the relative level of public support for higher education declines, colleges and universities aim to maximize alumni-giving to keep their programs competitive. Anchored in a utility maximization framework, this study employs the classification and regression tree methodology to examine characteristics of alumni donors and non-donors at a…
Updated generalized biomass equations for North American tree species
David C. Chojnacky; Linda S. Heath; Jennifer C. Jenkins
2014-01-01
Historically, tree biomass at large scales has been estimated by applying dimensional analysis techniques and field measurements such as diameter at breast height (dbh) in allometric regression equations. Equations often have been developed using differing methods and applied only to certain species or isolated areas. We previously had compiled and combined (in meta-...
The microcomputer scientific software series 5: the BIOMASS user's guide.
George E. Host; Stephen C. Westin; William G. Cole; Kurt S. Pregitzer
1989-01-01
BIOMASS is an interactive microcomputer program that uses allometric regression equations to calculate aboveground biomass of common tree species of the Lake States. The equations are species-specific and most use both diameter and height as independent variables. The program accommodates fixed area and variable radius sample designs and produces both individual tree...
Northern Arkansas Spring Precipitation Reconstructed from Tree Rings, 1023-1992 A.D.
Malcolm K. Cleaveland
2001-01-01
Three baldcypress (Taxodium distichum (L.) Rich.) tree-ring chronologies in northeastern Arkansas and southeastern Missouri respond strongly to April-June (spring) rainfall in northern Arkansas. I used regression to reconstruct an average of spring rainfall in the three climatic divisions of northern Arkansas since 1023 A.D. The reconstruction was...
Weather Impact on Airport Arrival Meter Fix Throughput
NASA Technical Reports Server (NTRS)
Wang, Yao
2017-01-01
Time-based flow management provides arrival aircraft schedules based on arrival airport conditions, airport capacity, required spacing, and weather conditions. In order to meet a scheduled time at which arrival aircraft can cross an airport arrival meter fix prior to entering the airport terminal airspace, air traffic controllers make regulations on air traffic. Severe weather may create an airport arrival bottleneck if one or more of airport arrival meter fixes are partially or completely blocked by the weather and the arrival demand has not been reduced accordingly. Under these conditions, aircraft are frequently being put in holding patterns until they can be rerouted. A model that predicts the weather impacted meter fix throughput may help air traffic controllers direct arrival flows into the airport more efficiently, minimizing arrival meter fix congestion. This paper presents an analysis of air traffic flows across arrival meter fixes at the Newark Liberty International Airport (EWR). Several scenarios of weather impacted EWR arrival fix flows are described. Furthermore, multiple linear regression and regression tree ensemble learning approaches for translating multiple sector Weather Impacted Traffic Indexes (WITI) to EWR arrival meter fix throughputs are examined. These weather translation models are developed and validated using the EWR arrival flight and weather data for the period of April-September in 2014. This study also compares the performance of the regression tree ensemble with traditional multiple linear regression models for estimating the weather impacted throughputs at each of the EWR arrival meter fixes. For all meter fixes investigated, the results from the regression tree ensemble weather translation models show a stronger correlation between model outputs and observed meter fix throughputs than that produced from multiple linear regression method.
Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H
2017-02-01
At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification. © 2016 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Wilson, Barry T.; Knight, Joseph F.; McRoberts, Ronald E.
2018-03-01
Imagery from the Landsat Program has been used frequently as a source of auxiliary data for modeling land cover, as well as a variety of attributes associated with tree cover. With ready access to all scenes in the archive since 2008 due to the USGS Landsat Data Policy, new approaches to deriving such auxiliary data from dense Landsat time series are required. Several methods have previously been developed for use with finer temporal resolution imagery (e.g. AVHRR and MODIS), including image compositing and harmonic regression using Fourier series. The manuscript presents a study, using Minnesota, USA during the years 2009-2013 as the study area and timeframe. The study examined the relative predictive power of land cover models, in particular those related to tree cover, using predictor variables based solely on composite imagery versus those using estimated harmonic regression coefficients. The study used two common non-parametric modeling approaches (i.e. k-nearest neighbors and random forests) for fitting classification and regression models of multiple attributes measured on USFS Forest Inventory and Analysis plots using all available Landsat imagery for the study area and timeframe. The estimated Fourier coefficients developed by harmonic regression of tasseled cap transformation time series data were shown to be correlated with land cover, including tree cover. Regression models using estimated Fourier coefficients as predictor variables showed a two- to threefold increase in explained variance for a small set of continuous response variables, relative to comparable models using monthly image composites. Similarly, the overall accuracies of classification models using the estimated Fourier coefficients were approximately 10-20 percentage points higher than the models using the image composites, with corresponding individual class accuracies between six and 45 percentage points higher.
A duplicate gene rooting of seed plants and the phylogenetic position of flowering plants
Mathews, Sarah; Clements, Mark D.; Beilstein, Mark A.
2010-01-01
Flowering plants represent the most significant branch in the tree of land plants, with respect to the number of extant species, their impact on the shaping of modern ecosystems and their economic importance. However, unlike so many persistent phylogenetic problems that have yielded to insights from DNA sequence data, the mystery surrounding the origin of angiosperms has deepened with the advent and advance of molecular systematics. Strong statistical support for competing hypotheses and recent novel trees from molecular data suggest that the accuracy of current molecular trees requires further testing. Analyses of phytochrome amino acids using a duplicate gene-rooting approach yield trees that unite cycads and angiosperms in a clade that is sister to a clade in which Gingko and Cupressophyta are successive sister taxa to gnetophytes plus Pinaceae. Application of a cycads + angiosperms backbone constraint in analyses of a morphological dataset yields better resolved trees than do analyses in which extant gymnosperms are forced to be monophyletic. The results have implications both for our assessment of uncertainty in trees from sequence data and for our use of molecular constraints as a way to integrate insights from morphological and molecular evidence. PMID:20047866
SUNPLIN: Simulation with Uncertainty for Phylogenetic Investigations
2013-01-01
Background Phylogenetic comparative analyses usually rely on a single consensus phylogenetic tree in order to study evolutionary processes. However, most phylogenetic trees are incomplete with regard to species sampling, which may critically compromise analyses. Some approaches have been proposed to integrate non-molecular phylogenetic information into incomplete molecular phylogenies. An expanded tree approach consists of adding missing species to random locations within their clade. The information contained in the topology of the resulting expanded trees can be captured by the pairwise phylogenetic distance between species and stored in a matrix for further statistical analysis. Thus, the random expansion and processing of multiple phylogenetic trees can be used to estimate the phylogenetic uncertainty through a simulation procedure. Because of the computational burden required, unless this procedure is efficiently implemented, the analyses are of limited applicability. Results In this paper, we present efficient algorithms and implementations for randomly expanding and processing phylogenetic trees so that simulations involved in comparative phylogenetic analysis with uncertainty can be conducted in a reasonable time. We propose algorithms for both randomly expanding trees and calculating distance matrices. We made available the source code, which was written in the C++ language. The code may be used as a standalone program or as a shared object in the R system. The software can also be used as a web service through the link: http://purl.oclc.org/NET/sunplin/. Conclusion We compare our implementations to similar solutions and show that significant performance gains can be obtained. Our results open up the possibility of accounting for phylogenetic uncertainty in evolutionary and ecological analyses of large datasets. PMID:24229408
SUNPLIN: simulation with uncertainty for phylogenetic investigations.
Martins, Wellington S; Carmo, Welton C; Longo, Humberto J; Rosa, Thierson C; Rangel, Thiago F
2013-11-15
Phylogenetic comparative analyses usually rely on a single consensus phylogenetic tree in order to study evolutionary processes. However, most phylogenetic trees are incomplete with regard to species sampling, which may critically compromise analyses. Some approaches have been proposed to integrate non-molecular phylogenetic information into incomplete molecular phylogenies. An expanded tree approach consists of adding missing species to random locations within their clade. The information contained in the topology of the resulting expanded trees can be captured by the pairwise phylogenetic distance between species and stored in a matrix for further statistical analysis. Thus, the random expansion and processing of multiple phylogenetic trees can be used to estimate the phylogenetic uncertainty through a simulation procedure. Because of the computational burden required, unless this procedure is efficiently implemented, the analyses are of limited applicability. In this paper, we present efficient algorithms and implementations for randomly expanding and processing phylogenetic trees so that simulations involved in comparative phylogenetic analysis with uncertainty can be conducted in a reasonable time. We propose algorithms for both randomly expanding trees and calculating distance matrices. We made available the source code, which was written in the C++ language. The code may be used as a standalone program or as a shared object in the R system. The software can also be used as a web service through the link: http://purl.oclc.org/NET/sunplin/. We compare our implementations to similar solutions and show that significant performance gains can be obtained. Our results open up the possibility of accounting for phylogenetic uncertainty in evolutionary and ecological analyses of large datasets.
Observation of eight ancient olive trees (Olea europaea L.) growing in the Garden of Gethsemane.
Petruccelli, Raffaella; Giordano, Cristiana; Salvatici, Maria Cristina; Capozzoli, Laura; Ciaccheri, Leonardo; Pazzini, Massimo; Lain, Orietta; Testolin, Raffaele; Cimato, Antonio
2014-05-01
For thousands of years, olive trees (Olea europaea L.) have been a significant presence and a symbol in the Garden of Gethsemane, a place located at the foot of the Mount of Olives, Jerusalem, remembered for the agony of Jesus Christ before his arrest. This investigation comprises the first morphological and genetic characterization of eight olive trees in the Garden of Gethsemane. Pomological traits, morphometric, and ultrastructural observations as well as SSR (Simple Sequence Repeat) analysis were performed to identify the olive trees. Statistical analyses were conducted to evaluate their morphological variability. The study revealed a low morphological variability and minimal dissimilarity among the olive trees. According to molecular analysis, these trees showed the same allelic profile at all microsatellite loci analyzed. Combining the results of the different analyses carried out in the frame of the present work, we could conclude that the eight olive trees of the Gethsemane Garden have been propagated from a single genotype. Copyright © 2014. Published by Elsevier SAS.
Treetrimmer: a method for phylogenetic dataset size reduction.
Maruyama, Shinichiro; Eveleigh, Robert J M; Archibald, John M
2013-04-12
With rapid advances in genome sequencing and bioinformatics, it is now possible to generate phylogenetic trees containing thousands of operational taxonomic units (OTUs) from a wide range of organisms. However, use of rigorous tree-building methods on such large datasets is prohibitive and manual 'pruning' of sequence alignments is time consuming and raises concerns over reproducibility. There is a need for bioinformatic tools with which to objectively carry out such pruning procedures. Here we present 'TreeTrimmer', a bioinformatics procedure that removes unnecessary redundancy in large phylogenetic datasets, alleviating the size effect on more rigorous downstream analyses. The method identifies and removes user-defined 'redundant' sequences, e.g., orthologous sequences from closely related organisms and 'recently' evolved lineage-specific paralogs. Representative OTUs are retained for more rigorous re-analysis. TreeTrimmer reduces the OTU density of phylogenetic trees without sacrificing taxonomic diversity while retaining the original tree topology, thereby speeding up downstream computer-intensive analyses, e.g., Bayesian and maximum likelihood tree reconstructions, in a reproducible fashion.
Baldwin, Elizabeth; Plotto, Anne; Bai, Jinhe; Manthey, John; Zhao, Wei; Raithore, Smita; Irey, Mike
2018-03-21
Orange trees affected by huanglongbing (HLB) exhibit excessive fruit drop, and fruit loosely attached to the tree may have inferior flavor. Fruit were collected from healthy and HLB-infected ( Candidatus liberibacter asiaticus) 'Hamlin' and 'Valencia' trees. Prior to harvest, the trees were shaken, fruit that dropped collected, tree-retained fruit harvested, and all fruit juiced. For chemical analyses, sugars and acids were generally lowest in HLB dropped (HLB-D) fruit juice compared to nonshaken healthy (H), healthy retained (H-R), and healthy dropped fruit (H-D) in early season (December) but not for the late season (January) 'Hamlin' or 'Valencia' except for sugar/acid ratio. The bitter limonoids, many flavonoids, and terpenoid volatiles were generally higher in HLB juice, especially HLB-D juice, compared to the other samples. The lower sugars, higher bitter limonoids, flavonoids, and terpenoid volatiles in HLB-D fruit, loosely attached to the tree, contributed to off-flavor, as was confirmed by sensory analyses.
Nattee, Cholwich; Khamsemanan, Nirattaya; Lawtrakul, Luckhana; Toochinda, Pisanu; Hannongbua, Supa
2017-01-01
Malaria is still one of the most serious diseases in tropical regions. This is due in part to the high resistance against available drugs for the inhibition of parasites, Plasmodium, the cause of the disease. New potent compounds with high clinical utility are urgently needed. In this work, we created a novel model using a regression tree to study structure-activity relationships and predict the inhibition constant, K i of three different antimalarial analogues (Trimethoprim, Pyrimethamine, and Cycloguanil) based on their molecular descriptors. To the best of our knowledge, this work is the first attempt to study the structure-activity relationships of all three analogues combined. The most relevant descriptors and appropriate parameters of the regression tree are harvested using extremely randomized trees. These descriptors are water accessible surface area, Log of the aqueous solubility, total hydrophobic van der Waals surface area, and molecular refractivity. Out of all possible combinations of these selected parameters and descriptors, the tree with the strongest coefficient of determination is selected to be our prediction model. Predicted K i values from the proposed model show a strong coefficient of determination, R 2 =0.996, to experimental K i values. From the structure of the regression tree, compounds with high accessible surface area of all hydrophobic atoms (ASA_H) and low aqueous solubility of inhibitors (Log S) generally possess low K i values. Our prediction model can also be utilized as a screening test for new antimalarial drug compounds which may reduce the time and expenses for new drug development. New compounds with high predicted K i should be excluded from further drug development. It is also our inference that a threshold of ASA_H greater than 575.80 and Log S less than or equal to -4.36 is a sufficient condition for a new compound to possess a low K i . Copyright © 2016 Elsevier Inc. All rights reserved.
Applying isozyme analyses in tree-breeding programs
W. T. Adams
1981-01-01
Four examples illustrate the potential for practical use of isozyme analyses in applied breeding programs. These include identifying parent trees and clones, seed sources, and parentage of controlled crosses, and evaluating the effectiveness of different procedures involving open-pollination to produce seed of specific crosses. The improved ability to assess the true...
Xiaoping Zhou; Miles A. Hemstrom
2009-01-01
Live tree biomass estimates are essential for carbon accounting, bioenergy feasibility studies, and other analyses. Several models are currently used for estimating tree biomass. Each of these incorporates different calculation methods that may significantly impact the estimates of total aboveground tree biomass, merchantable biomass, and carbon pools. Consequently,...
Comparison of Sub-Pixel Classification Approaches for Crop-Specific Mapping
This paper examined two non-linear models, Multilayer Perceptron (MLP) regression and Regression Tree (RT), for estimating sub-pixel crop proportions using time-series MODIS-NDVI data. The sub-pixel proportions were estimated for three major crop types including corn, soybean, a...
"Mad or bad?": burden on caregivers of patients with personality disorders.
Bauer, Rita; Döring, Antje; Schmidt, Tanja; Spießl, Hermann
2012-12-01
The burden on caregivers of patients with personality disorders is often greatly underestimated or completely disregarded. Possibilities for caregiver support have rarely been assessed. Thirty interviews were conducted with caregivers of such patients to assess illness-related burden. Responses were analyzed with a mixed method of qualitative and quantitative analysis in a sequential design. Patient and caregiver data, including sociodemographic and disease-related variables, were evaluated with regression analysis and regression trees. Caregiver statements (n = 404) were summarized into 44 global statements. The most frequent global statements were worries about the burden on other family members (70.0%), poor cooperation with clinical centers and other institutions (60.0%), financial burden (56.7%), worry about the patient's future (53.3%), and dissatisfaction with the patient's treatment and rehabilitation (53.3%). Linear regression and regression tree analysis identified predictors for more burdened caregivers. Caregivers of patients with personality disorders experience a variety of burdens, some disorder specific. Yet these caregivers often receive little attention or support.
NASA Astrophysics Data System (ADS)
Niemeijer, Meindert; Dumitrescu, Alina V.; van Ginneken, Bram; Abrámoff, Michael D.
2011-03-01
Parameters extracted from the vasculature on the retina are correlated with various conditions such as diabetic retinopathy and cardiovascular diseases such as stroke. Segmentation of the vasculature on the retina has been a topic that has received much attention in the literature over the past decade. Analysis of the segmentation result, however, has only received limited attention with most works describing methods to accurately measure the width of the vessels. Analyzing the connectedness of the vascular network is an important step towards the characterization of the complete vascular tree. The retinal vascular tree, from an image interpretation point of view, originates at the optic disc and spreads out over the retina. The tree bifurcates and the vessels also cross each other. The points where this happens form the key to determining the connectedness of the complete tree. We present a supervised method to detect the bifurcations and crossing points of the vasculature of the retina. The method uses features extracted from the vasculature as well as the image in a location regression approach to find those locations of the segmented vascular tree where the bifurcation or crossing occurs (from here, POI, points of interest). We evaluate the method on the publicly available DRIVE database in which an ophthalmologist has marked the POI.
Triplet supertree heuristics for the tree of life
Lin, Harris T; Burleigh, J Gordon; Eulenstein, Oliver
2009-01-01
Background There is much interest in developing fast and accurate supertree methods to infer the tree of life. Supertree methods combine smaller input trees with overlapping sets of taxa to make a comprehensive phylogenetic tree that contains all of the taxa in the input trees. The intrinsically hard triplet supertree problem takes a collection of input species trees and seeks a species tree (supertree) that maximizes the number of triplet subtrees that it shares with the input trees. However, the utility of this supertree problem has been limited by a lack of efficient and effective heuristics. Results We introduce fast hill-climbing heuristics for the triplet supertree problem that perform a step-wise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. To realize time efficient heuristics we designed the first nontrivial algorithms for two standard search problems, which greatly improve on the time complexity to the best known (naïve) solutions by a factor of n and n2 (the number of taxa in the supertree). These algorithms enable large-scale supertree analyses based on the triplet supertree problem that were previously not possible. We implemented hill-climbing heuristics that are based on our new algorithms, and in analyses of two published supertree data sets, we demonstrate that our new heuristics outperform other standard supertree methods in maximizing the number of triplets shared with the input trees. Conclusion With our new heuristics, the triplet supertree problem is now computationally more tractable for large-scale supertree analyses, and it provides a potentially more accurate alternative to existing supertree methods. PMID:19208181
Universal artifacts affect the branching of phylogenetic trees, not universal scaling laws.
Altaba, Cristian R
2009-01-01
The superficial resemblance of phylogenetic trees to other branching structures allows searching for macroevolutionary patterns. However, such trees are just statistical inferences of particular historical events. Recent meta-analyses report finding regularities in the branching pattern of phylogenetic trees. But is this supported by evidence, or are such regularities just methodological artifacts? If so, is there any signal in a phylogeny? In order to evaluate the impact of polytomies and imbalance on tree shape, the distribution of all binary and polytomic trees of up to 7 taxa was assessed in tree-shape space. The relationship between the proportion of outgroups and the amount of imbalance introduced with them was assessed applying four different tree-building methods to 100 combinations from a set of 10 ingroup and 9 outgroup species, and performing covariance analyses. The relevance of this analysis was explored taking 61 published phylogenies, based on nucleic acid sequences and involving various taxa, taxonomic levels, and tree-building methods. All methods of phylogenetic inference are quite sensitive to the artifacts introduced by outgroups. However, published phylogenies appear to be subject to a rather effective, albeit rather intuitive control against such artifacts. The data and methods used to build phylogenetic trees are varied, so any meta-analysis is subject to pitfalls due to their uneven intrinsic merits, which translate into artifacts in tree shape. The binary branching pattern is an imposition of methods, and seldom reflects true relationships in intraspecific analyses, yielding artifactual polytomies in short trees. Above the species level, the departure of real trees from simplistic random models is caused at least by two natural factors--uneven speciation and extinction rates; and artifacts such as choice of taxa included in the analysis, and imbalance introduced by outgroups and basal paraphyletic taxa. This artifactual imbalance accounts for tree shape convergence of large trees. There is no evidence for any universal scaling in the tree of life. Instead, there is a need for improved methods of tree analysis that can be used to discriminate the noise due to outgroups from the phylogenetic signal within the taxon of interest, and to evaluate realistic models of evolution, correcting the retrospective perspective and explicitly recognizing extinction as a driving force. Artifacts are pervasive, and can only be overcome through understanding the structure and biological meaning of phylogenetic trees. Catalan Abstract in Translation S1.
Tamibmaniam, Jayashamani; Hussin, Narwani; Cheah, Wee Kooi; Ng, Kee Sing; Muninathan, Prema
2016-01-01
WHO's new classification in 2009: dengue with or without warning signs and severe dengue, has necessitated large numbers of admissions to hospitals of dengue patients which in turn has been imposing a huge economical and physical burden on many hospitals around the globe, particularly South East Asia and Malaysia where the disease has seen a rapid surge in numbers in recent years. Lack of a simple tool to differentiate mild from life threatening infection has led to unnecessary hospitalization of dengue patients. We conducted a single-centre, retrospective study involving serologically confirmed dengue fever patients, admitted in a single ward, in Hospital Kuala Lumpur, Malaysia. Data was collected for 4 months from February to May 2014. Socio demography, co-morbidity, days of illness before admission, symptoms, warning signs, vital signs and laboratory result were all recorded. Descriptive statistics was tabulated and simple and multiple logistic regression analysis was done to determine significant risk factors associated with severe dengue. 657 patients with confirmed dengue were analysed, of which 59 (9.0%) had severe dengue. Overall, the commonest warning sign were vomiting (36.1%) and abdominal pain (32.1%). Previous co-morbid, vomiting, diarrhoea, pleural effusion, low systolic blood pressure, high haematocrit, low albumin and high urea were found as significant risk factors for severe dengue using simple logistic regression. However the significant risk factors for severe dengue with multiple logistic regressions were only vomiting, pleural effusion, and low systolic blood pressure. Using those 3 risk factors, we plotted an algorithm for predicting severe dengue. When compared to the classification of severe dengue based on the WHO criteria, the decision tree algorithm had a sensitivity of 0.81, specificity of 0.54, positive predictive value of 0.16 and negative predictive of 0.96. The decision tree algorithm proposed in this study showed high sensitivity and NPV in predicting patients with severe dengue that may warrant admission. This tool upon further validation study can be used to help clinicians decide on further managing a patient upon first encounter. It also will have a substantial impact on health resources as low risk patients can be managed as outpatients hence reserving the scarce hospital beds and medical resources for other patients in need.
Sensitivity of bud burst in key tree species in the UK to recent climate variability and change
NASA Astrophysics Data System (ADS)
Abernethy, Rachel; Cook, Sally; Hemming, Deborah; McCarthy, Mark
2017-04-01
Analysing the relationship between the changing climate of the UK and the spatial and temporal distribution of spring bud burst plays an important role in understanding ecosystem functionality and predicting future phenological trends. The location and timing of bud burst of eleven species of trees alongside climatic factors such as, temperature, precipitation and hours of sunshine (photoperiod) were used to investigate: i. which species' bud burst timing experiences the greatest impact from a changing climate, ii. which climatic factor has the greatest influence on the timing of bud burst, and iii. whether the location of bud burst is influenced by climate variability. Winter heatwave duration was also analysed as part of an investigation into the relationship between temperature trends of a specific winter period and the following spring events. Geographic Information Systems (GIS) and statistical analysis tools were used to visualise spatial patterns and to analyse the phenological and climate data through regression and analysis of variance (ANOVA) tests. Where there were areas that showed a strong positive or negative relationship between phenology and climate, satellite imagery was used to calculate a Normalised Difference Vegetation Index (NDVI) and a Leaf Area Index (LAI) to further investigate the relationships found. It was expected that in the north of the UK, where bud burst tends to occur later in the year than in the south, that the bud bursts would begin to occur earlier due to increasing temperatures and increased hours of sunshine. However, initial results show that for some species, the bud burst timing tends to remain or become later in the year. Interesting results will be found when investigating the statistical significance between the changing location of the bud bursts and each climatic factor.
Anderson, S.C.; Kupfer, J.A.; Wilson, R.R.; Cooper, R.J.
2000-01-01
The purpose of this research was to develop a model that could be used to provide a spatial representation of uneven-aged silvicultural treatments on forest crown area. We began by developing species-specific linear regression equations relating tree DBH to crown area for eight bottomland tree species at White River National Wildlife Refuge, Arkansas, USA. The relationships were highly significant for all species, with coefficients of determination (r(2)) ranging from 0.37 for Ulmus crassifolia to nearly 0.80 for Quercus nuttalliii and Taxodium distichum. We next located and measured the diameters of more than 4000 stumps from a single tree-group selection timber harvest. Stump locations were recorded with respect to an established gl id point system and entered into a Geographic Information System (ARC/INFO). The area occupied by the crown of each logged individual was then estimated by using the stump dimensions (adjusted to DBHs) and the regression equations relating tree DBH to crown area. Our model projected that the selection cuts removed roughly 300 m(2) of basal area from the logged sites resulting in the loss of approximate to 55 000 m(2) of crown area. The model developed in this research represents a tool that can be used in conjunction with remote sensing applications to assist in forest inventory and management, as well as to estimate the impacts of selective timber harvest on wildlife.
Scollo, Annalisa; Gottardo, Flaviana; Contiero, Barbara; Edwards, Sandra A
2017-10-01
Tail biting in pigs has been an identified behavioural, welfare and economic problem for decades, and requires appropriate but sometimes difficult on-farm interventions. The aim of the paper is to introduce the Classification and Regression Tree (CRT) methodologies to develop a tool for prevention of acute tail biting lesions in pigs on-farm. A sample of 60 commercial farms rearing heavy pigs were involved; an on-farm visit and an interview with the farmer collected data on general management, herd health, disease prevention, climate control, feeding and production traits. Results suggest a value for the CRT analysis in managing the risk factors behind tail biting on a farm-specific level, showing 86.7% sensitivity for the Classification Tree and a correlation of 0.7 between observed and predicted prevalence of tail biting obtained with the Regression Tree. CRT analysis showed five main variables (stocking density, ammonia levels, number of pigs per stockman, type of floor and timeliness in feed supply) as critical predictors of acute tail biting lesions, which demonstrate different importance in different farms subgroups. The model might have reliable and practical applications for the support and implementation of tail biting prevention interventions, especially in case of subgroups of pigs with higher risk, helping farmers and veterinarians to assess the risk in their own farm and to manage their predisposing variables in order to reduce acute tail biting lesions. Copyright © 2017 Elsevier B.V. All rights reserved.
Annual Tree Growth Predictions From Periodic Measurements
Quang V. Cao
2004-01-01
Data from annual measurements of a loblolly pine (Pinus taeda L.) plantation were available for this study. Regression techniques were employed to model annual changes of individual trees in terms of diameters, heights, and survival probabilities. Subsets of the data that include measurements every 2, 3, 4, 5, and 6 years were used to fit the same...
Understory response following varying levels of overstory removal in mixed conifer stands
Fabian C.C. Uzoh; Leroy K. Dolph; John R. Anstead
1997-01-01
Diameter growth rates of understory trees were measured for periods both before and after overstory removal on six study areas in northern California. All the species responded with increased diameter growth after adjusting to their new environments. Linear regression equations that predict post treatment diameter growth increment of the residual trees are presented...
Delayed conifer tree mortality following fire in California
Sharon M. Hood; Sheri L. Smith; Daniel R. Cluck
2007-01-01
Fire injury was characterized and survival monitored for 5,246 trees from five wildfires in California that occurred between 1999 and 2002. Logistic regression models for predicting the probability of mortality were developed for incense-cedar, Jeffrey pine, ponderosa pine, red fir and white fir. Two-year post-fire preliminary models were developed for incense-cedar,...
Estimating leaf area and leaf biomass of open-grown deciduous urban trees
David J. Nowak
1996-01-01
Logarithmic regression equations were developed to predict leaf area and leaf biomass for open-grown deciduous urban trees based on stem diameter and crown parameters. Equations based on crown parameters produced more reliable estimates. The equations can be used to help quantify forest structure and functions, particularly in urbanizing and urban/suburban areas.
Kirk M. Stueve; Dawna L. Cerney; Regina M. Rochefort; Laurie L. Kurth
2009-01-01
We performed classification analysis of 1970 satellite imagery and 2003 aerial photography to delineate establishment. Local site conditions were calculated from a LIDAR-based DEM, ancillary climate data, and 1970 tree locations in a GIS. We used logistic regression on a spatially weighted landscape matrix to rank variables.
Biomass of Yellow-Poplar in Natural Stands in Western North Carolina
Alexander Clark; James G. Schroeder
1977-01-01
Aboveground biomass was determined for yellow-poplar(Liriodendron tulipifera L.) trees 6 to 28 inches d. b. h. growingin natural, uneven-aged mountaincovestandsin western North Carolina.Specific gravity, moisture content, and green weight per cubic foot are presented for the total tree and its components. Tables developed from regression equations show weight and...
The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis
Koziol, James A.; Feng, Anne C.; Jia, Zhenyu; Wang, Yipeng; Goodison, Seven; McClelland, Michael; Mercola, Dan
2009-01-01
Motivation: Classification and regression trees have long been used for cancer diagnosis and prognosis. Nevertheless, instability and variable selection bias, as well as overfitting, are well-known problems of tree-based methods. In this article, we investigate whether ensemble tree classifiers can ameliorate these difficulties, using data from two recent studies of radical prostatectomy in prostate cancer. Results: Using time to progression following prostatectomy as the relevant clinical endpoint, we found that ensemble tree classifiers robustly and reproducibly identified three subgroups of patients in the two clinical datasets: non-progressors, early progressors and late progressors. Moreover, the consensus classifications were independent predictors of time to progression compared to known clinical prognostic factors. Contact: dmercola@uci.edu PMID:18628288
Decision tree modeling using R.
Zhang, Zhongheng
2016-08-01
In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.
Jose F. Negron; Willis C. Schaupp; Kenneth E. Gibson; John Anhold; Dawn Hansen; Ralph Thier; Phil Mocettini
1999-01-01
Data collected from Douglas-fir stands infected by the Douglas-fir beetle in Wyoming, Montana, Idaho, and Utah, were used to develop models to estimate amount of mortality in terms of basal area killed. Models were built using stepwise linear regression and regression tree approaches. Linear regression models using initial Douglas-fir basal area were built for all...
Federal Register 2010, 2011, 2012, 2013, 2014
2011-10-13
... species at that time. On October 28, 2008, we published a 90-day finding for the dusky tree vole in the... that live in conifer forests and spend almost all of their time in the tree canopy. Tree voles rarely... tree vole as a subspecies, the more recent research on tree vole genetics and analyses attempting to...
Thomasl L. Eberhardt; Xiaobo Li; Todd F. Shupe; Chung Y. Hse
2007-01-01
Wood, bark, and the wax-coated seeds from Chinese tallow tree (Sapium sebiferum (L.) Roxb. syn. Triadica sebifera (L.) Small), an invasive tree species in the southeastern United States, were subjected to extractions and degradative chemical analyses in an effort to better understand the mechanism(s) by which this tree species...
We analysed the oxygen isotopic values of wood (δ18Ow) of 12 ponderosa pine (Pinus ponderosa) trees from control, moderately, and heavily thinned stands and compared them with existing wood-based estimates of carbon isotope discrimination (∆13C), basal area increment (BAI), and g...
Fault tree models for fault tolerant hypercube multiprocessors
NASA Technical Reports Server (NTRS)
Boyd, Mark A.; Tuazon, Jezus O.
1991-01-01
Three candidate fault tolerant hypercube architectures are modeled, their reliability analyses are compared, and the resulting implications of these methods of incorporating fault tolerance into hypercube multiprocessors are discussed. In the course of performing the reliability analyses, the use of HARP and fault trees in modeling sequence dependent system behaviors is demonstrated.
Biomass expansion factor and root-to-shoot ratio for Pinus in Brazil.
Sanquetta, Carlos R; Corte, Ana Pd; da Silva, Fernando
2011-09-24
The Biomass Expansion Factor (BEF) and the Root-to-Shoot Ratio (R) are variables used to quantify carbon stock in forests. They are often considered as constant or species/area specific values in most studies. This study aimed at showing tree size and age dependence upon BEF and R and proposed equations to improve forest biomass and carbon stock. Data from 70 sample Pinus spp. grown in southern Brazil trees in different diameter classes and ages were used to demonstrate the correlation between BEF and R, and forest inventory data, such as DBH, tree height and age. Total dry biomass, carbon stock and CO2 equivalent were simulated using the IPCC default values of BEF and R, corresponding average calculated from data used in this study, as well as the values estimated by regression equations. The mean values of BEF and R calculated in this study were 1.47 and 0.17, respectively. The relationship between BEF and R and the tree measurement variables were inversely related with negative exponential behavior. Simulations indicated that use of fixed values of BEF and R, either IPCC default or current average data, may lead to unreliable estimates of carbon stock inventories and CDM projects. It was concluded that accounting for the variations in BEF and R and using regression equations to relate them to DBH, tree height and age, is fundamental in obtaining reliable estimates of forest tree biomass, carbon sink and CO2 equivalent.
Can dendrochronology procedures estimate historical Tree Water Footprint?
NASA Astrophysics Data System (ADS)
Fernandes, Tarcísio J. G.; Del Campo, Antonio D.; Molina, Antonio J.
2013-04-01
Whole estimates of tree water use are becoming increasingly important in forest science and forest scientists have long sought to develop reliable techniques to estimate tree water use. In this sense accurately determining or estimate the quantity of water transpired by trees and forests is important and can be used to determine "green" water footprint. The use of dendrochronology is relative common in the study of effects and interactions between growth and climatic variables, but few studies deal with the relationship with water footprint. The main objective of this study is determining the historical tree water-use in a planted stand by dendrochronological approaches. This study was performed in South-eastern Spain, in an area covered by 50-60 years old Pinus halepensis Mil. plantations with high tree density (ca.1288/ha) due to low forest management. The experimental set-up consisted of two plots (30x30m), one corresponding to a thinning treatment performed in 2008 (t10) and the other thinned in 1998 (t1) to assess the mid-term effects of thinning. After one year of thinning four representative trees were select in each plot to measure transpiration by heat pulse sensor (sapflow velocity, vs). The accumulated daily values of transpiration (L day-1) were estimated multiplying the values of vs by sapwood area of each selected tree. After transpiration measurements two cores per tree were taken for establishing the tree-rings chronologies. The cores were prepared, their ring-width were measured and standardised in basal area increment index (BAI-i) following usual dendrochronological methods. The dendrochronology analyses showed a general variability in ring width during the initial growth (15 years), while in the following years the width rings were very small, conditioned by climate. The year after thinning (1999 or 2009) all trees in the treatments showed significant increases in ring width. The average vs for t1 and t10 were 3.59 cm h-1 and 1.95 cm h-1, and transformed into tree transpiration using sapwood area, obtaining 6,768 and 5,844 litres per tree, respectively. BAI-i and vs were significantly related. The Pearson correlation was higher and positive when the growth from the rings formed during the span of sap flow measurement was considered, i.e., the 2009 and 2010 rings. An empirical model was fitted for the BAI-i and vs allowing a preliminary reconstruction of the stand's transpiration history. Linear regressions between vs and BAI-i were significant (R2 ≈ 0.65). Applying the linear equation in each BAI-i along the time (1960-2010) it was possible to reconstruct water use per tree, sometimes defined as the "green" water footprint. In conclusion dendrochronology methods can be used to estimate the Tree-Water-Footprint, and more experimental data should be used for better accuracy.
Fire frequency in the Interior Columbia River Basin: Building regional models from fire history data
McKenzie, D.; Peterson, D.L.; Agee, James K.
2000-01-01
Fire frequency affects vegetation composition and successional pathways; thus it is essential to understand fire regimes in order to manage natural resources at broad spatial scales. Fire history data are lacking for many regions for which fire management decisions are being made, so models are needed to estimate past fire frequency where local data are not yet available. We developed multiple regression models and tree-based (classification and regression tree, or CART) models to predict fire return intervals across the interior Columbia River basin at 1-km resolution, using georeferenced fire history, potential vegetation, cover type, and precipitation databases. The models combined semiqualitative methods and rigorous statistics. The fire history data are of uneven quality; some estimates are based on only one tree, and many are not cross-dated. Therefore, we weighted the models based on data quality and performed a sensitivity analysis of the effects on the models of estimation errors that are due to lack of cross-dating. The regression models predict fire return intervals from 1 to 375 yr for forested areas, whereas the tree-based models predict a range of 8 to 150 yr. Both types of models predict latitudinal and elevational gradients of increasing fire return intervals. Examination of regional-scale output suggests that, although the tree-based models explain more of the variation in the original data, the regression models are less likely to produce extrapolation errors. Thus, the models serve complementary purposes in elucidating the relationships among fire frequency, the predictor variables, and spatial scale. The models can provide local managers with quantitative information and provide data to initialize coarse-scale fire-effects models, although predictions for individual sites should be treated with caution because of the varying quality and uneven spatial coverage of the fire history database. The models also demonstrate the integration of qualitative and quantitative methods when requisite data for fully quantitative models are unavailable. They can be tested by comparing new, independent fire history reconstructions against their predictions and can be continually updated, as better fire history data become available.
National scale biomass estimators for United States tree species
Jennifer C. Jenkins; David C. Chojnacky; Linda S. Heath; Richard A. Birdsey
2003-01-01
Estimates of national-scale forest carbon (C) stocks and fluxes are typically based on allometric regression equations developed using dimensional analysis techniques. However, the literature is inconsistent and incomplete with respect to large-scale forest C estimation. We compiled all available diameter-based allometric regression equations for estimating total...
Mack, Jeremy S.; Berry, Kristin H.; Miller, David; Carlson, Andrea S.
2015-01-01
Agassiz's Desert Tortoises (Gopherus agassizii) spend >95% of their lives underground in cover sites that serve as thermal buffers from temperatures, which can fluctuate >40°C on a daily and seasonal basis. We monitored temperatures at 30 active tortoise cover sites within the Soda Mountains, San Bernardino County, California, from February 2004 to September 2006. Cover sites varied in type and structural characteristics, including opening height and width, soil cover depth over the opening, aspect, tunnel length, and surficial geology. We focused our analyses on periods of extreme temperature: in summer, between July 1 and September 1, and winter, between November 1 and February 15. With the use of multivariate regression tree analyses, we found cover-site temperatures were influenced largely by tunnel length and subsequently opening width and soil cover. Linear regression models further showed that increasing tunnel length increased temperature stability and dampened seasonal temperature extremes. Climate change models predict increased warming for southwestern North America. Cover sites that buffer temperature extremes and fluctuations will become increasingly important for survival of tortoises. In planning future translocation projects and conservation efforts, decision makers should consider habitats with terrain and underlying substrate that sustain cover sites with long tunnels and expanded openings for tortoises living under temperature extremes similar to those described here or as projected in the future.
Lubelchek, Ronald J.; Hoehnen, Sarah C.; Hotton, Anna L.; Kincaid, Stacey L.; Barker, David E.; French, Audrey L.
2014-01-01
Introduction HIV transmission cluster analyses can inform HIV prevention efforts. We describe the first such assessment for transmission clustering among HIV patients in Chicago. Methods We performed transmission cluster analyses using HIV pol sequences from newly diagnosed patients presenting to Chicago’s largest HIV clinic between 2008 and 2011. We compared sequences via progressive pairwise alignment, using neighbor joining to construct an un-rooted phylogenetic tree. We defined clusters as >2 sequences among which each sequence had at least one partner within a genetic distance of ≤ 1.5%. We used multivariable regression to examine factors associated with clustering and used geospatial analysis to assess geographic proximity of phylogenetically clustered patients. Results We compared sequences from 920 patients; median age 35 years; 75% male; 67% Black, 23% Hispanic; 8% had a Rapid Plasma Reagin (RPR) titer ≥ 1:16 concurrent with their HIV diagnosis. We had HIV transmission risk data for 54%; 43% identified as men who have sex with men (MSM). Phylogenetic analysis demonstrated 123 patients (13%) grouped into 26 clusters, the largest having 20 members. In multivariable regression, age < 25, Black race, MSM status, male gender, higher HIV viral load, and RPR ≥ 1:16 associated with clustering. We did not observe geographic grouping of genetically clustered patients. Discussion Our results demonstrate high rates of HIV transmission clustering, without local geographic foci, among young Black MSM in Chicago. Applied prospectively, phylogenetic analyses could guide prevention efforts and help break the cycle of transmission. PMID:25321182
Pyron, R Alexander; Hendry, Catriona R; Chou, Vincent M; Lemmon, Emily M; Lemmon, Alan R; Burbrink, Frank T
2014-12-01
Next-generation genomic sequencing promises to quickly and cheaply resolve remaining contentious nodes in the Tree of Life, and facilitates species-tree estimation while taking into account stochastic genealogical discordance among loci. Recent methods for estimating species trees bypass full likelihood-based estimates of the multi-species coalescent, and approximate the true species-tree using simpler summary metrics. These methods converge on the true species-tree with sufficient genomic sampling, even in the anomaly zone. However, no studies have yet evaluated their efficacy on a large-scale phylogenomic dataset, and compared them to previous concatenation strategies. Here, we generate such a dataset for Caenophidian snakes, a group with >2500 species that contains several rapid radiations that were poorly resolved with fewer loci. We generate sequence data for 333 single-copy nuclear loci with ∼100% coverage (∼0% missing data) for 31 major lineages. We estimate phylogenies using neighbor joining, maximum parsimony, maximum likelihood, and three summary species-tree approaches (NJst, STAR, and MP-EST). All methods yield similar resolution and support for most nodes. However, not all methods support monophyly of Caenophidia, with Acrochordidae placed as the sister taxon to Pythonidae in some analyses. Thus, phylogenomic species-tree estimation may occasionally disagree with well-supported relationships from concatenated analyses of small numbers of nuclear or mitochondrial genes, a consideration for future studies. In contrast for at least two diverse, rapid radiations (Lamprophiidae and Colubridae), phylogenomic data and species-tree inference do little to improve resolution and support. Thus, certain nodes may lack strong signal, and larger datasets and more sophisticated analyses may still fail to resolve them. Copyright © 2014 Elsevier Inc. All rights reserved.
Du, Ning; Fan, Jintu; Chen, Shuo; Liu, Yang
2008-07-21
Although recent investigations [Ryan, M.G., Yoder, B.J., 1997. Hydraulic limits to tree height and tree growth. Bioscience 47, 235-242; Koch, G.W., Sillett, S.C.,Jennings, G.M.,Davis, S.D., 2004. The limits to tree height. Nature 428, 851-854; Niklas, K.J., Spatz, H., 2004. Growth and hydraulic (not mechanical) constraints govern the scaling of tree height and mass. Proc. Natl Acad. Sci. 101, 15661-15663; Ryan, M.G., Phillips, N., Bond, B.J., 2006. Hydraulic limitation hypothesis revisited. Plant Cell Environ. 29, 367-381; Niklas, K.J., 2007. Maximum plant height and the biophysical factors that limit it. Tree Physiol. 27, 433-440; Burgess, S.S.O., Dawson, T.E., 2007. Predicting the limits to tree height using statistical regressions of leaf traits. New Phytol. 174, 626-636] suggested that the hydraulic limitation hypothesis (HLH) is the most plausible theory to explain the biophysical limits to maximum tree height and the decline in tree growth rate with age, the analysis is largely qualitative or based on statistical regression. Here we present an integrated biophysical model based on the principle that trees develop physiological compensations (e.g. the declined leaf water potential and the tapering of conduits with heights [West, G.B., Brown, J.H., Enquist, B.J., 1999. A general model for the structure and allometry of plant vascular systems. Nature 400, 664-667]) to resist the increasing water stress with height, the classical HLH and the biochemical limitations on photosynthesis [von Caemmerer, S., 2000. Biochemical Models of Leaf Photosynthesis. CSIRO Publishing, Australia]. The model has been applied to the tallest trees in the world (viz. Coast redwood (Sequoia sempervirens)). Xylem water potential, leaf carbon isotope composition, leaf mass to area ratio at different heights derived from the model show good agreements with the experimental measurements of Koch et al. [2004. The limits to tree height. Nature 428, 851-854]. The model also well explains the universal trend of declining growth rate with age.
Estimating tree species diversity in the savannah using NDVI and woody canopy cover
NASA Astrophysics Data System (ADS)
Madonsela, Sabelo; Cho, Moses Azong; Ramoelo, Abel; Mutanga, Onisimo; Naidoo, Laven
2018-04-01
Remote sensing applications in biodiversity research often rely on the establishment of relationships between spectral information from the image and tree species diversity measured in the field. Most studies have used normalized difference vegetation index (NDVI) to estimate tree species diversity on the basis that it is sensitive to primary productivity which defines spatial variation in plant diversity. The NDVI signal is influenced by photosynthetically active vegetation which, in the savannah, includes woody canopy foliage and grasses. The question is whether the relationship between NDVI and tree species diversity in the savanna depends on the woody cover percentage. This study explored the relationship between woody canopy cover (WCC) and tree species diversity in the savannah woodland of southern Africa and also investigated whether there is a significant interaction between seasonal NDVI and WCC in the factorial model when estimating tree species diversity. To fulfil our aim, we followed stratified random sampling approach and surveyed tree species in 68 plots of 90 m × 90 m across the study area. Within each plot, all trees with diameter at breast height of >10 cm were sampled and Shannon index - a common measure of species diversity which considers both species richness and abundance - was used to quantify tree species diversity. We then extracted WCC in each plot from existing fractional woody cover product produced from Synthetic Aperture Radar (SAR) data. Factorial regression model was used to determine the interaction effect between NDVI and WCC when estimating tree species diversity. Results from regression analysis showed that (i) WCC has a highly significant relationship with tree species diversity (r2 = 0.21; p < 0.01), (ii) the interaction between the NDVI and WCC is not significant, however, the factorial model significantly reduced the error of prediction (RMSE = 0.47, p < 0.05) compared to NDVI (RMSE = 0.49) or WCC (RMSE = 0.49) model during the senescence period. The result justifies our assertion that combining NDVI with WCC will be optimal for biodiversity estimation during the senescence period.
Analyzing and synthesizing phylogenies using tree alignment graphs.
Smith, Stephen A; Brown, Joseph W; Hinchliff, Cody E
2013-01-01
Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe.
Analyzing and Synthesizing Phylogenies Using Tree Alignment Graphs
Smith, Stephen A.; Brown, Joseph W.; Hinchliff, Cody E.
2013-01-01
Phylogenetic trees are used to analyze and visualize evolution. However, trees can be imperfect datatypes when summarizing multiple trees. This is especially problematic when accommodating for biological phenomena such as horizontal gene transfer, incomplete lineage sorting, and hybridization, as well as topological conflict between datasets. Additionally, researchers may want to combine information from sets of trees that have partially overlapping taxon sets. To address the problem of analyzing sets of trees with conflicting relationships and partially overlapping taxon sets, we introduce methods for aligning, synthesizing and analyzing rooted phylogenetic trees within a graph, called a tree alignment graph (TAG). The TAG can be queried and analyzed to explore uncertainty and conflict. It can also be synthesized to construct trees, presenting an alternative to supertrees approaches. We demonstrate these methods with two empirical datasets. In order to explore uncertainty, we constructed a TAG of the bootstrap trees from the Angiosperm Tree of Life project. Analysis of the resulting graph demonstrates that areas of the dataset that are unresolved in majority-rule consensus tree analyses can be understood in more detail within the context of a graph structure, using measures incorporating node degree and adjacency support. As an exercise in synthesis (i.e., summarization of a TAG constructed from the alignment trees), we also construct a TAG consisting of the taxonomy and source trees from a recent comprehensive bird study. We synthesized this graph into a tree that can be reconstructed in a repeatable fashion and where the underlying source information can be updated. The methods presented here are tractable for large scale analyses and serve as a basis for an alternative to consensus tree and supertree methods. Furthermore, the exploration of these graphs can expose structures and patterns within the dataset that are otherwise difficult to observe. PMID:24086118
Karin L. Riley; Isaac C. Grenfell; Mark A. Finney
2015-01-01
Mapping the number, size, and species of trees in forests across the western United States has utility for a number of research endeavors, ranging from estimation of terrestrial carbon resources to tree mortality following wildfires. For landscape fire and forest simulations that use the Forest Vegetation Simulator (FVS), a tree-level dataset, or âtree listâ, is a...
Habitat edge, land management, and rates of brood parasitism in tallgrass prairie.
Patten, Michael A; Shochat, Eyal; Reinking, Dan L; Wolfe, Donald H; Sherrod, Steve K
2006-04-01
Bird populations in North America's grasslands have declined sharply in recent decades. These declines are traceable, in large part, to habitat loss, but management of tallgrass prairie also has an impact. An indirect source of decline potentially associated with management is brood parasitism by the Brown-headed Cowbird (Molothrus ater), which has had substantial negative impacts on many passerine hosts. Using a novel application of regression trees, we analyzed an extensive five-year set of nest data to test how management of tallgrass prairie affected rates of brood parasitism. We examined seven landscape features that may have been associated with parasitism: presence of edge, burning, or grazing, and distance of the nest from woody vegetation, water, roads, or fences. All five grassland passerines that we included in the analyses exhibited evidence of an edge effect: the Grasshopper Sparrow (Ammodramus savannarum), Henslow's Sparrow (A. henslowii), Dickcissel (Spiza americana), Red-winged Blackbird (Agelaius phoeniceus), and Eastern Meadowlark (Sturnella magna). The edge was represented by narrow strips of woody vegetation occurring along roadsides cut through tallgrass prairie. The sparrows avoided nesting along these woody edges, whereas the other three species experienced significantly higher (1.9-5.3x) rates of parasitism along edges than in prairie. The edge effect could be related directly to increase in parasitism rate with decreased distance from woody vegetation. After accounting for edge effect in these three species, we found evidence for significantly higher (2.5-10.5x) rates of parasitism in grazed plots, particularly those burned in spring to increase forage, than in undisturbed prairie. Regression tree analysis proved to be an important tool for hierarchically parsing various landscape features that affect parasitism rates. We conclude that, on the Great Plains, rates of brood parasitism are strongly associated with relatively recent road cuts, in that edge effects manifest themselves through the presence of trees, a novel habitat component in much of the tallgrass prairie. Grazing is also a key associate of increased parasitism. Areas managed with prescribed fire, used frequently to increase forage for grazing cattle, may experience higher rates of brood parasitism. Regardless, removing trees and shrubs along roadsides and refraining from planting them along new roads may benefit grassland birds.
NASA Astrophysics Data System (ADS)
Schwan, M. R.; Herrick, C.; Hobbie, E. A.; Chen, J.; Varner, R. K.; Palace, M. W.; Marek, E.; Kashi, N. N.; Smith, S. L.
2015-12-01
Rapid warming in arctic and sub-arctic environments shifts plant community structure which in turn can alter carbon cycling by releasing large stocks of carbon sequestered in arctic soils. Much work has been done in sub-arctic peatlands to understand how shifts in dominant vegetation cover can ultimately affect global carbon balances, but less focus has been given to upland environments where similar changes are occurring. Recent circumpolar expansion of deciduous shrubs and trees in sub-arctic upland environments may alter carbon cycling due to shrubs and trees sequestering less C in soils than the heath plants they typically replace. In this study we explored the relationship between nutrient and carbon cycling and above-ground vegetation on six transects which traverse an ecotone gradient from heath tundra (dominated by ericoid mycorrhizal plants) through deciduous shrubs to deciduous trees (dominated by ectomycorrhizal plants) in upland environments of sub-arctic Sweden near Vassijaure (~850 mm precipitation) and Abisko (~300 mm precipitation). We collected soil and foliage for analysis of natural abundances of stable carbon and nitrogen isotopes (δ13C and δ15N), which can be a sensitive indicator of C and N dynamics. We also took high-resolution remote aerial imagery over the transects to calculate percent cover of vegetation types using GIS software. We concurrently estimated percent cover in smaller plots on the ground of three dominant species, Empetrum nigrum, Betula nana, and Betula pubescens, to serve as ground-truthing for the aerial imagery. Analysis of vegetation cover data shows significant differences in vegetation types along the transects. Preliminary multiple regression analysis of isotopes shows that δ13C in organic soil at the Vassijaure site is mostly controlled by distance along the transect, an interaction term between transect distance and soil depth, and δ15N (adjusted r2 = 0.85, p < 0.0001). Values of δ13C were lower in soils in the shrubs and forest than in the heath. In regression analyses, δ15N was primarily controlled by depth, and secondarily by heath cover (adjusted r2 = 0.68, p < 0.0001). These results suggest that trees and shrubs are sequestering carbon, and interactions between plants and belowground soil communities may be driving nitrogen dynamics.
Fabian C.C. Uzoh; Martin W. Ritchie
1996-01-01
The equations presented predict crown area for 13 species of trees and shrubs which may be found growing in competition with commercial conifers during early stages of stand development. The equations express crown area as a function of basal area and height. Parameters were estimated for each species individually using weighted nonlinear least square regression.
Biomass equations for major tree species of the Northeast
Louise M. Tritton; James W. Hornbeck
1982-01-01
Regression equations are used in both forestry and ecosystem studies to estimate tree biomass from field measurements of dbh (diameter at breast height) or a combination of dbh and height. Literature on biomass is reviewed, and 178 sets of publish equation for 25 species common to the Northeastern Unites States are listed. On the basis of these equations, estimates of...
Stand basal-area and tree-diameter growth in red spruce-fir forests in Maine, 1960-80
S.J. Zarnoch; D.A. Gansner; D.S. Powell; T.A. Birch; T.A. Birch
1990-01-01
Stand basal-area change and individual surviving red spruce d.b.h. growth from 1960 to 1980 were analyzed for red spruce-fir stands in Maine. Regression modeling was used to relate these measures of growth to stand and tree conditions and to compare growth throughout the period. Results indicate a decline in growth.
Examination of the Arborsonic Decay Detector for Detecting Bacterial Wetwood in Red Oaks
Zicai Xu; Theodor D. Leininger; James G. Williams; Frank H. Tainter
2000-01-01
The Arborsonic Decay Detector (ADD; Fujikura Europe Limited, Wiltshire, England) was used to measure the time it took an ultrasound wave to cross 280 diameters in red oak trees with varying degrees of bacterial wetwood or heartwood decay. Linear regressions derived from the ADD readings of trees in Mississippi and South Carolina with wetwood and heartwood decay...
NASA Astrophysics Data System (ADS)
Kisi, Ozgur; Parmar, Kulwinder Singh
2016-03-01
This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.
Clow, David W.; Rhoades, Charles; Briggs, Jenny S.; Caldwell, Megan K.; Lewis, William M.
2011-01-01
Pine forest in northern Colorado and southern Wyoming, USA, are experiencing the most severe mountain pine beetle epidemic in recorded history, and possible degradation of drinking-water quality is a major concern. The objective of this study was to investigate possible changes in soil and water chemistry in Grand County, Colorado in response to the epidemic, and to identify major controlling influences on stream-water nutrients and C in areas affected by the mountain pine beetle. Soil moisture and soil N increased in soils beneath trees killed by the mountain pine beetle, reflecting reduced evapotranspiration and litter accumulation and decay. No significant changes in stream-water NO3-">NO3- or dissolved organic C were observed; however, total N and total P increased, possibly due to litter breakdown or increased productivity related to warming air temperatures. Multiple-regression analyses indicated that % of basin affected by mountain pine beetles had minimal influence on stream-water NO3-">NO3- and dissolved organic C; instead, other basin characteristics, such as percent of the basin classified as forest, were much more important.
Shafizadeh-Moghadam, Hossein; Valavi, Roozbeh; Shahabi, Himan; Chapi, Kamran; Shirzadi, Ataollah
2018-07-01
In this research, eight individual machine learning and statistical models are implemented and compared, and based on their results, seven ensemble models for flood susceptibility assessment are introduced. The individual models included artificial neural networks, classification and regression trees, flexible discriminant analysis, generalized linear model, generalized additive model, boosted regression trees, multivariate adaptive regression splines, and maximum entropy, and the ensemble models were Ensemble Model committee averaging (EMca), Ensemble Model confidence interval Inferior (EMciInf), Ensemble Model confidence interval Superior (EMciSup), Ensemble Model to estimate the coefficient of variation (EMcv), Ensemble Model to estimate the mean (EMmean), Ensemble Model to estimate the median (EMmedian), and Ensemble Model based on weighted mean (EMwmean). The data set covered 201 flood events in the Haraz watershed (Mazandaran province in Iran) and 10,000 randomly selected non-occurrence points. Among the individual models, the Area Under the Receiver Operating Characteristic (AUROC), which showed the highest value, belonged to boosted regression trees (0.975) and the lowest value was recorded for generalized linear model (0.642). On the other hand, the proposed EMmedian resulted in the highest accuracy (0.976) among all models. In spite of the outstanding performance of some models, nevertheless, variability among the prediction of individual models was considerable. Therefore, to reduce uncertainty, creating more generalizable, more stable, and less sensitive models, ensemble forecasting approaches and in particular the EMmedian is recommended for flood susceptibility assessment. Copyright © 2018 Elsevier Ltd. All rights reserved.
Olive tree-ring problematic dating: a comparative analysis on Santorini (Greece).
Cherubini, Paolo; Humbel, Turi; Beeckman, Hans; Gärtner, Holger; Mannes, David; Pearson, Charlotte; Schoch, Werner; Tognetti, Roberto; Lev-Yadun, Simcha
2013-01-01
Olive trees are a classic component of Mediterranean environments and some of them are known historically to be very old. In order to evaluate the possibility to use olive tree-rings for dendrochronology, we examined by various methods the reliability of olive tree-rings identification. Dendrochronological analyses of olive trees growing on the Aegean island Santorini (Greece) show that the determination of the number of tree-rings is impossible because of intra-annual wood density fluctuations, variability in tree-ring boundary structure, and restriction of its cambial activity to shifting sectors of the circumference, causing the tree-ring sequences along radii of the same cross section to differ.
NASA Astrophysics Data System (ADS)
Kukunda, Collins B.; Duque-Lazo, Joaquín; González-Ferreiro, Eduardo; Thaden, Hauke; Kleinn, Christoph
2018-03-01
Distinguishing tree species is relevant in many contexts of remote sensing assisted forest inventory. Accurate tree species maps support management and conservation planning, pest and disease control and biomass estimation. This study evaluated the performance of applying ensemble techniques with the goal of automatically distinguishing Pinus sylvestris L. and Pinus uncinata Mill. Ex Mirb within a 1.3 km2 mountainous area in Barcelonnette (France). Three modelling schemes were examined, based on: (1) high-density LiDAR data (160 returns m-2), (2) Worldview-2 multispectral imagery, and (3) Worldview-2 and LiDAR in combination. Variables related to the crown structure and height of individual trees were extracted from the normalized LiDAR point cloud at individual-tree level, after performing individual tree crown (ITC) delineation. Vegetation indices and the Haralick texture indices were derived from Worldview-2 images and served as independent spectral variables. Selection of the best predictor subset was done after a comparison of three variable selection procedures: (1) Random Forests with cross validation (AUCRFcv), (2) Akaike Information Criterion (AIC) and (3) Bayesian Information Criterion (BIC). To classify the species, 9 regression techniques were combined using ensemble models. Predictions were evaluated using cross validation and an independent dataset. Integration of datasets and models improved individual tree species classification (True Skills Statistic, TSS; from 0.67 to 0.81) over individual techniques and maintained strong predictive power (Relative Operating Characteristic, ROC = 0.91). Assemblage of regression models and integration of the datasets provided more reliable species distribution maps and associated tree-scale mapping uncertainties. Our study highlights the potential of model and data assemblage at improving species classifications needed in present-day forest planning and management.
Integrative modeling of gene and genome evolution roots the archaeal tree of life
Szöllősi, Gergely J.; Spang, Anja; Foster, Peter G.; Heaps, Sarah E.; Boussau, Bastien; Ettema, Thijs J. G.; Embley, T. Martin
2017-01-01
A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum, which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO2 to acetate via the Wood–Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer. PMID:28533395
Turktas, Mine; Inal, Behcet; Okay, Sezer; Erkilic, Emine Gulden; Dundar, Ekrem; Hernandez, Pilar; Dorado, Gabriel; Unver, Turgay
2013-01-01
The olive tree (Olea europaea L.) is widely known for its strong tendency for alternate bearing, which severely affects the fruit yield from year to year. Microarray based gene expression analysis using RNA from olive samples (on-off years leaves and ripe-unripe fruits) are particularly useful to understand the molecular mechanisms influencing the periodicity in the olive tree. Thus, we carried out genome wide transcriptome analyses involving different organs and temporal stages of the olive tree using the NimbleGen Array containing 136,628 oligonucleotide probe sets. Cluster analyses of the genes showed that cDNAs originated from different organs could be sorted into separate groups. The nutritional control had a particularly remarkable impact on the alternate bearing of olive, as shown by the differential expression of transcripts under different temporal phases and organs. Additionally, hormonal control and flowering processes also played important roles in this phenomenon. Our analyses provide further insights into the transcript changes between "on year" and "off year" leaves along with the changes from unrpipe to ripe fruits, which shed light on the molecular mechanisms underlying the olive tree alternate bearing. These findings have important implications for the breeding and agriculture of the olive tree and other crops showing periodicity. To our knowledge, this is the first study reporting the development and use of an olive array to document the gene expression profiling associated with the alternate bearing in olive tree.
Integrative modeling of gene and genome evolution roots the archaeal tree of life.
Williams, Tom A; Szöllősi, Gergely J; Spang, Anja; Foster, Peter G; Heaps, Sarah E; Boussau, Bastien; Ettema, Thijs J G; Embley, T Martin
2017-06-06
A root for the archaeal tree is essential for reconstructing the metabolism and ecology of early cells and for testing hypotheses that propose that the eukaryotic nuclear lineage originated from within the Archaea; however, published studies based on outgroup rooting disagree regarding the position of the archaeal root. Here we constructed a consensus unrooted archaeal topology using protein concatenation and a multigene supertree method based on 3,242 single gene trees, and then rooted this tree using a recently developed model of genome evolution. This model uses evidence from gene duplications, horizontal transfers, and gene losses contained in 31,236 archaeal gene families to identify the most likely root for the tree. Our analyses support the monophyly of DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, Nanohaloarchaea), a recently discovered cosmopolitan and genetically diverse lineage, and, in contrast to previous work, place the tree root between DPANN and all other Archaea. The sister group to DPANN comprises the Euryarchaeota and the TACK Archaea, including Lokiarchaeum , which our analyses suggest are monophyletic sister lineages. Metabolic reconstructions on the rooted tree suggest that early Archaea were anaerobes that may have had the ability to reduce CO 2 to acetate via the Wood-Ljungdahl pathway. In contrast to proposals suggesting that genome reduction has been the predominant mode of archaeal evolution, our analyses infer a relatively small-genomed archaeal ancestor that subsequently increased in complexity via gene duplication and horizontal gene transfer.
Turktas, Mine; Inal, Behcet; Okay, Sezer; Erkilic, Emine Gulden; Dundar, Ekrem; Hernandez, Pilar; Dorado, Gabriel; Unver, Turgay
2013-01-01
The olive tree (Olea europaea L.) is widely known for its strong tendency for alternate bearing, which severely affects the fruit yield from year to year. Microarray based gene expression analysis using RNA from olive samples (on-off years leaves and ripe-unripe fruits) are particularly useful to understand the molecular mechanisms influencing the periodicity in the olive tree. Thus, we carried out genome wide transcriptome analyses involving different organs and temporal stages of the olive tree using the NimbleGen Array containing 136,628 oligonucleotide probe sets. Cluster analyses of the genes showed that cDNAs originated from different organs could be sorted into separate groups. The nutritional control had a particularly remarkable impact on the alternate bearing of olive, as shown by the differential expression of transcripts under different temporal phases and organs. Additionally, hormonal control and flowering processes also played important roles in this phenomenon. Our analyses provide further insights into the transcript changes between ”on year” and “off year” leaves along with the changes from unrpipe to ripe fruits, which shed light on the molecular mechanisms underlying the olive tree alternate bearing. These findings have important implications for the breeding and agriculture of the olive tree and other crops showing periodicity. To our knowledge, this is the first study reporting the development and use of an olive array to document the gene expression profiling associated with the alternate bearing in olive tree. PMID:23555820
The impact of fossil data on annelid phylogeny inferred from discrete morphological characters.
Parry, Luke A; Edgecombe, Gregory D; Eibye-Jacobsen, Danny; Vinther, Jakob
2016-08-31
As a result of their plastic body plan, the relationships of the annelid worms and even the taxonomic makeup of the phylum have long been contentious. Morphological cladistic analyses have typically recovered a monophyletic Polychaeta, with the simple-bodied forms assigned to an early-diverging clade or grade. This is in stark contrast to molecular trees, in which polychaetes are paraphyletic and include clitellates, echiurans and sipunculans. Cambrian stem group annelid body fossils are complex-bodied polychaetes that possess well-developed parapodia and paired head appendages (palps), suggesting that the root of annelids is misplaced in morphological trees. We present a reinvestigation of the morphology of key fossil taxa and include them in a comprehensive phylogenetic analysis of annelids. Analyses using probabilistic methods and both equal- and implied-weights parsimony recover paraphyletic polychaetes and support the conclusion that echiurans and clitellates are derived polychaetes. Morphological trees including fossils depict two main clades of crown-group annelids that are similar, but not identical, to Errantia and Sedentaria, the fundamental groupings in transcriptomic analyses. Removing fossils yields trees that are often less resolved and/or root the tree in greater conflict with molecular topologies. While there are many topological similarities between the analyses herein and recent phylogenomic hypotheses, differences include the exclusion of Sipuncula from Annelida and the taxa forming the deepest crown-group divergences. © 2016 The Authors.
Yi, Zhenzhen; Strüder-Kypke, Michaela; Hu, Xiaozhong; Lin, Xiaofeng; Song, Weibo
2014-02-01
In order to assess how dataset-selection for multi-gene analyses affects the accuracy of inferred phylogenetic trees in ciliates, we chose five genes and the genus Paramecium, one of the most widely used model protist genera, and compared tree topologies of the single- and multi-gene analyses. Our empirical study shows that: (1) Using multiple genes improves phylogenetic accuracy, even when their one-gene topologies are in conflict with each other. (2) The impact of missing data on phylogenetic accuracy is ambiguous: resolution power and topological similarity, but not number of represented taxa, are the most important criteria of a dataset for inclusion in concatenated analyses. (3) As an example, we tested the three classification models of the genus Paramecium with a multi-gene based approach, and only the monophyly of the subgenus Paramecium is supported. Copyright © 2013 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Bouffon, T.; Rice, R.; Bales, R.
2006-12-01
The spatial distributions of snow water equivalent (SWE) and snow depth within a 1, 4, and 16 km2 grid element around two automated snow pillows in a forested and open- forested region of the Upper Merced River Basin (2,800 km2) of Yosemite National Park were characterized using field observations and analyzed using binary regression trees. Snow surveys occurred at the forested site during the accumulation and ablation seasons, while at the open-forest site a survey was performed only during the accumulation season. An average of 130 snow depth and 7 snow density measurements were made on each survey, within the 4 km2 grid. Snow depth was distributed using binary regression trees and geostatistical methods using the physiographic parameters (e.g. elevation, slope, vegetation, aspect). Results in the forest region indicate that the snow pillow overestimated average SWE within the 1, 4, and 16 km2 areas by 34 percent during ablation, but during accumulation the snow pillow provides a good estimate of the modeled mean SWE grid value, however it is suspected that the snow pillow was underestimating SWE. However, at the open forest site, during accumulation, the snow pillow was 28 percent greater than the mean modeled grid element. In addition, the binary regression trees indicate that the independent variables of vegetation, slope, and aspect are the most influential parameters of snow depth distribution. The binary regression tree and multivariate linear regression models explain about 60 percent of the initial variance for snow depth and 80 percent for density, respectively. This short-term study provides motivation and direction for the installation of a distributed snow measurement network to fill the information gap in basin-wide SWE and snow depth measurements. Guided by these results, a distributed snow measurement network was installed in the Fall 2006 at Gin Flat in the Upper Merced River Basin with the specific objective of measuring accumulation and ablation across topographic variables with the aim of providing guidance for future larger scale observation network designs.
NASA Astrophysics Data System (ADS)
Nelson, E. A.; Thomas, S. C.
2007-12-01
Global increases in temperature and atmospheric CO2 concentration are predicted to enhance tree growth in the short term, but studies of current impacts of climate change on Canada's forests are limited. This study examined the effects of increasing temperature and atmospheric CO2 concentration on tree ring growth in west-central Manitoba and northern Ontario, sampling white spruce (Picea glauca) and black spruce (Picea mariana), respectively. Over 50 tree cores from each site were sampled, analysed for ring-width, cross-dated and detrended, generating a ~100 y chronology for each population. We found a positive correlation between ring-width increment and spring temperatures (April-May: p<0.005) in Ontario. In Manitoba, however, we found a negative correlation between summer temperatures (Jul-Aug: p<0.005) and ring-width increment coincident with a positive relationship with summer precipitation (July: p<0.03). We examined the residuals following a regression with temperature for a positive trend over time, which has been interpreted in prior studies as evidence for a CO2 fertilization effect. We detected no such putative CO2 fertilization signal in either spruce population. Our results suggest that temperature-limited lowland black spruce communities may respond positively to moderate warming, but that water-limited upland white spruce communities may suffer from drought stress under high temperature conditions. Neither population appears to benefit from increasing CO2 availability.
The relationship between trees and human health: evidence from the spread of the emerald ash borer.
Donovan, Geoffrey H; Butry, David T; Michael, Yvonne L; Prestemon, Jeffrey P; Liebhold, Andrew M; Gatziolis, Demetrios; Mao, Megan Y
2013-02-01
Several recent studies have identified a relationship between the natural environment and improved health outcomes. However, for practical reasons, most have been observational, cross-sectional studies. A natural experiment, which provides stronger evidence of causality, was used to test whether a major change to the natural environment-the loss of 100 million trees to the emerald ash borer, an invasive forest pest-has influenced mortality related to cardiovascular and lower-respiratory diseases. Two fixed-effects regression models were used to estimate the relationship between emerald ash borer presence and county-level mortality from 1990 to 2007 in 15 U.S. states, while controlling for a wide range of demographic covariates. Data were collected from 1990 to 2007, and the analyses were conducted in 2011 and 2012. There was an increase in mortality related to cardiovascular and lower-respiratory-tract illness in counties infested with the emerald ash borer. The magnitude of this effect was greater as infestation progressed and in counties with above-average median household income. Across the 15 states in the study area, the borer was associated with an additional 6113 deaths related to illness of the lower respiratory system, and 15,080 cardiovascular-related deaths. Results suggest that loss of trees to the emerald ash borer increased mortality related to cardiovascular and lower-respiratory-tract illness. This finding adds to the growing evidence that the natural environment provides major public health benefits. Published by Elsevier Inc.
Yamashita, Takashi; Kart, Cary S; Noe, Douglas A
2012-12-01
Type 2 diabetes is known to contribute to health disparities in the U.S. and failure to adhere to recommended self-care behaviors is a contributing factor. Intervention programs face difficulties as a result of patient diversity and limited resources. With data from the 2005 Behavioral Risk Factor Surveillance System, this study employs a logistic regression tree algorithm to identify characteristics of sub-populations with type 2 diabetes according to their reported frequency of adherence to four recommended diabetes self-care behaviors including blood glucose monitoring, foot examination, eye examination and HbA1c testing. Using Andersen's health behavior model, need factors appear to dominate the definition of which sub-groups were at greatest risk for low as well as high adherence. Findings demonstrate the utility of easily interpreted tree diagrams to design specific culturally appropriate intervention programs targeting sub-populations of diabetes patients who need to improve their self-care behaviors. Limitations and contributions of the study are discussed.
Biological, social, and urban design factors affecting young street tree mortality in New York City
Jacqueline W.T. Lu; Erika S. Svendsen; Lindsay K. Campbell; Jennifer Greenfeld; Jessie Braden; Kristen King; Nancy Falxa-Raymond
2010-01-01
In dense metropolitan areas, there are many factors including traffic congestion, building development and social organizations that may impact the health of street trees. The focus of this study is to better understand how social, biological and urban design factors affect the mortality rates of newly planted street trees. Prior analyses of street trees planted by the...
Growth and mortality of bigtooth aspen trees stressed by defoliation
Donald D. Davis; Timothy M. Frontz
2003-01-01
A survey conducted by the authors in 1993 in six stands within an area of declining aspen revealed that aspen mortality ranged from 25 to 67 percent per stand. Tree-ring analyses revealed that trees dead at time of sampling in 1993 had been growing slower in four of the six stands during the previous decade than were the surviving trees. However growth declines had...
Height-diameter equations for young-growth red fir in California and southern Oregon
K. Leroy Dolph
1989-01-01
Total tree height of young-growth red fir can be estimated from the relation of total tree height to diameter outside bark at breast height (DOB). Total tree heights and corresponding diameters were obtained from stem analyses of 562 trees distributed across 56 sampling locations in the true fir forest type of California and Oregon. The resulting equations can predict...
NASA Astrophysics Data System (ADS)
Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu
2018-02-01
A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.
Random Forest as a Predictive Analytics Alternative to Regression in Institutional Research
ERIC Educational Resources Information Center
He, Lingjun; Levine, Richard A.; Fan, Juanjuan; Beemer, Joshua; Stronach, Jeanne
2018-01-01
In institutional research, modern data mining approaches are seldom considered to address predictive analytics problems. The goal of this paper is to highlight the advantages of tree-based machine learning algorithms over classic (logistic) regression methods for data-informed decision making in higher education problems, and stress the success of…
Tree allometry and improved estimation of carbon stocks and balance in tropical forests.
Chave, J; Andalo, C; Brown, S; Cairns, M A; Chambers, J Q; Eamus, D; Fölster, H; Fromard, F; Higuchi, N; Kira, T; Lescure, J-P; Nelson, B W; Ogawa, H; Puig, H; Riéra, B; Yamakura, T
2005-08-01
Tropical forests hold large stores of carbon, yet uncertainty remains regarding their quantitative contribution to the global carbon cycle. One approach to quantifying carbon biomass stores consists in inferring changes from long-term forest inventory plots. Regression models are used to convert inventory data into an estimate of aboveground biomass (AGB). We provide a critical reassessment of the quality and the robustness of these models across tropical forest types, using a large dataset of 2,410 trees >or= 5 cm diameter, directly harvested in 27 study sites across the tropics. Proportional relationships between aboveground biomass and the product of wood density, trunk cross-sectional area, and total height are constructed. We also develop a regression model involving wood density and stem diameter only. Our models were tested for secondary and old-growth forests, for dry, moist and wet forests, for lowland and montane forests, and for mangrove forests. The most important predictors of AGB of a tree were, in decreasing order of importance, its trunk diameter, wood specific gravity, total height, and forest type (dry, moist, or wet). Overestimates prevailed, giving a bias of 0.5-6.5% when errors were averaged across all stands. Our regression models can be used reliably to predict aboveground tree biomass across a broad range of tropical forests. Because they are based on an unprecedented dataset, these models should improve the quality of tropical biomass estimates, and bring consensus about the contribution of the tropical forest biome and tropical deforestation to the global carbon cycle.
Tree growth-climate relationships in a forest-plot network on Mediterranean mountains.
Fyllas, Nikolaos M; Christopoulou, Anastasia; Galanidis, Alexandros; Michelaki, Chrysanthi Z; Dimitrakopoulos, Panayiotis G; Fulé, Peter Z; Arianoutsou, Margarita
2017-11-15
In this study we analysed a novel tree-growth dataset, inferred from annual ring-width measurements, of 7 forest tree species from 12 mountain regions in Greece, in order to identify tree growth - climate relationships. The tree species of interest were: Abies cephalonica, Abies borisii-regis, Picea abies, Pinus nigra, Pinus sylvestris, Fagus sylvatica and Quercus frainetto growing across a gradient of climate conditions with mean annual temperature ranging from 5.7 to 12.6°C and total annual precipitation from 500 to 950mm. In total, 344 tree cores (one per tree) were analysed across a network of 20 study sites. We found that water availability during the summer period (May-August) was a strong predictor of interannual variation in tree growth for all study species. Across species and sites, annual tree growth was positively related to summer season precipitation (P SP ). The responsiveness of annual growth to P SP was tightly related to species and site specific measurements of instantaneous photosynthetic water use efficiency (WUE), suggesting that the growth of species with efficient water use is more responsive to variations in precipitation during the dry months of the year. Our findings support the importance of water availability for the growth of mountainous Mediterranean tree species and highlight that future reductions in precipitation are likely to lead to reduced tree-growth under climate change conditions. Copyright © 2017 Elsevier B.V. All rights reserved.
Treuer, T; Feng, Q; Desaiah, D; Altin, M; Wu, S; El-Shafei, A; Serebryakova, E; Gado, M; Faries, D
2014-09-01
The reduced availability of data from non-Western countries limits our ability to understand attention-deficit/hyperactivity disorder (ADHD) treatment outcomes, specifically, adherence and persistence of ADHD in children and adolescents. This analysis assessed predictors of treatment outcomes in a non-Western cohort of patients with ADHD treated with atomoxetine or methylphenidate. Data from a 12-month, prospective, observational study in outpatients aged 6-17 years treated with atomoxetine (N = 234) or methylphenidate (N = 221) were analysed post hoc to determine potential predictors of treatment outcomes. Participating countries included the Russian Federation, China, Taiwan, Egypt, United Arab Emirates and Lebanon. Factors associated with remission were analysed with stepwise multiple logistic regression and classification and regression trees (CART). Cox proportional hazards models with propensity score adjustment assessed differences in atomoxetine persistence among initial-dose cohorts. In patients treated with atomoxetine who had available dosing information (N = 134), Cox proportional hazards revealed lower (< 0.5 mg/kg) initial dose was significantly associated with shorter medication persistence (p < 0.01). multiple logistic regression analysis revealed greater rates of remission for atomoxetine-treated patients were associated with age (older), country (United Arab Emirates) and gender (female) (all p < 0.05). CART analysis confirmed older age and lack of specific phobias were associated with greater remission rates. For methylphenidate, greater baseline weight (highly correlated with the age factor found for atomoxetine) and prior atomoxetine use were associated with greater remission rates. These findings may help clinicians assess factors upon initiation of ADHD treatment to improve course prediction, proper dosing and treatment adherence and persistence. Observational study, therefore no registration. © 2014 John Wiley & Sons Ltd.
Slow release fertilizers in bareroot nurseries
J. G. Iyer; J. Dobrahner; B. Lowery; J. Vandettey
2002-01-01
Maintaining sufficient soil fertility in tree nurseries for good tree growth can be implemented by annually performing soil analyses and following a fertility maintenance program. Percentage recovery by trees of fertilizer applied indicates efficiency of fertilizer use. There is a wide variation in the recovery among the various fertilizer elements. Our research has...
Object-Oriented Algorithm For Evaluation Of Fault Trees
NASA Technical Reports Server (NTRS)
Patterson-Hine, F. A.; Koen, B. V.
1992-01-01
Algorithm for direct evaluation of fault trees incorporates techniques of object-oriented programming. Reduces number of calls needed to solve trees with repeated events. Provides significantly improved software environment for such computations as quantitative analyses of safety and reliability of complicated systems of equipment (e.g., spacecraft or factories).
Kurt H. Johnsen; Lawrence B. Flanagan; Dudley A. Huber; John E. Major
1999-01-01
The authors performed genetic analyses of growth, carbon isotope discrimination (?13C), and foliar N concentration using a half-diallel subset of a 7 Ã 7 complete diallel planted on three sites ranging in water availability. Trees were 22 years old. Heritabilities; general and...
Vegetation Continuous Fields--Transitioning from MODIS to VIIRS
NASA Astrophysics Data System (ADS)
DiMiceli, C.; Townshend, J. R.; Sohlberg, R. A.; Kim, D. H.; Kelly, M.
2015-12-01
Measurements of fractional vegetation cover are critical for accurate and consistent monitoring of global deforestation rates. They also provide important parameters for land surface, climate and carbon models and vital background data for research into fire, hydrological and ecosystem processes. MODIS Vegetation Continuous Fields (VCF) products provide four complementary layers of fractional cover: tree cover, non-tree vegetation, bare ground, and surface water. MODIS VCF products are currently produced globally and annually at 250m resolution for 2000 to the present. Additionally, annual VCF products at 1/20° resolution derived from AVHRR and MODIS Long-Term Data Records are in development to provide Earth System Data Records of fractional vegetation cover for 1982 to the present. In order to provide continuity of these valuable products, we are extending the VCF algorithms to create Suomi NPP/VIIRS VCF products. This presentation will highlight the first VIIRS fractional cover product: global percent tree cover at 1 km resolution. To create this product, phenological and physiological metrics were derived from each complete year of VIIRS 8-day surface reflectance products. A supervised regression tree method was applied to the metrics, using training derived from Landsat data supplemented by high-resolution data from Ikonos, RapidEye and QuickBird. The regression tree model was then applied globally to produce fractional tree cover. In our presentation we will detail our methods for creating the VIIRS VCF product. We will compare the new VIIRS VCF product to our current MODIS VCF products and demonstrate continuity between instruments. Finally, we will outline future VIIRS VCF development plans.
Tanaka, Tomohiro; Voigt, Michael D
2018-03-01
Non-melanoma skin cancer (NMSC) is the most common de novo malignancy in liver transplant (LT) recipients; it behaves more aggressively and it increases mortality. We used decision tree analysis to develop a tool to stratify and quantify risk of NMSC in LT recipients. We performed Cox regression analysis to identify which predictive variables to enter into the decision tree analysis. Data were from the Organ Procurement Transplant Network (OPTN) STAR files of September 2016 (n = 102984). NMSC developed in 4556 of the 105984 recipients, a mean of 5.6 years after transplant. The 5/10/20-year rates of NMSC were 2.9/6.3/13.5%, respectively. Cox regression identified male gender, Caucasian race, age, body mass index (BMI) at LT, and sirolimus use as key predictive or protective factors for NMSC. These factors were entered into a decision tree analysis. The final tree stratified non-Caucasians as low risk (0.8%), and Caucasian males > 47 years, BMI < 40 who did not receive sirolimus, as high risk (7.3% cumulative incidence of NMSC). The predictions in the derivation set were almost identical to those in the validation set (r 2 = 0.971, p < 0.0001). Cumulative incidence of NMSC in low, moderate and high risk groups at 5/10/20 year was 0.5/1.2/3.3, 2.1/4.8/11.7 and 5.6/11.6/23.1% (p < 0.0001). The decision tree model accurately stratifies the risk of developing NMSC in the long-term after LT.
NASA Astrophysics Data System (ADS)
Park, Seonyoung; Im, Jungho; Park, Sumin; Rhee, Jinyoung
2017-04-01
Soil moisture is one of the most important keys for understanding regional and global climate systems. Soil moisture is directly related to agricultural processes as well as hydrological processes because soil moisture highly influences vegetation growth and determines water supply in the agroecosystem. Accurate monitoring of the spatiotemporal pattern of soil moisture is important. Soil moisture has been generally provided through in situ measurements at stations. Although field survey from in situ measurements provides accurate soil moisture with high temporal resolution, it requires high cost and does not provide the spatial distribution of soil moisture over large areas. Microwave satellite (e.g., advanced Microwave Scanning Radiometer on the Earth Observing System (AMSR2), the Advanced Scatterometer (ASCAT), and Soil Moisture Active Passive (SMAP)) -based approaches and numerical models such as Global Land Data Assimilation System (GLDAS) and Modern- Era Retrospective Analysis for Research and Applications (MERRA) provide spatial-temporalspatiotemporally continuous soil moisture products at global scale. However, since those global soil moisture products have coarse spatial resolution ( 25-40 km), their applications for agriculture and water resources at local and regional scales are very limited. Thus, soil moisture downscaling is needed to overcome the limitation of the spatial resolution of soil moisture products. In this study, GLDAS soil moisture data were downscaled up to 1 km spatial resolution through the integration of AMSR2 and ASCAT soil moisture data, Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM), and Moderate Resolution Imaging Spectroradiometer (MODIS) data—Land Surface Temperature, Normalized Difference Vegetation Index, and Land cover—using modified regression trees over East Asia from 2013 to 2015. Modified regression trees were implemented using Cubist, a commercial software tool based on machine learning. An optimization based on pruning of rules derived from the modified regression trees was conducted. Root Mean Square Error (RMSE) and Correlation coefficients (r) were used to optimize the rules, and finally 59 rules from modified regression trees were produced. The results show high validation r (0.79) and low validation RMSE (0.0556m3/m3). The 1 km downscaled soil moisture was evaluated using ground soil moisture data at 14 stations, and both soil moisture data showed similar temporal patterns (average r=0.51 and average RMSE=0.041). The spatial distribution of the 1 km downscaled soil moisture well corresponded with GLDAS soil moisture that caught both extremely dry and wet regions. Correlation between GLDAS and the 1 km downscaled soil moisture during growing season was positive (mean r=0.35) in most regions.
Diameter-growth model across shortleaf pine range using regression tree analysis
Daniel Yaussy; Louis Iverson; Anantha Prasad
1999-01-01
Diameter growth of a tree in most gap-phase models is limited by light, nutrients, moisture, and temperature. Growing-season temperature is represented by growing degree days (gdd), which is the sum of the average daily temperatures above a baseline temperature. Gap-phase models determine the north-south range of a species by the gdd limits at the north and south...
Eric J. Gustafson
2014-01-01
Regression models developed in the upper Midwest (United States) to predict drought-induced tree mortality from measures of drought (Palmer Drought Severity Index) were tested in the northeastern United States and found inadequate. The most likely cause of this result is that long drought events were rare in the Northeast during the period when inventory data were...
W. Henry McNab; David L. Loftis; Callie J. Schweitzer; Raymond Sheffield
2004-01-01
We used tree indicator species occurring on 438 plots in the Plateau counties of Tennessee to test the uniqueness of four conterminous ecoregions. Multinomial logistic regression indicated that the presence of 14 tree species allowed classification of sample plots according to ecoregion with an average overall accuracy of 75 percent (range 45 to 94 percent). Additional...
Dating tree mortality using log decay in the White Mountains of New Hampshire
Andrew J. Fast; Mark J. Ducey; Jeffrey H. Gove; William B. Leak
2008-01-01
Coarse woody material (CWM) is an important component of forest ecosystems. To meet specific CWM management objectives, it is important to understand rates of decay. We present results from a silvicultural trial at the Bartlett Experimental Forest, in which time of death is known for a large sample of trees. Either a simple table or regression equations that use...
Development of post-fire crown damage mortality thresholds in ponderosa pine
James F. Fowler; Carolyn Hull Sieg; Joel McMillin; Kurt K. Allen; Jose F. Negron; Linda L. Wadleigh; John A. Anhold; Ken E. Gibson
2010-01-01
Previous research has shown that crown scorch volume and crown consumption volume are the major predictors of post-fire mortality in ponderosa pine. In this study, we use piecewise logistic regression models of crown scorch data from 6633 trees in five wildfires from the Intermountain West to locate a mortality threshold at 88% scorch by volume for trees with no crown...
Equations relating compacted and uncompacted live crown ratio for common tree species in the South
KaDonna C. Randolph
2010-01-01
Species-specific equations to predict uncompacted crown ratio (UNCR) from compacted live crown ratio (CCR), tree length, and stem diameter were developed for 24 species and 12 genera in the southern United States. Using data from the US Forest Service Forest Inventory and Analysis program, nonlinear regression was used to model UNCR with a logistic function. Model...
Scott M. Bretthauer; George Z. Gertner; Gary L. Rolfe; Jeffery O. Dawson
2003-01-01
Tree species diversity increases and dominance decreases with proximity to forest border in two 60-year-old successional forest stands developed on abandoned agricultural land in Piatt County, Illinois. A regression equation allowed us to quantify an increase in diversity with closeness to forest border for one of the forest stands. Shingle oak is the most dominant...
Regeneration of Douglas-fir cutblocks on the Six Rivers National Forest in northwestern California
R. O. Strothmann
1979-01-01
A survey of 61 cutblocks planted since 1964 evaluated stocking of conifers (trees 1 foot tall or taller) on 2-milacre quadrats. Overall stocking percentage averaged 42.2 and ranged from 15 to 8 1. Overall number of trees per acre averaged 396. In the regression model, based on 36 cutblocks, better stocking was associated with high site class, northerly aspect,...
Smitley, D R; Rebek, E J; Royalty, R N; Davis, T W; Newhouse, K F
2010-02-01
We conducted field trials at five different locations over a period of 6 yr to investigate the efficacy of imidacloprid applied each spring as a basal soil drench for protection against emerald ash borer, Agrilus planipennis Fairmaire (Coleoptera: Buprestidae). Canopy thinning and emerald ash borer larval density were used to evaluate efficacy for 3-4 yr at each location while treatments continued. Test sites included small urban trees (5-15 cm diameter at breast height [dbh]), medium to large (15-65 cm dbh) trees at golf courses, and medium to large street trees. Annual basal drenches with imidacloprid gave complete protection of small ash trees for three years. At three sites where the size of trees ranged from 23 to 37 cm dbh, we successfully protected all ash trees beginning the test with <60% canopy thinning. Regression analysis of data from two sites reveals that tree size explains 46% of the variation in efficacy of imidacloprid drenches. The smallest trees (<30 cm dbh) remained in excellent condition for 3 yr, whereas most of the largest trees (>38 cm dbh) declined to a weakened state and undesirable appearance. The five-fold increase in trunk and branch surface area of ash trees as the tree dbh doubles may account for reduced efficacy on larger trees, and suggests a need to increase treatment rates for larger trees.
NASA Astrophysics Data System (ADS)
Krąpiec, Marek; Margielewski, Włodzimierz; Korzeń, Katarzyna; Szychowska-Krąpiec, Elżbieta; Nalepka, Dorota; Łajczak, Adam
2016-09-01
The results of dendrochronological and palynological analyses of subfossil pine trees occurring in the peat deposits of the Puścizna Wielka raised bog (Polish Carpathians, Southern Poland) - the only site with numerous subfossil pine trees in the mountainous regions of Central Europe presently known - indicate that the majority of the tree populations grew in the peat bog during the periods ca 5415-3940 cal BP and 3050-2560 cal BP. Several forestless episodes, dated to 5245-5155 cal BP, 4525-4395 cal BP and 3940-3050 cal BP, were preceded by tree dying-off phases caused by an extreme periodical increase in humidity and general climate cooling trends. These events are documented based on analyses of pollen and non-pollen palynomorph assemblages, dendrochronological analyses of the trees, as well as numerous radiocarbon datings of the sediment horizons occurring within the peat bog profile. The phases of germinations, and, in turn, of tree and shrub invasions of the peat bog areas have been closely connected to drying and occasional warming of the regional climate. The last of the forestless periods began about 2600 years ago and continued up to the very recent times. Currently, as a result of desiccation of the peat bog and the lowering of the groundwater level (due to improved water drainage system), pine trees have returned the peat bog again. These results demonstrate that studies of subfossil bog-pine trees are quite effective in documenting and reconstructing periods of humidity fluctuation that occurred within the Carpathian region over the last several millennia.
Detecting trends in tree growth: not so simple.
Bowman, David M J S; Brienen, Roel J W; Gloor, Emanuel; Phillips, Oliver L; Prior, Lynda D
2013-01-01
Tree biomass influences biogeochemical cycles, climate, and biodiversity across local to global scales. Understanding the environmental control of tree biomass demands consideration of the drivers of individual tree growth over their lifespan. This can be achieved by studies of tree growth in permanent sample plots (prospective studies) and tree ring analyses (retrospective studies). However, identification of growth trends and attribution of their drivers demands statistical control of the axiomatic co-variation of tree size and age, and avoiding sampling biases at the stand, forest, and regional scales. Tracking and predicting the effects of environmental change on tree biomass requires well-designed studies that address the issues that we have reviewed. Copyright © 2012 Elsevier Ltd. All rights reserved.
Olive Tree-Ring Problematic Dating: A Comparative Analysis on Santorini (Greece)
Cherubini, Paolo; Humbel, Turi; Beeckman, Hans; Gärtner, Holger; Mannes, David; Pearson, Charlotte; Schoch, Werner; Tognetti, Roberto; Lev-Yadun, Simcha
2013-01-01
Olive trees are a classic component of Mediterranean environments and some of them are known historically to be very old. In order to evaluate the possibility to use olive tree-rings for dendrochronology, we examined by various methods the reliability of olive tree-rings identification. Dendrochronological analyses of olive trees growing on the Aegean island Santorini (Greece) show that the determination of the number of tree-rings is impossible because of intra-annual wood density fluctuations, variability in tree-ring boundary structure, and restriction of its cambial activity to shifting sectors of the circumference, causing the tree-ring sequences along radii of the same cross section to differ. PMID:23382949
A Millennial-length Reconstruction of the Western Pacific Pattern with Associated Paleoclimate
NASA Astrophysics Data System (ADS)
Wright, W. E.; Guan, B. T.; Wei, K.
2010-12-01
The Western Pacific Pattern (WP) is a lesser known 500 hPa pressure pattern similar to the NAO or PNA. As defined, the poles of the WP index are centered on 60°N over the Kamchatka peninsula and the neighboring Pacific and on 32.5°N over the western north Pacific. However, the area of influence for the southern half of the dipole includes a wide swath from East Asia, across Taiwan, through the Philippine Sea, to the western north Pacific. Tree rings of Taiwanese Chamaecyparis obtusa var. formosana in this extended region show significant correlation with the WP, and with local temperature. The WP is also significantly correlated with atmospheric temperatures over Taiwan, especially at 850hPa and 700 hPa, pressure levels that bracket the tree site. Spectral analysis indicates that variations in the WP occur at relatively high frequency, with most power at less than 5 years. Simple linear regression against high frequency variants of the tree-ring chronology yielded the most significant correlation coefficients. Two reconstructions are presented. The first uses a tree-ring time series produced as the first intrinsic mode function (IMF) from an Ensemble Empirical Mode Decomposition (EEMD), based on the Hilbert-Huang Transform. The significance of the regression using the EEMD-derived time series was much more significant than time series produced using traditional high pass filtering. The second also uses the first IMF of a tree-ring time series, but the dataset was first sorted and partitioned at a specified quantile prior to EEMD decomposition, with the mean of the partitioned data forming the input to the EEMD. The partitioning was done to filter out the less climatically sensitive tree rings, a common problem with shade tolerant trees. Time series statistics indicate that the first reconstruction is reliable to 1241 of the Common Era. Reliability of the second reconstruction is dependent on the development of statistics related to the quantile partitioning, and the consequent reduction in sample depth. However, the correlation coefficients from regressions over the instrumental period greatly exceed those from any other method of chronology generation, and so the technique holds promise. Additional atmospheric parameters having significant correlations against the WPO and tree ring time series with similar spatial patterns are also presented. These include vertical wind shear (850hPa-700hPa) over the northern Philippines and the Philippine Sea, surface Omega and 850hPa v-winds over the East China Sea, Japan and Taiwan. Possible links to changes in the subtropical jet stream will also be discussed.
Prediction of Baseflow Index of Catchments using Machine Learning Algorithms
NASA Astrophysics Data System (ADS)
Yadav, B.; Hatfield, K.
2017-12-01
We present the results of eight machine learning techniques for predicting the baseflow index (BFI) of ungauged basins using a surrogate of catchment scale climate and physiographic data. The tested algorithms include ordinary least squares, ridge regression, least absolute shrinkage and selection operator (lasso), elasticnet, support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Our work seeks to identify the dominant controls of BFI that can be readily obtained from ancillary geospatial databases and remote sensing measurements, such that the developed techniques can be extended to ungauged catchments. More than 800 gauged catchments spanning the continental United States were selected to develop the general methodology. The BFI calculation was based on the baseflow separated from daily streamflow hydrograph using HYSEP filter. The surrogate catchment attributes were compiled from multiple sources including digital elevation model, soil, landuse, climate data, other publicly available ancillary and geospatial data. 80% catchments were used to train the ML algorithms, and the remaining 20% of the catchments were used as an independent test set to measure the generalization performance of fitted models. A k-fold cross-validation using exhaustive grid search was used to fit the hyperparameters of each model. Initial model development was based on 19 independent variables, but after variable selection and feature ranking, we generated revised sparse models of BFI prediction that are based on only six catchment attributes. These key predictive variables selected after the careful evaluation of bias-variance tradeoff include average catchment elevation, slope, fraction of sand, permeability, temperature, and precipitation. The most promising algorithms exceeding an accuracy score (r-square) of 0.7 on test data include support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Considering both the accuracy and the computational complexity of these algorithms, we identify the extremely randomized trees as the best performing algorithm for BFI prediction in ungauged basins.
Zong, Shengwei; Wu, Zhengfang; Xu, Jiawei; Li, Ming; Gao, Xiaofeng; He, Hongshi; Du, Haibo; Wang, Lei
2014-01-01
Tree line ecotone in the Changbai Mountains has undergone large changes in the past decades. Tree locations show variations on the four sides of the mountains, especially on the northern and western sides, which has not been fully explained. Previous studies attributed such variations to the variations in temperature. However, in this study, we hypothesized that topographic controls were responsible for causing the variations in the tree locations in tree line ecotone of the Changbai Mountains. To test the hypothesis, we used IKONOS images and WorldView-1 image to identify the tree locations and developed a logistic regression model using topographical variables to identify the dominant controls of the tree locations. The results showed that aspect, wetness, and slope were dominant controls for tree locations on western side of the mountains, whereas altitude, SPI, and aspect were the dominant factors on northern side. The upmost altitude a tree can currently reach was 2140 m asl on the northern side and 2060 m asl on western side. The model predicted results showed that habitats above the current tree line on the both sides were available for trees. Tree recruitments under the current tree line may take advantage of the available habitats at higher elevations based on the current tree location. Our research confirmed the controlling effects of topography on the tree locations in the tree line ecotone of Changbai Mountains and suggested that it was essential to assess the tree response to topography in the research of tree line ecotone. PMID:25170918
Zong, Shengwei; Wu, Zhengfang; Xu, Jiawei; Li, Ming; Gao, Xiaofeng; He, Hongshi; Du, Haibo; Wang, Lei
2014-01-01
Tree line ecotone in the Changbai Mountains has undergone large changes in the past decades. Tree locations show variations on the four sides of the mountains, especially on the northern and western sides, which has not been fully explained. Previous studies attributed such variations to the variations in temperature. However, in this study, we hypothesized that topographic controls were responsible for causing the variations in the tree locations in tree line ecotone of the Changbai Mountains. To test the hypothesis, we used IKONOS images and WorldView-1 image to identify the tree locations and developed a logistic regression model using topographical variables to identify the dominant controls of the tree locations. The results showed that aspect, wetness, and slope were dominant controls for tree locations on western side of the mountains, whereas altitude, SPI, and aspect were the dominant factors on northern side. The upmost altitude a tree can currently reach was 2140 m asl on the northern side and 2060 m asl on western side. The model predicted results showed that habitats above the current tree line on the both sides were available for trees. Tree recruitments under the current tree line may take advantage of the available habitats at higher elevations based on the current tree location. Our research confirmed the controlling effects of topography on the tree locations in the tree line ecotone of Changbai Mountains and suggested that it was essential to assess the tree response to topography in the research of tree line ecotone.
Liu, Rong; Li, Xi; Zhang, Wei; Zhou, Hong-Hao
2015-01-01
Objective Multiple linear regression (MLR) and machine learning techniques in pharmacogenetic algorithm-based warfarin dosing have been reported. However, performances of these algorithms in racially diverse group have never been objectively evaluated and compared. In this literature-based study, we compared the performances of eight machine learning techniques with those of MLR in a large, racially-diverse cohort. Methods MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied in warfarin dose algorithms in a cohort from the International Warfarin Pharmacogenetics Consortium database. Covariates obtained by stepwise regression from 80% of randomly selected patients were used to develop algorithms. To compare the performances of these algorithms, the mean percentage of patients whose predicted dose fell within 20% of the actual dose (mean percentage within 20%) and the mean absolute error (MAE) were calculated in the remaining 20% of patients. The performances of these techniques in different races, as well as the dose ranges of therapeutic warfarin were compared. Robust results were obtained after 100 rounds of resampling. Results BART, MARS and SVR were statistically indistinguishable and significantly out performed all the other approaches in the whole cohort (MAE: 8.84–8.96 mg/week, mean percentage within 20%: 45.88%–46.35%). In the White population, MARS and BART showed higher mean percentage within 20% and lower mean MAE than those of MLR (all p values < 0.05). In the Asian population, SVR, BART, MARS and LAR performed the same as MLR. MLR and LAR optimally performed among the Black population. When patients were grouped in terms of warfarin dose range, all machine learning techniques except ANN and LAR showed significantly higher mean percentage within 20%, and lower MAE (all p values < 0.05) than MLR in the low- and high- dose ranges. Conclusion Overall, machine learning-based techniques, BART, MARS and SVR performed superior than MLR in warfarin pharmacogenetic dosing. Differences of algorithms’ performances exist among the races. Moreover, machine learning-based algorithms tended to perform better in the low- and high- dose ranges than MLR. PMID:26305568
Isozymes and the genetic resources of forest trees
A. H. D. Brown; G. F. Moran
1981-01-01
Genetic data are an essential prerequisite for analysing the genetic structure of tree populations. The isozyme technique is the best currently available method for obtaining such data. Despite several shortcomings, isozyme data directly evaluate the genetic resources of forest trees, and can thus be used to monitor and manipulate these resources. For example,...
Allana K. Welsh; Jeffrey O. Dawson; Gerald J. Gottfried; Dittmar Hahn
2009-01-01
The diversity of uncultured Frankia populations in root nodules of Alnus oblongifolia trees geographically isolated on mountaintops of central Arizona was analyzed by comparative sequence analyses of nifH gene fragments. Sequences were retrieved from Frankia populations in nodules of four trees from each of...
Blomberg, S
2000-11-01
Currently available programs for the comparative analysis of phylogenetic data do not perform optimally when the phylogeny is not completely specified (i.e. the phylogeny contains polytomies). Recent literature suggests that a better way to analyse the data would be to create random trees from the known phylogeny that are fully-resolved but consistent with the known tree. A computer program is presented, Fels-Rand, that performs such analyses. A randomisation procedure is used to generate trees that are fully resolved but whose structure is consistent with the original tree. Statistics are then calculated on a large number of these randomly-generated trees. Fels-Rand uses the object-oriented features of Xlisp-Stat to manipulate internal tree representations. Xlisp-Stat's dynamic graphing features are used to provide heuristic tools to aid in analysis, particularly outlier analysis. The usefulness of Xlisp-Stat as a system for phylogenetic computation is discussed. Available from the author or at http://www.uq.edu.au/~ansblomb/Fels-Rand.sit.hqx. Xlisp-Stat is available from http://stat.umn.edu/~luke/xls/xlsinfo/xlsinfo.html. s.blomberg@abdn.ac.uk
Brignole, Daniele; Drava, Giuliana; Minganti, Vincenzo; Giordani, Paolo; Samson, Roeland; Vieira, Joana; Pinho, Pedro; Branquinho, Cristina
2018-03-01
Tree bark has proven to be a reliable tool for biomonitoring deposition of metals from the atmosphere. The aim of the present study was to test if bark magnetic properties can be used as a proxy of the overall metal loads of a tree bark, meaning that this approach can be used to discriminate different effects of pollution on different types of urban site. In this study, the concentrations of As, Cd, Co, Cu, Fe, Mn, Ni, P, Pb, V and Zn were measured by ICP-OES in bark samples of Jacaranda mimosifolia, collected along roads and in urban green spaces in the city of Lisbon (Portugal). Magnetic analyses were also performed on the same bark samples, measuring Isothermal Remanent Magnetization (IRM), Saturation Isothermal Remanent Magnetization (SIRM) and Magnetic Susceptibility (χ). The results confirmed that magnetic analyses can be used as a proxy of the overall load of trace elements in tree bark, and could be used to distinguish different types of urban sites regarding atmospheric pollution. Together with trace element analyses, magnetic analyses could thus be used as a tool to provide high-resolution data on urban air quality and to follow up the success of mitigation actions aiming at decreasing the pollutant load in urban environments. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evaluation and prediction of shrub cover in coastal Oregon forests (USA)
Becky K. Kerns; Janet L. Ohmann
2004-01-01
We used data from regional forest inventories and research programs, coupled with mapped climatic and topographic information, to explore relationships and develop multiple linear regression (MLR) and regression tree models for total and deciduous shrub cover in the Oregon coastal province. Results from both types of models indicate that forest structure variables were...
ERIC Educational Resources Information Center
Strobl, Carolin; Malley, James; Tutz, Gerhard
2009-01-01
Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…
Weighted linear regression using D2H and D2 as the independent variables
Hans T. Schreuder; Michael S. Williams
1998-01-01
Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...
An Intelligent Decision Support System for Workforce Forecast
2011-01-01
ARIMA ) model to forecast the demand for construction skills in Hong Kong. This model was based...Decision Trees ARIMA Rule Based Forecasting Segmentation Forecasting Regression Analysis Simulation Modeling Input-Output Models LP and NLP Markovian...data • When results are needed as a set of easily interpretable rules 4.1.4 ARIMA Auto-regressive, integrated, moving-average ( ARIMA ) models
Nitrogen and phosphorus resorption in a neotropical rain forest of a nutrient-rich soil.
Martínez-Sánchez, José Luis
2005-01-01
In tropical forests with nutrient-rich soil tree's nutrient resorption from senesced leaves has not always been observed to be low. Perhaps this lack of consistence is partly owing to the nutrient resorption methods used. The aim of the study was to analyse N and P resorption proficiency from tropical rain forest trees in a nutrient-rich soil. It was hypothesised that trees would exhibit low nutrient resorption in a nutrient-rich soil. The soil concentrations of total N and extractable P, among other physical and chemical characteristics, were analysed in 30 samples in the soil surface (10 cm) of three undisturbed forest plots at 'Estaci6n de Biologia Los Tuxtlas' on the east coast of Mexico (18 degrees 34' - 18 degrees 36' N, 95 degrees 04' - 95 degrees 09' W). N and P resorption proficiency were determined from senescing leaves in 11 dominant tree species. Nitrogen was analysed by microkjeldahl digestion with sulphuric acid and distilled with boric acid, and phosphorus was analysed by digestion with nitric acid and perchloric acid. Soil was rich in total N (0.50%, n = 30) and extractable P (4.11 microg g(-1) n = 30). As expected, trees showed incomplete N (1.13%, n = 11) and P (0.11%, n = 1) resorption. With a more accurate method of nutrient resorption assessment, it is possible to prove that a forest community with a nutrient-rich soil can have low levels of N and P resorption.
Ozge, C; Toros, F; Bayramkaya, E; Camdeviren, H; Sasmaz, T
2006-08-01
The purpose of this study is to evaluate the most important sociodemographic factors on smoking status of high school students using a broad randomised epidemiological survey. Using in-class, self administered questionnaire about their sociodemographic variables and smoking behaviour, a representative sample of total 3304 students of preparatory, 9th, 10th, and 11th grades, from 22 randomly selected schools of Mersin, were evaluated and discriminative factors have been determined using appropriate statistics. In addition to binary logistic regression analysis, the study evaluated combined effects of these factors using classification and regression tree methodology, as a new statistical method. The data showed that 38% of the students reported lifetime smoking and 16.9% of them reported current smoking with a male predominancy and increasing prevalence by age. Second hand smoking was reported at a 74.3% frequency with father predominance (56.6%). The significantly important factors that affect current smoking in these age groups were increased by household size, late birth rank, certain school types, low academic performance, increased second hand smoking, and stress (especially reported as separation from a close friend or because of violence at home). Classification and regression tree methodology showed the importance of some neglected sociodemographic factors with a good classification capacity. It was concluded that, as closely related with sociocultural factors, smoking was a common problem in this young population, generating important academic and social burden in youth life and with increasing data about this behaviour and using new statistical methods, effective coping strategies could be composed.
Price, B; Gomez, A; Mathys, L; Gardi, O; Schellenberger, A; Ginzler, C; Thürig, E
2017-03-01
Trees outside forest (TOF) can perform a variety of social, economic and ecological functions including carbon sequestration. However, detailed quantification of tree biomass is usually limited to forest areas. Taking advantage of structural information available from stereo aerial imagery and airborne laser scanning (ALS), this research models tree biomass using national forest inventory data and linear least-square regression and applies the model both inside and outside of forest to create a nationwide model for tree biomass (above ground and below ground). Validation of the tree biomass model against TOF data within settlement areas shows relatively low model performance (R 2 of 0.44) but still a considerable improvement on current biomass estimates used for greenhouse gas inventory and carbon accounting. We demonstrate an efficient and easily implementable approach to modelling tree biomass across a large heterogeneous nationwide area. The model offers significant opportunity for improved estimates on land use combination categories (CC) where tree biomass has either not been included or only roughly estimated until now. The ALS biomass model also offers the advantage of providing greater spatial resolution and greater within CC spatial variability compared to the current nationwide estimates.
Sah, Jay P.; Ross, Michael S.; Snyder, James R.; Ogurcak, Danielle E.
2010-01-01
In fire-dependent forests, managers are interested in predicting the consequences of prescribed burning on postfire tree mortality. We examined the effects of prescribed fire on tree mortality in Florida Keys pine forests, using a factorial design with understory type, season, and year of burn as factors. We also used logistic regression to model the effects of burn season, fire severity, and tree dimensions on individual tree mortality. Despite limited statistical power due to problems in carrying out the full suite of planned experimental burns, associations with tree and fire variables were observed. Post-fire pine tree mortality was negatively correlated with tree size and positively correlated with char height and percent crown scorch. Unlike post-fire mortality, tree mortality associated with storm surge from Hurricane Wilma was greater in the large size classes. Due to their influence on population structure and fuel dynamics, the size-selective mortality patterns following fire and storm surge have practical importance for using fire as a management tool in Florida Keys pinelands in the future, particularly when the threats to their continued existence from tropical storms and sea level rise are expected to increase.
Min, Xiaobo; Wang, Yangyang; Chai, Liyuan; Yang, Zhihui; Liao, Qi
2017-09-01
To explore how heavy metal contamination in Chromite Ore Processing Residue (COPR) disposal sites determine the dissimilarities of indigenous microbial communities, 16S rRNA gene MiSeq sequencing and advanced statistical methods were applied. 13 soil samples were collected from three COPR disposal sites in Mouding of southwestern, Shangnan of northwestern and Yima of central China. The results of analyses of variance (ANOVA), similarities (ANOSIM), and non-metric multidimensional scaling (NMDS) showed that the structural diversity of the microbial communities in the samples with high total chromium (Cr) content (more than 300 mg kg -1 ; High group) were significantly lesser than in the Low group (less than 90 mg kg -1 ) regardless of their geographical distribution. But their diversity had virtually rehabilitated under the pressures of long-term metal contamination. Furthermore, the similarity percentage (SIMPER) analysis indicated that the major dissimilarity contributors Micrococcaceae, Delftia, and Streptophyta, possibly having Cr(VI)-resistant and/or Cr(VI)-reducing capability, were dominant in the High group, while Ramlibacter and Gemmatimonas with potential resistances to other heavy metals were prevalent in the Low group. In addition, the multivariate regression tree (MRT), aggregated boosted tree (ABT), and Mantel test revealed that total Cr content affiliated with Cr(VI) was the principal factor shaping the dissimilarities between the soil microbial communities in the COPR sites. Our findings provide a deep insight of the influence of these heavy metals on the microbial communities in the COPR disposal sites and will facilitate bioremediation on such site. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin
2014-06-01
This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.
Pattern of oral-maxillofacial trauma from violence against women and its associated factors.
da Nóbrega, Lorena Marques; Bernardino, Ítalo de Macedo; Barbosa, Kevan Guilherme Nóbrega; E Silva, Jéssica Antoniana Lira; Massoni, Andreza Cristina de Lima Targino; d'Avila, Sérgio
2017-06-01
Violence against women is a global public health problem. The aim of this study was to characterize the profile of women victims of violence and identify factors associated with maxillofacial injuries. A cross-sectional study was performed based on an evaluation of 884 medico-legal and social records of women victims of physical aggression treated at the Center of Forensic Medicine and Dentistry in Brazil. The variables investigated were related to the sociodemographic characteristics of victims, circumstances of aggressions, and patterns of trauma. Descriptive and multivariate statistics using decision tree analysis by the Chi-squared automatic interaction detector (CHAID) algorithm, as well as univariate and multivariate Poisson regression analyses were performed. The occurrence of maxillofacial trauma was 46.4%. The mean age of victims was 29.38 (SD=12.55 years). Based on decision tree, the profile of violence against women can be explained by the aggressor's gender (P<.001) and sociodemographic characteristics of victims, such as marital status (P=.001), place of residence (P=.019), and educational level (P=.014). Based on the final Poisson regression model, women living in suburban areas were more likely to suffer maxillofacial trauma (PR=1.752; CI 95%=1.153-2.662; P=.009) compared to those living in rural areas. Moreover, aggression using a weapon resulted in a lower occurrence of maxillofacial trauma (PR=0.476; CI 95%=0.284-0.799; P=.005) compared to cases of aggression using physical force. The prevalence of oral-maxillofacial trauma was high, and the main associated factors were place of residence and mechanism of aggression. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Wright, Jeremy J; David, Solomon R; Near, Thomas J
2012-06-01
Extant gars represent the remaining members of a formerly diverse assemblage of ancient ray-finned fishes and have been the subject of multiple phylogenetic analyses using morphological data. Here, we present the first hypothesis of phylogenetic relationships among living gar species based on molecular data, through the examination of gene tree heterogeneity and coalescent species tree analyses of a portion of one mitochondrial (COI) and seven nuclear (ENC1, myh6, plagl2, S7 ribosomal protein intron 1, sreb2, tbr1, and zic1) genes. Individual gene trees displayed varying degrees of resolution with regards to species-level relationships, and the gene trees inferred from COI and the S7 intron were the only two that were completely resolved. Coalescent species tree analyses of nuclear genes resulted in a well-resolved and strongly supported phylogenetic tree of living gar species, for which Bayesian posterior node support was further improved by the inclusion of the mitochondrial gene. Species-level relationships among gars inferred from our molecular data set were highly congruent with previously published morphological phylogenies, with the exception of the placement of two species, Lepisosteus osseus and L. platostomus. Re-examination of the character coding used by previous authors provided partial resolution of this topological discordance, resulting in broad concordance in the phylogenies inferred from individual genes, the coalescent species tree analysis, and morphology. The completely resolved phylogeny inferred from the molecular data set with strong Bayesian posterior support at all nodes provided insights into the potential for introgressive hybridization and patterns of allopatric speciation in the evolutionary history of living gars, as well as a solid foundation for future examinations of functional diversification and evolutionary stasis in a "living fossil" lineage. Copyright © 2012 Elsevier Inc. All rights reserved.
Utilizing random forests imputation of forest plot data for landscape-level wildfire analyses
Karin L. Riley; Isaac C. Grenfell; Mark A. Finney; Nicholas L. Crookston
2014-01-01
Maps of the number, size, and species of trees in forests across the United States are desirable for a number of applications. For landscape-level fire and forest simulations that use the Forest Vegetation Simulator (FVS), a spatial tree-level dataset, or âtree listâ, is a necessity. FVS is widely used at the stand level for simulating fire effects on tree mortality,...
Singh, Gyanendra; Sachdeva, S N; Pal, Mahesh
2016-11-01
This work examines the application of M5 model tree and conventionally used fixed/random effect negative binomial (FENB/RENB) regression models for accident prediction on non-urban sections of highway in Haryana (India). Road accident data for a period of 2-6 years on different sections of 8 National and State Highways in Haryana was collected from police records. Data related to road geometry, traffic and road environment related variables was collected through field studies. Total two hundred and twenty two data points were gathered by dividing highways into sections with certain uniform geometric characteristics. For prediction of accident frequencies using fifteen input parameters, two modeling approaches: FENB/RENB regression and M5 model tree were used. Results suggest that both models perform comparably well in terms of correlation coefficient and root mean square error values. M5 model tree provides simple linear equations that are easy to interpret and provide better insight, indicating that this approach can effectively be used as an alternative to RENB approach if the sole purpose is to predict motor vehicle crashes. Sensitivity analysis using M5 model tree also suggests that its results reflect the physical conditions. Both models clearly indicate that to improve safety on Indian highways minor accesses to the highways need to be properly designed and controlled, the service roads to be made functional and dispersion of speeds is to be brought down. Copyright © 2016 Elsevier Ltd. All rights reserved.
Presence of indicator plant species as a predictor of wetland vegetation integrity
Stapanian, Martin A.; Adams, Jean V.; Gara, Brian
2013-01-01
We fit regression and classification tree models to vegetation data collected from Ohio (USA) wetlands to determine (1) which species best predict Ohio vegetation index of biotic integrity (OVIBI) score and (2) which species best predict high-quality wetlands (OVIBI score >75). The simplest regression tree model predicted OVIBI score based on the occurrence of three plant species: skunk-cabbage (Symplocarpus foetidus), cinnamon fern (Osmunda cinnamomea), and swamp rose (Rosa palustris). The lowest OVIBI scores were best predicted by the absence of the selected plant species rather than by the presence of other species. The simplest classification tree model predicted high-quality wetlands based on the occurrence of two plant species: skunk-cabbage and marsh-fern (Thelypteris palustris). The overall misclassification rate from this tree was 13 %. Again, low-quality wetlands were better predicted than high-quality wetlands by the absence of selected species rather than the presence of other species using the classification tree model. Our results suggest that a species’ wetland status classification and coefficient of conservatism are of little use in predicting wetland quality. A simple, statistically derived species checklist such as the one created in this study could be used by field biologists to quickly and efficiently identify wetland sites likely to be regulated as high-quality, and requiring more intensive field assessments. Alternatively, it can be used for advanced determinations of low-quality wetlands. Agencies can save considerable money by screening wetlands for the presence/absence of such “indicator” species before issuing permits.
Automatic energy expenditure measurement for health science.
Catal, Cagatay; Akbulut, Akhan
2018-04-01
It is crucial to predict the human energy expenditure in any sports activity and health science application accurately to investigate the impact of the activity. However, measurement of the real energy expenditure is not a trivial task and involves complex steps. The objective of this work is to improve the performance of existing estimation models of energy expenditure by using machine learning algorithms and several data from different sensors and provide this estimation service in a cloud-based platform. In this study, we used input data such as breathe rate, and hearth rate from three sensors. Inputs are received from a web form and sent to the web service which applies a regression model on Azure cloud platform. During the experiments, we assessed several machine learning models based on regression methods. Our experimental results showed that our novel model which applies Boosted Decision Tree Regression in conjunction with the median aggregation technique provides the best result among other five regression algorithms. This cloud-based energy expenditure system which uses a web service showed that cloud computing technology is a great opportunity to develop estimation systems and the new model which applies Boosted Decision Tree Regression with the median aggregation provides remarkable results. Copyright © 2018 Elsevier B.V. All rights reserved.
Tenero, David; Green, Justin A; Goyal, Navin
2015-10-01
Tafenoquine (TQ), a new 8-aminoquinoline with activity against all stages of the Plasmodium vivax life cycle, is being developed for the radical cure of acute P. vivax malaria in combination with chloroquine. The efficacy and exposure data from a pivotal phase 2b dose-ranging study were used to conduct exposure-response analyses for TQ after administration to subjects with P. vivax malaria. TQ exposure (i.e., area under the concentration-time curve [AUC]) and region (Thailand compared to Peru and Brazil) were found to be statistically significant predictors of clinical response based on multivariate logistic regression analyses. After accounting for region/country, the odds of being relapse free at 6 months increased by approximately 51% (95% confidence intervals [CI], 25%, 82%) for each 25-U increase in AUC above the median value of 54.5 μg · h/ml. TQ exposure was also a significant predictor of the time to relapse of the infection. The final parametric, time-to-event model for the time to relapse, included a Weibull distribution hazard function, AUC, and country as covariates. Based on the model, the risk of relapse decreased by 30% (95% CI, 17% to 42%) for every 25-U increase in AUC. Monte Carlo simulations indicated that the 300-mg dose of TQ would provide an AUC greater than the clinically relevant breakpoint obtained in a classification and regression tree (CART) analysis (56.4 μg · h/ml) in more than 90% of subjects and consequently result in a high probability of being relapse free at 6 months. This model-based approach was critical in selecting an appropriate phase 3 dose. (This study has been registered at ClinicalTrials.gov under registration no. NCT01376167.). Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Green, Justin A.; Goyal, Navin
2015-01-01
Tafenoquine (TQ), a new 8-aminoquinoline with activity against all stages of the Plasmodium vivax life cycle, is being developed for the radical cure of acute P. vivax malaria in combination with chloroquine. The efficacy and exposure data from a pivotal phase 2b dose-ranging study were used to conduct exposure-response analyses for TQ after administration to subjects with P. vivax malaria. TQ exposure (i.e., area under the concentration-time curve [AUC]) and region (Thailand compared to Peru and Brazil) were found to be statistically significant predictors of clinical response based on multivariate logistic regression analyses. After accounting for region/country, the odds of being relapse free at 6 months increased by approximately 51% (95% confidence intervals [CI], 25%, 82%) for each 25-U increase in AUC above the median value of 54.5 μg · h/ml. TQ exposure was also a significant predictor of the time to relapse of the infection. The final parametric, time-to-event model for the time to relapse, included a Weibull distribution hazard function, AUC, and country as covariates. Based on the model, the risk of relapse decreased by 30% (95% CI, 17% to 42%) for every 25-U increase in AUC. Monte Carlo simulations indicated that the 300-mg dose of TQ would provide an AUC greater than the clinically relevant breakpoint obtained in a classification and regression tree (CART) analysis (56.4 μg · h/ml) in more than 90% of subjects and consequently result in a high probability of being relapse free at 6 months. This model-based approach was critical in selecting an appropriate phase 3 dose. (This study has been registered at ClinicalTrials.gov under registration no. NCT01376167.) PMID:26248362
Does Forest Continuity Enhance the Resilience of Trees to Environmental Change?
von Oheimb, Goddert; Härdtle, Werner; Eckstein, Dieter; Engelke, Hans-Hermann; Hehnke, Timo; Wagner, Bettina; Fichtner, Andreas
2014-01-01
There is ample evidence that continuously existing forests and afforestations on previously agricultural land differ with regard to ecosystem functions and services such as carbon sequestration, nutrient cycling and biodiversity. However, no studies have so far been conducted on possible long-term (>100 years) impacts on tree growth caused by differences in the ecological continuity of forest stands. In the present study we analysed the variation in tree-ring width of sessile oak (Quercus petraea (Matt.) Liebl.) trees (mean age 115-136 years) due to different land-use histories (continuously existing forests, afforestations both on arable land and on heathland). We also analysed the relation of growth patterns to soil nutrient stores and to climatic parameters (temperature, precipitation). Tree rings formed between 1896 and 2005 were widest in trees afforested on arable land. This can be attributed to higher nitrogen and phosphorous availability and indicates that former fertilisation may continue to affect the nutritional status of forest soils for more than one century after those activities have ceased. Moreover, these trees responded more strongly to environmental changes - as shown by a higher mean sensitivity of the tree-ring widths - than trees of continuously existing forests. However, the impact of climatic parameters on the variability in tree-ring width was generally small, but trees on former arable land showed the highest susceptibility to annually changing climatic conditions. We assume that incompletely developed humus horizons as well as differences in the edaphon are responsible for the more sensitive response of oak trees of recent forests (former arable land and former heathland) to variation in environmental conditions. We conclude that forests characterised by a long ecological continuity may be better adapted to global change than recent forest ecosystems.
Does Forest Continuity Enhance the Resilience of Trees to Environmental Change?
von Oheimb, Goddert; Härdtle, Werner; Eckstein, Dieter; Engelke, Hans-Hermann; Hehnke, Timo; Wagner, Bettina; Fichtner, Andreas
2014-01-01
There is ample evidence that continuously existing forests and afforestations on previously agricultural land differ with regard to ecosystem functions and services such as carbon sequestration, nutrient cycling and biodiversity. However, no studies have so far been conducted on possible long-term (>100 years) impacts on tree growth caused by differences in the ecological continuity of forest stands. In the present study we analysed the variation in tree-ring width of sessile oak (Quercus petraea (Matt.) Liebl.) trees (mean age 115–136 years) due to different land-use histories (continuously existing forests, afforestations both on arable land and on heathland). We also analysed the relation of growth patterns to soil nutrient stores and to climatic parameters (temperature, precipitation). Tree rings formed between 1896 and 2005 were widest in trees afforested on arable land. This can be attributed to higher nitrogen and phosphorous availability and indicates that former fertilisation may continue to affect the nutritional status of forest soils for more than one century after those activities have ceased. Moreover, these trees responded more strongly to environmental changes – as shown by a higher mean sensitivity of the tree-ring widths – than trees of continuously existing forests. However, the impact of climatic parameters on the variability in tree-ring width was generally small, but trees on former arable land showed the highest susceptibility to annually changing climatic conditions. We assume that incompletely developed humus horizons as well as differences in the edaphon are responsible for the more sensitive response of oak trees of recent forests (former arable land and former heathland) to variation in environmental conditions. We conclude that forests characterised by a long ecological continuity may be better adapted to global change than recent forest ecosystems. PMID:25494042
TreeScaper: Visualizing and Extracting Phylogenetic Signal from Sets of Trees.
Huang, Wen; Zhou, Guifang; Marchand, Melissa; Ash, Jeremy R; Morris, David; Van Dooren, Paul; Brown, Jeremy M; Gallivan, Kyle A; Wilgenbusch, Jim C
2016-12-01
Modern phylogenomic analyses often result in large collections of phylogenetic trees representing uncertainty in individual gene trees, variation across genes, or both. Extracting phylogenetic signal from these tree sets can be challenging, as they are difficult to visualize, explore, and quantify. To overcome some of these challenges, we have developed TreeScaper, an application for tree set visualization as well as the identification of distinct phylogenetic signals. GUI and command-line versions of TreeScaper and a manual with tutorials can be downloaded from https://github.com/whuang08/TreeScaper/releases TreeScaper is distributed under the GNU General Public License. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Mukherjee, Arideep; Agrawal, Madhoolika
2018-05-15
Responses of urban vegetation to air pollution stress in relation to their tolerance and sensitivity have been extensively studied, however, studies related to air pollution responses based on different leaf functional traits and tree characteristics are limited. In this paper, we have tried to assess combined and individual effects of major air pollutants PM 10 (particulate matter ≤ 10 µm), TSP (total suspended particulate matter), SO 2 (sulphur dioxide), NO 2 (nitrogen dioxide) and O 3 (ozone) on thirteen tropical tree species in relation to fifteen leaf functional traits and different tree characteristics. Stepwise linear regression a general linear modelling approach was used to quantify the pollution response of trees against air pollutants. The study was performed for six successive seasons for two years in three distinct urban areas (traffic, industrial and residential) of Varanasi city in India. At all the study sites, concentrations of air pollutants, specifically PM (particulate matter) and NO 2 were above the specified standards. Distinct variations were recorded in all the fifteen leaf functional traits with pollution load. Caesalpinia sappan was identified as most tolerant species followed by Psidium guajava, Dalbergia sissoo and Albizia lebbeck. Stepwise regression analysis identified maximum response of Eucalyptus citriodora and P. guajava to air pollutants explaining overall 59% and 58% variability's in leaf functional traits, respectively. Among leaf functional traits, maximum effect of air pollutants was observed on non-enzymatic antioxidants followed by photosynthetic pigments and leaf water status. Among the pollutants, PM was identified as the major stress factor followed by O 3 explaining 47% and 33% variability's in leaf functional traits. Tolerance and pollution response were regulated by different tree characteristics such as height, canopy size, leaf from, texture and nature of tree. Outcomes of this study will help in urban forest development by selection of specific pollutant tolerant tree species and leaf traits, which is suitable as air pollution mitigation measure. Copyright © 2018 Elsevier Inc. All rights reserved.
Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses
Bayzid, Md Shamsuzzoha; Mirarab, Siavash; Boussau, Bastien; Warnow, Tandy
2015-01-01
Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning. PMID:26086579
Wilson, Jordan L; Samaranayake, V A; Limmer, Matthew A; Schumacher, John G; Burken, Joel G
2017-12-19
Contaminated sites pose ecological and human-health risks through exposure to contaminated soil and groundwater. Whereas we can readily locate, monitor, and track contaminants in groundwater, it is harder to perform these tasks in the vadose zone. In this study, tree-core samples were collected at a Superfund site to determine if the sample-collection location around a particular tree could reveal the subsurface location, or direction, of soil and soil-gas contaminant plumes. Contaminant-centroid vectors were calculated from tree-core data to reveal contaminant distributions in directional tree samples at a higher resolution, and vectors were correlated with soil-gas characterization collected using conventional methods. Results clearly demonstrated that directional tree coring around tree trunks can indicate gradients in soil and soil-gas contaminant plumes, and the strength of the correlations were directly proportionate to the magnitude of tree-core concentration gradients (spearman's coefficient of -0.61 and -0.55 in soil and tree-core gradients, respectively). Linear regression indicates agreement between the concentration-centroid vectors is significantly affected by in planta and soil concentration gradients and when concentration centroids in soil are closer to trees. Given the existing link between soil-gas and vapor intrusion, this study also indicates that directional tree coring might be applicable in vapor intrusion assessment.
Wilson, Jordan L.; Samaranayake, V.A.; Limmer, Matthew A.; Schumacher, John G.; Burken, Joel G.
2017-01-01
Contaminated sites pose ecological and human-health risks through exposure to contaminated soil and groundwater. Whereas we can readily locate, monitor, and track contaminants in groundwater, it is harder to perform these tasks in the vadose zone. In this study, tree-core samples were collected at a Superfund site to determine if the sample-collection location around a particular tree could reveal the subsurface location, or direction, of soil and soil-gas contaminant plumes. Contaminant-centroid vectors were calculated from tree-core data to reveal contaminant distributions in directional tree samples at a higher resolution, and vectors were correlated with soil-gas characterization collected using conventional methods. Results clearly demonstrated that directional tree coring around tree trunks can indicate gradients in soil and soil-gas contaminant plumes, and the strength of the correlations were directly proportionate to the magnitude of tree-core concentration gradients (spearman’s coefficient of -0.61 and -0.55 in soil and tree-core gradients, respectively). Linear regression indicates agreement between the concentration-centroid vectors is significantly affected by in-planta and soil concentration gradients and when concentration centroids in soil are closer to trees. Given the existing link between soil-gas and vapor intrusion, this study also indicates that directional tree coring might be applicable in vapor intrusion assessment.
NASA Astrophysics Data System (ADS)
Vaglio Laurin, Gaia; Puletti, Nicola; Chen, Qi; Corona, Piermaria; Papale, Dario; Valentini, Riccardo
2016-10-01
Estimates of forest aboveground biomass are fundamental for carbon monitoring and accounting; delivering information at very high spatial resolution is especially valuable for local management, conservation and selective logging purposes. In tropical areas, hosting large biomass and biodiversity resources which are often threatened by unsustainable anthropogenic pressures, frequent forest resources monitoring is needed. Lidar is a powerful tool to estimate aboveground biomass at fine resolution; however its application in tropical forests has been limited, with high variability in the accuracy of results. Lidar pulses scan the forest vertical profile, and can provide structure information which is also linked to biodiversity. In the last decade the remote sensing of biodiversity has received great attention, but few studies focused on the use of lidar for assessing tree species richness in tropical forests. This research aims at estimating aboveground biomass and tree species richness using discrete return airborne lidar in Ghana forests. We tested an advanced statistical technique, Multivariate Adaptive Regression Splines (MARS), which does not require assumptions on data distribution or on the relationships between variables, being suitable for studying ecological variables. We compared the MARS regression results with those obtained by multilinear regression and found that both algorithms were effective, but MARS provided higher accuracy either for biomass (R2 = 0.72) and species richness (R2 = 0.64). We also noted strong correlation between biodiversity and biomass field values. Even if the forest areas under analysis are limited in extent and represent peculiar ecosystems, the preliminary indications produced by our study suggest that instrument such as lidar, specifically useful for pinpointing forest structure, can also be exploited as a support for tree species richness assessment.
Tree failures and accidents in recreation areas: a guide to data management for hazard control
Lee A. Paine; James W. Clarke
1978-01-01
A data management system has been developed for storage and retrieval of tree failure and hazard data, with provision for computer analyses and presentation of results in useful tables. This system emphasizes important relationships between tree characteristics, environmental factors, and the resulting hazard. The analysis programs permit easy selection of subsets of...
Ecological and economic determinants of invasive tree species on Alabama forestland
Anwar Hussain; Changyou Sun; Xiaoping Zhou; Ian A. Munn
2008-01-01
The spread of invasive tree species has caused increasing harm to the environment. This study was motivated by the considerations that earlier studies generally ignored the role of economic factors related to the occurrence and abundance of invasive species, and empirical analyses of invasive trees on forestland have been inadequate. We assessed the impact of...
A comparison of soil carbon dynamics in residential yards with and without trees
USDA-ARS?s Scientific Manuscript database
Residential yards can provide chronosequences to discern the influence of home age and tree biomass on soil carbon (C) levels. To accomplish this goal, two separate analyses were conducted. 1) The relationship of soil C to home age was compared between 23 lawns without trees (‘pure lawns’, PL) and 4...
Growth response of black walnut to interplanted trees
Richard C. Schlesinger; Robert D. Williams
1984-01-01
Analyses of black walnut tree diameters 13 years after planting showed that interplanting autumn-olive, black locust, and European alder increased walnut tree growtb, but only at certain locations. Interplanting autumn-olive resulted in increases of 56 to 351% at four of five locations and all species resulted in doubled walnut growth on an upland site. The interaction...
Red-shouldered hawk nesting habitat preference in south Texas
Strobel, Bradley N.; Boal, Clint W.
2010-01-01
We examined nesting habitat preference by red-shouldered hawks Buteo lineatus using conditional logistic regression on characteristics measured at 27 occupied nest sites and 68 unused sites in 2005–2009 in south Texas. We measured vegetation characteristics of individual trees (nest trees and unused trees) and corresponding 0.04-ha plots. We evaluated the importance of tree and plot characteristics to nesting habitat selection by comparing a priori tree-specific and plot-specific models using Akaike's information criterion. Models with only plot variables carried 14% more weight than models with only center tree variables. The model-averaged odds ratios indicated red-shouldered hawks selected to nest in taller trees and in areas with higher average diameter at breast height than randomly available within the forest stand. Relative to randomly selected areas, each 1-m increase in nest tree height and 1-cm increase in the plot average diameter at breast height increased the probability of selection by 85% and 10%, respectively. Our results indicate that red-shouldered hawks select nesting habitat based on vegetation characteristics of individual trees as well as the 0.04-ha area surrounding the tree. Our results indicate forest management practices resulting in tall forest stands with large average diameter at breast height would benefit red-shouldered hawks in south Texas.
NASA Astrophysics Data System (ADS)
Leighs, J. A.; Halling-Brown, M. D.; Patel, M. N.
2018-03-01
The UK currently has a national breast cancer-screening program and images are routinely collected from a number of screening sites, representing a wealth of invaluable data that is currently under-used. Radiologists evaluate screening images manually and recall suspicious cases for further analysis such as biopsy. Histological testing of biopsy samples confirms the malignancy of the tumour, along with other diagnostic and prognostic characteristics such as disease grade. Machine learning is becoming increasingly popular for clinical image classification problems, as it is capable of discovering patterns in data otherwise invisible. This is particularly true when applied to medical imaging features; however clinical datasets are often relatively small. A texture feature extraction toolkit has been developed to mine a wide range of features from medical images such as mammograms. This study analysed a dataset of 1,366 radiologist-marked, biopsy-proven malignant lesions obtained from the OPTIMAM Medical Image Database (OMI-DB). Exploratory data analysis methods were employed to better understand extracted features. Machine learning techniques including Classification and Regression Trees (CART), ensemble methods (e.g. random forests), and logistic regression were applied to the data to predict the disease grade of the analysed lesions. Prediction scores of up to 83% were achieved; sensitivity and specificity of the models trained have been discussed to put the results into a clinical context. The results show promise in the ability to predict prognostic indicators from the texture features extracted and thus enable prioritisation of care for patients at greatest risk.
Mechanisms behind the estimation of photosynthesis traits from leaf reflectance observations
NASA Astrophysics Data System (ADS)
Dechant, Benjamin; Cuntz, Matthias; Doktor, Daniel; Vohland, Michael
2016-04-01
Many studies have investigated the reflectance-based estimation of leaf chlorophyll, water and dry matter contents of plants. Only few studies focused on photosynthesis traits, however. The maximum potential uptake of carbon dioxide under given environmental conditions is determined mainly by RuBisCO activity, limiting carboxylation, or the speed of photosynthetic electron transport. These two main limitations are represented by the maximum carboxylation capacity, V cmax,25, and the maximum electron transport rate, Jmax,25. These traits were estimated from leaf reflectance before but the mechanisms underlying the estimation remain rather speculative. The aim of this study was therefore to reveal the mechanisms behind reflectance-based estimation of V cmax,25 and Jmax,25. Leaf reflectance, photosynthetic response curves as well as nitrogen content per area, Narea, and leaf mass per area, LMA, were measured on 37 deciduous tree species. V cmax,25 and Jmax,25 were determined from the response curves. Partial Least Squares (PLS) regression models for the two photosynthesis traits V cmax,25 and Jmax,25 as well as Narea and LMA were studied using a cross-validation approach. Analyses of linear regression models based on Narea and other leaf traits estimated via PROSPECT inversion, PLS regression coefficients and model residuals were conducted in order to reveal the mechanisms behind the reflectance-based estimation. We found that V cmax,25 and Jmax,25 can be estimated from leaf reflectance with good to moderate accuracy for a large number of species and different light conditions. The dominant mechanism behind the estimations was the strong relationship between photosynthesis traits and leaf nitrogen content. This was concluded from very strong relationships between PLS regression coefficients, the model residuals as well as the prediction performance of Narea- based linear regression models compared to PLS regression models. While the PLS regression model for V cmax,25 was fully based on the correlation to Narea, the PLS regression model for Jmax,25 was not entirely based on it. Analyses of the contributions of different parts of the reflectance spectrum revealed that the information contributing to the Jmax,25 PLS regression model in addition to the main source of information, Narea, was mainly located in the visible part of the spectrum (500-900 nm). Estimated chlorophyll content could be excluded as potential source of this extra information. The PLS regression coefficients of the Jmax,25 model indicated possible contributions from chlorophyll fluorescence and cytochrome f content. In summary, we found that the main mechanism behind the estimation of V cmax,25 and Jmax,25 from leaf reflectance observations is the correlation to Narea but that there is additional information related to Jmax,25 mainly in the visible part of the spectrum.
Static terrestrial laser scanning of juvenile understory trees for field phenotyping
NASA Astrophysics Data System (ADS)
Wang, Huanhuan; Lin, Yi
2014-11-01
This study was to attempt the cutting-edge 3D remote sensing technique of static terrestrial laser scanning (TLS) for parametric 3D reconstruction of juvenile understory trees. The data for test was collected with a Leica HDS6100 TLS system in a single-scan way. The geometrical structures of juvenile understory trees are extracted by model fitting. Cones are used to model trunks and branches. Principal component analysis (PCA) is adopted to calculate their major axes. Coordinate transformation and orthogonal projection are used to estimate the parameters of the cones. Then, AutoCAD is utilized to simulate the morphological characteristics of the understory trees, and to add secondary branches and leaves in a random way. Comparison of the reference values and the estimated values gives the regression equation and shows that the proposed algorithm of extracting parameters is credible. The results have basically verified the applicability of TLS for field phenotyping of juvenile understory trees.
Spectral analysis of white ash response to emerald ash borer infestations
NASA Astrophysics Data System (ADS)
Calandra, Laura
The emerald ash borer (EAB) (Agrilus planipennis Fairmaire) is an invasive insect that has killed over 50 million ash trees in the US. The goal of this research was to establish a method to identify ash trees infested with EAB using remote sensing techniques at the leaf-level and tree crown level. First, a field-based study at the leaf-level used the range of spectral bands from the WorldView-2 sensor to determine if there was a significant difference between EAB-infested white ash (Fraxinus americana) and healthy leaves. Binary logistic regression models were developed using individual and combinations of wavelengths; the most successful model included 545 and 950 nm bands. The second half of this research employed imagery to identify healthy and EAB-infested trees, comparing pixel- and object-based methods by applying an unsupervised classification approach and a tree crown delineation algorithm, respectively. The pixel-based models attained the highest overall accuracies.
Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin
2015-08-01
Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.
Lundström, T; Jonas, T; Volkwein, A
2008-01-01
Thirteen Norway spruce [Picea abies (L.) Karst.] trees of different size, age, and social status, and grown under varying conditions, were investigated to see how they react to complex natural static loading under summer and winter conditions, and how they have adapted their growth to such combinations of load and tree state. For this purpose a non-linear finite-element model and an extensive experimental data set were used, as well as a new formulation describing the degree to which the exploitation of the bending stress capacity is uniform. The three main findings were: material and geometric non-linearities play important roles when analysing tree deflections and critical loads; the strengths of the stem and the anchorage mutually adapt to the local wind acting on the tree crown in the forest canopy; and the radial stem growth follows a mechanically high-performance path because it adapts to prevailing as well as acute seasonal combinations of the tree state (e.g. frozen or unfrozen stem and anchorage) and load (e.g. wind and vertical and lateral snow pressure). Young trees appeared to adapt to such combinations in a more differentiated way than older trees. In conclusion, the mechanical performance of the Norway spruce studied was mostly very high, indicating that their overall growth had been clearly influenced by the external site- and tree-specific mechanical stress.
ERIC Educational Resources Information Center
Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard
2010-01-01
The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…
B. Desta Fekedulegn; J.J. Colbert; R.R., Jr. Hicks; Michael E. Schuckers
2002-01-01
The theory and application of principal components regression, a method for coping with multicollinearity among independent variables in analyzing ecological data, is exhibited in detail. A concrete example of the complex procedures that must be carried out in developing a diagnostic growth-climate model is provided. We use tree radial increment data taken from breast...
Marek K. Jakubowksi; Qinghua Guo; Brandon Collins; Scott Stephens; Maggi Kelly
2013-01-01
We compared the ability of several classification and regression algorithms to predict forest stand structure metrics and standard surface fuel models. Our study area spans a dense, topographically complex Sierra Nevada mixed-conifer forest. We used clustering, regression trees, and support vector machine algorithms to analyze high density (average 9 pulses/m
2015-01-01
Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project. PMID:26339227
Shin, Yoonseok
2015-01-01
Among the recent data mining techniques available, the boosting approach has attracted a great deal of attention because of its effective learning algorithm and strong boundaries in terms of its generalization performance. However, the boosting approach has yet to be used in regression problems within the construction domain, including cost estimations, but has been actively utilized in other domains. Therefore, a boosting regression tree (BRT) is applied to cost estimations at the early stage of a construction project to examine the applicability of the boosting approach to a regression problem within the construction domain. To evaluate the performance of the BRT model, its performance was compared with that of a neural network (NN) model, which has been proven to have a high performance in cost estimation domains. The BRT model has shown results similar to those of NN model using 234 actual cost datasets of a building construction project. In addition, the BRT model can provide additional information such as the importance plot and structure model, which can support estimators in comprehending the decision making process. Consequently, the boosting approach has potential applicability in preliminary cost estimations in a building construction project.
Can tree species diversity be assessed with Landsat data in a temperate forest?
Arekhi, Maliheh; Yılmaz, Osman Yalçın; Yılmaz, Hatice; Akyüz, Yaşar Feyza
2017-10-28
The diversity of forest trees as an indicator of ecosystem health can be assessed using the spectral characteristics of plant communities through remote sensing data. The objectives of this study were to investigate alpha and beta tree diversity using Landsat data for six dates in the Gönen dam watershed of Turkey. We used richness and the Shannon and Simpson diversity indices to calculate tree alpha diversity. We also represented the relationship between beta diversity and remotely sensed data using species composition similarity and spectral distance similarity of sampling plots via quantile regression. A total of 99 sampling units, each 20 m × 20 m, were selected using geographically stratified random sampling method. Within each plot, the tree species were identified, and all of the trees with a diameter at breast height (dbh) larger than 7 cm were measured. Presence/absence and abundance data (tree species number and tree species basal area) of tree species were used to determine the relationship between richness and the Shannon and Simpson diversity indices, which were computed with ground field data, and spectral variables derived (2 × 2 pixels and 3 × 3 pixels) from Landsat 8 OLI data. The Shannon-Weiner index had the highest correlation. For all six dates, NDVI (normalized difference vegetation index) was the spectral variable most strongly correlated with the Shannon index and the tree diversity variables. The Ratio of green to red (VI) was the spectral variable least correlated with the tree diversity variables and the Shannon basal area. In both beta diversity curves, the slope of the OLS regression was low, while in the upper quantile, it was approximately twice the lower quantiles. The Jaccard index is closed to one with little difference in both two beta diversity approaches. This result is due to increasing the similarity between the sampling plots when they are located close to each other. The intercept differences between two investigated beta diversity were strongly related to the development stage of a number of sampling plots in the tree species basal area method. To obtain beta diversity, the tree basal area method indicates better result than the tree species number method at representing similarity of regions which are located close together. In conclusion, NDVI is helpful for estimating the alpha diversity of trees over large areas when the vegetation is at the maximum growing season. Beta diversity could be obtained with the spectral heterogeneity of Landsat data. Future tree diversity studies using remote sensing data should select data sets when vegetation is at the maximum growing season. Also, forest tree diversity investigations can be identified by using higher-resolution remote sensing data such as ESA Sentinel 2 data which is freely available since June 2015.
Arano, Ichiro; Sugimoto, Tomoyuki; Hamasaki, Toshimitsu; Ohno, Yuko
2010-04-23
Survival analysis methods such as the Kaplan-Meier method, log-rank test, and Cox proportional hazards regression (Cox regression) are commonly used to analyze data from randomized withdrawal studies in patients with major depressive disorder. However, unfortunately, such common methods may be inappropriate when a long-term censored relapse-free time appears in data as the methods assume that if complete follow-up were possible for all individuals, each would eventually experience the event of interest. In this paper, to analyse data including such a long-term censored relapse-free time, we discuss a semi-parametric cure regression (Cox cure regression), which combines a logistic formulation for the probability of occurrence of an event with a Cox proportional hazards specification for the time of occurrence of the event. In specifying the treatment's effect on disease-free survival, we consider the fraction of long-term survivors and the risks associated with a relapse of the disease. In addition, we develop a tree-based method for the time to event data to identify groups of patients with differing prognoses (cure survival CART). Although analysis methods typically adapt the log-rank statistic for recursive partitioning procedures, the method applied here used a likelihood ratio (LR) test statistic from a fitting of cure survival regression assuming exponential and Weibull distributions for the latency time of relapse. The method is illustrated using data from a sertraline randomized withdrawal study in patients with major depressive disorder. We concluded that Cox cure regression reveals facts on who may be cured, and how the treatment and other factors effect on the cured incidence and on the relapse time of uncured patients, and that cure survival CART output provides easily understandable and interpretable information, useful both in identifying groups of patients with differing prognoses and in utilizing Cox cure regression models leading to meaningful interpretations.
Developing a dengue forecast model using machine learning: A case study in China.
Guo, Pi; Liu, Tao; Zhang, Qin; Wang, Li; Xiao, Jianpeng; Zhang, Qingying; Luo, Ganfeng; Li, Zhihao; He, Jianfeng; Zhang, Yonghui; Ma, Wenjun
2017-10-01
In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue. Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011-2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China. The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yakir, D.; Gat, J.; Issar, A.
1994-08-01
The isotopic ratios [sup 13]C/[sup 12]C and [sup 18]O/[sup 16]O of cellulose from tamarix trees which were used by the Roman army as a groundwork of the siege-rampart of Masada (AD 70-73) were compared with ratios measured in present-day tamarix trees growing in the Masada region and in central Israel. The ancient tamarix cellulose is depleted in both [sup 13]C and [sup 18]O compared to cellulose from trees growing in the Masada region today. Similar trends were observed on comparing modern tamarix trees growing in the Negev Desert with those growing in the temperate climate of central Israel. Considering themore » factors that can contribute to the observed changes in isotopic composition, the authors conclude that the ancient trees enjoyed less arid environmental conditions during their growth compared to contemporary trees in this desert region. This report demonstrates the potential in using combined [sup 18]O and [sup 13]C analyses of archeological plant material as independent indication of regional climate change in desert areas (where conventional isotopic analyses, such as in tree rings, are impractical).« less
Trees grow on money: urban tree canopy cover and environmental justice.
Schwarz, Kirsten; Fragkias, Michail; Boone, Christopher G; Zhou, Weiqi; McHale, Melissa; Grove, J Morgan; O'Neil-Dunne, Jarlath; McFadden, Joseph P; Buckley, Geoffrey L; Childers, Dan; Ogden, Laura; Pincetl, Stephanie; Pataki, Diane; Whitmer, Ali; Cadenasso, Mary L
2015-01-01
This study examines the distributional equity of urban tree canopy (UTC) cover for Baltimore, MD, Los Angeles, CA, New York, NY, Philadelphia, PA, Raleigh, NC, Sacramento, CA, and Washington, D.C. using high spatial resolution land cover data and census data. Data are analyzed at the Census Block Group levels using Spearman's correlation, ordinary least squares regression (OLS), and a spatial autoregressive model (SAR). Across all cities there is a strong positive correlation between UTC cover and median household income. Negative correlations between race and UTC cover exist in bivariate models for some cities, but they are generally not observed using multivariate regressions that include additional variables on income, education, and housing age. SAR models result in higher r-square values compared to the OLS models across all cities, suggesting that spatial autocorrelation is an important feature of our data. Similarities among cities can be found based on shared characteristics of climate, race/ethnicity, and size. Our findings suggest that a suite of variables, including income, contribute to the distribution of UTC cover. These findings can help target simultaneous strategies for UTC goals and environmental justice concerns.
Improving ensemble decision tree performance using Adaboost and Bagging
NASA Astrophysics Data System (ADS)
Hasan, Md. Rajib; Siraj, Fadzilah; Sainin, Mohd Shamrie
2015-12-01
Ensemble classifier systems are considered as one of the most promising in medical data classification and the performance of deceision tree classifier can be increased by the ensemble method as it is proven to be better than single classifiers. However, in a ensemble settings the performance depends on the selection of suitable base classifier. This research employed two prominent esemble s namely Adaboost and Bagging with base classifiers such as Random Forest, Random Tree, j48, j48grafts and Logistic Model Regression (LMT) that have been selected independently. The empirical study shows that the performance varries when different base classifiers are selected and even some places overfitting issue also been noted. The evidence shows that ensemble decision tree classfiers using Adaboost and Bagging improves the performance of selected medical data sets.
Su, Xiaogang; Peña, Annette T; Liu, Lei; Levine, Richard A
2018-04-29
Assessing heterogeneous treatment effects is a growing interest in advancing precision medicine. Individualized treatment effects (ITEs) play a critical role in such an endeavor. Concerning experimental data collected from randomized trials, we put forward a method, termed random forests of interaction trees (RFIT), for estimating ITE on the basis of interaction trees. To this end, we propose a smooth sigmoid surrogate method, as an alternative to greedy search, to speed up tree construction. The RFIT outperforms the "separate regression" approach in estimating ITE. Furthermore, standard errors for the estimated ITE via RFIT are obtained with the infinitesimal jackknife method. We assess and illustrate the use of RFIT via both simulation and the analysis of data from an acupuncture headache trial. Copyright © 2018 John Wiley & Sons, Ltd.
Decision Tree Approach for Soil Liquefaction Assessment
Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.
2013-01-01
In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498
Decision tree approach for soil liquefaction assessment.
Gandomi, Amir H; Fridline, Mark M; Roke, David A
2013-01-01
In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view.
Yang, Yizi; Luo, Damin
2013-01-01
Inferring evolutionary history of parasitism genes is important to understand how evolutionary mechanisms affect the occurrences of parasitism genes. In this study, we constructed multiple domain trees for parasitism genes and genes under free-living conditions. Further analyses of horizontal gene transfer (HGT)-like phylogenetic incongruences, duplications, and speciations were performed based on these trees. By comparing these analyses, the contributions of pre-adaptations were found to be more important to the evolution of parasitism genes than those of duplications, and pre-adaptations are as crucial as previously reported HGTs to parasitism. Furthermore, speciation may also affect the evolution of parasitism genes. In addition, Pristionchus pacificus was suggested to be a common model organism for studies of parasitic nematodes, including root-knot species. These analyses provided information regarding mechanisms that may have contributed to the evolution of parasitism genes.
Field responses of Prunus serotina and Asclepias syriaca to ozone around southern Lake Michigan
Bennett, J.P.; Jepsen, E.A.; Roth, J.A.
2006-01-01
Higher ozone concentrations east of southern Lake Michigan compared to west of the lake were used to test hypotheses about injury and growth effects on two plant species. We measured approximately 1000 black cherry trees and over 3000 milkweed stems from 1999 to 2001 for this purpose. Black cherry branch elongation and milkweed growth and pod formation were significantly higher west of Lake Michigan while ozone injury was greater east of Lake Michigan. Using classification and regression tree (CART) analyses we determined that departures from normal precipitation, soil nitrogen and ozone exposure/peak hourly concentrations were the most important variables affecting cherry branch elongation, and milkweed stem height and pod formation. The effects of ozone were not consistently comparable with the effects of soil nutrients, weather, insect or disease injury, and depended on species. Ozone SUM06 exposures greater than 13 ppm-h decreased cherry branch elongation 18%; peak 1-h exposures greater than 93 ppb reduced milkweed stem height 13%; and peak 1-h concentrations greater than 98 ppb reduced pod formation 11% in milkweed. Decreased cherry branch elongation, milkweed stem height and pod production, and foliar injury on both species occurred at sites around southern Lake Michigan at ozone exposures of 13 SUM06 ppm-h and 93a??98 ppb peak hourly.
Effects of street tree shade on asphalt concrete pavement performance
E.G. McPherson; J. Muchnick
2005-01-01
Forty-eight street segments were paired into 24 high-and low-shade pairs in Modesto, California, U.S. Field data were collected to calculate a Pavement Condition Index (PCI) and Tree Shade Index (TSI) for each segment. Statistical analyses found that greater PCI was associated with greater TSI, indicating that tree shade was partially responsible for reduced pavement...
Simmons, Mark P; Goloboff, Pablo A
2013-10-01
Empirical and simulated examples are used to demonstrate an artifact caused by undersampling optimal trees in data matrices that consist mostly or entirely of locally sampled (as opposed to globally, for most or all terminals) characters. The artifact is that unsupported clades consisting entirely of terminals scored for the same locally sampled partition may be resolved and assigned high resampling support-despite their being properly unsupported (i.e., not resolved in the strict consensus of all optimal trees). This artifact occurs despite application of random-addition sequences for stepwise terminal addition. The artifact is not necessarily obviated with thorough conventional branch swapping methods (even tree-bisection-reconnection) when just a single tree is held, as is sometimes implemented in parsimony bootstrap pseudoreplicates, and in every GARLI, PhyML, and RAxML pseudoreplicate and search for the most likely tree for the matrix as a whole. Hence GARLI, RAxML, and PhyML-based likelihood results require extra scrutiny, particularly when they provide high resolution and support for clades that are entirely unsupported by methods that perform more thorough searches, as in most parsimony analyses. Copyright © 2013 Elsevier Inc. All rights reserved.
Using existing case-mix methods to fund trauma cases.
Monakova, Julia; Blais, Irene; Botz, Charles; Chechulin, Yuriy; Picciano, Gino; Basinski, Antoni
2010-01-01
Policymakers frequently face the need to increase funding in isolated and frequently heterogeneous (clinically and in terms of resource consumption) patient subpopulations. This article presents a methodologic solution for testing the appropriateness of using existing grouping and weighting methodologies for funding subsets of patients in the scenario where a case-mix approach is preferable to a flat-rate based payment system. Using as an example the subpopulation of trauma cases of Ontario lead trauma hospitals, the statistical techniques of linear and nonlinear regression models, regression trees, and spline models were applied to examine the fit of the existing case-mix groups and reference weights for the trauma cases. The analyses demonstrated that for funding Ontario trauma cases, the existing case-mix systems can form the basis for rational and equitable hospital funding, decreasing the need to develop a different grouper for this subset of patients. This study confirmed that Injury Severity Score is a poor predictor of costs for trauma patients. Although our analysis used the Canadian case-mix classification system and cost weights, the demonstrated concept of using existing case-mix systems to develop funding rates for specific subsets of patient populations may be applicable internationally.
Obstructive sleep apnea severity estimation: Fusion of speech-based systems.
Ben Or, D; Dafna, E; Tarasiuk, A; Zigel, Y
2016-08-01
Obstructive sleep apnea (OSA) is a common sleep-related breathing disorder. Previous studies associated OSA with anatomical abnormalities of the upper respiratory tract that may be reflected in the acoustic characteristics of speech. We tested the hypothesis that the speech signal carries essential information that can assist in early assessment of OSA severity by estimating apnea-hypopnea index (AHI). 198 men referred to routine polysomnography (PSG) were recorded shortly prior to sleep onset while reading a one-minute speech protocol. The different parts of the speech recordings, i.e., sustained vowels, short-time frames of fluent speech, and the speech recording as a whole, underwent separate analyses, using sustained vowels features, short-term features, and long-term features, respectively. Applying support vector regression and regression trees, these features were used in order to estimate AHI. The fusion of the outputs of the three subsystems resulted in a diagnostic agreement of 67.3% between the speech-estimated AHI and the PSG-determined AHI, and an absolute error rate of 10.8 events/hr. Speech signal analysis may assist in the estimation of AHI, thus allowing the development of a noninvasive tool for OSA screening.
Multivariate prediction of upper limb prosthesis acceptance or rejection.
Biddiss, Elaine A; Chau, Tom T
2008-07-01
To develop a model for prediction of upper limb prosthesis use or rejection. A questionnaire exploring factors in prosthesis acceptance was distributed internationally to individuals with upper limb absence through community-based support groups and rehabilitation hospitals. A total of 191 participants (59 prosthesis rejecters and 132 prosthesis wearers) were included in this study. A logistic regression model, a C5.0 decision tree, and a radial basis function neural network were developed and compared in terms of sensitivity (prediction of prosthesis rejecters), specificity (prediction of prosthesis wearers), and overall cross-validation accuracy. The logistic regression and neural network provided comparable overall accuracies of approximately 84 +/- 3%, specificity of 93%, and sensitivity of 61%. Fitting time-frame emerged as the predominant predictor. Individuals fitted within two years of birth (congenital) or six months of amputation (acquired) were 16 times more likely to continue prosthesis use. To increase rates of prosthesis acceptance, clinical directives should focus on timely, client-centred fitting strategies and the development of improved prostheses and healthcare for individuals with high-level or bilateral limb absence. Multivariate analyses are useful in determining the relative importance of the many factors involved in prosthesis acceptance and rejection.
Groenendijk, Peter; van der Sleen, Peter; Vlam, Mart; Bunyavejchewin, Sarayudh; Bongers, Frans; Zuidema, Pieter A
2015-10-01
The important role of tropical forests in the global carbon cycle makes it imperative to assess changes in their carbon dynamics for accurate projections of future climate-vegetation feedbacks. Forest monitoring studies conducted over the past decades have found evidence for both increasing and decreasing growth rates of tropical forest trees. The limited duration of these studies restrained analyses to decadal scales, and it is still unclear whether growth changes occurred over longer time scales, as would be expected if CO2 -fertilization stimulated tree growth. Furthermore, studies have so far dealt with changes in biomass gain at forest-stand level, but insights into species-specific growth changes - that ultimately determine community-level responses - are lacking. Here, we analyse species-specific growth changes on a centennial scale, using growth data from tree-ring analysis for 13 tree species (~1300 trees), from three sites distributed across the tropics. We used an established (regional curve standardization) and a new (size-class isolation) growth-trend detection method and explicitly assessed the influence of biases on the trend detection. In addition, we assessed whether aggregated trends were present within and across study sites. We found evidence for decreasing growth rates over time for 8-10 species, whereas increases were noted for two species and one showed no trend. Additionally, we found evidence for weak aggregated growth decreases at the site in Thailand and when analysing all sites simultaneously. The observed growth reductions suggest deteriorating growth conditions, perhaps due to warming. However, other causes cannot be excluded, such as recovery from large-scale disturbances or changing forest dynamics. Our findings contrast growth patterns that would be expected if elevated CO2 would stimulate tree growth. These results suggest that commonly assumed growth increases of tropical forests may not occur, which could lead to erroneous predictions of carbon dynamics of tropical forest under climate change. © 2015 John Wiley & Sons Ltd.
Coloured Logic Petri Nets and analysis of their reachable trees
NASA Astrophysics Data System (ADS)
Wang, Jing; Du, YuYue; Yu, ShuXia
2015-11-01
Logic Petri nets (LPNs) can describe and analyse the batch processing function and passing value indeterminacy in cooperative systems, and alleviate the state space explosion problem. However, the indeterminate data of logical output transitions cannot be described explicitly in LPNs. Therefore, Coloured Logic Petri nets (CLPNs) are defined in this paper. It can determine the indeterminate data of logic output transitions in LPNs, i.e., the indeterminate data can be represented definitely in CLPNs. A vector matching method is proposed to judge the enabling transitions and analyse CLPNs. From the marking equation and the proposed reachable tree generation algorithm of CLPNs, a reachable tree can be built, and reachable markings are calculated. The advantage of CLPNs can be shown based on the number of leaf nodes of the reachability tree, and CLPNs can solve the indeterminate data of logical output transitions. Finally, an example shows that CLPNs can further reduce the dimensionality of reachable markings.
Sah, Jay P.; Ross, Michael S.; Snyder, James R.; ...
2010-01-01
In fire-dependent forests, managers are interested in predicting the consequences of prescribed burning on postfire tree mortality. We examined the effects of prescribed fire on tree mortality in Florida Keys pine forests, using a factorial design with understory type, season, and year of burn as factors. We also used logistic regression to model the effects of burn season, fire severity, and tree dimensions on individual tree mortality. Despite limited statistical power due to problems in carrying out the full suite of planned experimental burns, associations with tree and fire variables were observed. Post-fire pine tree mortality was negatively correlated withmore » tree size and positively correlated with char height and percent crown scorch. Unlike post-fire mortality, tree mortality associated with storm surge from Hurricane Wilma was greater in the large size classes. Due to their influence on population structure and fuel dynamics, the size-selective mortality patterns following fire and storm surge have practical importance for using fire as a management tool in Florida Keys pinelands in the future, particularly when the threats to their continued existence from tropical storms and sea level rise are expected to increase.« less
Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling
NASA Astrophysics Data System (ADS)
Galelli, S.; Castelletti, A.
2013-02-01
Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modeling. In this paper we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modeling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalization property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally very efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analyzed on two real-world case studies (Marina catchment (Singapore) and Canning River (Western Australia)) representing two different morphoclimatic contexts comparatively with other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.
Fokkema, M; Smits, N; Zeileis, A; Hothorn, T; Kelderman, H
2017-10-25
Identification of subgroups of patients for whom treatment A is more effective than treatment B, and vice versa, is of key importance to the development of personalized medicine. Tree-based algorithms are helpful tools for the detection of such interactions, but none of the available algorithms allow for taking into account clustered or nested dataset structures, which are particularly common in psychological research. Therefore, we propose the generalized linear mixed-effects model tree (GLMM tree) algorithm, which allows for the detection of treatment-subgroup interactions, while accounting for the clustered structure of a dataset. The algorithm uses model-based recursive partitioning to detect treatment-subgroup interactions, and a GLMM to estimate the random-effects parameters. In a simulation study, GLMM trees show higher accuracy in recovering treatment-subgroup interactions, higher predictive accuracy, and lower type II error rates than linear-model-based recursive partitioning and mixed-effects regression trees. Also, GLMM trees show somewhat higher predictive accuracy than linear mixed-effects models with pre-specified interaction effects, on average. We illustrate the application of GLMM trees on an individual patient-level data meta-analysis on treatments for depression. We conclude that GLMM trees are a promising exploratory tool for the detection of treatment-subgroup interactions in clustered datasets.
Forward modeling of tree-ring data: a case study with a global network
NASA Astrophysics Data System (ADS)
Breitenmoser, P. D.; Frank, D.; Brönnimann, S.
2012-04-01
Information derived from tree-rings is one of the most powerful tools presently available for studying past climatic variability as well as identifying fundamental relationships between tree-growth and climate. Climate reconstructions are typically performed by extending linear relationships, established during the overlapping period of instrumental and climate proxy archives into the past. Such analyses, however, are limited by methodological assumptions, including stationarity and linearity of the climate-proxy relationship. We investigate climate and tree-ring data using the Vaganov-Shashkin-Lite (VS-Lite) forward model of tree-ring width formation to examine the relations among actual tree growth and climate (as inferred from the simulated chronologies) to reconstruct past climate variability. The VS-lite model has been shown to produce skill comparable to that achieved using classical dendrochronological statistical modeling techniques when applied on simulations of a network of North American tree-ring chronologies. Although the detailed mechanistic processes such as photosynthesis, storage, or cell processes are not modeled directly, the net effect of the dominating nonlinear climatic controls on tree-growth are implemented into the model by the principle of limiting factors and threshold growth response functions. The VS-lite model requires as inputs only latitude, monthly mean temperature and monthly accumulated precipitation. Hence, this simple, process-based model enables ring-width simulation at any location where monthly climate records exist. In this study, we analyse the growth response of simulated tree-rings to monthly climate conditions obtained from the 20th century reanalysis project back to 1871. These simulated tree-ring chronologies are compared to the climate-driven variability in worldwide observed tree-ring chronologies from the International Tree Ring Database. Results point toward the suitability of the relationship among actual tree growth and climate (as inferred from the simulated chronologies) for use in global palaeoclimate reconstructions.
NASA Astrophysics Data System (ADS)
Perone, A.; Lombardi, F.; Marchetti, M.; Tognetti, R.; Lasserre, B.
2016-10-01
Tree rings reveal climatic variations through years, but also the effect of solar activity in influencing the climate on a large scale. In order to investigate the role of solar cycles on climatic variability and to analyse their influences on tree growth, we focused on tree-ring chronologies of Araucaria angustifolia and Araucaria araucana in four study areas: Irati and Curitiba in Brazil, Caviahue in Chile, and Tolhuaca in Argentina. We obtained an average tree-ring chronology of 218, 117, 439, and 849 years for these areas, respectively. Particularly, the older chronologies also included the period of the Maunder and Dalton minima. To identify periodicities and trends observable in tree growth, the time series were analysed using spectral, wavelet and cross-wavelet techniques. Analysis based on the Multitaper method of annual growth rates identified 2 cycles with periodicities of 11 (Schwebe cycle) and 5.5 years (second harmonic of Schwebe cycle). In the Chilean and Argentinian sites, significant agreement between the time series of tree rings and the 11-year solar cycle was found during the periods of maximum solar activity. Results also showed oscillation with periods of 2-7 years, probably induced by local environmental variations, and possibly also related to the El-Niño events. Moreover, the Morlet complex wavelet analysis was applied to study the most relevant variability factors affecting tree-ring time series. Finally, we applied the cross-wavelet spectral analysis to evaluate the time lags between tree-ring and sunspot-number time series, as well as for the interaction between tree rings, the Southern Oscillation Index (SOI) and temperature and precipitation. Trees sampled in Chile and Argentina showed more evident responses of fluctuations in tree-ring time series to the variations of short and long periodicities in comparison with the Brazilian ones. These results provided new evidence on the solar activity-climate pattern-tree ring connections over centuries.
Özge, C; Toros, F; Bayramkaya, E; Çamdeviren, H; Şaşmaz, T
2006-01-01
Background The purpose of this study is to evaluate the most important sociodemographic factors on smoking status of high school students using a broad randomised epidemiological survey. Methods Using in‐class, self administered questionnaire about their sociodemographic variables and smoking behaviour, a representative sample of total 3304 students of preparatory, 9th, 10th, and 11th grades, from 22 randomly selected schools of Mersin, were evaluated and discriminative factors have been determined using appropriate statistics. In addition to binary logistic regression analysis, the study evaluated combined effects of these factors using classification and regression tree methodology, as a new statistical method. Results The data showed that 38% of the students reported lifetime smoking and 16.9% of them reported current smoking with a male predominancy and increasing prevalence by age. Second hand smoking was reported at a 74.3% frequency with father predominance (56.6%). The significantly important factors that affect current smoking in these age groups were increased by household size, late birth rank, certain school types, low academic performance, increased second hand smoking, and stress (especially reported as separation from a close friend or because of violence at home). Classification and regression tree methodology showed the importance of some neglected sociodemographic factors with a good classification capacity. Conclusions It was concluded that, as closely related with sociocultural factors, smoking was a common problem in this young population, generating important academic and social burden in youth life and with increasing data about this behaviour and using new statistical methods, effective coping strategies could be composed. PMID:16891446
2013-01-01
Background Beech bark disease is an insect-fungus complex that damages and often kills American beech trees and has major ecological and economic impacts on forests of the northeastern United States and southeastern Canadian forests. The disease begins when exotic beech scale insects feed on the bark of trees, and is followed by infection of damaged bark tissues by one of the Neonectria species of fungi. Proteomic analysis was conducted of beech bark proteins from diseased trees and healthy trees in areas heavily infested with beech bark disease. All of the diseased trees had signs of Neonectria infection such as cankers or fruiting bodies. In previous tests reported elsewhere, all of the diseased trees were demonstrated to be susceptible to the scale insect and all of the healthy trees were demonstrated to be resistant to the scale insect. Sixteen trees were sampled from eight geographically isolated stands, the sample consisting of 10 healthy (scale-resistant) and 6 diseased/infested (scale-susceptible) trees. Results Proteins were extracted from each tree and analysed in triplicate by isoelectric focusing followed by denaturing gel electrophoresis. Gels were stained and protein spots identified and intensity quantified, then a statistical model was fit to identify significant differences between trees. A subset of BBD differential proteins were analysed by mass spectrometry and matched to known protein sequences for identification. Identified proteins had homology to stress, insect, and pathogen related proteins in other plant systems. Protein spots significantly different in diseased and healthy trees having no stand or disease-by-stand interaction effects were identified. Conclusions Further study of these proteins should help to understand processes critical to resistance to beech bark disease and to develop biomarkers for use in tree breeding programs and for the selection of resistant trees prior to or in early stages of BBD development in stands. Early identification of resistant trees (prior to the full disease development in an area) will allow forest management through the removal of susceptible trees and their root-sprouts prior to the onset of disease, allowing management and mitigation of costs, economic impact, and impacts on ecological systems and services. PMID:23317283
Adams, Temitope F; Wongchai, Chatchawal; Chaidee, Anchalee; Pfeiffer, Wolfgang
2016-01-01
Plant essential oils have been suggested as a promising alternative to the established mosquito repellent DEET (N,N-diethyl-meta-toluamide). Searching for an assay with generally available equipment, we designed a new audiovisual assay of repellent activity against mosquitoes "Singing in the Tube," testing single mosquitoes in Drosophila cultivation tubes. Statistics with regression analysis should compensate for limitations of simple hardware. The assay was established with female Culex pipiens mosquitoes in 60 experiments, 120-h audio recording, and 2580 estimations of the distance between mosquito sitting position and the chemical. Correlations between parameters of sitting position, flight activity pattern, and flight tone spectrum were analyzed. Regression analysis of psycho-acoustic data of audio files (dB[A]) used a squared and modified sinus function determining wing beat frequency WBF ± SD (357 ± 47 Hz). Application of logistic regression defined the repelling velocity constant. The repelling velocity constant showed a decreasing order of efficiency of plant essential oils: rosemary (Rosmarinus officinalis), eucalyptus (Eucalyptus globulus), lavender (Lavandula angustifolia), citronella (Cymbopogon nardus), tea tree (Melaleuca alternifolia), clove (Syzygium aromaticum), lemon (Citrus limon), patchouli (Pogostemon cablin), DEET, cedar wood (Cedrus atlantica). In conclusion, we suggest (1) disease vector control (e.g., impregnation of bed nets) by eight plant essential oils with repelling velocity superior to DEET, (2) simple mosquito repellency testing in Drosophila cultivation tubes, (3) automated approaches and room surveillance by generally available audio equipment (dB[A]: ISO standard 226), and (4) quantification of repellent activity by parameters of the audiovisual assay defined by correlation and regression analyses.
NASA Astrophysics Data System (ADS)
Jurasinski, Gerald; Scharnweber, Tobias; Schröder, Christian; Lennartz, Bernd; Bauwe, Andreas
2017-04-01
Tree growth depends, among other factors, largely on the prevailing climatic conditions. Therefore, tree growth patterns are to be expected under climate change. Here, we analyze the tree-ring growth response of three major European tree species to projected future climate across a climatic (mostly precipitation) gradient in northeastern Germany. We used monthly data for temperature, precipitation, and the standardized precipitation evapotranspiration index (SPEI) over multiple time scales (1, 3, 6, 12, and 24 months) to construct models of tree-ring growth for Scots pine (Pinus syl- vestris L.) at three pure stands, and for Common beech (Fagus sylvatica L.) and Pedunculate oak (Quercus robur L.) at three mature mixed stands. The regression models were derived using a two-step approach based on partial least squares regression (PLSR) to extract potentially well explaining variables followed by ordinary least squares regression (OLSR) to consolidate the models to the least number of variables while retaining high explanatory power. The stability of the models was tested with a comprehensive calibration-verification scheme. All models were successfully verified with R2s ranging from 0.21 for the western pine stand to 0.62 for the beech stand in the east. For growth prediction, climate data forecasted until 2100 by the regional climate model WETTREG2010 based on the A1B Intergovernmental Panel on Climate Change (IPCC) emission scenario was used. For beech and oak, growth rates will likely decrease until the end of the 21st century. For pine, modeled growth trends vary and range from a slight growth increase to a weak decrease in growth rates depending on the position along the climatic gradient. The climatic gradient across the study area will possibly affect the future growth of oak with larger growth reductions towards the drier east. For beech, site-specific adaptations seem to override the influence of the climatic gradient. We conclude that in Northeastern Germany Scots pine has great potential to remain resilient to projected climate change without any greater impairment, whereas Common beech and Pedunculate oak will likely face lesser growth under the expected warmer and dryer climate conditions. The results call for an adaptation of forest management to mitigate the negative effects of climate change for beech and oak in the region.
Martin, Michael A; Meyricke, Ramona; O'Neill, Terry; Roberts, Steven
2006-04-20
A critical choice facing breast cancer patients is which surgical treatment--mastectomy or breast conserving surgery (BCS)--is most appropriate. Several studies have investigated factors that impact the type of surgery chosen, identifying features such as place of residence, age at diagnosis, tumor size, socio-economic and racial/ethnic elements as relevant. Such assessment of "propensity" is important in understanding issues such as a reported under-utilisation of BCS among women for whom such treatment was not contraindicated. Using Western Australian (WA) data, we further examine the factors associated with the type of surgical treatment for breast cancer using a classification tree approach. This approach deals naturally with complicated interactions between factors, and so allows flexible and interpretable models for treatment choice to be built that add to the current understanding of this complex decision process. Data was extracted from the WA Cancer Registry on women diagnosed with breast cancer in WA from 1990 to 2000. Subjects' treatment preferences were predicted from covariates using both classification trees and logistic regression. Tumor size was the primary determinant of patient choice, subjects with tumors smaller than 20 mm in diameter preferring BCS. For subjects with tumors greater than 20 mm in diameter factors such as patient age, nodal status, and tumor histology become relevant as predictors of patient choice. Classification trees perform as well as logistic regression for predicting patient choice, but are much easier to interpret for clinical use. The selected tree can inform clinicians' advice to patients.
Cowles, Richard S
2010-10-01
The armored scales Fiorinia externa Ferris and Aspidiotus cryptomeriae Kuwana (Hemiptera: Diaspididae) are increasingly damaging to Christmas tree plantings in southern New England. The systemic insecticide dinotefuran was investigated for selectively suppressing armored scale populations relative to their natural enemies in cooperating growers' fields in 2008 and 2009. Banded soil application of dinotefuran resulted in poor control. However, a dinotefuran spray applied to the basal 25 cm of trunk resulted in its absorption through the bark, translocation to the foliage, and good efficacy. The basal bark spray did not significantly impact the activity of predators Chilocorus stigma (Say) or Cybocephalus nipponicus Enrody-Younga and in 2009 showed a dosage-dependent improvement in the percentage of scales parasitized by Encarsia citrina Craw. A field dosage-response factorial experiment revealed that a 0.25% (vol:vol) addition of a surfactant with dinotefuran did not enhance insecticidal effect. Probit-transformed scale population reduction relative to the untreated check was subjected to linear regression analysis; reduction of scale populations was proportional to the log of insecticide dosage, whereas basal bark spray efficacy declined in proportion to the cube of tree height. The regression equation can be used to optimize dosage relative to tree height. Excellent efficacy resulted from basal bark spray application dates of 28 April (prebud break) to mid-June, but earlier spray timing within that treatment window had fewer crawlers discoloring new growth with their short-lived feeding. A basal bark spray of dinotefuran is well suited for integration with natural enemies to manage armored scales in Christmas tree plantations.
DeWitt, Jamie C; Millsap, Deborah S; Yeager, Ronnie L; Heise, Steve S; Sparks, Daniel W; Henshel, Diane S
2006-02-01
Necropsy-observable cardiac deformities were evaluated from 283 nestling passerines collected from one reference site and five polychlorinated biphenyl (PCB)-contaminated sites around Bloomington and Bedford, Indiana, USA. Hearts were weighed and assessed on relative scales in three dimensions (height, length, and width) and for externally visible deformities. Heart weights normalized to body weight (heart somatic index) were decreased significantly at the more contaminated sites in both house wren (Troglodytes aedon) and tree swallow (Tachycineta bicolor). Heart somatic indices significantly correlated with log PCB concentrations in Carolina chickadee (Parus carolinesis) and tree swallow and with log 2,3,7,8-tetrachlorodibenzo-p-dioxin toxic equivalent values in tree swallow alone. Ventricular length was increased significantly in eastern bluebirds (Sialia sialis) and decreased significantly in Carolina chickadee and tree swallow from contaminated sites versus the reference site. Heart length regressed significantly against the log PCB concentrations (Carolina chickadee and tree swallow) or the square of the PCB concentrations (red-winged blackbird [Agelaius phoeniceus]) in a sibling bird. The deformities that were observed most at the contaminated sites included abnormal tips (pointed, rounded, or flattened), center rolls, macro- and microsurface roughness, ventricular indentations on the ventral or dorsal surface, lateral ventricular notches, visibly thin ventricular walls, and changes in overall heart shape. A pooled heart deformity index regressed significantly against the logged contaminant concentrations for all species except red-winged blackbird. These results indicate that developmental changes in heart morphometrics and shape abnormalities are quantifiable and may be sensitive and useful indicators of PCB-related developmental impacts across many avian species.
Alghamdi, Manal; Al-Mallah, Mouaz; Keteyian, Steven; Brawner, Clinton; Ehrman, Jonathan; Sakr, Sherif
2017-01-01
Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE). The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree) and achieved high accuracy of prediction (AUC = 0.92). The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.
NASA Astrophysics Data System (ADS)
Isa, N. A.; Mohd, W. M. N. Wan; Salleh, S. A.; Ooi, M. C. G.
2018-02-01
Matured trees contain high concentration of chlorophyll that encourages the process of photosynthesis. This process produces oxygen as a by-product and releases it into the atmosphere and helps in lowering the ambient temperature. This study attempts to analyse the effect of green area on air surface temperature of the Kuala Lumpur city. The air surface temperatures of two different dates which are, in March 2006 and March 2016 were simulated using the Weather Research and Forecasting (WRF) model. The green area in the city was extracted using the Normalized Difference Vegetation Index (NDVI) from two Landsat satellite images. The relationship between the air surface temperature and the green area were analysed using linear regression models. From the study, it was found that, the green area was significantly affecting the distribution of air temperature within the city. A strong negative correlation was identified through this study which indicated that higher NDVI values tend to have lower air surface temperature distribution within the focus study area. It was also found that, different urban setting in mixed built-up and vegetated areas resulted in different distributions of air surface temperature. Future studies should focus on analysing the air surface temperature within the area of mixed built-up and vegetated area.
Masaki, T; Hata, S; Ide, Y
2015-03-01
In the present study, we analysed the habitat association of tree species in an old-growth temperate forest across all life stages to test theories on the coexistence of tree species in forest communities. An inventory for trees was implemented at a 6-ha plot in Ogawa Forest Reserve for adults, juveniles, saplings and seedlings. Volumetric soil water content (SMC) and light levels were measured in 10-m grids. Relationships between the actual number of stems and environmental variables were determined for 35 major tree species, and the spatial correlations within and among species were analysed. The light level had no statistically significant effect on distribution of saplings and seedlings of any species. In contrast, most species had specific optimal values along the SMC gradient. The optimal values were almost identical in earlier life stages, but were more variable in later life stages among species. However, no effective niche partitioning among the species was apparent even at the adult stage. Furthermore, results of spatial analyses suggest that dispersal limitation was not sufficient to mitigate competition between species. This might result from well-scattered seed distribution via wind and bird dispersal, as well as conspecific density-dependent mortality of seeds and seedlings. Thus, both niche partitioning and dispersal limitation appeared less important for facilitating coexistence of species within this forest than expected in tropical forests. The tree species assembly in this temperate forest might be controlled through a neutral process at the spatial scale tested in this study. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P
2015-01-01
This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.
Data mining of tree-based models to analyze freeway accident frequency.
Chang, Li-Yen; Chen, Wen-Chieh
2005-01-01
Statistical models, such as Poisson or negative binomial regression models, have been employed to analyze vehicle accident frequency for many years. However, these models have their own model assumptions and pre-defined underlying relationship between dependent and independent variables. If these assumptions are violated, the model could lead to erroneous estimation of accident likelihood. Classification and Regression Tree (CART), one of the most widely applied data mining techniques, has been commonly employed in business administration, industry, and engineering. CART does not require any pre-defined underlying relationship between target (dependent) variable and predictors (independent variables) and has been shown to be a powerful tool, particularly for dealing with prediction and classification problems. This study collected the 2001-2002 accident data of National Freeway 1 in Taiwan. A CART model and a negative binomial regression model were developed to establish the empirical relationship between traffic accidents and highway geometric variables, traffic characteristics, and environmental factors. The CART findings indicated that the average daily traffic volume and precipitation variables were the key determinants for freeway accident frequencies. By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies. By comparing the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies.
Ramírez, J; Górriz, J M; Segovia, F; Chaves, R; Salas-Gonzalez, D; López, M; Alvarez, I; Padilla, P
2010-03-19
This letter shows a computer aided diagnosis (CAD) technique for the early detection of the Alzheimer's disease (AD) by means of single photon emission computed tomography (SPECT) image classification. The proposed method is based on partial least squares (PLS) regression model and a random forest (RF) predictor. The challenge of the curse of dimensionality is addressed by reducing the large dimensionality of the input data by downscaling the SPECT images and extracting score features using PLS. A RF predictor then forms an ensemble of classification and regression tree (CART)-like classifiers being its output determined by a majority vote of the trees in the forest. A baseline principal component analysis (PCA) system is also developed for reference. The experimental results show that the combined PLS-RF system yields a generalization error that converges to a limit when increasing the number of trees in the forest. Thus, the generalization error is reduced when using PLS and depends on the strength of the individual trees in the forest and the correlation between them. Moreover, PLS feature extraction is found to be more effective for extracting discriminative information from the data than PCA yielding peak sensitivity, specificity and accuracy values of 100%, 92.7%, and 96.9%, respectively. Moreover, the proposed CAD system outperformed several other recently developed AD CAD systems. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
Accurate Phylogenetic Tree Reconstruction from Quartets: A Heuristic Approach
Reaz, Rezwana; Bayzid, Md. Shamsuzzoha; Rahman, M. Sohel
2014-01-01
Supertree methods construct trees on a set of taxa (species) combining many smaller trees on the overlapping subsets of the entire set of taxa. A ‘quartet’ is an unrooted tree over taxa, hence the quartet-based supertree methods combine many -taxon unrooted trees into a single and coherent tree over the complete set of taxa. Quartet-based phylogeny reconstruction methods have been receiving considerable attentions in the recent years. An accurate and efficient quartet-based method might be competitive with the current best phylogenetic tree reconstruction methods (such as maximum likelihood or Bayesian MCMC analyses), without being as computationally intensive. In this paper, we present a novel and highly accurate quartet-based phylogenetic tree reconstruction method. We performed an extensive experimental study to evaluate the accuracy and scalability of our approach on both simulated and biological datasets. PMID:25117474
Tree-growth analyses to estimate tree species' drought tolerance.
Eilmann, Britta; Rigling, Andreas
2012-02-01
Climate change is challenging forestry management and practices. Among other things, tree species with the ability to cope with more extreme climate conditions have to be identified. However, while environmental factors may severely limit tree growth or even cause tree death, assessing a tree species' potential for surviving future aggravated environmental conditions is rather demanding. The aim of this study was to find a tree-ring-based method suitable for identifying very drought-tolerant species, particularly potential substitute species for Scots pine (Pinus sylvestris L.) in Valais. In this inner-Alpine valley, Scots pine used to be the dominating species for dry forests, but today it suffers from high drought-induced mortality. We investigate the growth response of two native tree species, Scots pine and European larch (Larix decidua Mill.), and two non-native species, black pine (Pinus nigra Arnold) and Douglas fir (Pseudotsuga menziesii Mirb. var. menziesii), to drought. This involved analysing how the radial increment of these species responded to increasing water shortage (abandonment of irrigation) and to increasingly frequent drought years. Black pine and Douglas fir are able to cope with drought better than Scots pine and larch, as they show relatively high radial growth even after irrigation has been stopped and a plastic growth response to drought years. European larch does not seem to be able to cope with these dry conditions as it lacks the ability to recover from drought years. The analysis of trees' short-term response to extreme climate events seems to be the most promising and suitable method for detecting how tolerant a tree species is towards drought. However, combining all the methods used in this study provides a complete picture of how water shortage could limit species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sarrack, A.G.
The purpose of this report is to document fault tree analyses which have been completed for the Defense Waste Processing Facility (DWPF) safety analysis. Logic models for equipment failures and human error combinations that could lead to flammable gas explosions in various process tanks, or failure of critical support systems were developed for internal initiating events and for earthquakes. These fault trees provide frequency estimates for support systems failures and accidents that could lead to radioactive and hazardous chemical releases both on-site and off-site. Top event frequency results from these fault trees will be used in further APET analyses tomore » calculate accident risk associated with DWPF facility operations. This report lists and explains important underlying assumptions, provides references for failure data sources, and briefly describes the fault tree method used. Specific commitments from DWPF to provide new procedural/administrative controls or system design changes are listed in the ''Facility Commitments'' section. The purpose of the ''Assumptions'' section is to clarify the basis for fault tree modeling, and is not necessarily a list of items required to be protected by Technical Safety Requirements (TSRs).« less
Schelle, E; Rawlins, B G; Lark, R M; Webster, R; Staton, I; McLeod, C W
2008-09-01
We investigated the use of metals accumulated on tree bark for mapping their deposition across metropolitan Sheffield by sampling 642 trees of three common species. Mean concentrations of metals were generally an order of magnitude greater than in samples from a remote uncontaminated site. We found trivially small differences among tree species with respect to metal concentrations on bark, and in subsequent statistical analyses did not discriminate between them. We mapped the concentrations of As, Cd and Ni by lognormal universal kriging using parameters estimated by residual maximum likelihood (REML). The concentrations of Ni and Cd were greatest close to a large steel works, their probable source, and declined markedly within 500 m of it and from there more gradually over several kilometres. Arsenic was much more evenly distributed, probably as a result of locally mined coal burned in domestic fires for many years. Tree bark seems to integrate airborne pollution over time, and our findings show that sampling and analysing it are cost-effective means of mapping and identifying sources.
Conflict amongst chloroplast DNA sequences obscures the phylogeny of a group of Asplenium ferns.
Shepherd, Lara D; Holland, Barbara R; Perrie, Leon R
2008-07-01
A previous study of the relationships amongst three subgroups of the Austral Asplenium ferns found conflicting signal between the two chloroplast loci investigated. Because organelle genomes like those of chloroplasts and mitochondria are thought to be non-recombining, with a single evolutionary history, we sequenced four additional chloroplast loci with the expectation that this would resolve these relationships. Instead, the conflict was only magnified. Although tree-building analyses favoured one of the three possible trees, one of the alternative trees actually had one more supporting site (six versus five) and received greater support in spectral and neighbor-net analyses. Simulations suggested that chance alone was unlikely to produce strong support for two of the possible trees and none for the third. Likelihood permutation tests indicated that the concatenated chloroplast sequence data appeared to have experienced recombination. However, recombination between the chloroplast genomes of different species would be highly atypical, and corollary supporting observations, like chloroplast heteroplasmy, are lacking. Wider taxon sampling clarified the composition of the Austral group, but the conflicting signal meant analyses (e.g., morphological evolution, biogeographic) conditional on a well-supported phylogeny could not be performed.
Measuring (bio)physical tree properties using accelerometers
NASA Astrophysics Data System (ADS)
van Emmerik, Tim; Steele-Dunne, Susan; Hut, Rolf; Gentine, Pierre; Selker, John; van de Giesen, Nick
2017-04-01
Trees play a crucial role in the water, carbon and nitrogen cycle on local, regional and global scales. Understanding the exchange of heat, water, and CO2 between trees and the atmosphere is important to assess the impact of drought, deforestation and climate change. Unfortunately, ground measurements of tree dynamics are often expensive, or difficult due to challenging environments. We demonstrate the potential of measuring (bio)physical properties of trees using robust and affordable acceleration sensors. Tree sway is dependent on e.g. mass and wind energy absorption of the tree. By measuring tree acceleration we can relate the tree motion to external loads (e.g. precipitation), and tree (bio)physical properties (e.g. mass). Using five months of acceleration data of 19 trees in the Brazilian Amazon, we show that the frequency spectrum of tree sway is related to mass, precipitation, and canopy drag. This presentation aims to show the concept of using accelerometers to measure tree dynamics, and we acknowledge that the presented example applications is not an exhaustive list. Further analyses are the scope of current research, and we hope to inspire others to explore additional applications.
Nimbalkar, S D; Jade, S S; Kauthale, V K; Agale, S; Bahulikar, R A
2018-03-01
Madhuca indica provides livelihood to several tribal people in India, where the flowers are used for extraction of sweet juices having multiple applications. Certain trees have more value as judged by the tribal people mainly based on yield and quality performance of the trees, and these trees were selected for the genetic diversity analyses. Genetic diversity of 48 candidate Mahua trees from Etapalli, Dadagaon, and Jawhar, Maharashtra, India, was assessed using ISSR markers. Fourteen ISSR primers revealed a total of 132 polymorphic bands giving overall 92% polymorphism. Genetic diversity, in terms of expected number of alleles (Ne), the observed number of alleles (Na), Nei's genetic diversity (H), and Shannon's information index ( I ) was 1.921, 1.333, 0.211, and 0.337, respectively, and suggested lower genetic diversity. Region wise analysis revealed higher genetic diversity for site Etapalli ( H = 0.206) and lowest at Dhadgaon ( H = 0.140). Etapalli area possesses higher forest cover than Dhadgaon and Jawhar. Additionally, in Dhadgaon and Jawhar M. indica trees are restricted to field bunds; both reasons might contribute to lower genetic diversity in these regions. The dendrogram and the principal coordinate analyses showed no region-specific clustering. The clustering patterns were supported by AMOVA where higher genetic variance was observed within trees and lower variance among regions. Long-distance dispersal and/or higher human interference might be responsible for low diversity and higher genetic variance within the candidate trees.
Lacombe, Guillaume; Valentin, Christian; Sounyafong, Phabvilay; de Rouw, Anneke; Soulileuth, Bounsamai; Silvera, Norbert; Pierret, Alain; Sengtaheuanghoung, Oloth; Ribolzi, Olivier
2018-03-01
In Montane Southeast Asia, deforestation and unsuitable combinations of crops and agricultural practices degrade soils at an unprecedented rate. Typically, smallholder farmers gain income from "available" land by replacing fallow or secondary forest by perennial crops. We aimed to understand how these practices increase or reduce soil erosion. Ten land uses were monitored in Northern Laos during the 2015 monsoon, using local farmers' fields. Experiments included plots of the conventional system (food crops and fallow), and land uses corresponding to new market opportunities (e.g. commercial tree plantations). Land uses were characterized by measuring plant cover and plant mean height per vegetation layer. Recorded meteorological variables included rainfall intensity, throughfall amount, throughfall kinetic energy (TKE), and raindrop size. Runoff coefficient, soil loss, and the percentage areas of soil surface types (free aggregates and gravel; crusts; macro-faunal, vegetal and pedestal features; plant litter) were derived from observations and measurements in 1-m 2 micro-plots. Relationships between these variables were explored with multiple regression analyses. Our results indicate that TKE induces soil crusting and soil loss. By reducing rainfall infiltration, crusted area enhances runoff, which removes and transports soil particles detached by splash over non-crusted areas. TKE is lower under land uses reducing the velocity of raindrops and/or preventing an increase in their size. Optimal vegetation structures combine minimum height of the lowest layer (to reduce drop velocity at ground level) and maximum coverage (to intercept the largest amount of rainfall), as exemplified by broom grass (Thysanolaena latifolia). In contrast, high canopies with large leaves will increase TKE by enlarging raindrops, as exemplified by teak trees (Tectona grandis), unless a protective understorey exists under the trees. Policies that ban the burning of multi-layered vegetation structure under tree plantations should be enforced. Shade-tolerant shrubs and grasses with potential economic return could be promoted as understorey. Copyright © 2017 Elsevier B.V. All rights reserved.
Olivier, Pieter I.; van Aarde, Rudi J.
2017-01-01
The peninsula effect predicts that the number of species should decline from the base of a peninsula to the tip. However, evidence for the peninsula effect is ambiguous, as different analytical methods, study taxa, and variations in local habitat or regional climatic conditions influence conclusions on its presence. We address this uncertainty by using two analytical methods to investigate the peninsula effect in three taxa that occupy different trophic levels: trees, millipedes, and birds. We surveyed 81 tree quadrants, 102 millipede transects, and 152 bird points within 150 km of coastal dune forest that resemble a habitat peninsula along the northeast coast of South Africa. We then used spatial (trend surface analyses) and non-spatial regressions (generalized linear mixed models) to test for the presence of the peninsula effect in each of the three taxa. We also used linear mixed models to test if climate (temperature and precipitation) and/or local habitat conditions (water availability associated with topography and landscape structural variables) could explain gradients in species richness. Non-spatial models suggest that the peninsula effect was present in all three taxa. However, spatial models indicated that only bird species richness declined from the peninsula base to the peninsula tip. Millipede species richness increased near the centre of the peninsula, while tree species richness increased near the tip. Local habitat conditions explained species richness patterns of birds and trees, but not of millipedes, regardless of model type. Our study highlights the idiosyncrasies associated with the peninsula effect—conclusions on the presence of the peninsula effect depend on the analytical methods used and the taxon studied. The peninsula effect might therefore be better suited to describe a species richness pattern where the number of species decline from a broader habitat base to a narrow tip, rather than a process that drives species richness. PMID:28376096
Meng, Fengqun; Cao, Rui; Yang, Dongmei; Niklas, Karl J; Sun, Shucun
2013-07-01
In theory, plants can alter the distribution of leaves along the lengths of their twigs (i.e., within-twig leaf distribution patterns) to optimize light interception in the context of the architectures of their leaves, branches and canopies. We hypothesized that (i) among canopy tree species sharing similar light environments, deciduous trees will have more evenly spaced within-twig leaf distribution patterns compared with evergreen trees (because deciduous species tend to higher metabolic demands than evergreen species and hence require more light), and that (ii) shade-adapted evergreen species will have more evenly spaced patterns compared with sun-adapted evergreen ones (because shade-adapted species are generally light-limited). We tested these hypotheses by measuring morphological traits (i.e., internode length, leaf area, lamina mass per area, LMA; and leaf and twig inclination angles to the horizontal) and physiological traits (i.e., light-saturated net photosynthetic rates, Amax; light saturation points, LSP; and light compensation points, LCP), and calculated the 'evenness' of within-twig leaf distribution patterns as the coefficient of variation (CV; the higher the CV, the less evenly spaced leaves) of within-twig internode length for 9 deciduous canopy tree species, 15 evergreen canopy tree species, 8 shade-adapted evergreen shrub species and 12 sun-adapted evergreen shrub species in a subtropical broad-leaved rainforest in eastern China. Coefficient of variation was positively correlated with large LMA and large leaf and twig inclination angles, which collectively specify a typical trait combination adaptive to low light interception, as indicated by both ordinary regression and phylogenetic generalized least squares analyses. These relationships were also valid within the evergreen tree species group (which had the largest sample size). Consistent with our hypothesis, in the canopy layer, deciduous species (which were characterized by high LCP, LSP and Amax) had more even leaf distribution patterns than evergreen species (which had low LCP, LSP and Amax); shade-adapted evergreen species had more even leaf distribution patterns than sun-adapted evergreen species. We propose that the leaf distribution pattern (i.e., 'evenness' CV, which is an easily measured functional trait) can be used to distinguish among life-forms in communities similar to the one examined in this study.
Paine, T D; Millar, J G; Hanks, L M; Gould, J; Wang, Q; Daane, K; Dahlsten, D L; Mcpherson, E G
2015-12-01
As well as being planted for wind breaks, landscape trees, and fuel wood, eucalypts are also widely used as urban street trees in California. They now are besieged by exotic insect herbivores of four different feeding guilds. The objective of the current analysis was to determine the return on investment from biological control programs that have targeted these pests. Independent estimates of the total number of eucalypt street trees in California ranged from a high of 476,527 trees (based on tree inventories from 135 California cities) to a low of 190,666 trees (based on 49 tree inventories). Based on a survey of 3,512 trees, the estimated mean value of an individual eucalypt was US$5,978. Thus, the total value of eucalypt street trees in California ranged from more than US$1.0 billion to more than US$2.8 billion. Biological control programs that targeted pests of eucalypts in California have cost US$2,663,097 in extramural grants and University of California salaries. Consequently, the return derived from protecting the value of this resource through the biological control efforts, per dollar expended, ranged from US$1,070 for the high estimated number of trees to US$428 for the lower estimate. The analyses demonstrate both the tremendous value of urban street trees, and the benefits that stem from successful biological control programs aimed at preserving these trees. Economic analyses such as this, which demonstrate the substantial rates of return from successful biological control of invasive pests, may play a key role in developing both grass-roots and governmental support for future urban biological control efforts. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Validating automatic semantic annotation of anatomy in DICOM CT images
NASA Astrophysics Data System (ADS)
Pathak, Sayan D.; Criminisi, Antonio; Shotton, Jamie; White, Steve; Robertson, Duncan; Sparks, Bobbi; Munasinghe, Indeera; Siddiqui, Khan
2011-03-01
In the current health-care environment, the time available for physicians to browse patients' scans is shrinking due to the rapid increase in the sheer number of images. This is further aggravated by mounting pressure to become more productive in the face of decreasing reimbursement. Hence, there is an urgent need to deliver technology which enables faster and effortless navigation through sub-volume image visualizations. Annotating image regions with semantic labels such as those derived from the RADLEX ontology can vastly enhance image navigation and sub-volume visualization. This paper uses random regression forests for efficient, automatic detection and localization of anatomical structures within DICOM 3D CT scans. A regression forest is a collection of decision trees which are trained to achieve direct mapping from voxels to organ location and size in a single pass. This paper focuses on comparing automated labeling with expert-annotated ground-truth results on a database of 50 highly variable CT scans. Initial investigations show that regression forest derived localization errors are smaller and more robust than those achieved by state-of-the-art global registration approaches. The simplicity of the algorithm's context-rich visual features yield typical runtimes of less than 10 seconds for a 5123 voxel DICOM CT series on a single-threaded, single-core machine running multiple trees; each tree taking less than a second. Furthermore, qualitative evaluation demonstrates that using the detected organs' locations as index into the image volume improves the efficiency of the navigational workflow in all the CT studies.
Similar Gender Dimorphism in the Costs of Reproduction across the Geographic Range of Fraxinus ornus
Verdú, Miguel; Spanos, Kostas; čaňová, Ingrid; Slobodník, Branko; Paule, Ladislav
2007-01-01
Background and Aims The reproductive costs for individuals with the female function have been hypothesized to be greater than for those with the male function because the allocation unit per female flower is very high due to the necessity to nurture the embryos until seed dispersal occurs, while the male reproductive allocation per flower is lower because it finishes once pollen is shed. Consequently, males may invest more resources in growth than females. This prediction was tested across a wide geographical range in a tree with a dimorphic breeding system (Fraxinus ornus) consisting of males and hermaphrodites functioning as females. The contrasting ecological conditions found across the geographical range allowed the evaluation of the hypothesis that the reproductive costs of sexual dimorphism varies with environmental stressors. Methods By using random-effects meta-analysis, the differences in the reproductive and vegetative investment of male and hermaphrodite trees of F. ornus were analysed in 10 populations from the northern (Slovakia), south-eastern (Greece) and south-western (Spain) limits of its European distribution. The variation in gender-dimorphism with environmental stress was analysed by running a meta-regression between these effect sizes and the two environmental stress indicators: one related to temperature (the frost-free period) and another related to water availability (moisture deficit). Key Results Most of the effect sizes showed that males produced more flowers and grew more quickly than hermaphrodites. Gender differences in reproduction and growth were not minimized or maximized under adverse climatic conditions such as short frost-free periods or severe aridity. Conclusions The lower costs of reproduction for F. ornus males allow them to grow more quickly than hermaphrodites, although such differences in sex-specific reproductive costs are not magnified under stressful conditions. PMID:17098751
Schuster, Tanja M.; Setaro, Sabrina D.; Tibbits, Josquin F. G.; Batty, Erin L.; Fowler, Rachael M.; McLay, Todd G. B.; Wilcox, Stephen; Ades, Peter K.
2018-01-01
Previous molecular phylogenetic analyses have resolved the Australian bloodwood eucalypt genus Corymbia (~100 species) as either monophyletic or paraphyletic with respect to Angophora (9–10 species). Here we assess relationships of Corymbia and Angophora using a large dataset of chloroplast DNA sequences (121,016 base pairs; from 90 accessions representing 55 Corymbia and 8 Angophora species, plus 33 accessions of related genera), skimmed from high throughput sequencing of genomic DNA, and compare results with new analyses of nuclear ITS sequences (119 accessions) from previous studies. Maximum likelihood and maximum parsimony analyses of cpDNA resolve well supported trees with most nodes having >95% bootstrap support. These trees strongly reject monophyly of Corymbia, its two subgenera (Corymbia and Blakella), most taxonomic sections (Abbreviatae, Maculatae, Naviculares, Septentrionales), and several species. ITS trees weakly indicate paraphyly of Corymbia (bootstrap support <50% for maximum likelihood, and 71% for parsimony), but are highly incongruent with the cpDNA analyses, in that they support monophyly of both subgenera and some taxonomic sections of Corymbia. The striking incongruence between cpDNA trees and both morphological taxonomy and ITS trees is attributed largely to chloroplast introgression between taxa, because of geographic sharing of chloroplast clades across taxonomic groups. Such introgression has been widely inferred in studies of the related genus Eucalyptus. This is the first report of its likely prevalence in Corymbia and Angophora, but this is consistent with previous morphological inferences of hybridisation between species. Our findings (based on continent-wide sampling) highlight a need for more focussed studies to assess the extent of hybridisation and introgression in the evolutionary history of these genera, and that critical testing of the classification of Corymbia and Angophora requires additional sequence data from nuclear genomes. PMID:29668710
Heterogeneous Compression of Large Collections of Evolutionary Trees.
Matthews, Suzanne J
2015-01-01
Compressing heterogeneous collections of trees is an open problem in computational phylogenetics. In a heterogeneous tree collection, each tree can contain a unique set of taxa. An ideal compression method would allow for the efficient archival of large tree collections and enable scientists to identify common evolutionary relationships over disparate analyses. In this paper, we extend TreeZip to compress heterogeneous collections of trees. TreeZip is the most efficient algorithm for compressing homogeneous tree collections. To the best of our knowledge, no other domain-based compression algorithm exists for large heterogeneous tree collections or enable their rapid analysis. Our experimental results indicate that TreeZip averages 89.03 percent (72.69 percent) space savings on unweighted (weighted) collections of trees when the level of heterogeneity in a collection is moderate. The organization of the TRZ file allows for efficient computations over heterogeneous data. For example, consensus trees can be computed in mere seconds. Lastly, combining the TreeZip compressed (TRZ) file with general-purpose compression yields average space savings of 97.34 percent (81.43 percent) on unweighted (weighted) collections of trees. Our results lead us to believe that TreeZip will prove invaluable in the efficient archival of tree collections, and enables scientists to develop novel methods for relating heterogeneous collections of trees.
Modeling non-linear growth responses to temperature and hydrology in wetland trees
NASA Astrophysics Data System (ADS)
Keim, R.; Allen, S. T.
2016-12-01
Growth responses of wetland trees to flooding and climate variations are difficult to model because they depend on multiple, apparently interacting factors, but are a critical link in hydrological control of wetland carbon budgets. To more generally understand tree growth to hydrological forcing, we modeled non-linear responses of tree ring growth to flooding and climate at sub-annual time steps, using Vaganov-Shashkin response functions. We calibrated the model to six baldcypress tree-ring chronologies from two hydrologically distinct sites in southern Louisiana, and tested several hypotheses of plasticity in wetlands tree responses to interacting environmental variables. The model outperformed traditional multiple linear regression. More importantly, optimized response parameters were generally similar among sites with varying hydrological conditions, suggesting generality to the functions. Model forms that included interacting responses to multiple forcing factors were more effective than were single response functions, indicating the principle of a single limiting factor is not correct in wetlands and both climatic and hydrological variables must be considered in predicting responses to hydrological or climate change.
Leaf drop affects herbivory in oaks.
Pearse, Ian S; Karban, Richard
2013-11-01
Leaf phenology is important to herbivores, but the timing and extent of leaf drop has not played an important role in our understanding of herbivore interactions with deciduous plants. Using phylogenetic general least squares regression, we compared the phenology of leaves of 55 oak species in a common garden with the abundance of leaf miners on those trees. Mine abundance was highest on trees with an intermediate leaf retention index, i.e. trees that lost most, but not all, of their leaves for 2-3 months. The leaves of more evergreen species were more heavily sclerotized, and sclerotized leaves accumulated fewer mines in the summer. Leaves of more deciduous species also accumulated fewer mines in the summer, and this was consistent with the idea that trees reduce overwintering herbivores by shedding leaves. Trees with a later leaf set and slower leaf maturation accumulated fewer herbivores. We propose that both leaf drop and early leaf phenology strongly affect herbivore abundance and select for differences in plant defense. Leaf drop may allow trees to dispose of their herbivores so that the herbivores must recolonize in spring, but trees with the longest leaf retention also have the greatest direct defenses against herbivores.
Ruseva, Tatyana B; Evans, Tom P; Fischer, Burnell C
2015-05-15
This study uses a mail survey of private landowners in the Midwest United States to understand the characteristics of owners who have planted trees or intend to plant trees in the future. The analysis examines what policy tools encourage owners to plant trees, and how policy tools operate across different ownership attributes to promote tree-planting on private lands. Logistic regression results suggest that cost-subsidizing policy tools, such as low-cost and free seedlings, significantly increase the odds of actual and planned reforestation when landowners consider them important for increasing forest cover. Individuals most likely to plant trees, when low-cost seedlings are available and important, are fairly recent (<5 years), college-educated owners who own small parcels (<4 ha) and use the land for recreation. Motivations to reforest were also shaped by owners' planning horizons, connection to the land, previous tree-planting experience, and peer influence. The study has relevance for the design of policy approaches that can encourage private forestation through provision of economic incentives and capacity to private landowners. Copyright © 2015 Elsevier Ltd. All rights reserved.
Schmid, Matthias; Küchenhoff, Helmut; Hoerauf, Achim; Tutz, Gerhard
2016-02-28
Survival trees are a popular alternative to parametric survival modeling when there are interactions between the predictor variables or when the aim is to stratify patients into prognostic subgroups. A limitation of classical survival tree methodology is that most algorithms for tree construction are designed for continuous outcome variables. Hence, classical methods might not be appropriate if failure time data are measured on a discrete time scale (as is often the case in longitudinal studies where data are collected, e.g., quarterly or yearly). To address this issue, we develop a method for discrete survival tree construction. The proposed technique is based on the result that the likelihood of a discrete survival model is equivalent to the likelihood of a regression model for binary outcome data. Hence, we modify tree construction methods for binary outcomes such that they result in optimized partitions for the estimation of discrete hazard functions. By applying the proposed method to data from a randomized trial in patients with filarial lymphedema, we demonstrate how discrete survival trees can be used to identify clinically relevant patient groups with similar survival behavior. Copyright © 2015 John Wiley & Sons, Ltd.
An introduction to tree-structured modeling with application to quality of life data.
Su, Xiaogang; Azuero, Andres; Cho, June; Kvale, Elizabeth; Meneses, Karen M; McNees, M Patrick
2011-01-01
Investigators addressing nursing research are faced increasingly with the need to analyze data that involve variables of mixed types and are characterized by complex nonlinearity and interactions. Tree-based methods, also called recursive partitioning, are gaining popularity in various fields. In addition to efficiency and flexibility in handling multifaceted data, tree-based methods offer ease of interpretation. The aims of this study were to introduce tree-based methods, discuss their advantages and pitfalls in application, and describe their potential use in nursing research. In this article, (a) an introduction to tree-structured methods is presented, (b) the technique is illustrated via quality of life (QOL) data collected in the Breast Cancer Education Intervention study, and (c) implications for their potential use in nursing research are discussed. As illustrated by the QOL analysis example, tree methods generate interesting and easily understood findings that cannot be uncovered via traditional linear regression analysis. The expanding breadth and complexity of nursing research may entail the use of new tools to improve efficiency and gain new insights. In certain situations, tree-based methods offer an attractive approach that help address such needs.