Using Random Forest Models to Predict Organizational Violence
NASA Technical Reports Server (NTRS)
Levine, Burton; Bobashev, Georgly
2012-01-01
We present a methodology to access the proclivity of an organization to commit violence against nongovernment personnel. We fitted a Random Forest model using the Minority at Risk Organizational Behavior (MAROS) dataset. The MAROS data is longitudinal; so, individual observations are not independent. We propose a modification to the standard Random Forest methodology to account for the violation of the independence assumption. We present the results of the model fit, an example of predicting violence for an organization; and finally, we present a summary of the forest in a "meta-tree,"
Discriminant forest classification method and system
Chen, Barry Y.; Hanley, William G.; Lemmond, Tracy D.; Hiller, Lawrence J.; Knapp, David A.; Mugge, Marshall J.
2012-11-06
A hybrid machine learning methodology and system for classification that combines classical random forest (RF) methodology with discriminant analysis (DA) techniques to provide enhanced classification capability. A DA technique which uses feature measurements of an object to predict its class membership, such as linear discriminant analysis (LDA) or Andersen-Bahadur linear discriminant technique (AB), is used to split the data at each node in each of its classification trees to train and grow the trees and the forest. When training is finished, a set of n DA-based decision trees of a discriminant forest is produced for use in predicting the classification of new samples of unknown class.
Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction
Rahman, Raziur; Haider, Saad; Ghosh, Souparno; Pal, Ranadip
2015-01-01
Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity prediction problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error. PMID:27081304
Marchese Robinson, Richard L; Palczewska, Anna; Palczewski, Jan; Kidley, Nathan
2017-08-28
The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical programming language and the Python program HeatMapWrapper [ https://doi.org/10.5281/zenodo.495163 ] for heat map generation.
Marino, S R; Lin, S; Maiers, M; Haagenson, M; Spellman, S; Klein, J P; Binkowski, T A; Lee, S J; van Besien, K
2012-02-01
The identification of important amino acid substitutions associated with low survival in hematopoietic cell transplantation (HCT) is hampered by the large number of observed substitutions compared with the small number of patients available for analysis. Random forest analysis is designed to address these limitations. We studied 2107 HCT recipients with good or intermediate risk hematological malignancies to identify HLA class I amino acid substitutions associated with reduced survival at day 100 post transplant. Random forest analysis and traditional univariate and multivariate analyses were used. Random forest analysis identified amino acid substitutions in 33 positions that were associated with reduced 100 day survival, including HLA-A 9, 43, 62, 63, 76, 77, 95, 97, 114, 116, 152, 156, 166 and 167; HLA-B 97, 109, 116 and 156; and HLA-C 6, 9, 11, 14, 21, 66, 77, 80, 95, 97, 99, 116, 156, 163 and 173. In all 13 had been previously reported by other investigators using classical biostatistical approaches. Using the same data set, traditional multivariate logistic regression identified only five amino acid substitutions associated with lower day 100 survival. Random forest analysis is a novel statistical methodology for analysis of HLA mismatching and outcome studies, capable of identifying important amino acid substitutions missed by other methods.
VT0005 In Action: National Forest Biomass Inventory Using Airborne Lidar Sampling
NASA Astrophysics Data System (ADS)
Saatchi, S. S.; Xu, L.; Meyer, V.; Ferraz, A.; Yang, Y.; Shapiro, A.; Bastin, J. F.
2016-12-01
Tropical countries are required to produce robust and verifiable estimates of forest carbon stocks for successful implementation of climate change mitigation. Lack of systematic national inventory data due to access, cost, and infrastructure, has impacted the capacity of most tropical countries to accurately report the GHG emissions to the international community. Here, we report on the development of the aboveground forest carbon (AGC) map of Democratic Republic of Congo (DRC) by using the VCS (Verified Carbon Standard) methodology developed by Sassan Saatchi (VT0005) using high-resolution airborne LiDAR samples. The methodology provides the distribution of the carbon stocks in aboveground live trees of more than 150 million ha of forests at 1-ha spatial resolution in DRC using more than 430, 000 ha of systematic random airborne Lidar inventory samples of forest structure. We developed a LIDAR aboveground biomass allometry using more than 100 1-ha plots across forest types and power-law model with LIDAR height metrics and average landscape scale wood density. The methodology provided estimates of forest biomass over the entire country using two approaches: 1) mean, variance, and total carbon estimates for each forest type present in DRC using inventory statistical techniques, and 2) a wall-to-wall map of the forest biomass extrapolated using satellite radar (ALOS PALSAR), surface topography from SRTM, and spectral information from Landsat (TM) and machine learning algorithms. We present the methodology, the estimates of carbon stocks and the spatial uncertainty over the entire country. AcknowledgementsThe theoretical research was carried out partially at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration, and the design and implementation in the Democratic Republic of Congo was carried out at the Institute of Environment and Sustainability at University of California Los Angeles through the support of the International Climate Initiative of the German Ministry of Environment, Conservation and Nuclear Security, and the KFW Development Bank.
NASA Astrophysics Data System (ADS)
de Santana, Felipe Bachion; de Souza, André Marcelo; Poppi, Ronei Jesus
2018-02-01
This study evaluates the use of visible and near infrared spectroscopy (Vis-NIRS) combined with multivariate regression based on random forest to quantify some quality soil parameters. The parameters analyzed were soil cation exchange capacity (CEC), sum of exchange bases (SB), organic matter (OM), clay and sand present in the soils of several regions of Brazil. Current methods for evaluating these parameters are laborious, timely and require various wet analytical methods that are not adequate for use in precision agriculture, where faster and automatic responses are required. The random forest regression models were statistically better than PLS regression models for CEC, OM, clay and sand, demonstrating resistance to overfitting, attenuating the effect of outlier samples and indicating the most important variables for the model. The methodology demonstrates the potential of the Vis-NIR as an alternative for determination of CEC, SB, OM, sand and clay, making possible to develop a fast and automatic analytical procedure.
Devaney, John; Barrett, Brian; Barrett, Frank; Redmond, John; O Halloran, John
2015-01-01
Quantification of spatial and temporal changes in forest cover is an essential component of forest monitoring programs. Due to its cloud free capability, Synthetic Aperture Radar (SAR) is an ideal source of information on forest dynamics in countries with near-constant cloud-cover. However, few studies have investigated the use of SAR for forest cover estimation in landscapes with highly sparse and fragmented forest cover. In this study, the potential use of L-band SAR for forest cover estimation in two regions (Longford and Sligo) in Ireland is investigated and compared to forest cover estimates derived from three national (Forestry2010, Prime2, National Forest Inventory), one pan-European (Forest Map 2006) and one global forest cover (Global Forest Change) product. Two machine-learning approaches (Random Forests and Extremely Randomised Trees) are evaluated. Both Random Forests and Extremely Randomised Trees classification accuracies were high (98.1-98.5%), with differences between the two classifiers being minimal (<0.5%). Increasing levels of post classification filtering led to a decrease in estimated forest area and an increase in overall accuracy of SAR-derived forest cover maps. All forest cover products were evaluated using an independent validation dataset. For the Longford region, the highest overall accuracy was recorded with the Forestry2010 dataset (97.42%) whereas in Sligo, highest overall accuracy was obtained for the Prime2 dataset (97.43%), although accuracies of SAR-derived forest maps were comparable. Our findings indicate that spaceborne radar could aid inventories in regions with low levels of forest cover in fragmented landscapes. The reduced accuracies observed for the global and pan-continental forest cover maps in comparison to national and SAR-derived forest maps indicate that caution should be exercised when applying these datasets for national reporting.
Devaney, John; Barrett, Brian; Barrett, Frank; Redmond, John; O`Halloran, John
2015-01-01
Quantification of spatial and temporal changes in forest cover is an essential component of forest monitoring programs. Due to its cloud free capability, Synthetic Aperture Radar (SAR) is an ideal source of information on forest dynamics in countries with near-constant cloud-cover. However, few studies have investigated the use of SAR for forest cover estimation in landscapes with highly sparse and fragmented forest cover. In this study, the potential use of L-band SAR for forest cover estimation in two regions (Longford and Sligo) in Ireland is investigated and compared to forest cover estimates derived from three national (Forestry2010, Prime2, National Forest Inventory), one pan-European (Forest Map 2006) and one global forest cover (Global Forest Change) product. Two machine-learning approaches (Random Forests and Extremely Randomised Trees) are evaluated. Both Random Forests and Extremely Randomised Trees classification accuracies were high (98.1–98.5%), with differences between the two classifiers being minimal (<0.5%). Increasing levels of post classification filtering led to a decrease in estimated forest area and an increase in overall accuracy of SAR-derived forest cover maps. All forest cover products were evaluated using an independent validation dataset. For the Longford region, the highest overall accuracy was recorded with the Forestry2010 dataset (97.42%) whereas in Sligo, highest overall accuracy was obtained for the Prime2 dataset (97.43%), although accuracies of SAR-derived forest maps were comparable. Our findings indicate that spaceborne radar could aid inventories in regions with low levels of forest cover in fragmented landscapes. The reduced accuracies observed for the global and pan-continental forest cover maps in comparison to national and SAR-derived forest maps indicate that caution should be exercised when applying these datasets for national reporting. PMID:26262681
Russo, Lucia; Russo, Paola; Siettos, Constantinos I.
2016-01-01
Based on complex network theory, we propose a computational methodology which addresses the spatial distribution of fuel breaks for the inhibition of the spread of wildland fires on heterogeneous landscapes. This is a two-level approach where the dynamics of fire spread are modeled as a random Markov field process on a directed network whose edge weights are determined by a Cellular Automata model that integrates detailed GIS, landscape and meteorological data. Within this framework, the spatial distribution of fuel breaks is reduced to the problem of finding network nodes (small land patches) which favour fire propagation. Here, this is accomplished by exploiting network centrality statistics. We illustrate the proposed approach through (a) an artificial forest of randomly distributed density of vegetation, and (b) a real-world case concerning the island of Rhodes in Greece whose major part of its forest was burned in 2008. Simulation results show that the proposed methodology outperforms the benchmark/conventional policy of fuel reduction as this can be realized by selective harvesting and/or prescribed burning based on the density and flammability of vegetation. Interestingly, our approach reveals that patches with sparse density of vegetation may act as hubs for the spread of the fire. PMID:27780249
Russo, Lucia; Russo, Paola; Siettos, Constantinos I
2016-01-01
Based on complex network theory, we propose a computational methodology which addresses the spatial distribution of fuel breaks for the inhibition of the spread of wildland fires on heterogeneous landscapes. This is a two-level approach where the dynamics of fire spread are modeled as a random Markov field process on a directed network whose edge weights are determined by a Cellular Automata model that integrates detailed GIS, landscape and meteorological data. Within this framework, the spatial distribution of fuel breaks is reduced to the problem of finding network nodes (small land patches) which favour fire propagation. Here, this is accomplished by exploiting network centrality statistics. We illustrate the proposed approach through (a) an artificial forest of randomly distributed density of vegetation, and (b) a real-world case concerning the island of Rhodes in Greece whose major part of its forest was burned in 2008. Simulation results show that the proposed methodology outperforms the benchmark/conventional policy of fuel reduction as this can be realized by selective harvesting and/or prescribed burning based on the density and flammability of vegetation. Interestingly, our approach reveals that patches with sparse density of vegetation may act as hubs for the spread of the fire.
Automatic detection of freezing of gait events in patients with Parkinson's disease.
Tripoliti, Evanthia E; Tzallas, Alexandros T; Tsipouras, Markos G; Rigas, George; Bougia, Panagiota; Leontiou, Michael; Konitsiotis, Spiros; Chondrogiorgi, Maria; Tsouli, Sofia; Fotiadis, Dimitrios I
2013-04-01
The aim of this study is to detect freezing of gait (FoG) events in patients suffering from Parkinson's disease (PD) using signals received from wearable sensors (six accelerometers and two gyroscopes) placed on the patients' body. For this purpose, an automated methodology has been developed which consists of four stages. In the first stage, missing values due to signal loss or degradation are replaced and then (second stage) low frequency components of the raw signal are removed. In the third stage, the entropy of the raw signal is calculated. Finally (fourth stage), four classification algorithms have been tested (Naïve Bayes, Random Forests, Decision Trees and Random Tree) in order to detect the FoG events. The methodology has been evaluated using several different configurations of sensors in order to conclude to the set of sensors which can produce optimal FoG episode detection. Signals recorded from five healthy subjects, five patients with PD who presented the symptom of FoG and six patients who suffered from PD but they do not present FoG events. The signals included 93 FoG events with 405.6s total duration. The results indicate that the proposed methodology is able to detect FoG events with 81.94% sensitivity, 98.74% specificity, 96.11% accuracy and 98.6% area under curve (AUC) using the signals from all sensors and the Random Forests classification algorithm. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Jarnevich, Catherine S.; Talbert, Marian; Morisette, Jeffrey T.; Aldridge, Cameron L.; Brown, Cynthia; Kumar, Sunil; Manier, Daniel; Talbert, Colin; Holcombe, Tracy R.
2017-01-01
Evaluating the conditions where a species can persist is an important question in ecology both to understand tolerances of organisms and to predict distributions across landscapes. Presence data combined with background or pseudo-absence locations are commonly used with species distribution modeling to develop these relationships. However, there is not a standard method to generate background or pseudo-absence locations, and method choice affects model outcomes. We evaluated combinations of both model algorithms (simple and complex generalized linear models, multivariate adaptive regression splines, Maxent, boosted regression trees, and random forest) and background methods (random, minimum convex polygon, and continuous and binary kernel density estimator (KDE)) to assess the sensitivity of model outcomes to choices made. We evaluated six questions related to model results, including five beyond the common comparison of model accuracy assessment metrics (biological interpretability of response curves, cross-validation robustness, independent data accuracy and robustness, and prediction consistency). For our case study with cheatgrass in the western US, random forest was least sensitive to background choice and the binary KDE method was least sensitive to model algorithm choice. While this outcome may not hold for other locations or species, the methods we used can be implemented to help determine appropriate methodologies for particular research questions.
Stevens, Forrest R; Gaughan, Andrea E; Linard, Catherine; Tatem, Andrew J
2015-01-01
High resolution, contemporary data on human population distributions are vital for measuring impacts of population growth, monitoring human-environment interactions and for planning and policy development. Many methods are used to disaggregate census data and predict population densities for finer scale, gridded population data sets. We present a new semi-automated dasymetric modeling approach that incorporates detailed census and ancillary data in a flexible, "Random Forest" estimation technique. We outline the combination of widely available, remotely-sensed and geospatial data that contribute to the modeled dasymetric weights and then use the Random Forest model to generate a gridded prediction of population density at ~100 m spatial resolution. This prediction layer is then used as the weighting surface to perform dasymetric redistribution of the census counts at a country level. As a case study we compare the new algorithm and its products for three countries (Vietnam, Cambodia, and Kenya) with other common gridded population data production methodologies. We discuss the advantages of the new method and increases over the accuracy and flexibility of those previous approaches. Finally, we outline how this algorithm will be extended to provide freely-available gridded population data sets for Africa, Asia and Latin America.
An application of quantile random forests for predictive mapping of forest attributes
E.A. Freeman; G.G. Moisen
2015-01-01
Increasingly, random forest models are used in predictive mapping of forest attributes. Traditional random forests output the mean prediction from the random trees. Quantile regression forests (QRF) is an extension of random forests developed by Nicolai Meinshausen that provides non-parametric estimates of the median predicted value as well as prediction quantiles. It...
Random forest feature selection approach for image segmentation
NASA Astrophysics Data System (ADS)
Lefkovits, László; Lefkovits, Szidónia; Emerich, Simina; Vaida, Mircea Florin
2017-03-01
In the field of image segmentation, discriminative models have shown promising performance. Generally, every such model begins with the extraction of numerous features from annotated images. Most authors create their discriminative model by using many features without using any selection criteria. A more reliable model can be built by using a framework that selects the important variables, from the point of view of the classification, and eliminates the unimportant once. In this article we present a framework for feature selection and data dimensionality reduction. The methodology is built around the random forest (RF) algorithm and its variable importance evaluation. In order to deal with datasets so large as to be practically unmanageable, we propose an algorithm based on RF that reduces the dimension of the database by eliminating irrelevant features. Furthermore, this framework is applied to optimize our discriminative model for brain tumor segmentation.
GIS based Cadastral level Forest Information System using World View-II data in Bir Hisar (Haryana)
NASA Astrophysics Data System (ADS)
Mothi Kumar, K. E.; Singh, S.; Attri, P.; Kumar, R.; Kumar, A.; Sarika; Hooda, R. S.; Sapra, R. K.; Garg, V.; Kumar, V.; Nivedita
2014-11-01
Identification and demarcation of Forest lands on the ground remains a major challenge in Forest administration and management. Cadastral forest mapping deals with forestlands boundary delineation and their associated characterization (forest/non forest). The present study is an application of high resolution World View-II data for digitization of Protected Forest boundary at cadastral level with integration of Records of Right (ROR) data. Cadastral vector data was generated by digitization of spatial data using scanned mussavies in ArcGIS environment. Ortho-images were created from World View-II digital stereo data with Universal Transverse Mercator coordinate system with WGS 84 datum. Cadastral vector data of Bir Hisar (Hisar district, Haryana) and adjacent villages was spatially adjusted over ortho-image using ArcGIS software. Edge matching of village boundaries was done with respect to khasra boundaries of individual village. The notified forest grids were identified on ortho-image and grid vector data was extracted from georeferenced cadastral data. Cadastral forest boundary vectors were digitized from ortho-images. Accuracy of cadastral data was checked by comparison of randomly selected geo-coordinates points, tie lines and boundary measurements of randomly selected parcels generated from image data set with that of actual field measurements. Area comparison was done between cadastral map area, the image map area and RoR area. The area covered under Protected Forest was compared with ROR data and within an accuracy of less than 1 % from ROR area was accepted. The methodology presented in this paper is useful to update the cadastral forest maps. The produced GIS databases and large-scale Forest Maps may serve as a data foundation towards a land register of forests. The study introduces the use of very high resolution satellite data to develop a method for cadastral surveying through on - screen digitization in a less time as compared to the old fashioned cadastral parcel boundaries surveying method.
Pereira, Sérgio; Meier, Raphael; McKinley, Richard; Wiest, Roland; Alves, Victor; Silva, Carlos A; Reyes, Mauricio
2018-02-01
Machine learning systems are achieving better performances at the cost of becoming increasingly complex. However, because of that, they become less interpretable, which may cause some distrust by the end-user of the system. This is especially important as these systems are pervasively being introduced to critical domains, such as the medical field. Representation Learning techniques are general methods for automatic feature computation. Nevertheless, these techniques are regarded as uninterpretable "black boxes". In this paper, we propose a methodology to enhance the interpretability of automatically extracted machine learning features. The proposed system is composed of a Restricted Boltzmann Machine for unsupervised feature learning, and a Random Forest classifier, which are combined to jointly consider existing correlations between imaging data, features, and target variables. We define two levels of interpretation: global and local. The former is devoted to understanding if the system learned the relevant relations in the data correctly, while the later is focused on predictions performed on a voxel- and patient-level. In addition, we propose a novel feature importance strategy that considers both imaging data and target variables, and we demonstrate the ability of the approach to leverage the interpretability of the obtained representation for the task at hand. We evaluated the proposed methodology in brain tumor segmentation and penumbra estimation in ischemic stroke lesions. We show the ability of the proposed methodology to unveil information regarding relationships between imaging modalities and extracted features and their usefulness for the task at hand. In both clinical scenarios, we demonstrate that the proposed methodology enhances the interpretability of automatically learned features, highlighting specific learning patterns that resemble how an expert extracts relevant data from medical images. Copyright © 2017 Elsevier B.V. All rights reserved.
Prediction of the effect of formulation on the toxicity of chemicals.
Mistry, Pritesh; Neagu, Daniel; Sanchez-Ruiz, Antonio; Trundle, Paul R; Vessey, Jonathan D; Gosling, John Paul
2017-01-01
Two approaches for the prediction of which of two vehicles will result in lower toxicity for anticancer agents are presented. Machine-learning models are developed using decision tree, random forest and partial least squares methodologies and statistical evidence is presented to demonstrate that they represent valid models. Separately, a clustering method is presented that allows the ordering of vehicles by the toxicity they show for chemically-related compounds.
NASA Astrophysics Data System (ADS)
Shiri, Jalal
2018-06-01
Among different reference evapotranspiration (ETo) modeling approaches, mass transfer-based methods have been less studied. These approaches utilize temperature and wind speed records. On the other hand, the empirical equations proposed in this context generally produce weak simulations, except when a local calibration is used for improving their performance. This might be a crucial drawback for those equations in case of local data scarcity for calibration procedure. So, application of heuristic methods can be considered as a substitute for improving the performance accuracy of the mass transfer-based approaches. However, given that the wind speed records have usually higher variation magnitudes than the other meteorological parameters, application of a wavelet transform for coupling with heuristic models would be necessary. In the present paper, a coupled wavelet-random forest (WRF) methodology was proposed for the first time to improve the performance accuracy of the mass transfer-based ETo estimation approaches using cross-validation data management scenarios in both local and cross-station scales. The obtained results revealed that the new coupled WRF model (with the minimum scatter index values of 0.150 and 0.192 for local and external applications, respectively) improved the performance accuracy of the single RF models as well as the empirical equations to great extent.
NASA Astrophysics Data System (ADS)
Yang, Jing; Zammit, Christian; Dudley, Bruce
2017-04-01
The phenomenon of losing and gaining in rivers normally takes place in lowland where often there are various, sometimes conflicting uses for water resources, e.g., agriculture, industry, recreation, and maintenance of ecosystem function. To better support water allocation decisions, it is crucial to understand the location and seasonal dynamics of these losses and gains. We present a statistical methodology to predict losing and gaining river reaches in New Zealand based on 1) information surveys with surface water and groundwater experts from regional government, 2) A collection of river/watershed characteristics, including climate, soil and hydrogeologic information, and 3) the random forests technique. The surveys on losing and gaining reaches were conducted face-to-face at 16 New Zealand regional government authorities, and climate, soil, river geometry, and hydrogeologic data from various sources were collected and compiled to represent river/watershed characteristics. The random forests technique was used to build up the statistical relationship between river reach status (gain and loss) and river/watershed characteristics, and then to predict for river reaches at Strahler order one without prior losing and gaining information. Results show that the model has a classification error of around 10% for "gain" and "loss". The results will assist further research, and water allocation decisions in lowland New Zealand.
Disruption prediction investigations using Machine Learning tools on DIII-D and Alcator C-Mod
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rea, C.; Granetz, R. S.; Montes, K.
Using data-driven methodology, we exploit the time series of relevant plasma parameters for a large set of disrupted and non-disrupted discharges to develop a classification algorithm for detecting disruptive phases in shots that eventually disrupt. Comparing the same methodology on different devices is crucial in order to have information on the portability of the developed algorithm and the possible extrapolation to ITER. Therefore, we use data from two very different tokamaks, DIII-D and Alcator C-Mod. We then focus on a subset of disruption predictors, most of which are dimensionless and/or machine-independent parameters, coming from both plasma diagnostics and equilibrium reconstructions,more » such as the normalized plasma internal inductance ℓ and the n = 1 mode amplitude normalized to the toroidal magnetic field. Using such dimensionless indicators facilitates a more direct comparison between DIII-D and C-Mod. We then choose a shallow Machine Learning technique, called Random Forests, to explore the databases available for the two devices. We show results from the classification task, where we introduce a time dependency through the definition of class labels on the basis of the elapsed time before the disruption (i.e. ‘far from a disruption’ and ‘close to a disruption’). The performances of the different Random Forest classifiers are discussed in terms of several metrics, by showing the number of successfully detected samples, as well as the misclassifications. The overall model accuracies are above 97% when identifying a ‘far from disruption’ and a ‘disruptive’ phase for disrupted discharges. Nevertheless, the Forests are intrinsically different in their capability of predicting a disruptive behavior, with C-Mod predictions comparable to random guesses. Indeed, we show that C-Mod recall index, i.e. the sensitivity to a disruptive behavior, is as low as 0.47, while DIII-D recall is ~0.72. The portability of the developed algorithm is also tested across the two devices, by using DIII-D data for training the forests and C-Mod for testing and vice versa.« less
Disruption prediction investigations using Machine Learning tools on DIII-D and Alcator C-Mod
Rea, C.; Granetz, R. S.; Montes, K.; ...
2018-06-18
Using data-driven methodology, we exploit the time series of relevant plasma parameters for a large set of disrupted and non-disrupted discharges to develop a classification algorithm for detecting disruptive phases in shots that eventually disrupt. Comparing the same methodology on different devices is crucial in order to have information on the portability of the developed algorithm and the possible extrapolation to ITER. Therefore, we use data from two very different tokamaks, DIII-D and Alcator C-Mod. We then focus on a subset of disruption predictors, most of which are dimensionless and/or machine-independent parameters, coming from both plasma diagnostics and equilibrium reconstructions,more » such as the normalized plasma internal inductance ℓ and the n = 1 mode amplitude normalized to the toroidal magnetic field. Using such dimensionless indicators facilitates a more direct comparison between DIII-D and C-Mod. We then choose a shallow Machine Learning technique, called Random Forests, to explore the databases available for the two devices. We show results from the classification task, where we introduce a time dependency through the definition of class labels on the basis of the elapsed time before the disruption (i.e. ‘far from a disruption’ and ‘close to a disruption’). The performances of the different Random Forest classifiers are discussed in terms of several metrics, by showing the number of successfully detected samples, as well as the misclassifications. The overall model accuracies are above 97% when identifying a ‘far from disruption’ and a ‘disruptive’ phase for disrupted discharges. Nevertheless, the Forests are intrinsically different in their capability of predicting a disruptive behavior, with C-Mod predictions comparable to random guesses. Indeed, we show that C-Mod recall index, i.e. the sensitivity to a disruptive behavior, is as low as 0.47, while DIII-D recall is ~0.72. The portability of the developed algorithm is also tested across the two devices, by using DIII-D data for training the forests and C-Mod for testing and vice versa.« less
Kebede, Mihiretu; Zegeye, Desalegn Tigabu; Zeleke, Berihun Megabiaw
2017-12-01
To monitor the progress of therapy and disease progression, periodic CD4 counts are required throughout the course of HIV/AIDS care and support. The demand for CD4 count measurement is increasing as ART programs expand over the last decade. This study aimed to predict CD4 count changes and to identify the predictors of CD4 count changes among patients on ART. A cross-sectional study was conducted at the University of Gondar Hospital from 3,104 adult patients on ART with CD4 counts measured at least twice (baseline and most recent). Data were retrieved from the HIV care clinic electronic database and patients` charts. Descriptive data were analyzed by SPSS version 20. Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology was followed to undertake the study. WEKA version 3.8 was used to conduct a predictive data mining. Before building the predictive data mining models, information gain values and correlation-based Feature Selection methods were used for attribute selection. Variables were ranked according to their relevance based on their information gain values. J48, Neural Network, and Random Forest algorithms were experimented to assess model accuracies. The median duration of ART was 191.5 weeks. The mean CD4 count change was 243 (SD 191.14) cells per microliter. Overall, 2427 (78.2%) patients had their CD4 counts increased by at least 100 cells per microliter, while 4% had a decline from the baseline CD4 value. Baseline variables including age, educational status, CD8 count, ART regimen, and hemoglobin levels predicted CD4 count changes with predictive accuracies of J48, Neural Network, and Random Forest being 87.1%, 83.5%, and 99.8%, respectively. Random Forest algorithm had a superior performance accuracy level than both J48 and Artificial Neural Network. The precision, sensitivity and recall values of Random Forest were also more than 99%. Nearly accurate prediction results were obtained using Random Forest algorithm. This algorithm could be used in a low-resource setting to build a web-based prediction model for CD4 count changes. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Dung Nguyen, The; Kappas, Martin
2017-04-01
In the last several years, the interest in forest biomass and carbon stock estimation has increased due to its importance for forest management, modelling carbon cycle, and other ecosystem services. However, no estimates of biomass and carbon stocks of deferent forest cover types exist throughout in the Xuan Lien Nature Reserve, Thanh Hoa, Viet Nam. This study investigates the relationship between above ground carbon stock and different vegetation indices and to identify the most likely vegetation index that best correlate with forest carbon stock. The terrestrial inventory data come from 380 sample plots that were randomly sampled. Individual tree parameters such as DBH and tree height were collected to calculate the above ground volume, biomass and carbon for different forest types. The SPOT6 2013 satellite data was used in the study to obtain five vegetation indices NDVI, RDVI, MSR, RVI, and EVI. The relationships between the forest carbon stock and vegetation indices were investigated using a multiple linear regression analysis. R-square, RMSE values and cross-validation were used to measure the strength and validate the performance of the models. The methodology presented here demonstrates the possibility of estimating forest volume, biomass and carbon stock. It can also be further improved by addressing more spectral bands data and/or elevation.
Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Rao, A R
2016-11-05
DNA barcoding is a molecular diagnostic method that allows automated and accurate identification of species based on a short and standardized fragment of DNA. To this end, an attempt has been made in this study to develop a computational approach for identifying the species by comparing its barcode with the barcode sequence of known species present in the reference library. Each barcode sequence was first mapped onto a numeric feature vector based on k-mer frequencies and then Random forest methodology was employed on the transformed dataset for species identification. The proposed approach outperformed similarity-based, tree-based, diagnostic-based approaches and found comparable with existing supervised learning based approaches in terms of species identification success rate, while compared using real and simulated datasets. Based on the proposed approach, an online web interface SPIDBAR has also been developed and made freely available at http://cabgrid.res.in:8080/spidbar/ for species identification by the taxonomists. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Morizet, N.; Godin, N.; Tang, J.; Maillet, E.; Fregonese, M.; Normand, B.
2016-03-01
This paper aims to propose a novel approach to classify acoustic emission (AE) signals deriving from corrosion experiments, even if embedded into a noisy environment. To validate this new methodology, synthetic data are first used throughout an in-depth analysis, comparing Random Forests (RF) to the k-Nearest Neighbor (k-NN) algorithm. Moreover, a new evaluation tool called the alter-class matrix (ACM) is introduced to simulate different degrees of uncertainty on labeled data for supervised classification. Then, tests on real cases involving noise and crevice corrosion are conducted, by preprocessing the waveforms including wavelet denoising and extracting a rich set of features as input of the RF algorithm. To this end, a software called RF-CAM has been developed. Results show that this approach is very efficient on ground truth data and is also very promising on real data, especially for its reliability, performance and speed, which are serious criteria for the chemical industry.
Estimating Forest Canopy Heights and Aboveground Biomass with Simulated ICESat-2 Data
NASA Astrophysics Data System (ADS)
Malambo, L.; Narine, L.; Popescu, S. C.; Neuenschwander, A. L.; Sheridan, R.
2016-12-01
The Ice, Cloud and Land Elevation Satellite (ICESat) 2 is scheduled for launch in 2017 and one of its overall science objectives will be to measure vegetation heights, which can be used to estimate and monitor aboveground biomass (AGB) over large spatial scales. This study serves to develop a methodology for utilizing vegetation data collected by ICESat-2 that will be on a five-year mission from 2017, for mapping forest canopy heights and estimating aboveground forest biomass (AGB). The specific objectives are to, (1) simulate ICESat-2 photon-counting lidar (PCL) data, (2) utilize simulated PCL data to estimate forest canopy heights and propose a methodology for upscaling PCL height measurements to obtain spatially contiguous coverage and, (3) estimate and map AGB using simulated PCL data. The laser pulse from ICESat-2 will be divided into three pairs of beams spaced approximately 3 km apart, with footprints measuring approximately 14 m in diameter and with 70 cm along-track intervals. Using existing airborne lidar data (ALS) for Sam Houston National Forest (SHNF) and known ICESat-2 beam locations, footprints are generated along beam locations and PCL data are then simulated from discrete return lidar points within each footprint. By applying data processing algorithms, photons are classified into top of canopy points and ground surface elevation points to yield tree canopy height values within each ICESat-2 footprint. AGB is then estimated using simple linear regression that utilizes AGB from a biomass map generated with ALS data for SHNF and simulated PCL height metrics for 100 m segments along ICESat-2 tracks. Two approaches also investigated for upscaling AGB estimates to provide wall-to-wall coverage of AGB are (1) co-kriging and (2) Random Forest. Height and AGB maps, which are the outcomes of this study, will demonstrate how data acquired by ICESat-2 can be used to measure forest parameters and in extension, estimate forest carbon for climate change initiatives.
Data-Driven Lead-Acid Battery Prognostics Using Random Survival Forests
2014-10-02
Kogalur, Blackstone , & Lauer, 2008; Ishwaran & Kogalur, 2010). Random survival forest is a sur- vival analysis extension of Random Forests (Breiman, 2001...Statistics & probability letters, 80(13), 1056–1064. Ishwaran, H., Kogalur, U. B., Blackstone , E. H., & Lauer, M. S. (2008). Random survival forests. The
A tale of two "forests": random forest machine learning AIDS tropical forest carbon mapping.
Mascaro, Joseph; Asner, Gregory P; Knapp, David E; Kennedy-Bowdoin, Ty; Martin, Roberta E; Anderson, Christopher; Higgins, Mark; Chadwick, K Dana
2014-01-01
Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including--in the latter case--x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called "out-of-bag"), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha(-1) when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation.
Development of a Methodology for Predicting Forest Area for Large-Area Resource Monitoring
William H. Cooke
2001-01-01
The U.S. Department of Agriculture, Forest Service, Southcm Research Station, appointed a remote-sensing team to develop an image-processing methodology for mapping forest lands over large geographic areds. The team has presented a repeatable methodology, which is based on regression modeling of Advanced Very High Resolution Radiometer (AVHRR) and Landsat Thematic...
A Tale of Two “Forests”: Random Forest Machine Learning Aids Tropical Forest Carbon Mapping
Mascaro, Joseph; Asner, Gregory P.; Knapp, David E.; Kennedy-Bowdoin, Ty; Martin, Roberta E.; Anderson, Christopher; Higgins, Mark; Chadwick, K. Dana
2014-01-01
Accurate and spatially-explicit maps of tropical forest carbon stocks are needed to implement carbon offset mechanisms such as REDD+ (Reduced Deforestation and Degradation Plus). The Random Forest machine learning algorithm may aid carbon mapping applications using remotely-sensed data. However, Random Forest has never been compared to traditional and potentially more reliable techniques such as regionally stratified sampling and upscaling, and it has rarely been employed with spatial data. Here, we evaluated the performance of Random Forest in upscaling airborne LiDAR (Light Detection and Ranging)-based carbon estimates compared to the stratification approach over a 16-million hectare focal area of the Western Amazon. We considered two runs of Random Forest, both with and without spatial contextual modeling by including—in the latter case—x, and y position directly in the model. In each case, we set aside 8 million hectares (i.e., half of the focal area) for validation; this rigorous test of Random Forest went above and beyond the internal validation normally compiled by the algorithm (i.e., called “out-of-bag”), which proved insufficient for this spatial application. In this heterogeneous region of Northern Peru, the model with spatial context was the best preforming run of Random Forest, and explained 59% of LiDAR-based carbon estimates within the validation area, compared to 37% for stratification or 43% by Random Forest without spatial context. With the 60% improvement in explained variation, RMSE against validation LiDAR samples improved from 33 to 26 Mg C ha−1 when using Random Forest with spatial context. Our results suggest that spatial context should be considered when using Random Forest, and that doing so may result in substantially improved carbon stock modeling for purposes of climate change mitigation. PMID:24489686
Survival prediction of trauma patients: a study on US National Trauma Data Bank.
Sefrioui, I; Amadini, R; Mauro, J; El Fallahi, A; Gabbrielli, M
2017-12-01
Exceptional circumstances like major incidents or natural disasters may cause a huge number of victims that might not be immediately and simultaneously saved. In these cases it is important to define priorities avoiding to waste time and resources for not savable victims. Trauma and Injury Severity Score (TRISS) methodology is the well-known and standard system usually used by practitioners to predict the survival probability of trauma patients. However, practitioners have noted that the accuracy of TRISS predictions is unacceptable especially for severely injured patients. Thus, alternative methods should be proposed. In this work we evaluate different approaches for predicting whether a patient will survive or not according to simple and easily measurable observations. We conducted a rigorous, comparative study based on the most important prediction techniques using real clinical data of the US National Trauma Data Bank. Empirical results show that well-known Machine Learning classifiers can outperform the TRISS methodology. Based on our findings, we can say that the best approach we evaluated is Random Forest: it has the best accuracy, the best area under the curve, and k-statistic, as well as the second-best sensitivity and specificity. It has also a good calibration curve. Furthermore, its performance monotonically increases as the dataset size grows, meaning that it can be very effective to exploit incoming knowledge. Considering the whole dataset, it is always better than TRISS. Finally, we implemented a new tool to compute the survival of victims. This will help medical practitioners to obtain a better accuracy than the TRISS tools. Random Forests may be a good candidate solution for improving the predictions on survival upon the standard TRISS methodology.
Methodology for estimating soil carbon for the forest carbon budget model of the United States, 2001
L. S. Heath; R. A. Birdsey; D. W. Williams
2002-01-01
The largest carbon (C) pool in United States forests is the soil C pool. We present methodology and soil C pool estimates used in the FORCARB model, which estimates and projects forest carbon budgets for the United States. The methodology balances knowledge, uncertainties, and ease of use. The estimates are calculated using the USDA Natural Resources Conservation...
Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso
2015-07-01
In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.
Sustainability assessment in forest management based on individual preferences.
Martín-Fernández, Susana; Martinez-Falero, Eugenio
2018-01-15
This paper presents a methodology to elicit the preferences of any individual in the assessment of sustainable forest management at the stand level. The elicitation procedure was based on the comparison of the sustainability of pairs of forest locations. A sustainability map of the whole territory was obtained according to the individual's preferences. Three forest sustainability indicators were pre-calculated for each point in a study area in a Scots pine forest in the National Park of Sierra de Guadarrama in the Madrid Region in Spain to obtain the best management plan with the sustainability map. We followed a participatory process involving fifty people to assess the sustainability of the forest management and the methodology. The results highlighted the demand for conservative forest management, the usefulness of the methodology for managers, and the importance and necessity of incorporating stakeholders into forestry decision-making processes. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Löw, Fabian; Schorcht, Gunther; Michel, Ulrich; Dech, Stefan; Conrad, Christopher
2012-10-01
Accurate crop identification and crop area estimation are important for studies on irrigated agricultural systems, yield and water demand modeling, and agrarian policy development. In this study a novel combination of Random Forest (RF) and Support Vector Machine (SVM) classifiers is presented that (i) enhances crop classification accuracy and (ii) provides spatial information on map uncertainty. The methodology was implemented over four distinct irrigated sites in Middle Asia using RapidEye time series data. The RF feature importance statistics was used as feature-selection strategy for the SVM to assess possible negative effects on classification accuracy caused by an oversized feature space. The results of the individual RF and SVM classifications were combined with rules based on posterior classification probability and estimates of classification probability entropy. SVM classification performance was increased by feature selection through RF. Further experimental results indicate that the hybrid classifier improves overall classification accuracy in comparison to the single classifiers as well as useŕs and produceŕs accuracy.
Stevens, Forrest R.; Gaughan, Andrea E.; Linard, Catherine; Tatem, Andrew J.
2015-01-01
High resolution, contemporary data on human population distributions are vital for measuring impacts of population growth, monitoring human-environment interactions and for planning and policy development. Many methods are used to disaggregate census data and predict population densities for finer scale, gridded population data sets. We present a new semi-automated dasymetric modeling approach that incorporates detailed census and ancillary data in a flexible, “Random Forest” estimation technique. We outline the combination of widely available, remotely-sensed and geospatial data that contribute to the modeled dasymetric weights and then use the Random Forest model to generate a gridded prediction of population density at ~100 m spatial resolution. This prediction layer is then used as the weighting surface to perform dasymetric redistribution of the census counts at a country level. As a case study we compare the new algorithm and its products for three countries (Vietnam, Cambodia, and Kenya) with other common gridded population data production methodologies. We discuss the advantages of the new method and increases over the accuracy and flexibility of those previous approaches. Finally, we outline how this algorithm will be extended to provide freely-available gridded population data sets for Africa, Asia and Latin America. PMID:25689585
Calibrating random forests for probability estimation.
Dankowski, Theresa; Ziegler, Andreas
2016-09-30
Probabilities can be consistently estimated using random forests. It is, however, unclear how random forests should be updated to make predictions for other centers or at different time points. In this work, we present two approaches for updating random forests for probability estimation. The first method has been proposed by Elkan and may be used for updating any machine learning approach yielding consistent probabilities, so-called probability machines. The second approach is a new strategy specifically developed for random forests. Using the terminal nodes, which represent conditional probabilities, the random forest is first translated to logistic regression models. These are, in turn, used for re-calibration. The two updating strategies were compared in a simulation study and are illustrated with data from the German Stroke Study Collaboration. In most simulation scenarios, both methods led to similar improvements. In the simulation scenario in which the stricter assumptions of Elkan's method were not met, the logistic regression-based re-calibration approach for random forests outperformed Elkan's method. It also performed better on the stroke data than Elkan's method. The strength of Elkan's method is its general applicability to any probability machine. However, if the strict assumptions underlying this approach are not met, the logistic regression-based approach is preferable for updating random forests for probability estimation. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Applying a weighted random forests method to extract karst sinkholes from LiDAR data
NASA Astrophysics Data System (ADS)
Zhu, Junfeng; Pierskalla, William P.
2016-02-01
Detailed mapping of sinkholes provides critical information for mitigating sinkhole hazards and understanding groundwater and surface water interactions in karst terrains. LiDAR (Light Detection and Ranging) measures the earth's surface in high-resolution and high-density and has shown great potentials to drastically improve locating and delineating sinkholes. However, processing LiDAR data to extract sinkholes requires separating sinkholes from other depressions, which can be laborious because of the sheer number of the depressions commonly generated from LiDAR data. In this study, we applied the random forests, a machine learning method, to automatically separate sinkholes from other depressions in a karst region in central Kentucky. The sinkhole-extraction random forest was grown on a training dataset built from an area where LiDAR-derived depressions were manually classified through a visual inspection and field verification process. Based on the geometry of depressions, as well as natural and human factors related to sinkholes, 11 parameters were selected as predictive variables to form the dataset. Because the training dataset was imbalanced with the majority of depressions being non-sinkholes, a weighted random forests method was used to improve the accuracy of predicting sinkholes. The weighted random forest achieved an average accuracy of 89.95% for the training dataset, demonstrating that the random forest can be an effective sinkhole classifier. Testing of the random forest in another area, however, resulted in moderate success with an average accuracy rate of 73.96%. This study suggests that an automatic sinkhole extraction procedure like the random forest classifier can significantly reduce time and labor costs and makes its more tractable to map sinkholes using LiDAR data for large areas. However, the random forests method cannot totally replace manual procedures, such as visual inspection and field verification.
Moreno-Opo, Rubén; Fernández-Olalla, Mariana; Margalida, Antoni; Arredondo, Ángel; Guil, Francisco
2012-01-01
The application of scientific-based conservation measures requires that sampling methodologies in studies modelling similar ecological aspects produce comparable results making easier their interpretation. We aimed to show how the choice of different methodological and ecological approaches can affect conclusions in nest-site selection studies along different Palearctic meta-populations of an indicator species. First, a multivariate analysis of the variables affecting nest-site selection in a breeding colony of cinereous vulture (Aegypius monachus) in central Spain was performed. Then, a meta-analysis was applied to establish how methodological and habitat-type factors determine differences and similarities in the results obtained by previous studies that have modelled the forest breeding habitat of the species. Our results revealed patterns in nesting-habitat modelling by the cinereous vulture throughout its whole range: steep and south-facing slopes, great cover of large trees and distance to human activities were generally selected. The ratio and situation of the studied plots (nests/random), the use of plots vs. polygons as sampling units and the number of years of data set determined the variability explained by the model. Moreover, a greater size of the breeding colony implied that ecological and geomorphological variables at landscape level were more influential. Additionally, human activities affected in greater proportion to colonies situated in Mediterranean forests. For the first time, a meta-analysis regarding the factors determining nest-site selection heterogeneity for a single species at broad scale was achieved. It is essential to homogenize and coordinate experimental design in modelling the selection of species' ecological requirements in order to avoid that differences in results among studies would be due to methodological heterogeneity. This would optimize best conservation and management practices for habitats and species in a global context. PMID:22413023
USDA-ARS?s Scientific Manuscript database
We developed a cost-based methodology to assess the value of forested watersheds to improve water quality in public water supplies. The developed methodology is applicable to other source watersheds to determine ecosystem services for water quality. We assess the value of forest land for source wate...
Ma, Li; Fan, Suohai
2017-03-14
The random forests algorithm is a type of classifier with prominent universality, a wide application range, and robustness for avoiding overfitting. But there are still some drawbacks to random forests. Therefore, to improve the performance of random forests, this paper seeks to improve imbalanced data processing, feature selection and parameter optimization. We propose the CURE-SMOTE algorithm for the imbalanced data classification problem. Experiments on imbalanced UCI data reveal that the combination of Clustering Using Representatives (CURE) enhances the original synthetic minority oversampling technique (SMOTE) algorithms effectively compared with the classification results on the original data using random sampling, Borderline-SMOTE1, safe-level SMOTE, C-SMOTE, and k-means-SMOTE. Additionally, the hybrid RF (random forests) algorithm has been proposed for feature selection and parameter optimization, which uses the minimum out of bag (OOB) data error as its objective function. Simulation results on binary and higher-dimensional data indicate that the proposed hybrid RF algorithms, hybrid genetic-random forests algorithm, hybrid particle swarm-random forests algorithm and hybrid fish swarm-random forests algorithm can achieve the minimum OOB error and show the best generalization ability. The training set produced from the proposed CURE-SMOTE algorithm is closer to the original data distribution because it contains minimal noise. Thus, better classification results are produced from this feasible and effective algorithm. Moreover, the hybrid algorithm's F-value, G-mean, AUC and OOB scores demonstrate that they surpass the performance of the original RF algorithm. Hence, this hybrid algorithm provides a new way to perform feature selection and parameter optimization.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee Spangler; Lee A. Vierling; Eva K. Stand
2012-04-01
Sound policy recommendations relating to the role of forest management in mitigating atmospheric carbon dioxide (CO{sub 2}) depend upon establishing accurate methodologies for quantifying forest carbon pools for large tracts of land that can be dynamically updated over time. Light Detection and Ranging (LiDAR) remote sensing is a promising technology for achieving accurate estimates of aboveground biomass and thereby carbon pools; however, not much is known about the accuracy of estimating biomass change and carbon flux from repeat LiDAR acquisitions containing different data sampling characteristics. In this study, discrete return airborne LiDAR data was collected in 2003 and 2009 acrossmore » {approx}20,000 hectares (ha) of an actively managed, mixed conifer forest landscape in northern Idaho, USA. Forest inventory plots, established via a random stratified sampling design, were established and sampled in 2003 and 2009. The Random Forest machine learning algorithm was used to establish statistical relationships between inventory data and forest structural metrics derived from the LiDAR acquisitions. Aboveground biomass maps were created for the study area based on statistical relationships developed at the plot level. Over this 6-year period, we found that the mean increase in biomass due to forest growth across the non-harvested portions of the study area was 4.8 metric ton/hectare (Mg/ha). In these non-harvested areas, we found a significant difference in biomass increase among forest successional stages, with a higher biomass increase in mature and old forest compared to stand initiation and young forest. Approximately 20% of the landscape had been disturbed by harvest activities during the six-year time period, representing a biomass loss of >70 Mg/ha in these areas. During the study period, these harvest activities outweighed growth at the landscape scale, resulting in an overall loss in aboveground carbon at this site. The 30-fold increase in sampling density between the 2003 and 2009 did not affect the biomass estimates. Overall, LiDAR data coupled with field reference data offer a powerful method for calculating pools and changes in aboveground carbon in forested systems. The results of our study suggest that multitemporal LiDAR-based approaches are likely to be useful for high quality estimates of aboveground carbon change in conifer forest systems.« less
Using Random Forest to Improve the Downscaling of Global Livestock Census Data
Nicolas, Gaëlle; Robinson, Timothy P.; Wint, G. R. William; Conchedda, Giulia; Cinardi, Giuseppina; Gilbert, Marius
2016-01-01
Large scale, high-resolution global data on farm animal distributions are essential for spatially explicit assessments of the epidemiological, environmental and socio-economic impacts of the livestock sector. This has been the major motivation behind the development of the Gridded Livestock of the World (GLW) database, which has been extensively used since its first publication in 2007. The database relies on a downscaling methodology whereby census counts of animals in sub-national administrative units are redistributed at the level of grid cells as a function of a series of spatial covariates. The recent upgrade of GLW1 to GLW2 involved automating the processing, improvement of input data, and downscaling at a spatial resolution of 1 km per cell (5 km per cell in the earlier version). The underlying statistical methodology, however, remained unchanged. In this paper, we evaluate new methods to downscale census data with a higher accuracy and increased processing efficiency. Two main factors were evaluated, based on sample census datasets of cattle in Africa and chickens in Asia. First, we implemented and evaluated Random Forest models (RF) instead of stratified regressions. Second, we investigated whether models that predicted the number of animals per rural person (per capita) could provide better downscaled estimates than the previous approach that predicted absolute densities (animals per km2). RF models consistently provided better predictions than the stratified regressions for both continents and species. The benefit of per capita over absolute density models varied according to the species and continent. In addition, different technical options were evaluated to reduce the processing time while maintaining their predictive power. Future GLW runs (GLW 3.0) will apply the new RF methodology with optimized modelling options. The potential benefit of per capita models will need to be further investigated with a better distinction between rural and agricultural populations. PMID:26977807
2013-01-01
Background Meat quality involves many traits, such as marbling, tenderness, juiciness, and backfat thickness, all of which require attention from livestock producers. Backfat thickness improvement by means of traditional selection techniques in Canchim beef cattle has been challenging due to its low heritability, and it is measured late in an animal’s life. Therefore, the implementation of new methodologies for identification of single nucleotide polymorphisms (SNPs) linked to backfat thickness are an important strategy for genetic improvement of carcass and meat quality. Results The set of SNPs identified by the random forest approach explained as much as 50% of the deregressed estimated breeding value (dEBV) variance associated with backfat thickness, and a small set of 5 SNPs were able to explain 34% of the dEBV for backfat thickness. Several quantitative trait loci (QTL) for fat-related traits were found in the surrounding areas of the SNPs, as well as many genes with roles in lipid metabolism. Conclusions These results provided a better understanding of the backfat deposition and regulation pathways, and can be considered a starting point for future implementation of a genomic selection program for backfat thickness in Canchim beef cattle. PMID:23738659
2017-01-01
This paper provides evidence on the usefulness of very high spatial resolution (VHR) imagery in gathering socioeconomic information in urban settlements. We use land cover, spectral, structure and texture features extracted from a Google Earth image of Liverpool (UK) to evaluate their potential to predict Living Environment Deprivation at a small statistical area level. We also contribute to the methodological literature on the estimation of socioeconomic indices with remote-sensing data by introducing elements from modern machine learning. In addition to classical approaches such as Ordinary Least Squares (OLS) regression and a spatial lag model, we explore the potential of the Gradient Boost Regressor and Random Forests to improve predictive performance and accuracy. In addition to novel predicting methods, we also introduce tools for model interpretation and evaluation such as feature importance and partial dependence plots, or cross-validation. Our results show that Random Forest proved to be the best model with an R2 of around 0.54, followed by Gradient Boost Regressor with 0.5. Both the spatial lag model and the OLS fall behind with significantly lower performances of 0.43 and 0.3, respectively. PMID:28464010
Arribas-Bel, Daniel; Patino, Jorge E; Duque, Juan C
2017-01-01
This paper provides evidence on the usefulness of very high spatial resolution (VHR) imagery in gathering socioeconomic information in urban settlements. We use land cover, spectral, structure and texture features extracted from a Google Earth image of Liverpool (UK) to evaluate their potential to predict Living Environment Deprivation at a small statistical area level. We also contribute to the methodological literature on the estimation of socioeconomic indices with remote-sensing data by introducing elements from modern machine learning. In addition to classical approaches such as Ordinary Least Squares (OLS) regression and a spatial lag model, we explore the potential of the Gradient Boost Regressor and Random Forests to improve predictive performance and accuracy. In addition to novel predicting methods, we also introduce tools for model interpretation and evaluation such as feature importance and partial dependence plots, or cross-validation. Our results show that Random Forest proved to be the best model with an R2 of around 0.54, followed by Gradient Boost Regressor with 0.5. Both the spatial lag model and the OLS fall behind with significantly lower performances of 0.43 and 0.3, respectively.
Beccaria, Marco; Mellors, Theodore R; Petion, Jacky S; Rees, Christiaan A; Nasir, Mavra; Systrom, Hannah K; Sairistil, Jean W; Jean-Juste, Marc-Antoine; Rivera, Vanessa; Lavoile, Kerline; Severe, Patrice; Pape, Jean W; Wright, Peter F; Hill, Jane E
2018-02-01
Tuberculosis (TB) remains a global public health malady that claims almost 1.8 million lives annually. Diagnosis of TB represents perhaps one of the most challenging aspects of tuberculosis control. Gold standards for diagnosis of active TB (culture and nucleic acid amplification) are sputum-dependent, however, in up to a third of TB cases, an adequate biological sputum sample is not readily available. The analysis of exhaled breath, as an alternative to sputum-dependent tests, has the potential to provide a simple, fast, and non-invasive, and ready-available diagnostic service that could positively change TB detection. Human breath has been evaluated in the setting of active tuberculosis using thermal desorption-comprehensive two-dimensional gas chromatography-time of flight mass spectrometry methodology. From the entire spectrum of volatile metabolites in breath, three random forest machine learning models were applied leading to the generation of a panel of 46 breath features. The twenty-two common features within each random forest model used were selected as a set that could distinguish subjects with confirmed pulmonary M. tuberculosis infection and people with other pathologies than TB. Copyright © 2018 Elsevier B.V. All rights reserved.
Garland, Ellen C; Castellote, Manuel; Berchok, Catherine L
2015-06-01
Beluga whales, Delphinapterus leucas, have a graded call system; call types exist on a continuum making classification challenging. A description of vocalizations from the eastern Beaufort Sea beluga population during its spring migration are presented here, using both a non-parametric classification tree analysis (CART), and a Random Forest analysis. Twelve frequency and duration measurements were made on 1019 calls recorded over 14 days off Icy Cape, Alaska, resulting in 34 identifiable call types with 83% agreement in classification for both CART and Random Forest analyses. This high level of agreement in classification, with an initial subjective classification of calls into 36 categories, demonstrates that the methods applied here provide a quantitative analysis of a graded call dataset. Further, as calls cannot be attributed to individuals using single sensor passive acoustic monitoring efforts, these methods provide a comprehensive analysis of data where the influence of pseudo-replication of calls from individuals is unknown. This study is the first to describe the vocal repertoire of a beluga population using a robust and repeatable methodology. A baseline eastern Beaufort Sea beluga population repertoire is presented here, against which the call repertoire of other seasonally sympatric Alaskan beluga populations can be compared.
STUDYING FOREST ROOT SYSTEMS - AN OVERVIEW OF METHODOLOGICAL PROBLEMS
The study of tree root systems is central to understanding forest ecosystem carbon and nutrient cycles, nutrient and water uptake, C allocation patterns by trees, soil microbial populations, adaptation of trees to stress, soil organic matter production, etc. Methodological probl...
Non-random species loss in a forest herbaceous layer following nitrogen addition
Christopher A. Walter; Mary Beth Adams; Frank S. Gilliam; William T. Peterjohn
2017-01-01
Nitrogen (N) additions have decreased species richness (S) in hardwood forest herbaceous layers, yet the functional mechanisms for these decreases have not been explicitly evaluated.We tested two hypothesized mechanisms, random species loss (RSL) and non-random species loss (NRSL), in the hardwood forest herbaceous layer of a long-term, plot-scale...
Shah, Anoop D.; Bartlett, Jonathan W.; Carpenter, James; Nicholas, Owen; Hemingway, Harry
2014-01-01
Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The “true” imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001–2010) with complete data on all covariates. Variables were artificially made “missing at random,” and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data. PMID:24589914
Shah, Anoop D; Bartlett, Jonathan W; Carpenter, James; Nicholas, Owen; Hemingway, Harry
2014-03-15
Multivariate imputation by chained equations (MICE) is commonly used for imputing missing data in epidemiologic research. The "true" imputation model may contain nonlinearities which are not included in default imputation models. Random forest imputation is a machine learning technique which can accommodate nonlinearities and interactions and does not require a particular regression model to be specified. We compared parametric MICE with a random forest-based MICE algorithm in 2 simulation studies. The first study used 1,000 random samples of 2,000 persons drawn from the 10,128 stable angina patients in the CALIBER database (Cardiovascular Disease Research using Linked Bespoke Studies and Electronic Records; 2001-2010) with complete data on all covariates. Variables were artificially made "missing at random," and the bias and efficiency of parameter estimates obtained using different imputation methods were compared. Both MICE methods produced unbiased estimates of (log) hazard ratios, but random forest was more efficient and produced narrower confidence intervals. The second study used simulated data in which the partially observed variable depended on the fully observed variables in a nonlinear way. Parameter estimates were less biased using random forest MICE, and confidence interval coverage was better. This suggests that random forest imputation may be useful for imputing complex epidemiologic data sets in which some patients have missing data.
Approximating prediction uncertainty for random forest regression models
John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne
2016-01-01
Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...
Health and well-being benefits of spending time in forests: systematic review.
Oh, Byeongsang; Lee, Kyung Ju; Zaslawski, Chris; Yeung, Albert; Rosenthal, David; Larkey, Linda; Back, Michael
2017-10-18
Numerous studies have reported that spending time in nature is associated with the improvement of various health outcomes and well-being. This review evaluated the physical and psychological benefits of a specific type of exposure to nature, forest therapy. A literature search was carried out using MEDLINE, PubMed, ScienceDirect, EMBASE, and ProQuest databases and manual searches from inception up to December 2016. Key words: "Forest" or "Shinrin -Yoku" or "Forest bath" AND "Health" or "Wellbeing". The methodological quality of each randomized controlled trials (RCTs) was assessed according to the Cochrane risk of bias (ROB) tool. Six RCTs met the inclusion criteria. Participants' ages ranged from 20 to 79 years. Sample size ranged from 18 to 99. Populations studied varied from young healthy university students to elderly people with chronic disease. Studies reported the positive impact of forest therapy on hypertension (n = 2), cardiac and pulmonary function (n = 1), immune function (n = 2), inflammation (n = 3), oxidative stress (n = 1), stress (n = 1), stress hormone (n = 1), anxiety (n = 1), depression (n = 2), and emotional response (n = 3). The quality of all studies included in this review had a high ROB. Forest therapy may play an important role in health promotion and disease prevention. However, the lack of high-quality studies limits the strength of results, rendering the evidence insufficient to establish clinical practice guidelines for its use. More robust RCTs are warranted.
NASA Astrophysics Data System (ADS)
Fedrigo, Melissa; Newnham, Glenn J.; Coops, Nicholas C.; Culvenor, Darius S.; Bolton, Douglas K.; Nitschke, Craig R.
2018-02-01
Light detection and ranging (lidar) data have been increasingly used for forest classification due to its ability to penetrate the forest canopy and provide detail about the structure of the lower strata. In this study we demonstrate forest classification approaches using airborne lidar data as inputs to random forest and linear unmixing classification algorithms. Our results demonstrated that both random forest and linear unmixing models identified a distribution of rainforest and eucalypt stands that was comparable to existing ecological vegetation class (EVC) maps based primarily on manual interpretation of high resolution aerial imagery. Rainforest stands were also identified in the region that have not previously been identified in the EVC maps. The transition between stand types was better characterised by the random forest modelling approach. In contrast, the linear unmixing model placed greater emphasis on field plots selected as endmembers which may not have captured the variability in stand structure within a single stand type. The random forest model had the highest overall accuracy (84%) and Cohen's kappa coefficient (0.62). However, the classification accuracy was only marginally better than linear unmixing. The random forest model was applied to a region in the Central Highlands of south-eastern Australia to produce maps of stand type probability, including areas of transition (the 'ecotone') between rainforest and eucalypt forest. The resulting map provided a detailed delineation of forest classes, which specifically recognised the coalescing of stand types at the landscape scale. This represents a key step towards mapping the structural and spatial complexity of these ecosystems, which is important for both their management and conservation.
Hsieh, Chung-Ho; Lu, Ruey-Hwa; Lee, Nai-Hsin; Chiu, Wen-Ta; Hsu, Min-Huei; Li, Yu-Chuan Jack
2011-01-01
Diagnosing acute appendicitis clinically is still difficult. We developed random forests, support vector machines, and artificial neural network models to diagnose acute appendicitis. Between January 2006 and December 2008, patients who had a consultation session with surgeons for suspected acute appendicitis were enrolled. Seventy-five percent of the data set was used to construct models including random forest, support vector machines, artificial neural networks, and logistic regression. Twenty-five percent of the data set was withheld to evaluate model performance. The area under the receiver operating characteristic curve (AUC) was used to evaluate performance, which was compared with that of the Alvarado score. Data from a total of 180 patients were collected, 135 used for training and 45 for testing. The mean age of patients was 39.4 years (range, 16-85). Final diagnosis revealed 115 patients with and 65 without appendicitis. The AUC of random forest, support vector machines, artificial neural networks, logistic regression, and Alvarado was 0.98, 0.96, 0.91, 0.87, and 0.77, respectively. The sensitivity, specificity, positive, and negative predictive values of random forest were 94%, 100%, 100%, and 87%, respectively. Random forest performed better than artificial neural networks, logistic regression, and Alvarado. We demonstrated that random forest can predict acute appendicitis with good accuracy and, deployed appropriately, can be an effective tool in clinical decision making. Copyright © 2011 Mosby, Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Jeong, Seojeong; Lee, Jaegeun; Kim, Hwan-Chul; Hwang, Jun Yeon; Ku, Bon-Cheol; Zakharov, Dmitri N.; Maruyama, Benji; Stach, Eric A.; Kim, Seung Min
2016-01-01
In this study, we develop a new methodology for transmission electron microscopy (TEM) analysis that enables us to directly investigate the interface between carbon nanotube (CNT) arrays and the catalyst and support layers for CNT forest growth without any damage induced by a post-growth TEM sample preparation. Using this methodology, we perform in situ and ex situ TEM investigations on the evolution of the morphology of the catalyst particles and observe the catalyst particles to climb up through CNT arrays during CNT forest growth. We speculate that the lifted catalysts significantly affect the growth and growth termination of CNT forests along with Ostwald ripening and sub-surface diffusion. Thus, we propose a modified growth termination model which better explains various phenomena related to the growth and growth termination of CNT forests.In this study, we develop a new methodology for transmission electron microscopy (TEM) analysis that enables us to directly investigate the interface between carbon nanotube (CNT) arrays and the catalyst and support layers for CNT forest growth without any damage induced by a post-growth TEM sample preparation. Using this methodology, we perform in situ and ex situ TEM investigations on the evolution of the morphology of the catalyst particles and observe the catalyst particles to climb up through CNT arrays during CNT forest growth. We speculate that the lifted catalysts significantly affect the growth and growth termination of CNT forests along with Ostwald ripening and sub-surface diffusion. Thus, we propose a modified growth termination model which better explains various phenomena related to the growth and growth termination of CNT forests. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr05547d
The experimental design of the Missouri Ozark Forest Ecosystem Project
Steven L. Sheriff; Shuoqiong He
1997-01-01
The Missouri Ozark Forest Ecosystem Project (MOFEP) is an experiment that examines the effects of three forest management practices on the forest community. MOFEP is designed as a randomized complete block design using nine sites divided into three blocks. Treatments of uneven-aged, even-aged, and no-harvest management were randomly assigned to sites within each block...
ERIC Educational Resources Information Center
Golino, Hudson F.; Gomes, Cristiano M. A.
2016-01-01
This paper presents a non-parametric imputation technique, named random forest, from the machine learning field. The random forest procedure has two main tuning parameters: the number of trees grown in the prediction and the number of predictors used. Fifty experimental conditions were created in the imputation procedure, with different…
Random Bits Forest: a Strong Classifier/Regressor for Big Data
NASA Astrophysics Data System (ADS)
Wang, Yi; Li, Yi; Pu, Weilin; Wen, Kathryn; Shugart, Yin Yao; Xiong, Momiao; Jin, Li
2016-07-01
Efficiency, memory consumption, and robustness are common problems with many popular methods for data analysis. As a solution, we present Random Bits Forest (RBF), a classification and regression algorithm that integrates neural networks (for depth), boosting (for width), and random forests (for prediction accuracy). Through a gradient boosting scheme, it first generates and selects ~10,000 small, 3-layer random neural networks. These networks are then fed into a modified random forest algorithm to obtain predictions. Testing with datasets from the UCI (University of California, Irvine) Machine Learning Repository shows that RBF outperforms other popular methods in both accuracy and robustness, especially with large datasets (N > 1000). The algorithm also performed highly in testing with an independent data set, a real psoriasis genome-wide association study (GWAS).
Arevalillo, Jorge M; Sztein, Marcelo B; Kotloff, Karen L; Levine, Myron M; Simon, Jakub K
2017-10-01
Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines. Copyright © 2017 Elsevier Inc. All rights reserved.
Forest structure in low diversity tropical forests: a study of Hawaiian wet and dry forests
R. Ostertag; F. Inman-Narahari; S. Cordell; C.P. Giardina; L. Sack
2014-01-01
The potential influence of diversity on ecosystem structure and function remains a topic of significant debate, especially for tropical forests where diversity can range widely. We used Center for Tropical Forest Science (CTFS) methodology to establish forest dynamics plots in montane wet forest and lowland dry forest on Hawaiâi Island. We compared the species...
Seismic activity prediction using computational intelligence techniques in northern Pakistan
NASA Astrophysics Data System (ADS)
Asim, Khawaja M.; Awais, Muhammad; Martínez-Álvarez, F.; Iqbal, Talat
2017-10-01
Earthquake prediction study is carried out for the region of northern Pakistan. The prediction methodology includes interdisciplinary interaction of seismology and computational intelligence. Eight seismic parameters are computed based upon the past earthquakes. Predictive ability of these eight seismic parameters is evaluated in terms of information gain, which leads to the selection of six parameters to be used in prediction. Multiple computationally intelligent models have been developed for earthquake prediction using selected seismic parameters. These models include feed-forward neural network, recurrent neural network, random forest, multi layer perceptron, radial basis neural network, and support vector machine. The performance of every prediction model is evaluated and McNemar's statistical test is applied to observe the statistical significance of computational methodologies. Feed-forward neural network shows statistically significant predictions along with accuracy of 75% and positive predictive value of 78% in context of northern Pakistan.
NASA Astrophysics Data System (ADS)
Elmore, K. L.
2016-12-01
The Metorological Phenomemna Identification NeartheGround (mPING) project is an example of a crowd-sourced, citizen science effort to gather data of sufficeint quality and quantity needed by new post processing methods that use machine learning. Transportation and infrastructure are particularly sensitive to precipitation type in winter weather. We extract attributes from operational numerical forecast models and use them in a random forest to generate forecast winter precipitation types. We find that random forests applied to forecast soundings are effective at generating skillful forecasts of surface ptype with consideralbly more skill than the current algorithms, especuially for ice pellets and freezing rain. We also find that three very different forecast models yuield similar overall results, showing that random forests are able to extract essentially equivalent information from different forecast models. We also show that the random forest for each model, and each profile type is unique to the particular forecast model and that the random forests developed using a particular model suffer significant degradation when given attributes derived from a different model. This implies that no single algorithm can perform well across all forecast models. Clearly, random forests extract information unavailable to "physically based" methods because the physical information in the models does not appear as we expect. One intersting result is that results from the classic "warm nose" sounding profile are, by far, the most sensitive to the particular forecast model, but this profile is also the one for which random forests are most skillful. Finally, a method for calibrarting probabilties for each different ptype using multinomial logistic regression is shown.
A Random Forest-based ensemble method for activity recognition.
Feng, Zengtao; Mo, Lingfei; Li, Meng
2015-01-01
This paper presents a multi-sensor ensemble approach to human physical activity (PA) recognition, using random forest. We designed an ensemble learning algorithm, which integrates several independent Random Forest classifiers based on different sensor feature sets to build a more stable, more accurate and faster classifier for human activity recognition. To evaluate the algorithm, PA data collected from the PAMAP (Physical Activity Monitoring for Aging People), which is a standard, publicly available database, was utilized to train and test. The experimental results show that the algorithm is able to correctly recognize 19 PA types with an accuracy of 93.44%, while the training is faster than others. The ensemble classifier system based on the RF (Random Forest) algorithm can achieve high recognition accuracy and fast calculation.
Gorodeski, Eiran Z.; Ishwaran, Hemant; Kogalur, Udaya B.; Blackstone, Eugene H.; Hsich, Eileen; Zhang, Zhu-ming; Vitolins, Mara Z.; Manson, JoAnn E.; Curb, J. David; Martin, Lisa W.; Prineas, Ronald J.; Lauer, Michael S.
2013-01-01
Background Simultaneous contribution of hundreds of electrocardiographic biomarkers to prediction of long-term mortality in post-menopausal women with clinically normal resting electrocardiograms (ECGs) is unknown. Methods and Results We analyzed ECGs and all-cause mortality in 33,144 women enrolled in Women’s Health Initiative trials, who were without baseline cardiovascular disease or cancer, and had normal ECGs by Minnesota and Novacode criteria. Four hundred and seventy seven ECG biomarkers, encompassing global and individual ECG findings, were measured using computer algorithms. During a median follow-up of 8.1 years (range for survivors 0.5–11.2 years), 1,229 women died. For analyses cohort was randomly split into derivation (n=22,096, deaths=819) and validation (n=11,048, deaths=410) subsets. ECG biomarkers, demographic, and clinical characteristics were simultaneously analyzed using both traditional Cox regression and Random Survival Forest (RSF), a novel algorithmic machine-learning approach. Regression modeling failed to converge. RSF variable selection yielded 20 variables that were independently predictive of long-term mortality, 14 of which were ECG biomarkers related to autonomic tone, atrial conduction, and ventricular depolarization and repolarization. Conclusions We identified 14 ECG biomarkers from amongst hundreds that were associated with long-term prognosis using a novel random forest variable selection methodology. These were related to autonomic tone, atrial conduction, ventricular depolarization, and ventricular repolarization. Quantitative ECG biomarkers have prognostic importance, and may be markers of subclinical disease in apparently healthy post-menopausal women. PMID:21862719
Wright, Marvin N; Dankowski, Theresa; Ziegler, Andreas
2017-04-15
The most popular approach for analyzing survival data is the Cox regression model. The Cox model may, however, be misspecified, and its proportionality assumption may not always be fulfilled. An alternative approach for survival prediction is random forests for survival outcomes. The standard split criterion for random survival forests is the log-rank test statistic, which favors splitting variables with many possible split points. Conditional inference forests avoid this split variable selection bias. However, linear rank statistics are utilized by default in conditional inference forests to select the optimal splitting variable, which cannot detect non-linear effects in the independent variables. An alternative is to use maximally selected rank statistics for the split point selection. As in conditional inference forests, splitting variables are compared on the p-value scale. However, instead of the conditional Monte-Carlo approach used in conditional inference forests, p-value approximations are employed. We describe several p-value approximations and the implementation of the proposed random forest approach. A simulation study demonstrates that unbiased split variable selection is possible. However, there is a trade-off between unbiased split variable selection and runtime. In benchmark studies of prediction performance on simulated and real datasets, the new method performs better than random survival forests if informative dichotomous variables are combined with uninformative variables with more categories and better than conditional inference forests if non-linear covariate effects are included. In a runtime comparison, the method proves to be computationally faster than both alternatives, if a simple p-value approximation is used. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Nasejje, Justine B; Mwambi, Henry; Dheda, Keertan; Lesosky, Maia
2017-07-28
Random survival forest (RSF) models have been identified as alternative methods to the Cox proportional hazards model in analysing time-to-event data. These methods, however, have been criticised for the bias that results from favouring covariates with many split-points and hence conditional inference forests for time-to-event data have been suggested. Conditional inference forests (CIF) are known to correct the bias in RSF models by separating the procedure for the best covariate to split on from that of the best split point search for the selected covariate. In this study, we compare the random survival forest model to the conditional inference model (CIF) using twenty-two simulated time-to-event datasets. We also analysed two real time-to-event datasets. The first dataset is based on the survival of children under-five years of age in Uganda and it consists of categorical covariates with most of them having more than two levels (many split-points). The second dataset is based on the survival of patients with extremely drug resistant tuberculosis (XDR TB) which consists of mainly categorical covariates with two levels (few split-points). The study findings indicate that the conditional inference forest model is superior to random survival forest models in analysing time-to-event data that consists of covariates with many split-points based on the values of the bootstrap cross-validated estimates for integrated Brier scores. However, conditional inference forests perform comparably similar to random survival forests models in analysing time-to-event data consisting of covariates with fewer split-points. Although survival forests are promising methods in analysing time-to-event data, it is important to identify the best forest model for analysis based on the nature of covariates of the dataset in question.
SNP selection and classification of genome-wide SNP data using stratified sampling random forests.
Wu, Qingyao; Ye, Yunming; Liu, Yang; Ng, Michael K
2012-09-01
For high dimensional genome-wide association (GWA) case-control data of complex disease, there are usually a large portion of single-nucleotide polymorphisms (SNPs) that are irrelevant with the disease. A simple random sampling method in random forest using default mtry parameter to choose feature subspace, will select too many subspaces without informative SNPs. Exhaustive searching an optimal mtry is often required in order to include useful and relevant SNPs and get rid of vast of non-informative SNPs. However, it is too time-consuming and not favorable in GWA for high-dimensional data. The main aim of this paper is to propose a stratified sampling method for feature subspace selection to generate decision trees in a random forest for GWA high-dimensional data. Our idea is to design an equal-width discretization scheme for informativeness to divide SNPs into multiple groups. In feature subspace selection, we randomly select the same number of SNPs from each group and combine them to form a subspace to generate a decision tree. The advantage of this stratified sampling procedure can make sure each subspace contains enough useful SNPs, but can avoid a very high computational cost of exhaustive search of an optimal mtry, and maintain the randomness of a random forest. We employ two genome-wide SNP data sets (Parkinson case-control data comprised of 408 803 SNPs and Alzheimer case-control data comprised of 380 157 SNPs) to demonstrate that the proposed stratified sampling method is effective, and it can generate better random forest with higher accuracy and lower error bound than those by Breiman's random forest generation method. For Parkinson data, we also show some interesting genes identified by the method, which may be associated with neurological disorders for further biological investigations.
NASA Astrophysics Data System (ADS)
Othman, Arsalan A.; Gloaguen, Richard
2017-09-01
Lithological mapping in mountainous regions is often impeded by limited accessibility due to relief. This study aims to evaluate (1) the performance of different supervised classification approaches using remote sensing data and (2) the use of additional information such as geomorphology. We exemplify the methodology in the Bardi-Zard area in NE Iraq, a part of the Zagros Fold - Thrust Belt, known for its chromite deposits. We highlighted the improvement of remote sensing geological classification by integrating geomorphic features and spatial information in the classification scheme. We performed a Maximum Likelihood (ML) classification method besides two Machine Learning Algorithms (MLA): Support Vector Machine (SVM) and Random Forest (RF) to allow the joint use of geomorphic features, Band Ratio (BR), Principal Component Analysis (PCA), spatial information (spatial coordinates) and multispectral data of the Advanced Space-borne Thermal Emission and Reflection radiometer (ASTER) satellite. The RF algorithm showed reliable results and discriminated serpentinite, talus and terrace deposits, red argillites with conglomerates and limestone, limy conglomerates and limestone conglomerates, tuffites interbedded with basic lavas, limestone and Metamorphosed limestone and reddish green shales. The best overall accuracy (∼80%) was achieved by Random Forest (RF) algorithms in the majority of the sixteen tested combination datasets.
Building rooftop classification using random forests for large-scale PV deployment
NASA Astrophysics Data System (ADS)
Assouline, Dan; Mohajeri, Nahid; Scartezzini, Jean-Louis
2017-10-01
Large scale solar Photovoltaic (PV) deployment on existing building rooftops has proven to be one of the most efficient and viable sources of renewable energy in urban areas. As it usually requires a potential analysis over the area of interest, a crucial step is to estimate the geometric characteristics of the building rooftops. In this paper, we introduce a multi-layer machine learning methodology to classify 6 roof types, 9 aspect (azimuth) classes and 5 slope (tilt) classes for all building rooftops in Switzerland, using GIS processing. We train Random Forests (RF), an ensemble learning algorithm, to build the classifiers. We use (2 × 2) [m2 ] LiDAR data (considering buildings and vegetation) to extract several rooftop features, and a generalised footprint polygon data to localize buildings. The roof classifier is trained and tested with 1252 labeled roofs from three different urban areas, namely Baden, Luzern, and Winterthur. The results for roof type classification show an average accuracy of 67%. The aspect and slope classifiers are trained and tested with 11449 labeled roofs in the Zurich periphery area. The results for aspect and slope classification show different accuracies depending on the classes: while some classes are well identified, other under-represented classes remain challenging to detect.
Teh, Seng Khoon; Zheng, Wei; Lau, David P; Huang, Zhiwei
2009-06-01
In this work, we evaluated the diagnostic ability of near-infrared (NIR) Raman spectroscopy associated with the ensemble recursive partitioning algorithm based on random forests for identifying cancer from normal tissue in the larynx. A rapid-acquisition NIR Raman system was utilized for tissue Raman measurements at 785 nm excitation, and 50 human laryngeal tissue specimens (20 normal; 30 malignant tumors) were used for NIR Raman studies. The random forests method was introduced to develop effective diagnostic algorithms for classification of Raman spectra of different laryngeal tissues. High-quality Raman spectra in the range of 800-1800 cm(-1) can be acquired from laryngeal tissue within 5 seconds. Raman spectra differed significantly between normal and malignant laryngeal tissues. Classification results obtained from the random forests algorithm on tissue Raman spectra yielded a diagnostic sensitivity of 88.0% and specificity of 91.4% for laryngeal malignancy identification. The random forests technique also provided variables importance that facilitates correlation of significant Raman spectral features with cancer transformation. This study shows that NIR Raman spectroscopy in conjunction with random forests algorithm has a great potential for the rapid diagnosis and detection of malignant tumors in the larynx.
The National Visitor Use Monitoring methodology and final results for round 1
S.J. Zarnoch; E.M. White; D.B.K. English; Susan M. Kocis; Ross Arnold
2011-01-01
A nationwide, systematic monitoring process has been developed to provide improved estimates of recreation visitation on National Forest System lands. Methodology is presented to provide estimates of site visits and national forest visits based on an onsite sampling design of site-days and last-exiting recreationists. Stratification of the site days, based on site type...
Forest Resources of East Oklahoma, 2008
Richard A. Harper; Tony G. Johnson
2012-01-01
The Forest Inventory and Analysis Program conducted the seventh survey of east Oklahoma forests. This was the establishment of the annual plot methodology and closeout of the prism remeasurement plots. Forest land area remained stable at 5.7 million acres and covered almost 57 percent of the land area. About 5.1 million acres of forest land was considered timberland...
Survival analysis for a large scale forest health issue: Missouri oak decline
C.W. Woodall; P.L. Grambsch; W. Thomas; W.K. Moser
2005-01-01
Survival analysis methodologies provide novel approaches for forest mortality analysis that may aid in detecting, monitoring, and mitigating of large-scale forest health issues. This study examined survivor analysis for evaluating a regional forest health issue - Missouri oak decline. With a statewide Missouri forest inventory, log-rank tests of the effects of...
Nikolay Strigul; Jean Lienard
2015-01-01
Forest inventory datasets offer unprecedented opportunities to model forest dynamics under evolving environmental conditions but they are analytically challenging due to irregular sampling time intervals of the same plot, across the years. We propose here a novel method to model dynamic changes in forest biomass and basal area using forest inventory data. Our...
Adapting GNU random forest program for Unix and Windows
NASA Astrophysics Data System (ADS)
Jirina, Marcel; Krayem, M. Said; Jirina, Marcel, Jr.
2013-10-01
The Random Forest is a well-known method and also a program for data clustering and classification. Unfortunately, the original Random Forest program is rather difficult to use. Here we describe a new version of this program originally written in Fortran 77. The modified program in Fortran 95 needs to be compiled only once and information for different tasks is passed with help of arguments. The program was tested with 24 data sets from UCI MLR and results are available on the net.
Testing for change in structural elements of forest inventories
Melinda Vokoun; David Wear; Robert Abt
2009-01-01
In this article we develop a methodology to test for changes in the underlying relationships between measures of forest productivity (structural elements) and site characteristics, herein referred to as structural changes, using standard forest inventories. Changes in measures of forest growing stock volume and number of trees for both...
Bühnemann, Claudia; Li, Simon; Yu, Haiyue; Branford White, Harriet; Schäfer, Karl L; Llombart-Bosch, Antonio; Machado, Isidro; Picci, Piero; Hogendoorn, Pancras C W; Athanasou, Nicholas A; Noble, J Alison; Hassan, A Bassim
2014-01-01
Driven by genomic somatic variation, tumour tissues are typically heterogeneous, yet unbiased quantitative methods are rarely used to analyse heterogeneity at the protein level. Motivated by this problem, we developed automated image segmentation of images of multiple biomarkers in Ewing sarcoma to generate distributions of biomarkers between and within tumour cells. We further integrate high dimensional data with patient clinical outcomes utilising random survival forest (RSF) machine learning. Using material from cohorts of genetically diagnosed Ewing sarcoma with EWSR1 chromosomal translocations, confocal images of tissue microarrays were segmented with level sets and watershed algorithms. Each cell nucleus and cytoplasm were identified in relation to DAPI and CD99, respectively, and protein biomarkers (e.g. Ki67, pS6, Foxo3a, EGR1, MAPK) localised relative to nuclear and cytoplasmic regions of each cell in order to generate image feature distributions. The image distribution features were analysed with RSF in relation to known overall patient survival from three separate cohorts (185 informative cases). Variation in pre-analytical processing resulted in elimination of a high number of non-informative images that had poor DAPI localisation or biomarker preservation (67 cases, 36%). The distribution of image features for biomarkers in the remaining high quality material (118 cases, 104 features per case) were analysed by RSF with feature selection, and performance assessed using internal cross-validation, rather than a separate validation cohort. A prognostic classifier for Ewing sarcoma with low cross-validation error rates (0.36) was comprised of multiple features, including the Ki67 proliferative marker and a sub-population of cells with low cytoplasmic/nuclear ratio of CD99. Through elimination of bias, the evaluation of high-dimensionality biomarker distribution within cell populations of a tumour using random forest analysis in quality controlled tumour material could be achieved. Such an automated and integrated methodology has potential application in the identification of prognostic classifiers based on tumour cell heterogeneity.
Mapping Deforestation area in North Korea Using Phenology-based Multi-Index and Random Forest
NASA Astrophysics Data System (ADS)
Jin, Y.; Sung, S.; Lee, D. K.; Jeong, S.
2016-12-01
Forest ecosystem provides ecological benefits to both humans and wildlife. Growing global demand for food and fiber is accelerating the pressure on the forest ecosystem in whole world from agriculture and logging. In recently, North Korea lost almost 40 % of its forests to crop fields for food production and cut-down of forest for fuel woods between 1990 and 2015. It led to the increased damage caused by natural disasters and is known to be one of the most forest degraded areas in the world. The characteristic of forest landscape in North Korea is complex and heterogeneous, the major landscape types in the forest are hillside farm, unstocked forest, natural forest and plateau vegetation. Remote sensing can be used for the forest degradation mapping of a dynamic landscape at a broad scale of detail and spatial distribution. Confusion mostly occurred between hillside farmland and unstocked forest, but also between unstocked forest and forest. Most previous forest degradation that used focused on the classification of broad types such as deforests area and sand from the perspective of land cover classification. The objective of this study is using random forest for mapping degraded forest in North Korea by phenological based vegetation index derived from MODIS products, which has various environmental factors such as vegetation, soil and water at a regional scale for improving accuracy. The model created by random forest resulted in an overall accuracy was 91.44%. Class user's accuracy of hillside farmland and unstocked forest were 97.2% and 84%%, which indicate the degraded forest. Unstocked forest had relative low user accuracy due to misclassified hillside farmland and forest samples. Producer's accuracy of hillside farmland and unstocked forest were 85.2% and 93.3%, repectly. In this case hillside farmland had lower produce accuracy mainly due to confusion with field, unstocked forest and forest. Such a classification of degraded forest could supply essential information to decide the priority of forest management and restoration in degraded forest area.
Jeffrey T. Walton
2008-01-01
Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...
Schmidt, Johannes; Glaser, Bruno
2016-01-01
Tropical forests are significant carbon sinks and their soils’ carbon storage potential is immense. However, little is known about the soil organic carbon (SOC) stocks of tropical mountain areas whose complex soil-landscape and difficult accessibility pose a challenge to spatial analysis. The choice of methodology for spatial prediction is of high importance to improve the expected poor model results in case of low predictor-response correlations. Four aspects were considered to improve model performance in predicting SOC stocks of the organic layer of a tropical mountain forest landscape: Different spatial predictor settings, predictor selection strategies, various machine learning algorithms and model tuning. Five machine learning algorithms: random forests, artificial neural networks, multivariate adaptive regression splines, boosted regression trees and support vector machines were trained and tuned to predict SOC stocks from predictors derived from a digital elevation model and satellite image. Topographical predictors were calculated with a GIS search radius of 45 to 615 m. Finally, three predictor selection strategies were applied to the total set of 236 predictors. All machine learning algorithms—including the model tuning and predictor selection—were compared via five repetitions of a tenfold cross-validation. The boosted regression tree algorithm resulted in the overall best model. SOC stocks ranged between 0.2 to 17.7 kg m-2, displaying a huge variability with diffuse insolation and curvatures of different scale guiding the spatial pattern. Predictor selection and model tuning improved the models’ predictive performance in all five machine learning algorithms. The rather low number of selected predictors favours forward compared to backward selection procedures. Choosing predictors due to their indiviual performance was vanquished by the two procedures which accounted for predictor interaction. PMID:27128736
Ließ, Mareike; Schmidt, Johannes; Glaser, Bruno
2016-01-01
Tropical forests are significant carbon sinks and their soils' carbon storage potential is immense. However, little is known about the soil organic carbon (SOC) stocks of tropical mountain areas whose complex soil-landscape and difficult accessibility pose a challenge to spatial analysis. The choice of methodology for spatial prediction is of high importance to improve the expected poor model results in case of low predictor-response correlations. Four aspects were considered to improve model performance in predicting SOC stocks of the organic layer of a tropical mountain forest landscape: Different spatial predictor settings, predictor selection strategies, various machine learning algorithms and model tuning. Five machine learning algorithms: random forests, artificial neural networks, multivariate adaptive regression splines, boosted regression trees and support vector machines were trained and tuned to predict SOC stocks from predictors derived from a digital elevation model and satellite image. Topographical predictors were calculated with a GIS search radius of 45 to 615 m. Finally, three predictor selection strategies were applied to the total set of 236 predictors. All machine learning algorithms-including the model tuning and predictor selection-were compared via five repetitions of a tenfold cross-validation. The boosted regression tree algorithm resulted in the overall best model. SOC stocks ranged between 0.2 to 17.7 kg m-2, displaying a huge variability with diffuse insolation and curvatures of different scale guiding the spatial pattern. Predictor selection and model tuning improved the models' predictive performance in all five machine learning algorithms. The rather low number of selected predictors favours forward compared to backward selection procedures. Choosing predictors due to their indiviual performance was vanquished by the two procedures which accounted for predictor interaction.
Su, Ruiliang; Chen, Xiang; Cao, Shuai; Zhang, Xu
2016-01-14
Sign language recognition (SLR) has been widely used for communication amongst the hearing-impaired and non-verbal community. This paper proposes an accurate and robust SLR framework using an improved decision tree as the base classifier of random forests. This framework was used to recognize Chinese sign language subwords using recordings from a pair of portable devices worn on both arms consisting of accelerometers (ACC) and surface electromyography (sEMG) sensors. The experimental results demonstrated the validity of the proposed random forest-based method for recognition of Chinese sign language (CSL) subwords. With the proposed method, 98.25% average accuracy was obtained for the classification of a list of 121 frequently used CSL subwords. Moreover, the random forests method demonstrated a superior performance in resisting the impact of bad training samples. When the proportion of bad samples in the training set reached 50%, the recognition error rate of the random forest-based method was only 10.67%, while that of a single decision tree adopted in our previous work was almost 27.5%. Our study offers a practical way of realizing a robust and wearable EMG-ACC-based SLR systems.
Pseudo CT estimation from MRI using patch-based random forest
NASA Astrophysics Data System (ADS)
Yang, Xiaofeng; Lei, Yang; Shu, Hui-Kuo; Rossi, Peter; Mao, Hui; Shim, Hyunsuk; Curran, Walter J.; Liu, Tian
2017-02-01
Recently, MR simulators gain popularity because of unnecessary radiation exposure of CT simulators being used in radiation therapy planning. We propose a method for pseudo CT estimation from MR images based on a patch-based random forest. Patient-specific anatomical features are extracted from the aligned training images and adopted as signatures for each voxel. The most robust and informative features are identified using feature selection to train the random forest. The well-trained random forest is used to predict the pseudo CT of a new patient. This prediction technique was tested with human brain images and the prediction accuracy was assessed using the original CT images. Peak signal-to-noise ratio (PSNR) and feature similarity (FSIM) indexes were used to quantify the differences between the pseudo and original CT images. The experimental results showed the proposed method could accurately generate pseudo CT images from MR images. In summary, we have developed a new pseudo CT prediction method based on patch-based random forest, demonstrated its clinical feasibility, and validated its prediction accuracy. This pseudo CT prediction technique could be a useful tool for MRI-based radiation treatment planning and attenuation correction in a PET/MRI scanner.
Economic vulnerability of timber resources to forest fires
Francisco Rodriguez y Silva; Juan Ramon Molina; Armando Gonzalez-Caban; Miguel Angel Herrera Machuca
2012-01-01
The temporal-spatial planning of activities for a territorial fire management program requires knowing the value of forest ecosystems. In this paper we extend to and apply the economic valuation principle to the concept of economic vulnerability and present a methodology for the economic valuation of the forest production ecosystems. The forest vulnerability is...
Rapid forest change in the interior west presents analysis opportunities and challenges
John D. Shaw
2007-01-01
A recent drought has caused compositional and structural changes in Interior West forests. Recent periodic and annual inventory data provide an opportunity to analyze forest changes on a grand scale. This "natural experiment" also provides opportunities to test the effectiveness of Forest Inventory and Analysis (FIA) methodologies. It also presents some...
Detecting targets hidden in random forests
NASA Astrophysics Data System (ADS)
Kouritzin, Michael A.; Luo, Dandan; Newton, Fraser; Wu, Biao
2009-05-01
Military tanks, cargo or troop carriers, missile carriers or rocket launchers often hide themselves from detection in the forests. This plagues the detection problem of locating these hidden targets. An electro-optic camera mounted on a surveillance aircraft or unmanned aerial vehicle is used to capture the images of the forests with possible hidden targets, e.g., rocket launchers. We consider random forests of longitudinal and latitudinal correlations. Specifically, foliage coverage is encoded with a binary representation (i.e., foliage or no foliage), and is correlated in adjacent regions. We address the detection problem of camouflaged targets hidden in random forests by building memory into the observations. In particular, we propose an efficient algorithm to generate random forests, ground, and camouflage of hidden targets with two dimensional correlations. The observations are a sequence of snapshots consisting of foliage-obscured ground or target. Theoretically, detection is possible because there are subtle differences in the correlations of the ground and camouflage of the rocket launcher. However, these differences are well beyond human perception. To detect the presence of hidden targets automatically, we develop a Markov representation for these sequences and modify the classical filtering equations to allow the Markov chain observation. Particle filters are used to estimate the position of the targets in combination with a novel random weighting technique. Furthermore, we give positive proof-of-concept simulations.
Screening large-scale association study data: exploiting interactions using random forests.
Lunetta, Kathryn L; Hayward, L Brooke; Segal, Jonathan; Van Eerdewegh, Paul
2004-12-10
Genome-wide association studies for complex diseases will produce genotypes on hundreds of thousands of single nucleotide polymorphisms (SNPs). A logical first approach to dealing with massive numbers of SNPs is to use some test to screen the SNPs, retaining only those that meet some criterion for further study. For example, SNPs can be ranked by p-value, and those with the lowest p-values retained. When SNPs have large interaction effects but small marginal effects in a population, they are unlikely to be retained when univariate tests are used for screening. However, model-based screens that pre-specify interactions are impractical for data sets with thousands of SNPs. Random forest analysis is an alternative method that produces a single measure of importance for each predictor variable that takes into account interactions among variables without requiring model specification. Interactions increase the importance for the individual interacting variables, making them more likely to be given high importance relative to other variables. We test the performance of random forests as a screening procedure to identify small numbers of risk-associated SNPs from among large numbers of unassociated SNPs using complex disease models with up to 32 loci, incorporating both genetic heterogeneity and multi-locus interaction. Keeping other factors constant, if risk SNPs interact, the random forest importance measure significantly outperforms the Fisher Exact test as a screening tool. As the number of interacting SNPs increases, the improvement in performance of random forest analysis relative to Fisher Exact test for screening also increases. Random forests perform similarly to the univariate Fisher Exact test as a screening tool when SNPs in the analysis do not interact. In the context of large-scale genetic association studies where unknown interactions exist among true risk-associated SNPs or SNPs and environmental covariates, screening SNPs using random forest analyses can significantly reduce the number of SNPs that need to be retained for further study compared to standard univariate screening methods.
Monitoring the effects of extreme climate disturbances on forest health in the northeast U.S.
Allan N.D. Auclair; Warren E. Heilman; Peter Busalacchi
2002-01-01
No methodology has been developed to date to predict when a forest population is at risk to specific climate and air pollution stressors. Yet, this information is important to natural resource managers who need frequent, updated assessments of forest health upon which to base management decisions and respond to public concerns on forest health. The USDA Forest Service...
FORCARB2: An updated version of the U.S. Forest Carbon Budget Model
Linda S. Heath; Michael C. Nichols; James E. Smith; John R. Mills
2010-01-01
FORCARB2, an updated version of the U.S. FORest CARBon Budget Model (FORCARB), produces estimates of carbon stocks and stock changes for forest ecosystems and forest products at 5-year intervals. FORCARB2 includes a new methodology for carbon in harvested wood products, updated initial inventory data, a revised algorithm for dead wood, and now includes public forest...
Toby Thaler; Gwen Griffith; Nancy Gilliam
2014-01-01
Forest-based ecosystem services are at risk from human-caused stressors, including climate change. Improving governance and management of forests to reduce impacts and increase community resilience to all stressors is the objective of forest-related climate change adaptation. The Model Forest Policy Program (MFPP) has applied one method designed to meet this objective...
NASA Astrophysics Data System (ADS)
Poulter, B.; Ciais, P.; Joetzjer, E.; Maignan, F.; Luyssaert, S.; Barichivich, J.
2015-12-01
Accurately estimating forest biomass and forest carbon dynamics requires new integrated remote sensing, forest inventory, and carbon cycle modeling approaches. Presently, there is an increasing and urgent need to reduce forest biomass uncertainty in order to meet the requirements of carbon mitigation treaties, such as Reducing Emissions from Deforestation and forest Degradation (REDD+). Here we describe a new parameterization and assimilation methodology used to estimate tropical forest biomass using the ORCHIDEE-CAN dynamic global vegetation model. ORCHIDEE-CAN simulates carbon uptake and allocation to individual trees using a mechanistic representation of photosynthesis, respiration and other first-order processes. The model is first parameterized using forest inventory data to constrain background mortality rates, i.e., self-thinning, and productivity. Satellite remote sensing data for forest structure, i.e., canopy height, is used to constrain simulated forest stand conditions using a look-up table approach to match canopy height distributions. The resulting forest biomass estimates are provided for spatial grids that match REDD+ project boundaries and aim to provide carbon estimates for the criteria described in the IPCC Good Practice Guidelines Tier 3 category. With the increasing availability of forest structure variables derived from high-resolution LIDAR, RADAR, and optical imagery, new methodologies and applications with process-based carbon cycle models are becoming more readily available to inform land management.
NASA Astrophysics Data System (ADS)
Arevalo, P. A.; Olofsson, P.; Woodcock, C. E.
2017-12-01
Unbiased estimation of the areas of conversion between land categories ("activity data") and their uncertainty is crucial for providing more robust calculations of carbon emissions to the atmosphere, as well as their removals. This is particularly important for the REDD+ mechanism of UNFCCC where an economic compensation is tied to the magnitude and direction of such fluxes. Dense time series of Landsat data and statistical protocols are becoming an integral part of forest monitoring efforts, but there are relatively few studies in the tropics focused on using these methods to advance operational MRV systems (Monitoring, Reporting and Verification). We present the results of a prototype methodology for continuous monitoring and unbiased estimation of activity data that is compliant with the IPCC Approach 3 for representation of land. We used a break detection algorithm (Continuous Change Detection and Classification, CCDC) to fit pixel-level temporal segments to time series of Landsat data in the Colombian Amazon. The segments were classified using a Random Forest classifier to obtain annual maps of land categories between 2001 and 2016. Using these maps, a biannual stratified sampling approach was implemented and unbiased stratified estimators constructed to calculate area estimates with confidence intervals for each of the stable and change classes. Our results provide evidence of a decrease in primary forest as a result of conversion to pastures, as well as increase in secondary forest as pastures are abandoned and the forest allowed to regenerate. Estimating areas of other land transitions proved challenging because of their very small mapped areas compared to stable classes like forest, which corresponds to almost 90% of the study area. Implications on remote sensing data processing, sample allocation and uncertainty reduction are also discussed.
Application of lifting wavelet and random forest in compound fault diagnosis of gearbox
NASA Astrophysics Data System (ADS)
Chen, Tang; Cui, Yulian; Feng, Fuzhou; Wu, Chunzhi
2018-03-01
Aiming at the weakness of compound fault characteristic signals of a gearbox of an armored vehicle and difficult to identify fault types, a fault diagnosis method based on lifting wavelet and random forest is proposed. First of all, this method uses the lifting wavelet transform to decompose the original vibration signal in multi-layers, reconstructs the multi-layer low-frequency and high-frequency components obtained by the decomposition to get multiple component signals. Then the time-domain feature parameters are obtained for each component signal to form multiple feature vectors, which is input into the random forest pattern recognition classifier to determine the compound fault type. Finally, a variety of compound fault data of the gearbox fault analog test platform are verified, the results show that the recognition accuracy of the fault diagnosis method combined with the lifting wavelet and the random forest is up to 99.99%.
Prediction of Baseflow Index of Catchments using Machine Learning Algorithms
NASA Astrophysics Data System (ADS)
Yadav, B.; Hatfield, K.
2017-12-01
We present the results of eight machine learning techniques for predicting the baseflow index (BFI) of ungauged basins using a surrogate of catchment scale climate and physiographic data. The tested algorithms include ordinary least squares, ridge regression, least absolute shrinkage and selection operator (lasso), elasticnet, support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Our work seeks to identify the dominant controls of BFI that can be readily obtained from ancillary geospatial databases and remote sensing measurements, such that the developed techniques can be extended to ungauged catchments. More than 800 gauged catchments spanning the continental United States were selected to develop the general methodology. The BFI calculation was based on the baseflow separated from daily streamflow hydrograph using HYSEP filter. The surrogate catchment attributes were compiled from multiple sources including digital elevation model, soil, landuse, climate data, other publicly available ancillary and geospatial data. 80% catchments were used to train the ML algorithms, and the remaining 20% of the catchments were used as an independent test set to measure the generalization performance of fitted models. A k-fold cross-validation using exhaustive grid search was used to fit the hyperparameters of each model. Initial model development was based on 19 independent variables, but after variable selection and feature ranking, we generated revised sparse models of BFI prediction that are based on only six catchment attributes. These key predictive variables selected after the careful evaluation of bias-variance tradeoff include average catchment elevation, slope, fraction of sand, permeability, temperature, and precipitation. The most promising algorithms exceeding an accuracy score (r-square) of 0.7 on test data include support vector machine, gradient boosted regression trees, random forests, and extremely randomized trees. Considering both the accuracy and the computational complexity of these algorithms, we identify the extremely randomized trees as the best performing algorithm for BFI prediction in ungauged basins.
State-of-the-art methodology of forest inventory: a symposium proceedings.
Vernon J. LaBau; Tiberius Cunia
1990-01-01
The state-of-the-art of forest inventory methodology, being closely integrated with the fast-moving, high technology computer world, has been changing at a rapid pace over the past decade. Several successful conferences were held during the 1980s with the goal and purpose of staying abreast of such change. This symposium was conceived, not just with the idea of helping...
NASA Astrophysics Data System (ADS)
Wu, J.; Yao, W.; Zhang, J.; Li, Y.
2018-04-01
Labeling 3D point cloud data with traditional supervised learning methods requires considerable labelled samples, the collection of which is cost and time expensive. This work focuses on adopting domain adaption concept to transfer existing trained random forest classifiers (based on source domain) to new data scenes (target domain), which aims at reducing the dependence of accurate 3D semantic labeling in point clouds on training samples from the new data scene. Firstly, two random forest classifiers were firstly trained with existing samples previously collected for other data. They were different from each other by using two different decision tree construction algorithms: C4.5 with information gain ratio and CART with Gini index. Secondly, four random forest classifiers adapted to the target domain are derived through transferring each tree in the source random forest models with two types of operations: structure expansion and reduction-SER and structure transfer-STRUT. Finally, points in target domain are labelled by fusing the four newly derived random forest classifiers using weights of evidence based fusion model. To validate our method, experimental analysis was conducted using 3 datasets: one is used as the source domain data (Vaihingen data for 3D Semantic Labelling); another two are used as the target domain data from two cities in China (Jinmen city and Dunhuang city). Overall accuracies of 85.5 % and 83.3 % for 3D labelling were achieved for Jinmen city and Dunhuang city data respectively, with only 1/3 newly labelled samples compared to the cases without domain adaption.
Hayati, Elyas; Majnounian, Baris; Abdi, Ehsan; Sessions, John; Makhdoum, Majid
2013-02-01
Changes in forest landscapes resulting from road construction have increased remarkably in the last few years. On the other hand, the sustainable management of forest resources can only be achieved through a well-organized road network. In order to minimize the environmental impacts of forest roads, forest road managers must design the road network efficiently and environmentally as well. Efficient planning methodologies can assist forest road managers in considering the technical, economic, and environmental factors that affect forest road planning. This paper describes a three-stage methodology using the Delphi method for selecting the important criteria, the Analytic Hierarchy Process for obtaining the relative importance of the criteria, and finally, a spatial multi-criteria evaluation in a geographic information system (GIS) environment for identifying the lowest-impact road network alternative. Results of the Delphi method revealed that ground slope, lithology, distance from stream network, distance from faults, landslide susceptibility, erosion susceptibility, geology, and soil texture are the most important criteria for forest road planning in the study area. The suitability map for road planning was then obtained by combining the fuzzy map layers of these criteria with respect to their weights. Nine road network alternatives were designed using PEGGER, an ArcView GIS extension, and finally, their values were extracted from the suitability map. Results showed that the methodology was useful for identifying road that met environmental and cost considerations. Based on this work, we suggest future work in forest road planning using multi-criteria evaluation and decision making be considered in other regions and that the road planning criteria identified in this study may be useful.
Carlos Alberto Silva; Carine Klauberg; Andrew Thomas Hudak; Lee Alexander Vierling; Wan Shafrina Wan Mohd Jaafar; Midhun Mohan; Mariano Garcia; Antonio Ferraz; Adrian Cardil; Sassan Saatchi
2017-01-01
Improvements in the management of pine plantations result in multiple industrial and environmental benefits. Remote sensing techniques can dramatically increase the efficiency of plantation management by reducing or replacing time-consuming field sampling. We tested the utility and accuracy of combining field and airborne lidar data with Random Forest, a supervised...
Nonmarket Economic Impacts of Forest Insect Pests: A Literature Review
Randall S. Rosenberger; Eric L. Smith
1997-01-01
This report summarizes the results of research on the nonmarket economic impacts of forest insect pests. The majority of the research reports are journal articles or fulfillment of three USDA Forest Service research contracts. This report also reviews the foundations for methodologies used and classifies the forest insect pests studied, the regions in which research...
Practicalities of methodologies in monitoring morest degradation in the tropics
Yoshiyuki Kiyono
2013-01-01
Conversion of natural forest to agricultural land is one of the most important forms of land-use change affecting both carbon stock and biodiversity. When the agricultural land contains trees, e.g. fallow-land forest of slash-and-burn agriculture, the conversion can be categorized into forest degradation when the forest definition covers such vegetation. One practical...
Predicting bird habitat quality from a geospatial analysis of FIA data
John M. Tirpak; D. Todd Jones-Farrand; Frank R., III Thompson; Daniel J. Twedt; Mark D. Nelson; William B., III Uihlein
2009-01-01
The ability to assess the influence of site-scale forest structure on avian habitat suitability at an ecoregional scale remains a major methodological constraint to effective biological planning for forest land birds in North America. We evaluated the feasibility of using forest inventory and analysis (FIA) data to define vegetation structure within forest patches,...
Uncertainty in Random Forests: What does it mean in a spatial context?
NASA Astrophysics Data System (ADS)
Klump, Jens; Fouedjio, Francky
2017-04-01
Geochemical surveys are an important part of exploration for mineral resources and in environmental studies. The samples and chemical analyses are often laborious and difficult to obtain and therefore come at a high cost. As a consequence, these surveys are characterised by datasets with large numbers of variables but relatively few data points when compared to conventional big data problems. With more remote sensing platforms and sensor networks being deployed, large volumes of auxiliary data of the surveyed areas are becoming available. The use of these auxiliary data has the potential to improve the prediction of chemical element concentrations over the whole study area. Kriging is a well established geostatistical method for the prediction of spatial data but requires significant pre-processing and makes some basic assumptions about the underlying distribution of the data. Some machine learning algorithms, on the other hand, may require less data pre-processing and are non-parametric. In this study we used a dataset provided by Kirkwood et al. [1] to explore the potential use of Random Forest in geochemical mapping. We chose Random Forest because it is a well understood machine learning method and has the advantage that it provides us with a measure of uncertainty. By comparing Random Forest to Kriging we found that both methods produced comparable maps of estimated values for our variables of interest. Kriging outperformed Random Forest for variables of interest with relatively strong spatial correlation. The measure of uncertainty provided by Random Forest seems to be quite different to the measure of uncertainty provided by Kriging. In particular, the lack of spatial context can give misleading results in areas without ground truth data. In conclusion, our preliminary results show that the model driven approach in geostatistics gives us more reliable estimates for our target variables than Random Forest for variables with relatively strong spatial correlation. However, in cases of weak spatial correlation Random Forest, as a nonparametric method, may give the better results once we have a better understanding of the meaning of its uncertainty measures in a spatial context. References [1] Kirkwood, C., M. Cave, D. Beamish, S. Grebby, and A. Ferreira (2016), A machine learning approach to geochemical mapping, Journal of Geochemical Exploration, 163, 28-40, doi:10.1016/j.gexplo.2016.05.003.
The Greek National Observatory of Forest Fires (NOFFi)
NASA Astrophysics Data System (ADS)
Tompoulidou, Maria; Stefanidou, Alexandra; Grigoriadis, Dionysios; Dragozi, Eleni; Stavrakoudis, Dimitris; Gitas, Ioannis Z.
2016-08-01
Efficient forest fire management is a key element for alleviating the catastrophic impacts of wildfires. Overall, the effective response to fire events necessitates adequate planning and preparedness before the start of the fire season, as well as quantifying the environmental impacts in case of wildfires. Moreover, the estimation of fire danger provides crucial information required for the optimal allocation and distribution of the available resources. The Greek National Observatory of Forest Fires (NOFFi)—established by the Greek Forestry Service in collaboration with the Laboratory of Forest Management and Remote Sensing of the Aristotle University of Thessaloniki and the International Balkan Center—aims to develop a series of modern products and services for supporting the efficient forest fire prevention management in Greece and the Balkan region, as well as to stimulate the development of transnational fire prevention and impacts mitigation policies. More specifically, NOFFi provides three main fire-related products and services: a) a remote sensing-based fuel type mapping methodology, b) a semi-automatic burned area mapping service, and c) a dynamically updatable fire danger index providing mid- to long-term predictions. The fuel type mapping methodology was developed and applied across the country, following an object-oriented approach and using Landsat 8 OLI satellite imagery. The results showcase the effectiveness of the generated methodology in obtaining highly accurate fuel type maps on a national level. The burned area mapping methodology was developed as a semi-automatic object-based classification process, carefully crafted to minimize user interaction and, hence, be easily applicable on a near real-time operational level as well as for mapping historical events. NOFFi's products can be visualized through the interactive Fire Forest portal, which allows the involvement and awareness of the relevant stakeholders via the Public Participation GIS (PPGIS) tool.
Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies
Theis, Fabian J.
2017-01-01
Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on nonstratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverse-probability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different behaviors between the random forest and other classifiers. In conclusion, we provide guidance for choosing correction methods when training classifiers on biased samples. For random forests, our method outperforms state-of-the-art procedures if distribution assumptions are roughly fulfilled. We provide our implementation in the R package sambia. PMID:29312464
Esmaily, Habibollah; Tayefi, Maryam; Doosti, Hassan; Ghayour-Mobarhan, Majid; Nezami, Hossein; Amirabadizadeh, Alireza
2018-04-24
We aimed to identify the associated risk factors of type 2 diabetes mellitus (T2DM) using data mining approach, decision tree and random forest techniques using the Mashhad Stroke and Heart Atherosclerotic Disorders (MASHAD) Study program. A cross-sectional study. The MASHAD study started in 2010 and will continue until 2020. Two data mining tools, namely decision trees, and random forests, are used for predicting T2DM when some other characteristics are observed on 9528 subjects recruited from MASHAD database. This paper makes a comparison between these two models in terms of accuracy, sensitivity, specificity and the area under ROC curve. The prevalence rate of T2DM was 14% among these subjects. The decision tree model has 64.9% accuracy, 64.5% sensitivity, 66.8% specificity, and area under the ROC curve measuring 68.6%, while the random forest model has 71.1% accuracy, 71.3% sensitivity, 69.9% specificity, and area under the ROC curve measuring 77.3% respectively. The random forest model, when used with demographic, clinical, and anthropometric and biochemical measurements, can provide a simple tool to identify associated risk factors for type 2 diabetes. Such identification can substantially use for managing the health policy to reduce the number of subjects with T2DM .
Applications of random forest feature selection for fine-scale genetic population assignment.
Sylvester, Emma V A; Bentzen, Paul; Bradbury, Ian R; Clément, Marie; Pearce, Jon; Horne, John; Beiko, Robert G
2018-02-01
Genetic population assignment used to inform wildlife management and conservation efforts requires panels of highly informative genetic markers and sensitive assignment tests. We explored the utility of machine-learning algorithms (random forest, regularized random forest and guided regularized random forest) compared with F ST ranking for selection of single nucleotide polymorphisms (SNP) for fine-scale population assignment. We applied these methods to an unpublished SNP data set for Atlantic salmon ( Salmo salar ) and a published SNP data set for Alaskan Chinook salmon ( Oncorhynchus tshawytscha ). In each species, we identified the minimum panel size required to obtain a self-assignment accuracy of at least 90% using each method to create panels of 50-700 markers Panels of SNPs identified using random forest-based methods performed up to 7.8 and 11.2 percentage points better than F ST -selected panels of similar size for the Atlantic salmon and Chinook salmon data, respectively. Self-assignment accuracy ≥90% was obtained with panels of 670 and 384 SNPs for each data set, respectively, a level of accuracy never reached for these species using F ST -selected panels. Our results demonstrate a role for machine-learning approaches in marker selection across large genomic data sets to improve assignment for management and conservation of exploited populations.
Do little interactions get lost in dark random forests?
Wright, Marvin N; Ziegler, Andreas; König, Inke R
2016-03-31
Random forests have often been claimed to uncover interaction effects. However, if and how interaction effects can be differentiated from marginal effects remains unclear. In extensive simulation studies, we investigate whether random forest variable importance measures capture or detect gene-gene interactions. With capturing interactions, we define the ability to identify a variable that acts through an interaction with another one, while detection is the ability to identify an interaction effect as such. Of the single importance measures, the Gini importance captured interaction effects in most of the simulated scenarios, however, they were masked by marginal effects in other variables. With the permutation importance, the proportion of captured interactions was lower in all cases. Pairwise importance measures performed about equal, with a slight advantage for the joint variable importance method. However, the overall fraction of detected interactions was low. In almost all scenarios the detection fraction in a model with only marginal effects was larger than in a model with an interaction effect only. Random forests are generally capable of capturing gene-gene interactions, but current variable importance measures are unable to detect them as interactions. In most of the cases, interactions are masked by marginal effects and interactions cannot be differentiated from marginal effects. Consequently, caution is warranted when claiming that random forests uncover interactions.
Nasejje, Justine B; Mwambi, Henry
2017-09-07
Uganda just like any other Sub-Saharan African country, has a high under-five child mortality rate. To inform policy on intervention strategies, sound statistical methods are required to critically identify factors strongly associated with under-five child mortality rates. The Cox proportional hazards model has been a common choice in analysing data to understand factors strongly associated with high child mortality rates taking age as the time-to-event variable. However, due to its restrictive proportional hazards (PH) assumption, some covariates of interest which do not satisfy the assumption are often excluded in the analysis to avoid mis-specifying the model. Otherwise using covariates that clearly violate the assumption would mean invalid results. Survival trees and random survival forests are increasingly becoming popular in analysing survival data particularly in the case of large survey data and could be attractive alternatives to models with the restrictive PH assumption. In this article, we adopt random survival forests which have never been used in understanding factors affecting under-five child mortality rates in Uganda using Demographic and Health Survey data. Thus the first part of the analysis is based on the use of the classical Cox PH model and the second part of the analysis is based on the use of random survival forests in the presence of covariates that do not necessarily satisfy the PH assumption. Random survival forests and the Cox proportional hazards model agree that the sex of the household head, sex of the child, number of births in the past 1 year are strongly associated to under-five child mortality in Uganda given all the three covariates satisfy the PH assumption. Random survival forests further demonstrated that covariates that were originally excluded from the earlier analysis due to violation of the PH assumption were important in explaining under-five child mortality rates. These covariates include the number of children under the age of five in a household, number of births in the past 5 years, wealth index, total number of children ever born and the child's birth order. The results further indicated that the predictive performance for random survival forests built using covariates including those that violate the PH assumption was higher than that for random survival forests built using only covariates that satisfy the PH assumption. Random survival forests are appealing methods in analysing public health data to understand factors strongly associated with under-five child mortality rates especially in the presence of covariates that violate the proportional hazards assumption.
Christopher W. Woodall; Jacques Rondeux; Pieter J. Verkerk; G& #246; ran St& #229; hl
2009-01-01
Efforts to assess forest ecosystem carbon stocks, biodiversity, and fire hazards have spurred the need for comprehensive assessments of forest ecosystem dead wood (DW) components around the world. Currently, information regarding the prevalence, status, and methods of DW inventories occurring in the world's forested landscapes is scattered. The goal of this study...
Timber resource statistics for the upper Tanana block, Tanana inventory unit, Alaska, 1974.
Karl M. Hegg
1983-01-01
This report for the 3.6-million-acre Upper Tanana block is the third of four on the 14-million-acre Tanana Valley forest inventory unit. Descriptions of area, climate, forest, general resource use, and inventory methodology are presented. Area and volume tables are provided for commercial and operable noncommercial forest lands. Estimates for commercial forest land...
Christopher W. Woodall; Jacques Rondeux; Pieter J. Verkerk; Goran Stahl
2009-01-01
Efforts to assess forest ecosystem carbon stocks, biodiversity, and fire hazards have spurred the need for comprehensive assessments of forest ecosystem dead wood (DW) attributes around the world. Currently, information regarding the prevalence, status, and methods of DW inventories occurring in the world?s forested landscapes is scattered. The goal of this study is to...
Assessing net carbon sequestration on urban and community forests of northern New England, USA
Daolan Zheng; Mark J. Ducey; Linda S. Heath
2013-01-01
Urban and community forests play an important role in the overall carbon budget of the USA. Accurately quantifying carbon sequestration by these forests can provide insight for strategic planning to mitigate greenhouse gas effects on climate change. This study provides a new methodology to estimate net forest carbon sequestration (FCS) in urban and community lands of...
Comparing alternatives for increasing sampling intensity in forest inventories
J. Blackard; P. Patterson
2014-01-01
Each of the U.S. Forest Serviceâs Forest Inventory and Analysis (FIA) regions has an occasional need to intensify the national sampling grid. A variety of methodologies exist within the various FIA regions and National Forest Systems regions for constructing plot intensifications, and there is no consensus on a national procedure The primary objectives of this paper...
Recent advances in environmental data mining
NASA Astrophysics Data System (ADS)
Leuenberger, Michael; Kanevski, Mikhail
2016-04-01
Due to the large amount and complexity of data available nowadays in geo- and environmental sciences, we face the need to develop and incorporate more robust and efficient methods for their analysis, modelling and visualization. An important part of these developments deals with an elaboration and application of a contemporary and coherent methodology following the process from data collection to the justification and communication of the results. Recent fundamental progress in machine learning (ML) can considerably contribute to the development of the emerging field - environmental data science. The present research highlights and investigates the different issues that can occur when dealing with environmental data mining using cutting-edge machine learning algorithms. In particular, the main attention is paid to the description of the self-consistent methodology and two efficient algorithms - Random Forest (RF, Breiman, 2001) and Extreme Learning Machines (ELM, Huang et al., 2006), which recently gained a great popularity. Despite the fact that they are based on two different concepts, i.e. decision trees vs artificial neural networks, they both propose promising results for complex, high dimensional and non-linear data modelling. In addition, the study discusses several important issues of data driven modelling, including feature selection and uncertainties. The approach considered is accompanied by simulated and real data case studies from renewable resources assessment and natural hazards tasks. In conclusion, the current challenges and future developments in statistical environmental data learning are discussed. References - Breiman, L., 2001. Random Forests. Machine Learning 45 (1), 5-32. - Huang, G.-B., Zhu, Q.-Y., Siew, C.-K., 2006. Extreme learning machine: theory and applications. Neurocomputing 70 (1-3), 489-501. - Kanevski, M., Pozdnoukhov, A., Timonin, V., 2009. Machine Learning for Spatial Environmental Data. EPFL Press; Lausanne, Switzerland, p.392. - Leuenberger, M., Kanevski, M., 2015. Extreme Learning Machines for spatial environmental data. Computers and Geosciences 85, 64-73.
Characterizing forest composition of the Allegheny Mountains using extensive forest inventory data
W. H. McWilliams; R. Riemann Hershey; D. A. Drake; C. L. Alerich
1993-01-01
There is a general lack of information that describes forest composition at the landscape level, e.g. for entire physiographic provinces. Studies in the ecological literature usually are local in scope and combining data from such studies is questionable due to different design methodology.
Estimating forest characteristics using NAIP imagery and ArcObjects
John S Hogland; Nathaniel M. Anderson; Woodam Chung; Lucas Wells
2014-01-01
Detailed, accurate, efficient, and inexpensive methods of estimating basal area, trees, and aboveground biomass per acre across broad extents are needed to effectively manage forests. In this study we present such a methodology using readily available National Agriculture Imagery Program imagery, Forest Inventory Analysis samples, a two stage classification and...
W. Devine; C. Aubry; J. Miller; K. Potter; A. Bower
2012-01-01
This guide provides a step-by-step description of the methodology used to apply the Forest Tree Genetic Risk Assessment System (ForGRAS; Potter and Crane 2010) to the tree species of the Pacific Northwest in a recent climate change vulnerability assessment (Devine et al. 2012). We describe our modified version of the ForGRAS model, and we review the modelâs basic...
Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw
2006-01-01
We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.
Comparing spatial regression to random forests for large environmental data sets
Environmental data may be “large” due to number of records, number of covariates, or both. Random forests has a reputation for good predictive performance when using many covariates, whereas spatial regression, when using reduced rank methods, has a reputatio...
Landorf, Karl B; Menz, Hylton B; Armstrong, David G; Herbert, Robert D
2015-07-01
Randomized trials must be of high methodological quality to yield credible, actionable findings. The main aim of this project was to evaluate whether there has been an improvement in the methodological quality of randomized trials published in the Journal of the American Podiatric Medical Association (JAPMA). Randomized trials published in JAPMA during a 15-year period (January 1999 to December 2013) were evaluated. The methodological quality of randomized trials was evaluated using the PEDro scale (scores range from 0 to 10, with 0 being lowest quality). Linear regression was used to assess changes in methodological quality over time. A total of 1,143 articles were published in JAPMA between January 1999 and December 2013. Of these, 44 articles were reports of randomized trials. Although the number of randomized trials published each year increased, there was only minimal improvement in their methodological quality (mean rate of improvement = 0.01 points per year). The methodological quality of the trials studied was typically moderate, with a mean ± SD PEDro score of 5.1 ± 1.5. Although there were a few high-quality randomized trials published in the journal, most (84.1%) scored between 3 and 6. Although there has been an increase in the number of randomized trials published in JAPMA, there is substantial opportunity for improvement in the methodological quality of trials published in the journal. Researchers seeking to publish reports of randomized trials should seek to meet current best-practice standards in the conduct and reporting of their trials.
Solberg, Svein; Gizachew, Belachew; Næsset, Erik; Gobakken, Terje; Bollandsås, Ole Martin; Mauya, Ernest William; Olsson, Håkan; Malimbwi, Rogers; Zahabu, Eliakimu
2015-12-01
REDD+ implementation requires establishment of a system for measuring, reporting and verification (MRV) of forest carbon changes. A challenge for MRV is the lack of satellite based methods that can track not only deforestation, but also degradation and forest growth, as well as a lack of historical data that can serve as a basis for a reference emission level. Working in a miombo woodland in Tanzania, we here aim at demonstrating a novel 3D satellite approach based on interferometric processing of radar imagery (InSAR). Forest carbon changes are derived from changes in the forest canopy height obtained from InSAR, i.e. decreases represent carbon loss from logging and increases represent carbon sequestration through forest growth. We fitted a model of above-ground biomass (AGB) against InSAR height, and used this to convert height changes to biomass and carbon changes. The relationship between AGB and InSAR height was weak, as the individual plots were widely scattered around the model fit. However, we consider the approach to be unique and feasible for large-scale MRV efforts in REDD+ because the low accuracy was attributable partly to small plots and other limitations in the data set, and partly to a random pixel-to-pixel variation in trunk forms. Further processing of the InSAR data provides data on the categories of forest change. The combination of InSAR data from the Shuttle RADAR Topography Mission (SRTM) and the TanDEM-X satellite mission provided both historic baseline of change for the period 2000-2011, as well as annual change 2011-2012. A 3D data set from InSAR is a promising tool for MRV in REDD+. The temporal changes seen by InSAR data corresponded well with, but largely supplemented, the changes derived from Landsat data.
Remote sensing-based estimation of annual soil respiration at two contrasting forest sites
Gu, Lianhong; Huang, Ni; Black, T. Andrew; ...
2015-11-23
Soil respiration (R s), an important component of the global carbon cycle, can be estimated using remotely sensed data, but the accuracy of this technique has not been thoroughly investigated. In this article, we proposed a methodology for the remote estimation of annual R s at two contrasting FLUXNET forest sites (a deciduous broadleaf forest and an evergreen needleleaf forest).
Benjamin C. Bright; Andrew T. Hudak; Robert E. Kennedy; Arjan J. H. Meddens
2014-01-01
Bark beetle-caused tree mortality affects important forest ecosystem processes. Remote sensing methodologies that quantify live and dead basal area (BA) in bark beetle-affected forests can provide valuable information to forest managers and researchers. We compared the utility of light detection and ranging (lidar) and the Landsat-based detection of trends in...
Patrick D. Miles; Andrew D. Hill
2010-01-01
The U.S. Forest Service's Forest Inventory and Analysis (FIA) program collects sample plot data on all forest ownerships across the United States. This report documents the methodology used to estimate live-tree gross, net, and sound volume for the 24 States inventoried by the Northern Research Station's (NRS) FIA unit. Sound volume is of particular interest...
The relationship between urban forests and income: A meta-analysis.
Gerrish, Ed; Watkins, Shannon Lea
2018-02-01
Urban trees provide substantial public health and public environmental benefits. However, scholarly works suggest that urban trees may be unequally distributed among poor and minority urban communities, meaning that these communities are potentially being deprived of public environmental benefits, a form of environmental injustice. The evidence of this problem is not uniform however, and evidence of inequity varies in size and significance across studies. This variation in results suggests the need for a research synthesis and meta-analysis. We employed a systematic literature search to identify original studies which examined the relationship between urban forest cover and income (n=61) and coded each effect size (n=332). We used meta-analytic techniques to estimate the average (unconditional) relationship between urban forest cover and income and to estimate the impact that methodological choices, measurement, publication characteristics, and study site characteristics had on the magnitude of that relationship. We leveraged variation in study methodology to evaluate the extent to which results were sensitive to methodological choices often debated in the geographic and environmental justice literature but not yet evaluated in environmental amenities research. We found evidence of income-based inequity in urban forest cover (unconditional mean effect size = 0.098; s.e. = .017) that was robust across most measurement and methodological strategies in original studies and results did not differ systematically with study site characteristics. Studies that controlled for spatial autocorrelation, a violation of independent errors, found evidence of substantially less urban forest inequity; future research in this area should test and correct for spatial autocorrelation.
Random forest (RF) modeling has emerged as an important statistical learning method in ecology due to its exceptional predictive performance. However, for large and complex ecological datasets there is limited guidance on variable selection methods for RF modeling. Typically, e...
NASA Astrophysics Data System (ADS)
Ahmed, Oumer S.; Franklin, Steven E.; Wulder, Michael A.; White, Joanne C.
2015-03-01
Many forest management activities, including the development of forest inventories, require spatially detailed forest canopy cover and height data. Among the various remote sensing technologies, LiDAR (Light Detection and Ranging) offers the most accurate and consistent means for obtaining reliable canopy structure measurements. A potential solution to reduce the cost of LiDAR data, is to integrate transects (samples) of LiDAR data with frequently acquired and spatially comprehensive optical remotely sensed data. Although multiple regression is commonly used for such modeling, often it does not fully capture the complex relationships between forest structure variables. This study investigates the potential of Random Forest (RF), a machine learning technique, to estimate LiDAR measured canopy structure using a time series of Landsat imagery. The study is implemented over a 2600 ha area of industrially managed coastal temperate forests on Vancouver Island, British Columbia, Canada. We implemented a trajectory-based approach to time series analysis that generates time since disturbance (TSD) and disturbance intensity information for each pixel and we used this information to stratify the forest land base into two strata: mature forests and young forests. Canopy cover and height for three forest classes (i.e. mature, young and mature and young (combined)) were modeled separately using multiple regression and Random Forest (RF) techniques. For all forest classes, the RF models provided improved estimates relative to the multiple regression models. The lowest validation error was obtained for the mature forest strata in a RF model (R2 = 0.88, RMSE = 2.39 m and bias = -0.16 for canopy height; R2 = 0.72, RMSE = 0.068% and bias = -0.0049 for canopy cover). This study demonstrates the value of using disturbance and successional history to inform estimates of canopy structure and obtain improved estimates of forest canopy cover and height using the RF algorithm.
Sankari, E Siva; Manimegalai, D
2017-12-21
Predicting membrane protein types is an important and challenging research area in bioinformatics and proteomics. Traditional biophysical methods are used to classify membrane protein types. Due to large exploration of uncharacterized protein sequences in databases, traditional methods are very time consuming, expensive and susceptible to errors. Hence, it is highly desirable to develop a robust, reliable, and efficient method to predict membrane protein types. Imbalanced datasets and large datasets are often handled well by decision tree classifiers. Since imbalanced datasets are taken, the performance of various decision tree classifiers such as Decision Tree (DT), Classification And Regression Tree (CART), C4.5, Random tree, REP (Reduced Error Pruning) tree, ensemble methods such as Adaboost, RUS (Random Under Sampling) boost, Rotation forest and Random forest are analysed. Among the various decision tree classifiers Random forest performs well in less time with good accuracy of 96.35%. Another inference is RUS boost decision tree classifier is able to classify one or two samples in the class with very less samples while the other classifiers such as DT, Adaboost, Rotation forest and Random forest are not sensitive for the classes with fewer samples. Also the performance of decision tree classifiers is compared with SVM (Support Vector Machine) and Naive Bayes classifier. Copyright © 2017 Elsevier Ltd. All rights reserved.
Xiao, Li-Hong; Chen, Pei-Ran; Gou, Zhong-Ping; Li, Yong-Zhong; Li, Mei; Xiang, Liang-Cheng; Feng, Ping
2017-01-01
The aim of this study is to evaluate the ability of the random forest algorithm that combines data on transrectal ultrasound findings, age, and serum levels of prostate-specific antigen to predict prostate carcinoma. Clinico-demographic data were analyzed for 941 patients with prostate diseases treated at our hospital, including age, serum prostate-specific antigen levels, transrectal ultrasound findings, and pathology diagnosis based on ultrasound-guided needle biopsy of the prostate. These data were compared between patients with and without prostate cancer using the Chi-square test, and then entered into the random forest model to predict diagnosis. Patients with and without prostate cancer differed significantly in age and serum prostate-specific antigen levels (P < 0.001), as well as in all transrectal ultrasound characteristics (P < 0.05) except uneven echo (P = 0.609). The random forest model based on age, prostate-specific antigen and ultrasound predicted prostate cancer with an accuracy of 83.10%, sensitivity of 65.64%, and specificity of 93.83%. Positive predictive value was 86.72%, and negative predictive value was 81.64%. By integrating age, prostate-specific antigen levels and transrectal ultrasound findings, the random forest algorithm shows better diagnostic performance for prostate cancer than either diagnostic indicator on its own. This algorithm may help improve diagnosis of the disease by identifying patients at high risk for biopsy.
NASA Astrophysics Data System (ADS)
Bayram, B.; Erdem, F.; Akpinar, B.; Ince, A. K.; Bozkurt, S.; Catal Reis, H.; Seker, D. Z.
2017-11-01
Coastal monitoring plays a vital role in environmental planning and hazard management related issues. Since shorelines are fundamental data for environment management, disaster management, coastal erosion studies, modelling of sediment transport and coastal morphodynamics, various techniques have been developed to extract shorelines. Random Forest is one of these techniques which is used in this study for shoreline extraction.. This algorithm is a machine learning method based on decision trees. Decision trees analyse classes of training data creates rules for classification. In this study, Terkos region has been chosen for the proposed method within the scope of "TUBITAK Project (Project No: 115Y718) titled "Integration of Unmanned Aerial Vehicles for Sustainable Coastal Zone Monitoring Model - Three-Dimensional Automatic Coastline Extraction and Analysis: Istanbul-Terkos Example". Random Forest algorithm has been implemented to extract the shoreline of the Black Sea where near the lake from LANDSAT-8 and GOKTURK-2 satellite imageries taken in 2015. The MATLAB environment was used for classification. To obtain land and water-body classes, the Random Forest method has been applied to NIR bands of LANDSAT-8 (5th band) and GOKTURK-2 (4th band) imageries. Each image has been digitized manually and shorelines obtained for accuracy assessment. According to accuracy assessment results, Random Forest method is efficient for both medium and high resolution images for shoreline extraction studies.
Spam comments prediction using stacking with ensemble learning
NASA Astrophysics Data System (ADS)
Mehmood, Arif; On, Byung-Won; Lee, Ingyu; Ashraf, Imran; Choi, Gyu Sang
2018-01-01
Illusive comments of product or services are misleading for people in decision making. The current methodologies to predict deceptive comments are concerned for feature designing with single training model. Indigenous features have ability to show some linguistic phenomena but are hard to reveal the latent semantic meaning of the comments. We propose a prediction model on general features of documents using stacking with ensemble learning. Term Frequency/Inverse Document Frequency (TF/IDF) features are inputs to stacking of Random Forest and Gradient Boosted Trees and the outputs of the base learners are encapsulated with decision tree to make final training of the model. The results exhibits that our approach gives the accuracy of 92.19% which outperform the state-of-the-art method.
Preliminary results of the global forest biomass survey
S. Healey; E. Lindquist
2014-01-01
Many countries do not yet have well-established national forest inventories, and among those that do, significant methodological differences exist, particularly in the estimation of standing forest biomass. Global space-based LiDAR (Light Detection and Ranging) from NASAâs now-completed ICESat mission provided consistent, high-quality measures of canopy height and...
Missouri Ozark Forest Ecosystem Project: the experiment
Steven L. Sheriff
2002-01-01
Missouri Ozark Forest Ecosystem Project (MOFEP) is a unique experiment to learn about the impacts of management practices on a forest system. Three forest management practices (uneven-aged management, even-aged management, and no-harvest management) as practiced by the Missouri Department of Conservation were randomly assigned to nine forest management sites using a...
NASA Technical Reports Server (NTRS)
Potter, Christopher S.
2014-01-01
The Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) methodology was applied to detected changes in forest vegetation cover for areas burned by wildfires in the Sierra Nevada Mountains of California between the periods of 1975- 79 and 1995-1999. Results for areas burned by wildfire between 1995 and 1999 confirmed the importance of regrowing forest vegetation over 17% of the combined burned areas. A notable fraction (12%) of the entire 5-km (unburned) buffer area outside the 1995-199 fires perimeters showed decline in forest cover, and not nearly as many regrowing forest areas, covering only 3% of all the 1995-1999 buffer areas combined. Areas burned by wildfire between 1975 and 1979 confirmed the importance of disturbed (or declining evergreen) vegetation covering 13% of the combined 1975- 1979 burned areas. Based on comparison of these results to ground-based survey data, the LEDAPS methodology should be capable of fulfilling much of the need for consistent, low-cost monitoring of changes due to climate and biological factors in western forest regrowth following stand-replacing disturbances.
Random forest (RF) is popular in ecological and environmental modeling, in part, because of its insensitivity to correlated predictors and resistance to overfitting. Although variable selection has been proposed to improve both performance and interpretation of RF models, it is u...
Random Forests for Evaluating Pedagogy and Informing Personalized Learning
ERIC Educational Resources Information Center
Spoon, Kelly; Beemer, Joshua; Whitmer, John C.; Fan, Juanjuan; Frazee, James P.; Stronach, Jeanne; Bohonak, Andrew J.; Levine, Richard A.
2016-01-01
Random forests are presented as an analytics foundation for educational data mining tasks. The focus is on course- and program-level analytics including evaluating pedagogical approaches and interventions and identifying and characterizing at-risk students. As part of this development, the concept of individualized treatment effects (ITE) is…
USDA-ARS?s Scientific Manuscript database
Palmer amaranth (Amaranthus palmeri S. Wats.) invasion negatively impacts cotton (Gossypium hirsutum L.) production systems throughout the United States. The objective of this study was to evaluate canopy hyperspectral narrowband data as input into the random forest machine learning algorithm to dis...
Campbell, J Elliott; Moen, Jeremie C; Ney, Richard A; Schnoor, Jerald L
2008-03-01
Estimates of forest soil organic carbon (SOC) have applications in carbon science, soil quality studies, carbon sequestration technologies, and carbon trading. Forest SOC has been modeled using a regression coefficient methodology that applies mean SOC densities (mass/area) to broad forest regions. A higher resolution model is based on an approach that employs a geographic information system (GIS) with soil databases and satellite-derived landcover images. Despite this advancement, the regression approach remains the basis of current state and federal level greenhouse gas inventories. Both approaches are analyzed in detail for Wisconsin forest soils from 1983 to 2001, applying rigorous error-fixing algorithms to soil databases. Resulting SOC stock estimates are 20% larger when determined using the GIS method rather than the regression approach. Average annual rates of increase in SOC stocks are 3.6 and 1.0 million metric tons of carbon per year for the GIS and regression approaches respectively.
Old-growth and mature forests near spotted owl nests in western Oregon
NASA Technical Reports Server (NTRS)
Ripple, William J.; Johnson, David H.; Hershey, K. T.; Meslow, E. Charles
1995-01-01
We investigated how the amount of old-growth and mature forest influences the selection of nest sites by northern spotted owls (Strix occidentalis caurina) in the Central Cascade Mountains of Oregon. We used 7 different plot sizes to compare the proportion of mature and old-growth forest between 30 nest sites and 30 random sites. The proportion of old-growth and mature forest was significantly greater at nests sites than at random sites for all plot sizes (P less than or equal to 0.01). Thus, management of the spotted owl might require setting the percentage of old-growth and mature forest retained from harvesting at least 1 standard deviation above the mean for the 30 nest sites we examined.
NASA Astrophysics Data System (ADS)
Polan, Daniel F.; Brady, Samuel L.; Kaufman, Robert A.
2016-09-01
There is a need for robust, fully automated whole body organ segmentation for diagnostic CT. This study investigates and optimizes a Random Forest algorithm for automated organ segmentation; explores the limitations of a Random Forest algorithm applied to the CT environment; and demonstrates segmentation accuracy in a feasibility study of pediatric and adult patients. To the best of our knowledge, this is the first study to investigate a trainable Weka segmentation (TWS) implementation using Random Forest machine-learning as a means to develop a fully automated tissue segmentation tool developed specifically for pediatric and adult examinations in a diagnostic CT environment. Current innovation in computed tomography (CT) is focused on radiomics, patient-specific radiation dose calculation, and image quality improvement using iterative reconstruction, all of which require specific knowledge of tissue and organ systems within a CT image. The purpose of this study was to develop a fully automated Random Forest classifier algorithm for segmentation of neck-chest-abdomen-pelvis CT examinations based on pediatric and adult CT protocols. Seven materials were classified: background, lung/internal air or gas, fat, muscle, solid organ parenchyma, blood/contrast enhanced fluid, and bone tissue using Matlab and the TWS plugin of FIJI. The following classifier feature filters of TWS were investigated: minimum, maximum, mean, and variance evaluated over a voxel radius of 2 n , (n from 0 to 4), along with noise reduction and edge preserving filters: Gaussian, bilateral, Kuwahara, and anisotropic diffusion. The Random Forest algorithm used 200 trees with 2 features randomly selected per node. The optimized auto-segmentation algorithm resulted in 16 image features including features derived from maximum, mean, variance Gaussian and Kuwahara filters. Dice similarity coefficient (DSC) calculations between manually segmented and Random Forest algorithm segmented images from 21 patient image sections, were analyzed. The automated algorithm produced segmentation of seven material classes with a median DSC of 0.86 ± 0.03 for pediatric patient protocols, and 0.85 ± 0.04 for adult patient protocols. Additionally, 100 randomly selected patient examinations were segmented and analyzed, and a mean sensitivity of 0.91 (range: 0.82-0.98), specificity of 0.89 (range: 0.70-0.98), and accuracy of 0.90 (range: 0.76-0.98) were demonstrated. In this study, we demonstrate that this fully automated segmentation tool was able to produce fast and accurate segmentation of the neck and trunk of the body over a wide range of patient habitus and scan parameters.
NASA Astrophysics Data System (ADS)
Molinario, G.; Baraldi, A.; Altstatt, A. L.; Nackoney, J.
2011-12-01
The University of Maryland has been a USAID Central Africa Rregional Program for the Environment (CARPE) cross-cutting partner for many years, providing remote sensing derived information on forest cover and forest cover changes in support of CARPE's objectives of diminishing forest degradation, loss and biodiversity loss as a result of poor or inexistent land use planning strategies. Together with South Dakota State University, Congo Basin-wide maps have been provided that map forest cover loss at a maximum of 60m resolution, using Landsat imagery and higher resolution imagery for algorithm training and validation. However, to better meet the needs within the CARPE Landscapes, which call for higher resolution, more accurate land cover change maps, UMD has been exploring the use of the SIAM automatic spectral -rule classifier together with pan-sharpened Landsat data (15m resolution) and Very High Resolution imagery from various sources. The pilot project is being developed in collaboration with the African Wildlife Foundation in the Maringa Lopori Wamba CARPE Landscape. If successful in the future this methodology will make the creation of high resolution change maps faster and easier, making it accessible to other entities in the Congo Basin that need accurate land cover and land use change maps in order, for example, to create sustainable land use plans, conserve biodiversity and resources and prepare Reducing Emissions from forest Degradation and Deforestation (REDD) Measurement, Reporting and Verification (MRV) projects. The paper describes the need for higher resolution land cover change maps that focus on forest change dynamics such as the cycling between primary forests, secondary forest, agriculture and other expanding and intensifying land uses in the Maringa Lopori Wamba CARPE Landscape in the Equateur Province of the Democratic Republic of Congo. The Methodology uses the SIAM remote sensing imagery automatic spectral rule classifier, together with pan-sharpened Landsat imagery with 15m resolution and Very High Resolution imagery from different sensors, obtained from the Department of Defense database that was recently opened to NASA and its Earth Observation partners. Particular emphasis is placed on the detection of agricultural fields and their expansion in primary forests or intensification in secondary forests and fallow fields, as this is the primary driver of deforestation in this area. Fields in this area area also of very small size and irregular shapes, often partly obscured by neighboring forest canopy, hence the technical challenge of correctly detecting them and tracking them through time. Finally, the potential for use of this methodology in other regions where information on land cover changes is needed for land use sustainability planning, is also addressed.
Temporal changes in randomness of bird communities across Central Europe.
Renner, Swen C; Gossner, Martin M; Kahl, Tiemo; Kalko, Elisabeth K V; Weisser, Wolfgang W; Fischer, Markus; Allan, Eric
2014-01-01
Many studies have examined whether communities are structured by random or deterministic processes, and both are likely to play a role, but relatively few studies have attempted to quantify the degree of randomness in species composition. We quantified, for the first time, the degree of randomness in forest bird communities based on an analysis of spatial autocorrelation in three regions of Germany. The compositional dissimilarity between pairs of forest patches was regressed against the distance between them. We then calculated the y-intercept of the curve, i.e. the 'nugget', which represents the compositional dissimilarity at zero spatial distance. We therefore assume, following similar work on plant communities, that this represents the degree of randomness in species composition. We then analysed how the degree of randomness in community composition varied over time and with forest management intensity, which we expected to reduce the importance of random processes by increasing the strength of environmental drivers. We found that a high portion of the bird community composition could be explained by chance (overall mean of 0.63), implying that most of the variation in local bird community composition is driven by stochastic processes. Forest management intensity did not consistently affect the mean degree of randomness in community composition, perhaps because the bird communities were relatively insensitive to management intensity. We found a high temporal variation in the degree of randomness, which may indicate temporal variation in assembly processes and in the importance of key environmental drivers. We conclude that the degree of randomness in community composition should be considered in bird community studies, and the high values we find may indicate that bird community composition is relatively hard to predict at the regional scale.
NASA Astrophysics Data System (ADS)
Markman, Adam; Carnicer, Artur; Javidi, Bahram
2017-05-01
We overview our recent work [1] on utilizing three-dimensional (3D) optical phase codes for object authentication using the random forest classifier. A simple 3D optical phase code (OPC) is generated by combining multiple diffusers and glass slides. This tag is then placed on a quick-response (QR) code, which is a barcode capable of storing information and can be scanned under non-uniform illumination conditions, rotation, and slight degradation. A coherent light source illuminates the OPC and the transmitted light is captured by a CCD to record the unique signature. Feature extraction on the signature is performed and inputted into a pre-trained random-forest classifier for authentication.
Dennis P. Dykstra; Robert A. Monserud
2009-01-01
The purpose of the international conference from which these proceedings are drawn was to explore relationships between forest management activities and timber quality. Sessions were organized to explore models and simulation methodologies that contribute to an understanding of tree development over time and the ways that management and harvesting activities can...
Remote Sensing for Tropical Forest Assessment
AJR Gillespie
1994-01-01
The purpose of this workshop was to allow remote sensing experts from Latin America, the U.S.A., and FAO to discuss state-of-the-art methodology in remote sensing of forest environments, and to develop plans on how to better incorporate this technology into FAO and national forest inventory efforts. The workshop included numerous presentations of ongoing activities, as...
Timber resource statistics for the Kantishna block, Tanana inventory unit, Alaska, 1973.
Karl M. Hegg
1982-01-01
This report for the 2.9-million-acre Kantishna block is the second of four on the 14-million-acre Tanana Valley inventory unit. Comments are made on general landform, timber use, recreational potential, agricultural developments, forest defect, regeneration, and inventory methodology. Tables are provided for commercial forest land and for operable noncommercial forest...
Wan, Xiaoqing; Zhao, Chunhui
2017-06-01
As a competitive machine learning algorithm, the stacked sparse autoencoder (SSA) has achieved outstanding popularity in exploiting high-level features for classification of hyperspectral images (HSIs). In general, in the SSA architecture, the nodes between adjacent layers are fully connected and need to be iteratively fine-tuned during the pretraining stage; however, the nodes of previous layers further away may be less likely to have a dense correlation to the given node of subsequent layers. Therefore, to reduce the classification error and increase the learning rate, this paper proposes the general framework of locally connected SSA; that is, the biologically inspired local receptive field (LRF) constrained SSA architecture is employed to simultaneously characterize the local correlations of spectral features and extract high-level feature representations of hyperspectral data. In addition, the appropriate receptive field constraint is concurrently updated by measuring the spatial distances from the neighbor nodes to the corresponding node. Finally, the efficient random forest classifier is cascaded to the last hidden layer of the SSA architecture as a benchmark classifier. Experimental results on two real HSI datasets demonstrate that the proposed hierarchical LRF constrained stacked sparse autoencoder and random forest (SSARF) provides encouraging results with respect to other contrastive methods, for instance, the improvements of overall accuracy in a range of 0.72%-10.87% for the Indian Pines dataset and 0.74%-7.90% for the Kennedy Space Center dataset; moreover, it generates lower running time compared with the result provided by similar SSARF based methodology.
Fast image interpolation via random forests.
Huang, Jun-Jie; Siu, Wan-Chi; Liu, Tian-Rui
2015-10-01
This paper proposes a two-stage framework for fast image interpolation via random forests (FIRF). The proposed FIRF method gives high accuracy, as well as requires low computation. The underlying idea of this proposed work is to apply random forests to classify the natural image patch space into numerous subspaces and learn a linear regression model for each subspace to map the low-resolution image patch to high-resolution image patch. The FIRF framework consists of two stages. Stage 1 of the framework removes most of the ringing and aliasing artifacts in the initial bicubic interpolated image, while Stage 2 further refines the Stage 1 interpolated image. By varying the number of decision trees in the random forests and the number of stages applied, the proposed FIRF method can realize computationally scalable image interpolation. Extensive experimental results show that the proposed FIRF(3, 2) method achieves more than 0.3 dB improvement in peak signal-to-noise ratio over the state-of-the-art nonlocal autoregressive modeling (NARM) method. Moreover, the proposed FIRF(1, 1) obtains similar or better results as NARM while only takes its 0.3% computational time.
CW-SSIM kernel based random forest for image classification
NASA Astrophysics Data System (ADS)
Fan, Guangzhe; Wang, Zhou; Wang, Jiheng
2010-07-01
Complex wavelet structural similarity (CW-SSIM) index has been proposed as a powerful image similarity metric that is robust to translation, scaling and rotation of images, but how to employ it in image classification applications has not been deeply investigated. In this paper, we incorporate CW-SSIM as a kernel function into a random forest learning algorithm. This leads to a novel image classification approach that does not require a feature extraction or dimension reduction stage at the front end. We use hand-written digit recognition as an example to demonstrate our algorithm. We compare the performance of the proposed approach with random forest learning based on other kernels, including the widely adopted Gaussian and the inner product kernels. Empirical evidences show that the proposed method is superior in its classification power. We also compared our proposed approach with the direct random forest method without kernel and the popular kernel-learning method support vector machine. Our test results based on both simulated and realworld data suggest that the proposed approach works superior to traditional methods without the feature selection procedure.
NASA Technical Reports Server (NTRS)
Williams, D. L.; Walthall, C. L.; Goward, S. N.
1984-01-01
An important part of fundamental remote sensing research is based on the measurement and analysis of spectral reflectance from earth surface materials in situ. It has been found that for an effective analysis of the target of interest, different applications of remotely sensed data require spectral measurements from different portions of the electromagnetic spectrum. It is pointed out that the detailed spectral reflectance characteristics of forest vegetation are currently not well understood, particularly in the middle infrared wavelength region. Details regarding the need for in situ forest canopy measurements are examined, taking into account certain difficulties arising in the case of satellite observations. Because of these difficulties, the present paper provides a discussion of methodology and preliminary spectra based on an experiment to use a helicopter as an observing platform for in situ forest canopy spectra measurement.
Random forests as cumulative effects models: A case study of lakes and rivers in Muskoka, Canada.
Jones, F Chris; Plewes, Rachel; Murison, Lorna; MacDougall, Mark J; Sinclair, Sarah; Davies, Christie; Bailey, John L; Richardson, Murray; Gunn, John
2017-10-01
Cumulative effects assessment (CEA) - a type of environmental appraisal - lacks effective methods for modeling cumulative effects, evaluating indicators of ecosystem condition, and exploring the likely outcomes of development scenarios. Random forests are an extension of classification and regression trees, which model response variables by recursive partitioning. Random forests were used to model a series of candidate ecological indicators that described lakes and rivers from a case study watershed (The Muskoka River Watershed, Canada). Suitability of the candidate indicators for use in cumulative effects assessment and watershed monitoring was assessed according to how well they could be predicted from natural habitat features and how sensitive they were to human land-use. The best models explained 75% of the variation in a multivariate descriptor of lake benthic-macroinvertebrate community structure, and 76% of the variation in the conductivity of river water. Similar results were obtained by cross-validation. Several candidate indicators detected a simulated doubling of urban land-use in their catchments, and a few were able to detect a simulated doubling of agricultural land-use. The paper demonstrates that random forests can be used to describe the combined and singular effects of multiple stressors and natural environmental factors, and furthermore, that random forests can be used to evaluate the performance of monitoring indicators. The numerical methods presented are applicable to any ecosystem and indicator type, and therefore represent a step forward for CEA. Crown Copyright © 2017. Published by Elsevier Ltd. All rights reserved.
Improved high-dimensional prediction with Random Forests by the use of co-data.
Te Beest, Dennis E; Mes, Steven W; Wilting, Saskia M; Brakenhoff, Ruud H; van de Wiel, Mark A
2017-12-28
Prediction in high dimensional settings is difficult due to the large number of variables relative to the sample size. We demonstrate how auxiliary 'co-data' can be used to improve the performance of a Random Forest in such a setting. Co-data are incorporated in the Random Forest by replacing the uniform sampling probabilities that are used to draw candidate variables by co-data moderated sampling probabilities. Co-data here are defined as any type information that is available on the variables of the primary data, but does not use its response labels. These moderated sampling probabilities are, inspired by empirical Bayes, learned from the data at hand. We demonstrate the co-data moderated Random Forest (CoRF) with two examples. In the first example we aim to predict the presence of a lymph node metastasis with gene expression data. We demonstrate how a set of external p-values, a gene signature, and the correlation between gene expression and DNA copy number can improve the predictive performance. In the second example we demonstrate how the prediction of cervical (pre-)cancer with methylation data can be improved by including the location of the probe relative to the known CpG islands, the number of CpG sites targeted by a probe, and a set of p-values from a related study. The proposed method is able to utilize auxiliary co-data to improve the performance of a Random Forest.
Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A
2017-09-15
Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code available at http://insilico.utulsa.edu/software/privateEC . brett-mckinney@utulsa.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
NASA Astrophysics Data System (ADS)
Clough, B.; Russell, M.; Domke, G. M.; Woodall, C. W.
2016-12-01
Uncertainty estimates are needed to establish confidence in national forest carbon stocks and to verify changes reported to the United Nations Framework Convention on Climate Change. Good practice guidance from the Intergovernmental Panel on Climate Change stipulates that uncertainty assessments should neither exaggerate nor underestimate the actual error within carbon stocks, yet methodological guidance for forests has been hampered by limited understanding of how complex dynamics give rise to errors across spatial scales (i.e., individuals to continents). This talk highlights efforts to develop a multi-scale, data-driven framework for assessing uncertainty within the United States (US) forest carbon inventory, and focuses on challenges and opportunities for improving the precision of national forest carbon stock estimates. Central to our approach is the calibration of allometric models with a newly established legacy biomass database for North American tree species, and the use of hierarchical models to link these data with the Forest Inventory and Analysis (FIA) database as well as remote sensing datasets. Our work suggests substantial risk for misestimating key sources of uncertainty including: (1) attributing more confidence in allometric models than what is warranted by the best available data; (2) failing to capture heterogeneity in biomass stocks due to environmental variation at regional scales; and (3) ignoring spatial autocorrelation and other random effects that are characteristic of national forest inventory data. Our results suggest these sources of error may be much higher than is generally assumed, though these results must be understood with the limited scope and availability of appropriate calibration data in mind. In addition to reporting on important sources of uncertainty, this talk will discuss opportunities to improve the precision of national forest carbon stocks that are motivated by our use of data-driven forecasting including: (1) improving the taxonomic and geographic scope of available biomass data; (2) direct attribution of landscape-level heterogeneity in biomass stocks to specific ecological processes; and (3) integration of expert opinion and meta-analysis to lessen the influence of often highly variable datasets on biomass stock forecasts.
Forest owners' perceptions of ecotourism: Integrating community values and forest conservation.
Rodríguez-Piñeros, Sandra; Mayett-Moreno, Yesica
2015-03-01
The use of forest land for ecotourism has been well accepted due to its ability to provide income to local people and to conserve the forest. Preparing the forest with infrastructure to attract and educate visitors has been reported of importance. This study applied Q methodology in a small rural community of the State of Puebla, Mexico, to reveal forest owners' perceptions to build infrastructure in their forest as part of their ecotourism project. It also discloses forest owners' underlying motives to use their forest for ecotourism. Ecotourism is perceived as a complementary activity to farming that would allow women to be involved in community development. Low impact infrastructure is desired due to forest owners' perception to preserve the forest for the overall community well-being.
What does it take to get family forest owners to enroll in a forest stewardship-type program?
Michael A. Kilgore; Stephanie A. Snyder; Joseph Schertz; Steven J. Taff
2008-01-01
We estimated the probability of enrollment and factors influencing participation in a forest stewardship-type program, Minnesota's Sustainable Forest Incentives Act, using data from a mail survey of over 1000 randomly-selected Minnesota family forest owners. Of the 15 variables tested, only five were significant predictors of a landowner's interest in...
Karin Riley; Isaac C. Grenfell; Mark A. Finney
2016-01-01
Maps of the number, size, and species of trees in forests across the western United States are desirable for many applications such as estimating terrestrial carbon resources, predicting tree mortality following wildfires, and for forest inventory. However, detailed mapping of trees for large areas is not feasible with current technologies, but statistical...
NASA Astrophysics Data System (ADS)
Mangla, Rohit; Kumar, Shashi; Nandy, Subrata
2016-05-01
SAR and LiDAR remote sensing have already shown the potential of active sensors for forest parameter retrieval. SAR sensor in its fully polarimetric mode has an advantage to retrieve scattering property of different component of forest structure and LiDAR has the capability to measure structural information with very high accuracy. This study was focused on retrieval of forest aboveground biomass (AGB) using Terrestrial Laser Scanner (TLS) based point clouds and scattering property of forest vegetation obtained from decomposition modelling of RISAT-1 fully polarimetric SAR data. TLS data was acquired for 14 plots of Timli forest range, Uttarakhand, India. The forest area is dominated by Sal trees and random sampling with plot size of 0.1 ha (31.62m*31.62m) was adopted for TLS and field data collection. RISAT-1 data was processed to retrieve SAR data based variables and TLS point clouds based 3D imaging was done to retrieve LiDAR based variables. Surface scattering, double-bounce scattering, volume scattering, helix and wire scattering were the SAR based variables retrieved from polarimetric decomposition. Tree heights and stem diameters were used as LiDAR based variables retrieved from single tree vertical height and least square circle fit methods respectively. All the variables obtained for forest plots were used as an input in a machine learning based Random Forest Regression Model, which was developed in this study for forest AGB estimation. Modelled output for forest AGB showed reliable accuracy (RMSE = 27.68 t/ha) and a good coefficient of determination (0.63) was obtained through the linear regression between modelled AGB and field-estimated AGB. The sensitivity analysis showed that the model was more sensitive for the major contributed variables (stem diameter and volume scattering) and these variables were measured from two different remote sensing techniques. This study strongly recommends the integration of SAR and LiDAR data for forest AGB estimation.
Mapping of land cover in northern California with simulated hyperspectral satellite imagery
NASA Astrophysics Data System (ADS)
Clark, Matthew L.; Kilham, Nina E.
2016-09-01
Land-cover maps are important science products needed for natural resource and ecosystem service management, biodiversity conservation planning, and assessing human-induced and natural drivers of land change. Analysis of hyperspectral, or imaging spectrometer, imagery has shown an impressive capacity to map a wide range of natural and anthropogenic land cover. Applications have been mostly with single-date imagery from relatively small spatial extents. Future hyperspectral satellites will provide imagery at greater spatial and temporal scales, and there is a need to assess techniques for mapping land cover with these data. Here we used simulated multi-temporal HyspIRI satellite imagery over a 30,000 km2 area in the San Francisco Bay Area, California to assess its capabilities for mapping classes defined by the international Land Cover Classification System (LCCS). We employed a mapping methodology and analysis framework that is applicable to regional and global scales. We used the Random Forests classifier with three sets of predictor variables (reflectance, MNF, hyperspectral metrics), two temporal resolutions (summer, spring-summer-fall), two sample scales (pixel, polygon) and two levels of classification complexity (12, 20 classes). Hyperspectral metrics provided a 16.4-21.8% and 3.1-6.7% increase in overall accuracy relative to MNF and reflectance bands, respectively, depending on pixel or polygon scales of analysis. Multi-temporal metrics improved overall accuracy by 0.9-3.1% over summer metrics, yet increases were only significant at the pixel scale of analysis. Overall accuracy at pixel scales was 72.2% (Kappa 0.70) with three seasons of metrics. Anthropogenic and homogenous natural vegetation classes had relatively high confidence and producer and user accuracies were over 70%; in comparison, woodland and forest classes had considerable confusion. We next focused on plant functional types with relatively pure spectra by removing open-canopy shrublands, woodlands and mixed forests from the classification. This 12-class map had significantly improved accuracy of 85.1% (Kappa 0.83) and most classes had over 70% producer and user accuracies. Finally, we summarized important metrics from the multi-temporal Random Forests to infer the underlying chemical and structural properties that best discriminated our land-cover classes across seasons.
Mapping Migratory Bird Prevalence Using Remote Sensing Data Fusion
Swatantran, Anu; Dubayah, Ralph; Goetz, Scott; Hofton, Michelle; Betts, Matthew G.; Sun, Mindy; Simard, Marc; Holmes, Richard
2012-01-01
Background Improved maps of species distributions are important for effective management of wildlife under increasing anthropogenic pressures. Recent advances in lidar and radar remote sensing have shown considerable potential for mapping forest structure and habitat characteristics across landscapes. However, their relative efficacies and integrated use in habitat mapping remain largely unexplored. We evaluated the use of lidar, radar and multispectral remote sensing data in predicting multi-year bird detections or prevalence for 8 migratory songbird species in the unfragmented temperate deciduous forests of New Hampshire, USA. Methodology and Principal Findings A set of 104 predictor variables describing vegetation vertical structure and variability from lidar, phenology from multispectral data and backscatter properties from radar data were derived. We tested the accuracies of these variables in predicting prevalence using Random Forests regression models. All data sets showed more than 30% predictive power with radar models having the lowest and multi-sensor synergy (“fusion”) models having highest accuracies. Fusion explained between 54% and 75% variance in prevalence for all the birds considered. Stem density from discrete return lidar and phenology from multispectral data were among the best predictors. Further analysis revealed different relationships between the remote sensing metrics and bird prevalence. Spatial maps of prevalence were consistent with known habitat preferences for the bird species. Conclusion and Significance Our results highlight the potential of integrating multiple remote sensing data sets using machine-learning methods to improve habitat mapping. Multi-dimensional habitat structure maps such as those generated from this study can significantly advance forest management and ecological research by facilitating fine-scale studies at both stand and landscape level. PMID:22235254
Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.
2013-01-01
In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.
Prediction of Nucleotide Binding Peptides Using Star Graph Topological Indices.
Liu, Yong; Munteanu, Cristian R; Fernández Blanco, Enrique; Tan, Zhiliang; Santos Del Riego, Antonino; Pazos, Alejandro
2015-11-01
The nucleotide binding proteins are involved in many important cellular processes, such as transmission of genetic information or energy transfer and storage. Therefore, the screening of new peptides for this biological function is an important research topic. The current study proposes a mixed methodology to obtain the first classification model that is able to predict new nucleotide binding peptides, using only the amino acid sequence. Thus, the methodology uses a Star graph molecular descriptor of the peptide sequences and the Machine Learning technique for the best classifier. The best model represents a Random Forest classifier based on two features of the embedded and non-embedded graphs. The performance of the model is excellent, considering similar models in the field, with an Area Under the Receiver Operating Characteristic Curve (AUROC) value of 0.938 and true positive rate (TPR) of 0.886 (test subset). The prediction of new nucleotide binding peptides with this model could be useful for drug target studies in drug development. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
F. Rodríguez y Silva; J.R. Molina Martínez; Armando González-Cabán
2014-01-01
Traditional uses of the forest (timber, forage) have been giving way to other uses more in demand (recreation, ecosystem services). An observable consequence of this process of forest land use conversion is an increase in more difficult and extreme wildfires. Wildland forest management and protection program budgets are limited, and managers are requesting help in...
D.J. Hayes; W.B. Cohen
2006-01-01
This article describes the development of a methodology for scaling observations of changes in tropical forest cover to large areas at high temporal frequency from coarse-resolution satellite imagery. The approach for estimating proportional forest cover change as a continuous variable is based on a regression model that relates multispectral, multitemporal Moderate...
Recreation conflict potential and management in the northern/central Black Forest Nature Park
C. Mann; J. D. Absher
2008-01-01
This study explores conflict in recreational use of the Black Forest Nature Park (BFNP) by six different nature sports groups as a function of infrastructure, forest management and other users. A multi-step, methodological triangulation conflict model from US recreation management was applied and tested in the Park. Results from two groups, hikers and mountain bikers,...
Francisco Rodríguez y Silva; Armando González-Cabán
2013-01-01
The abandonment of land, the high energy load generated and accumulated by vegetation covers, climate change and interface scenarios in Mediterranean forest ecosystems are demanding serious attention to forest fire conditions. This is particularly true when dealing with the budget requirements for undertaking protection programs related to the state of current and...
Estimating erosion risks associated with logging and forest roads in northwestern California
Raymond M. Rice; Jack Lewis
1991-01-01
Abstract - Erosion resulting from logging and road building has long been a concern to forest managers and the general public. An objective methodology was developed to estimate erosion risk on forest roads and in harvest areas on private land in northwestern California. It was based on 260 plots sampled from the area harvested under 415 Timber Harvest Plans...
Quantifying aboveground forest carbon pools and fluxes from repeat LiDAR surveys
Andrew T. Hudak; Eva K. Strand; Lee A. Vierling; John C. Byrne; Jan U. H. Eitel; Sebastian Martinuzzi; Michael J. Falkowski
2012-01-01
Sound forest policy and management decisions to mitigate rising atmospheric CO2 depend upon accurate methodologies to quantify forest carbon pools and fluxes over large tracts of land. LiDAR remote sensing is a rapidly evolving technology for quantifying aboveground biomass and thereby carbon pools; however, little work has evaluated the efficacy of repeat LiDAR...
The Random Forests Statistical Technique: An Examination of Its Value for the Study of Reading
ERIC Educational Resources Information Center
Matsuki, Kazunaga; Kuperman, Victor; Van Dyke, Julie A.
2016-01-01
Studies investigating individual differences in reading ability often involve data sets containing a large number of collinear predictors and a small number of observations. In this article, we discuss the method of Random Forests and demonstrate its suitability for addressing the statistical concerns raised by such data sets. The method is…
ERIC Educational Resources Information Center
Strobl, Carolin; Malley, James; Tutz, Gerhard
2009-01-01
Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…
Random location of fuel treatments in wildland community interfaces: a percolation approach
Michael Bevers; Philip N. Omi; John G. Hof
2004-01-01
We explore the use of spatially correlated random treatments to reduce fuels in landscape patterns that appear somewhat natural while forming fully connected fuelbreaks between wildland forests and developed protection zones. From treatment zone maps partitioned into grids of hexagonal forest cells representing potential treatment sites, we selected cells to be treated...
Road Network State Estimation Using Random Forest Ensemble Learning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hou, Yi; Edara, Praveen; Chang, Yohan
Network-scale travel time prediction not only enables traffic management centers (TMC) to proactively implement traffic management strategies, but also allows travelers make informed decisions about route choices between various origins and destinations. In this paper, a random forest estimator was proposed to predict travel time in a network. The estimator was trained using two years of historical travel time data for a case study network in St. Louis, Missouri. Both temporal and spatial effects were considered in the modeling process. The random forest models predicted travel times accurately during both congested and uncongested traffic conditions. The computational times for themore » models were low, thus useful for real-time traffic management and traveler information applications.« less
Adaptive economic and ecological forest management under risk
Joseph Buongiorno; Mo Zhou
2015-01-01
Background: Forest managers must deal with inherently stochastic ecological and economic processes. The future growth of trees is uncertain, and so is their value. The randomness of low-impact, high frequency or rare catastrophic shocks in forest growth has significant implications in shaping the mix of tree species and the forest landscape...
A methodological framework to assess the carbon balance of tropical managed forests.
Piponiot, Camille; Cabon, Antoine; Descroix, Laurent; Dourdain, Aurélie; Mazzei, Lucas; Ouliac, Benjamin; Rutishauser, Ervan; Sist, Plinio; Hérault, Bruno
2016-12-01
Managed forests are a major component of tropical landscapes. Production forests as designated by national forest services cover up to 400 million ha, i.e. half of the forested area in the humid tropics. Forest management thus plays a major role in the global carbon budget, but with a lack of unified method to estimate carbon fluxes from tropical managed forests. In this study we propose a new time- and spatially-explicit methodology to estimate the above-ground carbon budget of selective logging at regional scale. The yearly balance of a logging unit, i.e. the elementary management unit of a forest estate, is modelled by aggregating three sub-models encompassing (i) emissions from extracted wood, (ii) emissions from logging damage and deforested areas and (iii) carbon storage from post-logging recovery. Models are parametrised and uncertainties are propagated through a MCMC algorithm. As a case study, we used 38 years of National Forest Inventories in French Guiana, northeastern Amazonia, to estimate the above-ground carbon balance (i.e. the net carbon exchange with the atmosphere) of selectively logged forests. Over this period, the net carbon balance of selective logging in the French Guianan Permanent Forest Estate is estimated to be comprised between 0.12 and 1.33 Tg C, with a median value of 0.64 Tg C. Uncertainties over the model could be diminished by improving the accuracy of both logging damage and large woody necromass decay submodels. We propose an innovating carbon accounting framework relying upon basic logging statistics. This flexible tool allows carbon budget of tropical managed forests to be estimated in a wide range of tropical regions.
Karin L. Riley; Isaac C. Grenfell; Mark A. Finney
2015-01-01
Mapping the number, size, and species of trees in forests across the western United States has utility for a number of research endeavors, ranging from estimation of terrestrial carbon resources to tree mortality following wildfires. For landscape fire and forest simulations that use the Forest Vegetation Simulator (FVS), a tree-level dataset, or âtree listâ, is a...
NASA Astrophysics Data System (ADS)
Talab-Ou-Ali, Halima; Niculescu, Simona; Sellin, Vanessa; Bougault, Christophe
2017-10-01
This paper presents a methodology for monitoring vegetation in the Pays de Brest using new series of Sentinel-1 satellite images combining with Sentinel-2 and SPOT-6. This work consists of establishing an interferogram method of the main types of vegetation in order to achieve the coherence of a multi-temporal Sentinel-1 radar image series, in SLC format (C band, VV and VH polarization), between 2015 and 2016. We then proceed to calculating the radar backscatter coefficient based on Sentinel 1 images in GRD format. Multi-date and multipolarized color compositions will be made to detect changes. It also shows the importance of data synergy to obtain an excellent accuracy using Random Forest classification.
Real-Time Detection of In-flight Aircraft Damage
Blair, Brenton; Lee, Herbert K. H.; Davies, Misty
2017-10-02
When there is damage to an aircraft, it is critical to be able to quickly detect and diagnose the problem so that the pilot can attempt to maintain control of the aircraft and land it safely. We develop methodology for real-time classification of flight trajectories to be able to distinguish between an undamaged aircraft and five different damage scenarios. Principal components analysis allows a lower-dimensional representation of multi-dimensional trajectory information in time. Random Forests provide a computationally efficient approach with sufficient accuracy to be able to detect and classify the different scenarios in real-time. We demonstrate our approach by classifyingmore » realizations of a 45 degree bank angle generated from the Generic Transport Model flight simulator in collaboration with NASA.« less
Real-Time Detection of In-flight Aircraft Damage
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blair, Brenton; Lee, Herbert K. H.; Davies, Misty
When there is damage to an aircraft, it is critical to be able to quickly detect and diagnose the problem so that the pilot can attempt to maintain control of the aircraft and land it safely. We develop methodology for real-time classification of flight trajectories to be able to distinguish between an undamaged aircraft and five different damage scenarios. Principal components analysis allows a lower-dimensional representation of multi-dimensional trajectory information in time. Random Forests provide a computationally efficient approach with sufficient accuracy to be able to detect and classify the different scenarios in real-time. We demonstrate our approach by classifyingmore » realizations of a 45 degree bank angle generated from the Generic Transport Model flight simulator in collaboration with NASA.« less
Hubble, Lee J; Cooper, James S; Sosa-Pintos, Andrea; Kiiveri, Harri; Chow, Edith; Webster, Melissa S; Wieczorek, Lech; Raguse, Burkhard
2015-02-09
Chemiresistor sensor arrays are a promising technology to replace current laboratory-based analysis instrumentation, with the advantage of facile integration into portable, low-cost devices for in-field use. To increase the performance of chemiresistor sensor arrays a high-throughput fabrication and screening methodology was developed to assess different organothiol-functionalized gold nanoparticle chemiresistors. This high-throughput fabrication and testing methodology was implemented to screen a library consisting of 132 different organothiol compounds as capping agents for functionalized gold nanoparticle chemiresistor sensors. The methodology utilized an automated liquid handling workstation for the in situ functionalization of gold nanoparticle films and subsequent automated analyte testing of sensor arrays using a flow-injection analysis system. To test the methodology we focused on the discrimination and quantitation of benzene, toluene, ethylbenzene, p-xylene, and naphthalene (BTEXN) mixtures in water at low microgram per liter concentration levels. The high-throughput methodology identified a sensor array configuration consisting of a subset of organothiol-functionalized chemiresistors which in combination with random forests analysis was able to predict individual analyte concentrations with overall root-mean-square errors ranging between 8-17 μg/L for mixtures of BTEXN in water at the 100 μg/L concentration. The ability to use a simple sensor array system to quantitate BTEXN mixtures in water at the low μg/L concentration range has direct and significant implications to future environmental monitoring and reporting strategies. In addition, these results demonstrate the advantages of high-throughput screening to improve the performance of gold nanoparticle based chemiresistors for both new and existing applications.
Mapping the temporary and perennial character of whole river networks
NASA Astrophysics Data System (ADS)
González-Ferreras, A. M.; Barquín, J.
2017-08-01
Knowledge of the spatial distribution of temporary and perennial river channels in a whole catchment is important for effective integrated basin management and river biodiversity conservation. However, this information is usually not available or is incomplete. In this study, we present a statistically based methodology to classify river segments from a whole river network (Deva-Cares catchment, Northern Spain) as temporary or perennial. This method is based on an a priori classification of a subset of river segments as temporary or perennial, using field surveys and aerial images, and then running Random Forest models to predict classification membership for the rest of the river network. The independent variables and the river network were derived following a computer-based geospatial simulation of riverine landscapes. The model results show high values of overall accuracy, sensitivity, and specificity for the evaluation of the fitted model to the training and testing data set (≥0.9). The most important independent variables were catchment area, area occupied by broadleaf forest, minimum monthly precipitation in August, and average catchment elevation. The final map shows 7525 temporary river segments (1012.5 km) and 3731 perennial river segments (662.5 km). A subsequent validation of the mapping results using River Habitat Survey data and expert knowledge supported the validity of the proposed maps. We conclude that the proposed methodology is a valid method for mapping the limits of flow permanence that could substantially increase our understanding of the spatial links between terrestrial and aquatic interfaces, improving the research, management, and conservation of river biodiversity and functioning.
The Role of Satellite Data for the National Forest Monitoring Systems in the Context of REDD+
NASA Astrophysics Data System (ADS)
Jonckheere, Inge
2012-04-01
Reducing Emissions from Deforestation and Forest Degradation (REDD) is an effort to create a financial value for the carbon stored in forests, offering incentives for developing countries to reduce emissions from forested lands and invest in low-carbon paths to sustainable development. “REDD+” goes beyond deforestation and forest degradation, and includes the role of conservation, sustainable management of forests and enhancement of forest carbon stocks. In the framework of getting countries ready for REDD+, the UN-REDD Programme assists developing countries to prepare and implement national REDD+ strategies. For the monitoring, reporting and verification (MRV), FAO supports the countries to develop national forest monitoring systems (NFMS) based on satellite data that allow for credible MRV of REDD+ activities through time. The UN-REDD Programme through a joint effort of FAO and Brazil's National Space Agency, INPE, is supporting countries to develop cost- effective, robust and compatible national monitoring and MRV systems, providing tools, methodologies, training and knowledge sharing that help countries to strengthen their technical and institutional capacity for effective MRV systems. The Brazilian forest monitoring system, TerraAmazon, which is used as a multi-user basis, allows countries to adapt it to country needs. With the technical assistance of FAO, INPE and other stakeholders, the countries will set up an autonomous operational satellite forest monitoring systems. A beta version and the methodologies of the system for DRC and PNG are launched in Durban (SA) during COP 17, while Paraguay, Zambia and Viet Nam are in development in 2012.
Multiple filters affect tree species assembly in mid-latitude forest communities.
Kubota, Y; Kusumoto, B; Shiono, T; Ulrich, W
2018-05-01
Species assembly patterns of local communities are shaped by the balance between multiple abiotic/biotic filters and dispersal that both select individuals from species pools at the regional scale. Knowledge regarding functional assembly can provide insight into the relative importance of the deterministic and stochastic processes that shape species assembly. We evaluated the hierarchical roles of the α niche and β niches by analyzing the influence of environmental filtering relative to functional traits on geographical patterns of tree species assembly in mid-latitude forests. Using forest plot datasets, we examined the α niche traits (leaf and wood traits) and β niche properties (cold/drought tolerance) of tree species, and tested non-randomness (clustering/over-dispersion) of trait assembly based on null models that assumed two types of species pools related to biogeographical regions. For most plots, species assembly patterns fell within the range of random expectation. However, particularly for cold/drought tolerance-related β niche properties, deviation from randomness was frequently found; non-random clustering was predominant in higher latitudes with harsh climates. Our findings demonstrate that both randomness and non-randomness in trait assembly emerged as a result of the α and β niches, although we suggest the potential role of dispersal processes and/or species equalization through trait similarities in generating the prevalence of randomness. Clustering of β niche traits along latitudinal climatic gradients provides clear evidence of species sorting by filtering particular traits. Our results reveal that multiple filters through functional niches and stochastic processes jointly shape geographical patterns of species assembly across mid-latitude forests.
Hand pose estimation in depth image using CNN and random forest
NASA Astrophysics Data System (ADS)
Chen, Xi; Cao, Zhiguo; Xiao, Yang; Fang, Zhiwen
2018-03-01
Thanks to the availability of low cost depth cameras, like Microsoft Kinect, 3D hand pose estimation attracted special research attention in these years. Due to the large variations in hand`s viewpoint and the high dimension of hand motion, 3D hand pose estimation is still challenging. In this paper we propose a two-stage framework which joint with CNN and Random Forest to boost the performance of hand pose estimation. First, we use a standard Convolutional Neural Network (CNN) to regress the hand joints` locations. Second, using a Random Forest to refine the joints from the first stage. In the second stage, we propose a pyramid feature which merges the information flow of the CNN. Specifically, we get the rough joints` location from first stage, then rotate the convolutional feature maps (and image). After this, for each joint, we map its location to each feature map (and image) firstly, then crop features at each feature map (and image) around its location, put extracted features to Random Forest to refine at last. Experimentally, we evaluate our proposed method on ICVL dataset and get the mean error about 11mm, our method is also real-time on a desktop.
On Forests and Trees: A Response to Klingner.
ERIC Educational Resources Information Center
Neuman, Susan B.; Koskinen, Patricia
1993-01-01
Responds to criticisms raised in another article in this issue concerning a study of incidental word learning among second-language learners viewing captioned television. Suggests that the criticisms fail to "see the forest for the trees." Responds to specific methodological criticisms. (RS)
Recent drought conditions in the Conterminous United States
Frank H. Koch; William D. Smith; John W. Coulston
2013-01-01
Droughts are common in virtually all U.S. forests, but their frequency and intensity vary widely both between and within forest ecosystems (Hanson and Weltzin 2000). Forests in the Western United States generally exhibit a pattern of annual seasonal droughts. Forests in the Eastern United States tend to exhibit one of two prevailing patterns: random occasional droughts...
Santos, Alexandre Rosa Dos; Antonio Alvares Soares Ribeiro, Carlos; de Oliveira Peluzio, Telma Machado; Esteves Peluzio, João Batista; de Queiroz, Vagner Tebaldi; Figueira Branco, Elvis Ricardo; Lorenzon, Alexandre Simões; Domingues, Getulio Fonseca; Marcatti, Gustavo Eduardo; de Castro, Nero Lemos Martins; Teixeira, Thaisa Ribeiro; Dos Santos, Gleissy Mary Amaral Dino Alves; Santos Mota, Pedro Henrique; Ferreira da Silva, Samuel; Vargas, Rozimelia; de Carvalho, José Romário; Macedo, Leandro Levate; da Silva Araújo, Cintia; de Almeida, Samira Luns Hatum
2016-12-01
The Atlantic Forest biome is recognized for its biodiversity and is one of the most threatened biomes on the planet, with forest fragmentation increasing due to uncontrolled land use, land occupation, and population growth. The most serious aspect of the forest fragmentation process is the edge effect and the loss of biodiversity. In this context, the aim of this study was to evaluate the dynamics of forest fragmentation and select potential forest fragments with a higher degree of conservation for seed harvesting in the Itapemirim river basin, Espírito Santo State, Brazil. Image classification techniques, forest landscape ecology, and multi-criteria analysis were used to evaluate the evolution of forest fragmentation to develop the landscape metric indexes, and to select potential forest fragments for seed harvesting for the years 1985 and 2013. According to the results, there was a reduction of 2.55% of the occupancy of the fragments in the basin between the years 1985 and 2013. For the years 1985 and 2013, forest fragment units 2 and 3 were spatialized with a high potential for seed harvesting, representing 6.99% and 16.01% of the total fragments, respectively. The methodology used in this study has the potential to be used to support decisions for the selection of potential fragments for seed harvesting because selecting fragments in different environments by their spatial attributes provides a greater degree of conservation, contributing to the protection and conscious management of the forests. The proposed methodology can be adapted to other areas and different biomes of the world. Copyright © 2016 Elsevier Ltd. All rights reserved.
Sara A. Goeking; Paul L. Patterson
2013-01-01
The USDA Forest Serviceâs Forest Inventory and Analysis (FIA) Program applies specific sampling and analysis procedures to estimate a variety of forest attributes. FIAâs Interior West region uses post-stratification, where strata consist of forest/nonforest polygons based on MODIS imagery, and assumes that nonresponse plots are distributed at random across each stratum...
RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest.
Ismail, Hamid D; Jones, Ahoi; Kim, Jung H; Newman, Robert H; Kc, Dukka B
2016-01-01
Protein phosphorylation is one of the most widespread regulatory mechanisms in eukaryotes. Over the past decade, phosphorylation site prediction has emerged as an important problem in the field of bioinformatics. Here, we report a new method, termed Random Forest-based Phosphosite predictor 2.0 (RF-Phos 2.0), to predict phosphorylation sites given only the primary amino acid sequence of a protein as input. RF-Phos 2.0, which uses random forest with sequence and structural features, is able to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation and an independent dataset, RF-Phos 2.0 compares favorably to other popular mammalian phosphosite prediction methods, such as PhosphoSVM, GPS2.1, and Musite.
NASA Astrophysics Data System (ADS)
Norajitra, Tobias; Meinzer, Hans-Peter; Maier-Hein, Klaus H.
2015-03-01
During image segmentation, 3D Statistical Shape Models (SSM) usually conduct a limited search for target landmarks within one-dimensional search profiles perpendicular to the model surface. In addition, landmark appearance is modeled only locally based on linear profiles and weak learners, altogether leading to segmentation errors from landmark ambiguities and limited search coverage. We present a new method for 3D SSM segmentation based on 3D Random Forest Regression Voting. For each surface landmark, a Random Regression Forest is trained that learns a 3D spatial displacement function between the according reference landmark and a set of surrounding sample points, based on an infinite set of non-local randomized 3D Haar-like features. Landmark search is then conducted omni-directionally within 3D search spaces, where voxelwise forest predictions on landmark position contribute to a common voting map which reflects the overall position estimate. Segmentation experiments were conducted on a set of 45 CT volumes of the human liver, of which 40 images were randomly chosen for training and 5 for testing. Without parameter optimization, using a simple candidate selection and a single resolution approach, excellent results were achieved, while faster convergence and better concavity segmentation were observed, altogether underlining the potential of our approach in terms of increased robustness from distinct landmark detection and from better search coverage.
NASA Astrophysics Data System (ADS)
Zhang, H.; Roy, D. P.
2016-12-01
Classification is a fundamental process in remote sensing used to relate pixel values to land cover classes present on the surface. The state of the practice for large area land cover classification is to classify satellite time series metrics with a supervised (i.e., training data dependent) non-parametric classifier. Classification accuracy generally increases with training set size. However, training data collection is expensive and the optimal training distribution over large areas is unknown. The MODIS 500 m land cover product is available globally on an annual basis and so provides a potentially very large source of land cover training data. A novel methodology to classify large volume Landsat data using high quality training data derived automatically from the MODIS land cover product is demonstrated for all of the Conterminous United States (CONUS). The known misclassification accuracy of the MODIS land cover product and the scale difference between the 500 m MODIS and 30 m Landsat data are accommodated for by a novel MODIS product filtering, Landsat pixel selection, and iterative training approach to balance the proportion of local and CONUS training data used. Three years of global Web-enabled Landsat data (WELD) data for all of the CONUS are classified using a random forest classifier and the results assessed using random forest `out-of-bag' training samples. The global WELD data are corrected to surface nadir BRDF-Adjusted Reflectance and are defined in 158 × 158 km tiles in the same projection and nested to the MODIS land cover products. This reduces the need to pre-process the considerable Landsat data volume (more than 14,000 Landsat 5 and 7 scenes per year over the CONUS covering 11,000 million 30 m pixels). The methodology is implemented in a parallel manner on WELD tile by tile basis but provides a wall-to-wall seamless 30 m land cover product. Detailed tile and CONUS results are presented and the potential for global production using the recently available global WELD products are discussed.
Nunes, Matheus Henrique
2016-01-01
Tree stem form in native tropical forests is very irregular, posing a challenge to establishing taper equations that can accurately predict the diameter at any height along the stem and subsequently merchantable volume. Artificial intelligence approaches can be useful techniques in minimizing estimation errors within complex variations of vegetation. We evaluated the performance of Random Forest® regression tree and Artificial Neural Network procedures in modelling stem taper. Diameters and volume outside bark were compared to a traditional taper-based equation across a tropical Brazilian savanna, a seasonal semi-deciduous forest and a rainforest. Neural network models were found to be more accurate than the traditional taper equation. Random forest showed trends in the residuals from the diameter prediction and provided the least precise and accurate estimations for all forest types. This study provides insights into the superiority of a neural network, which provided advantages regarding the handling of local effects. PMID:27187074
Electromagnetic wave extinction within a forested canopy
NASA Technical Reports Server (NTRS)
Karam, M. A.; Fung, A. K.
1989-01-01
A forested canopy is modeled by a collection of randomly oriented finite-length cylinders shaded by randomly oriented and distributed disk- or needle-shaped leaves. For a plane wave exciting the forested canopy, the extinction coefficient is formulated in terms of the extinction cross sections (ECSs) in the local frame of each forest component and the Eulerian angles of orientation (used to describe the orientation of each component). The ECSs in the local frame for the finite-length cylinders used to model the branches are obtained by using the forward-scattering theorem. ECSs in the local frame for the disk- and needle-shaped leaves are obtained by the summation of the absorption and scattering cross-sections. The behavior of the extinction coefficients with the incidence angle is investigated numerically for both deciduous and coniferous forest. The dependencies of the extinction coefficients on the orientation of the leaves are illustrated numerically.
Nunes, Matheus Henrique; Görgens, Eric Bastos
2016-01-01
Tree stem form in native tropical forests is very irregular, posing a challenge to establishing taper equations that can accurately predict the diameter at any height along the stem and subsequently merchantable volume. Artificial intelligence approaches can be useful techniques in minimizing estimation errors within complex variations of vegetation. We evaluated the performance of Random Forest® regression tree and Artificial Neural Network procedures in modelling stem taper. Diameters and volume outside bark were compared to a traditional taper-based equation across a tropical Brazilian savanna, a seasonal semi-deciduous forest and a rainforest. Neural network models were found to be more accurate than the traditional taper equation. Random forest showed trends in the residuals from the diameter prediction and provided the least precise and accurate estimations for all forest types. This study provides insights into the superiority of a neural network, which provided advantages regarding the handling of local effects.
E. Freeman; G. Moisen; J. Coulston; B. Wilson
2014-01-01
Random forests (RF) and stochastic gradient boosting (SGB), both involving an ensemble of classification and regression trees, are compared for modeling tree canopy cover for the 2011 National Land Cover Database (NLCD). The objectives of this study were twofold. First, sensitivity of RF and SGB to choices in tuning parameters was explored. Second, performance of the...
Relationship of field and LiDAR estimates of forest canopy cover with snow accumulation and melt
Mariana Dobre; William J. Elliot; Joan Q. Wu; Timothy E. Link; Brandon Glaza; Theresa B. Jain; Andrew T. Hudak
2012-01-01
At the Priest River Experimental Forest in northern Idaho, USA, snow water equivalent (SWE) was recorded over a period of six years on random, equally-spaced plots in ~4.5 ha small watersheds (n=10). Two watersheds were selected as controls and eight as treatments, with two watersheds randomly assigned per treatment as follows: harvest (2007) followed by mastication (...
L.R. Iverson; A.M. Prasad; A. Liaw
2004-01-01
More and better machine learning tools are becoming available for landscape ecologists to aid in understanding species-environment relationships and to map probable species occurrence now and potentially into the future. To thal end, we evaluated three statistical models: Regression Tree Analybib (RTA), Bagging Trees (BT) and Random Forest (RF) for their utility in...
Elizabeth A. Freeman; Gretchen G. Moisen; John W. Coulston; Barry T. (Ty) Wilson
2015-01-01
As part of the development of the 2011 National Land Cover Database (NLCD) tree canopy cover layer, a pilot project was launched to test the use of high-resolution photography coupled with extensive ancillary data to map the distribution of tree canopy cover over four study regions in the conterminous US. Two stochastic modeling techniques, random forests (RF...
Chapter4 - Drought patterns in the conterminous United States and Hawaii.
Frank H. Koch; William D. Smith; John W. Coulston
2014-01-01
Droughts are common in virtually all U.S. forests, but their frequency and intensity vary widely both between and within forest ecosystems (Hanson and Weltzin 2000). Forests in the Western United States generally exhibit a pattern of annual seasonal droughts. Forests in the Eastern United States tend to exhibit one of two prevailing patterns: random occasional droughts...
Steve Zack; William F. Laudenslayer; Luke George; Carl Skinner; William Oliver
1999-01-01
At two different locations in northeast California, an interdisciplinary team of scientists is initiating long-term studies to quantify the effects of forest manipulations intended to accelerate andlor enhance late-successional structure of eastside pine forest ecosystems. One study, at Blacks Mountain Experimental Forest, uses a split-plot, factorial, randomized block...
Probabilistic risk models for multiple disturbances: an example of forest insects and wildfires
Haiganoush K. Preisler; Alan A. Ager; Jane L. Hayes
2010-01-01
Building probabilistic risk models for highly random forest disturbances like wildfire and forest insect outbreaks is a challenging. Modeling the interactions among natural disturbances is even more difficult. In the case of wildfire and forest insects, we looked at the probability of a large fire given an insect outbreak and also the incidence of insect outbreaks...
NASA Technical Reports Server (NTRS)
Yao, S. S. (Principal Investigator)
1981-01-01
The planning and scheduling of the use of remote sensing and computer technology to support the land management planning effort at the national forests level are outlined. The task planning and system capability development were reviewed. A user evaluation is presented along with technological transfer methodology. A land management planning pilot test of the San Juan National Forest is discussed.
Michez, Adrien; Piégay, Hervé; Lisein, Jonathan; Claessens, Hugues; Lejeune, Philippe
2016-03-01
Riparian forests are critically endangered many anthropogenic pressures and natural hazards. The importance of riparian zones has been acknowledged by European Directives, involving multi-scale monitoring. The use of this very-high-resolution and hyperspatial imagery in a multi-temporal approach is an emerging topic. The trend is reinforced by the recent and rapid growth of the use of the unmanned aerial system (UAS), which has prompted the development of innovative methodology. Our study proposes a methodological framework to explore how a set of multi-temporal images acquired during a vegetative period can differentiate some of the deciduous riparian forest species and their health conditions. More specifically, the developed approach intends to identify, through a process of variable selection, which variables derived from UAS imagery and which scale of image analysis are the most relevant to our objectives.The methodological framework is applied to two study sites to describe the riparian forest through two fundamental characteristics: the species composition and the health condition. These characteristics were selected not only because of their use as proxies for the riparian zone ecological integrity but also because of their use for river management.The comparison of various scales of image analysis identified the smallest object-based image analysis (OBIA) objects (ca. 1 m(2)) as the most relevant scale. Variables derived from spectral information (bands ratios) were identified as the most appropriate, followed by variables related to the vertical structure of the forest. Classification results show good overall accuracies for the species composition of the riparian forest (five classes, 79.5 and 84.1% for site 1 and site 2). The classification scenario regarding the health condition of the black alders of the site 1 performed the best (90.6%).The quality of the classification models developed with a UAS-based, cost-effective, and semi-automatic approach competes successfully with those developed using more expensive imagery, such as multi-spectral and hyperspectral airborne imagery. The high overall accuracy results obtained by the classification of the diseased alders open the door to applications dedicated to monitoring of the health conditions of riparian forest. Our methodological framework will allow UAS users to manage large imagery metric datasets derived from those dense time series.
Utilizing random forests imputation of forest plot data for landscape-level wildfire analyses
Karin L. Riley; Isaac C. Grenfell; Mark A. Finney; Nicholas L. Crookston
2014-01-01
Maps of the number, size, and species of trees in forests across the United States are desirable for a number of applications. For landscape-level fire and forest simulations that use the Forest Vegetation Simulator (FVS), a spatial tree-level dataset, or âtree listâ, is a necessity. FVS is widely used at the stand level for simulating fire effects on tree mortality,...
Chen, Xuexia; Liu, Shuguang; Zhu, Zhiliang; Vogelmann, James E.; Li, Zhengpeng; Ohlen, Donald O.
2011-01-01
The concentrations of CO2 and other greenhouse gases in the atmosphere have been increasing and greatly affecting global climate and socio-economic systems. Actively growing forests are generally considered to be a major carbon sink, but forest wildfires lead to large releases of biomass carbon into the atmosphere. Aboveground forest biomass carbon (AFBC), an important ecological indicator, and fire-induced carbon emissions at regional scales are highly relevant to forest sustainable management and climate change. It is challenging to accurately estimate the spatial distribution of AFBC across large areas because of the spatial heterogeneity of forest cover types and canopy structure. In this study, Forest Inventory and Analysis (FIA) data, Landsat, and Landscape Fire and Resource Management Planning Tools Project (LANDFIRE) data were integrated in a regression tree model for estimating AFBC at a 30-m resolution in the Utah High Plateaus. AFBC were calculated from 225 FIA field plots and used as the dependent variable in the model. Of these plots, 10% were held out for model evaluation with stratified random sampling, and the other 90% were used as training data to develop the regression tree model. Independent variable layers included Landsat imagery and the derived spectral indicators, digital elevation model (DEM) data and derivatives, biophysical gradient data, existing vegetation cover type and vegetation structure. The cross-validation correlation coefficient (r value) was 0.81 for the training model. Independent validation using withheld plot data was similar with r value of 0.82. This validated regression tree model was applied to map AFBC in the Utah High Plateaus and then combined with burn severity information to estimate loss of AFBC in the Longston fire of Zion National Park in 2001. The final dataset represented 24 forest cover types for a 4 million ha forested area. We estimated a total of 353 Tg AFBC with an average of 87 MgC/ha in the Utah High Plateaus. We also estimated that 8054 Mg AFBC were released from 2.24 km2 burned forest area in the Longston fire. These results demonstrate that an AFBC spatial map and estimated biomass carbon consumption can readily be generated using existing database. The methodology provides a consistent, practical, and inexpensive way for estimating AFBC at 30-m resolution over large areas throughout the United States.
NASA Astrophysics Data System (ADS)
Ksoll, Victor F.; Gouliermis, Dimitrios A.; Klessen, Ralf S.; Grebel, Eva K.; Sabbi, Elena; Anderson, Jay; Lennon, Daniel J.; Cignoni, Michele; de Marchi, Guido; Smith, Linda J.; Tosi, Monica; van der Marel, Roeland P.
2018-05-01
The Hubble Tarantula Treasury Project (HTTP) has provided an unprecedented photometric coverage of the entire star-burst region of 30 Doradus down to the half Solar mass limit. We use the deep stellar catalogue of HTTP to identify all the pre-main-sequence (PMS) stars of the region, i.e., stars that have not started their lives on the main-sequence yet. The photometric distinction of these stars from the more evolved populations is not a trivial task due to several factors that alter their colour-magnitude diagram positions. The identification of PMS stars requires, thus, sophisticated statistical methods. We employ Machine Learning Classification techniques on the HTTP survey of more than 800,000 sources to identify the PMS stellar content of the observed field. Our methodology consists of 1) carefully selecting the most probable low-mass PMS stellar population of the star-forming cluster NGC2070, 2) using this sample to train classification algorithms to build a predictive model for PMS stars, and 3) applying this model in order to identify the most probable PMS content across the entire Tarantula Nebula. We employ Decision Tree, Random Forest and Support Vector Machine classifiers to categorise the stars as PMS and Non-PMS. The Random Forest and Support Vector Machine provided the most accurate models, predicting about 20,000 sources with a candidateship probability higher than 50 percent, and almost 10,000 PMS candidates with a probability higher than 95 percent. This is the richest and most accurate photometric catalogue of extragalactic PMS candidates across the extent of a whole star-forming complex.
What variables are important in predicting bovine viral diarrhea virus? A random forest approach.
Machado, Gustavo; Mendoza, Mariana Recamonde; Corbellini, Luis Gustavo
2015-07-24
Bovine viral diarrhea virus (BVDV) causes one of the most economically important diseases in cattle, and the virus is found worldwide. A better understanding of the disease associated factors is a crucial step towards the definition of strategies for control and eradication. In this study we trained a random forest (RF) prediction model and performed variable importance analysis to identify factors associated with BVDV occurrence. In addition, we assessed the influence of features selection on RF performance and evaluated its predictive power relative to other popular classifiers and to logistic regression. We found that RF classification model resulted in an average error rate of 32.03% for the negative class (negative for BVDV) and 36.78% for the positive class (positive for BVDV).The RF model presented area under the ROC curve equal to 0.702. Variable importance analysis revealed that important predictors of BVDV occurrence were: a) who inseminates the animals, b) number of neighboring farms that have cattle and c) rectal palpation performed routinely. Our results suggest that the use of machine learning algorithms, especially RF, is a promising methodology for the analysis of cross-sectional studies, presenting a satisfactory predictive power and the ability to identify predictors that represent potential risk factors for BVDV investigation. We examined classical predictors and found some new and hard to control practices that may lead to the spread of this disease within and among farms, mainly regarding poor or neglected reproduction management, which should be considered for disease control and eradication.
Webb, Samuel J; Hanser, Thierry; Howlin, Brendan; Krause, Paul; Vessey, Jonathan D
2014-03-25
A new algorithm has been developed to enable the interpretation of black box models. The developed algorithm is agnostic to learning algorithm and open to all structural based descriptors such as fragments, keys and hashed fingerprints. The algorithm has provided meaningful interpretation of Ames mutagenicity predictions from both random forest and support vector machine models built on a variety of structural fingerprints.A fragmentation algorithm is utilised to investigate the model's behaviour on specific substructures present in the query. An output is formulated summarising causes of activation and deactivation. The algorithm is able to identify multiple causes of activation or deactivation in addition to identifying localised deactivations where the prediction for the query is active overall. No loss in performance is seen as there is no change in the prediction; the interpretation is produced directly on the model's behaviour for the specific query. Models have been built using multiple learning algorithms including support vector machine and random forest. The models were built on public Ames mutagenicity data and a variety of fingerprint descriptors were used. These models produced a good performance in both internal and external validation with accuracies around 82%. The models were used to evaluate the interpretation algorithm. Interpretation was revealed that links closely with understood mechanisms for Ames mutagenicity. This methodology allows for a greater utilisation of the predictions made by black box models and can expedite further study based on the output for a (quantitative) structure activity model. Additionally the algorithm could be utilised for chemical dataset investigation and knowledge extraction/human SAR development.
Alternative methods to evaluate trial level surrogacy.
Abrahantes, Josè Cortiñas; Shkedy, Ziv; Molenberghs, Geert
2008-01-01
The evaluation and validation of surrogate endpoints have been extensively studied in the last decade. Prentice [1] and Freedman, Graubard and Schatzkin [2] laid the foundations for the evaluation of surrogate endpoints in randomized clinical trials. Later, Buyse et al. [5] proposed a meta-analytic methodology, producing different methods for different settings, which was further studied by Alonso and Molenberghs [9], in their unifying approach based on information theory. In this article, we focus our attention on the trial-level surrogacy and propose alternative procedures to evaluate such surrogacy measure, which do not pre-specify the type of association. A promising correction based on cross-validation is investigated. As well as the construction of confidence intervals for this measure. In order to avoid making assumption about the type of relationship between the treatment effects and its distribution, a collection of alternative methods, based on regression trees, bagging, random forests, and support vector machines, combined with bootstrap-based confidence interval and, should one wish, in conjunction with a cross-validation based correction, will be proposed and applied. We apply the various strategies to data from three clinical studies: in opthalmology, in advanced colorectal cancer, and in schizophrenia. The results obtained for the three case studies are compared; they indicate that using random forest or bagging models produces larger estimated values for the surrogacy measure, which are in general stabler and the confidence interval narrower than linear regression and support vector regression. For the advanced colorectal cancer studies, we even found the trial-level surrogacy is considerably different from what has been reported. In general the alternative methods are more computationally demanding, and specially the calculation of the confidence intervals, require more computational time that the delta-method counterpart. First, more flexible modeling techniques can be used, allowing for other type of association. Second, when no cross-validation-based correction is applied, overly optimistic trial-level surrogacy estimates will be found, thus cross-validation is highly recommendable. Third, the use of the delta method to calculate confidence intervals is not recommendable since it makes assumptions valid only in very large samples. It may also produce range-violating limits. We therefore recommend alternatives: bootstrap methods in general. Also, the information-theoretic approach produces comparable results with the bagging and random forest approaches, when cross-validation correction is applied. It is also important to observe that, even for the case in which the linear model might be a good option too, bagging methods perform well too, and their confidence intervals were more narrow.
Random Forest Application for NEXRAD Radar Data Quality Control
NASA Astrophysics Data System (ADS)
Keem, M.; Seo, B. C.; Krajewski, W. F.
2017-12-01
Identification and elimination of non-meteorological radar echoes (e.g., returns from ground, wind turbines, and biological targets) are the basic data quality control steps before radar data use in quantitative applications (e.g., precipitation estimation). Although WSR-88Ds' recent upgrade to dual-polarization has enhanced this quality control and echo classification, there are still challenges to detect some non-meteorological echoes that show precipitation-like characteristics (e.g., wind turbine or anomalous propagation clutter embedded in rain). With this in mind, a new quality control method using Random Forest is proposed in this study. This classification algorithm is known to produce reliable results with less uncertainty. The method introduces randomness into sampling and feature selections and integrates consequent multiple decision trees. The multidimensional structure of the trees can characterize the statistical interactions of involved multiple features in complex situations. The authors explore the performance of Random Forest method for NEXRAD radar data quality control. Training datasets are selected using several clear cases of precipitation and non-precipitation (but with some non-meteorological echoes). The model is structured using available candidate features (from the NEXRAD data) such as horizontal reflectivity, differential reflectivity, differential phase shift, copolar correlation coefficient, and their horizontal textures (e.g., local standard deviation). The influence of each feature on classification results are quantified by variable importance measures that are automatically estimated by the Random Forest algorithm. Therefore, the number and types of features in the final forest can be examined based on the classification accuracy. The authors demonstrate the capability of the proposed approach using several cases ranging from distinct to complex rain/no-rain events and compare the performance with the existing algorithms (e.g., MRMS). They also discuss operational feasibility based on the observed strength and weakness of the method.
Economic vulnerability of timber resources to forest fires.
y Silva, Francisco Rodríguez; Molina, Juan Ramón; González-Cabán, Armando; Machuca, Miguel Ángel Herrera
2012-06-15
The temporal-spatial planning of activities for a territorial fire management program requires knowing the value of forest ecosystems. In this paper we extend to and apply the economic valuation principle to the concept of economic vulnerability and present a methodology for the economic valuation of the forest production ecosystems. The forest vulnerability is analyzed from criteria intrinsically associated to the forest characterization, and to the potential behavior of surface fires. Integrating a mapping process of fire potential and analytical valuation algorithms facilitates the implementation of fire prevention planning. The availability of cartography of economic vulnerability of the forest ecosystems is fundamental for budget optimization, and to help in the decision making process. Published by Elsevier Ltd.
NASA Technical Reports Server (NTRS)
Spurce, Joseph P.; Hargrove, William; Ryan, Robert E.; Smooth, James C.; Prados, Don; McKellip, Rodney; Sader, Steven A.; Gasser, Jerry; May, George
2008-01-01
This viewgraph presentation reviews a project, the goal of which is to study the potential of MODIS data for monitoring historic gypsy moth defoliation. A NASA/USDA Forest Service (USFS) partnership was formed to perform the study. NASA is helping USFS to implement satellite data products into its emerging Forest Threat Early Warning System. The latter system is being developed by the USFS Eastern and Western Forest Threat Assessment Centers. The USFS Forest Threat Centers want to use MODIS time series data for regional monitoring of forest damage (e.g., defoliation) preferably in near real time. The study's methodology is described, and the results of the study are shown.
Fault Detection of Aircraft System with Random Forest Algorithm and Similarity Measure
Park, Wookje; Jung, Sikhang
2014-01-01
Research on fault detection algorithm was developed with the similarity measure and random forest algorithm. The organized algorithm was applied to unmanned aircraft vehicle (UAV) that was readied by us. Similarity measure was designed by the help of distance information, and its usefulness was also verified by proof. Fault decision was carried out by calculation of weighted similarity measure. Twelve available coefficients among healthy and faulty status data group were used to determine the decision. Similarity measure weighting was done and obtained through random forest algorithm (RFA); RF provides data priority. In order to get a fast response of decision, a limited number of coefficients was also considered. Relation of detection rate and amount of feature data were analyzed and illustrated. By repeated trial of similarity calculation, useful data amount was obtained. PMID:25057508
A primer on stand and forest inventory designs
H. Gyde Lund; Charles E. Thomas
1989-01-01
Covers designs for the inventory of stands and forests in detail and with worked-out examples. For stands, random sampling, line transects, ricochet plot, systematic sampling, single plot, cluster, subjective sampling and complete enumeration are discussed. For forests inventory, the main categories are subjective sampling, inventories without prior stand mapping,...
A methodology for mapping forest latent heat flux densities using remote sensing
NASA Technical Reports Server (NTRS)
Pierce, Lars L.; Congalton, Russell G.
1988-01-01
Surface temperatures and reflectances of an upper elevation Sierran mixed conifer forest were monitored using the Thematic Mapper Simulator sensor during the summer of 1985 in order to explore the possibility of using remote sensing to determine the distribution of solar energy on forested watersheds. The results show that the method is capable of quantifying the relative energy allocation relationships between the two cover types defined in the study. It is noted that the method also has the potential to map forest latent heat flux densities.
John Yarie
1983-01-01
The forest vegetation of 3,600,000 hectares in northeast interior Alaska was classified. A total of 365 plots located in a stratified random design were run through the ordination programs SIMORD and TWINSPAN. A total of 40 forest communities were described vegetatively and, to a limited extent, environmentally. The area covered by each community was similar, ranging...
JoAnn M. Hanowski; Gerald J. Niemi
1995-01-01
We established bird monitoring programs in two regions of Minnesota: the Chippewa National Forest and the Superior National Forest. The experimental design defined forest cover types as strata in which samples of forest stands were randomly selected. Subsamples (3 point counts) were placed in each stand to maximize field effort and to assess within-stand and between-...
Predicting live and dead tree basal area of bark beetle affected forests from discrete-return lidar
Benjamin C. Bright; Andrew T. Hudak; Robert McGaughey; Hans-Erik Andersen; Jose Negron
2013-01-01
Bark beetle outbreaks have killed large numbers of trees across North America in recent years. Lidar remote sensing can be used to effectively estimate forest biomass, but prediction of both live and dead standing biomass in beetle-affected forests using lidar alone has not been demonstrated. We developed Random Forest (RF) models predicting total, live, dead, and...
Valuing the Recreational Benefits from the Creation of Nature Reserves in Irish Forests
Riccardo Scarpa; Susan M. Chilton; W. George Hutchinson; Joseph Buongiorno
2000-01-01
Data from a large-scale contingent valuation study are used to investigate the effects of forest attribum on willingness to pay for forest recreation in Ireland. In particular, the presence of a nature reserve in the forest is found to significantly increase the visitors' willingness to pay. A random utility model is used to estimate the welfare change associated...
Elizabeth A. Freeman; Gretchen G. Moisen; Tracy S. Frescino
2012-01-01
Random Forests is frequently used to model species distributions over large geographic areas. Complications arise when data used to train the models have been collected in stratified designs that involve different sampling intensity per stratum. The modeling process is further complicated if some of the target species are relatively rare on the landscape leading to an...
Unbiased feature selection in learning random forests for high-dimensional data.
Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi
2015-01-01
Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.
A random forest learning assisted "divide and conquer" approach for peptide conformation search.
Chen, Xin; Yang, Bing; Lin, Zijing
2018-06-11
Computational determination of peptide conformations is challenging as it is a problem of finding minima in a high-dimensional space. The "divide and conquer" approach is promising for reliably reducing the search space size. A random forest learning model is proposed here to expand the scope of applicability of the "divide and conquer" approach. A random forest classification algorithm is used to characterize the distributions of the backbone φ-ψ units ("words"). A random forest supervised learning model is developed to analyze the combinations of the φ-ψ units ("grammar"). It is found that amino acid residues may be grouped as equivalent "words", while the φ-ψ combinations in low-energy peptide conformations follow a distinct "grammar". The finding of equivalent words empowers the "divide and conquer" method with the flexibility of fragment substitution. The learnt grammar is used to improve the efficiency of the "divide and conquer" method by removing unfavorable φ-ψ combinations without the need of dedicated human effort. The machine learning assisted search method is illustrated by efficiently searching the conformations of GGG/AAA/GGGG/AAAA/GGGGG through assembling the structures of GFG/GFGG. Moreover, the computational cost of the new method is shown to increase rather slowly with the peptide length.
Evaluation of methodology for detecting/predicting migration of forest species
Dale S. Solomon; William B. Leak
1996-01-01
Available methods for analyzing migration of forest species are evaluated, including simulation models, remeasured plots, resurveys, pollen/vegetation analysis, and age/distance trends. Simulation models have provided some of the most drastic estimates of species changes due to predicted changes in global climate. However, these models require additional testing...
Projecting Timber Inventory at the Product Level
Lawrence Teeter; Xiaoping Zhou
1999-01-01
Current timber inventory projections generally lack information on inventory by product classes. Most models available for inventory projection and linked to supply analyses are limited to projecting aggregate softwood and hardwood. The research presented describes a methodology for distributing the volume on each FIA (USDA Forest Service Forest Inventory and Analysis...
NASA Astrophysics Data System (ADS)
Liu, Chenguang; Cheng, Heng-Da; Zhang, Yingtao; Wang, Yuxuan; Xian, Min
2016-01-01
This paper presents a methodology for tracking multiple skaters in short track speed skating competitions. Nonrigid skaters move at high speed with severe occlusions happening frequently among them. The camera is panned quickly in order to capture the skaters in a large and dynamic scene. To automatically track the skaters and precisely output their trajectories becomes a challenging task in object tracking. We employ the global rink information to compensate camera motion and obtain the global spatial information of skaters, utilize random forest to fuse multiple cues and predict the blob of each skater, and finally apply a silhouette- and edge-based template-matching and blob-evolving method to labelling pixels to a skater. The effectiveness and robustness of the proposed method are verified through thorough experiments.
Robert G. Ribe
2013-01-01
Perceptions of public forestsâ acceptability can be infl uenced by aesthetic qualities, at both broad and project levels, aff ecting managersâ social license to act. Legal and methodological issues related to measuring and managing forest aesthetics in NEPA and NFMA decision-making are discussed. It is argued that conventional visual impact assessmentsâusing...
Hans-Erik Andersen; Jacob Strunk; Hailemariam Temesgen
2011-01-01
Airborne laser scanning, collected in a sampling mode, has the potential to be a valuable tool for estimating the biomass resources available to support bioenergy production in rural communities of interior Alaska. In this study, we present a methodology for estimating forest biomass over a 201,226-ha area (of which 163,913 ha are forested) in the upper Tanana valley...
Prediction of forest fires occurrences with area-level Poisson mixed models.
Boubeta, Miguel; Lombardía, María José; Marey-Pérez, Manuel Francisco; Morales, Domingo
2015-05-01
The number of fires in forest areas of Galicia (north-west of Spain) during the summer period is quite high. Local authorities are interested in analyzing the factors that explain this phenomenon. Poisson regression models are good tools for describing and predicting the number of fires per forest areas. This work employs area-level Poisson mixed models for treating real data about fires in forest areas. A parametric bootstrap method is applied for estimating the mean squared errors of fires predictors. The developed methodology and software are applied to a real data set of fires in forest areas of Galicia. Copyright © 2015 Elsevier Ltd. All rights reserved.
Tustison, Nicholas J; Shrinidhi, K L; Wintermark, Max; Durst, Christopher R; Kandel, Benjamin M; Gee, James C; Grossman, Murray C; Avants, Brian B
2015-04-01
Segmenting and quantifying gliomas from MRI is an important task for diagnosis, planning intervention, and for tracking tumor changes over time. However, this task is complicated by the lack of prior knowledge concerning tumor location, spatial extent, shape, possible displacement of normal tissue, and intensity signature. To accommodate such complications, we introduce a framework for supervised segmentation based on multiple modality intensity, geometry, and asymmetry feature sets. These features drive a supervised whole-brain and tumor segmentation approach based on random forest-derived probabilities. The asymmetry-related features (based on optimal symmetric multimodal templates) demonstrate excellent discriminative properties within this framework. We also gain performance by generating probability maps from random forest models and using these maps for a refining Markov random field regularized probabilistic segmentation. This strategy allows us to interface the supervised learning capabilities of the random forest model with regularized probabilistic segmentation using the recently developed ANTsR package--a comprehensive statistical and visualization interface between the popular Advanced Normalization Tools (ANTs) and the R statistical project. The reported algorithmic framework was the top-performing entry in the MICCAI 2013 Multimodal Brain Tumor Segmentation challenge. The challenge data were widely varying consisting of both high-grade and low-grade glioma tumor four-modality MRI from five different institutions. Average Dice overlap measures for the final algorithmic assessment were 0.87, 0.78, and 0.74 for "complete", "core", and "enhanced" tumor components, respectively.
Pasqualini, Vanina; Oberti, Pascal; Vigetta, Stéphanie; Riffard, Olivier; Panaïotis, Christophe; Cannac, Magali; Ferrat, Lila
2011-07-01
Forest management can benefit from decision support tools, including GIS-based multicriteria decision-aiding approach. In the Mediterranean region, Pinus pinaster forests play a very important role in biodiversity conservation and offer many socioeconomic benefits. However, the conservation of this species is affected by the increase in forest fires and the expansion of Matsucoccus feytaudi. This paper proposes a methodology based on commonly available data for assessing the values and risks of P. pinaster forests and to generating maps to aid in decisions pertaining to fire and phytosanitary risk management. The criteria for assessing the values (land cover type, legislative tools for biodiversity conservation, environmental tourist sites and access routes, and timber yield) and the risks (fire and phytosanitation) of P. pinaster forests were obtained directly or by considering specific indicators, and they were subsequently aggregated by means of GIS-based multicriteria analysis. This approach was tested on the island of Corsica (France), and maps to aid in decisions pertaining to fire risk and phytosanitary risk (M. feytaudi) were obtained for P. pinaster forest management. Study results are used by the technical offices of the local administration-Corsican Agricultural and Rural Development Agency (ODARC)-for planning the conservation of P. pinaster forests with regard to fire prevention and safety and phytosanitary risks. The decision maker took part in the evaluation criteria study (weight, normalization, and classification of the values). Most suitable locations are given to target the public intervention. The methodology presented in this paper could be applied to other species and in other Mediterranean regions.
NASA Astrophysics Data System (ADS)
Pasqualini, Vanina; Oberti, Pascal; Vigetta, Stéphanie; Riffard, Olivier; Panaïotis, Christophe; Cannac, Magali; Ferrat, Lila
2011-07-01
Forest management can benefit from decision support tools, including GIS-based multicriteria decision-aiding approach. In the Mediterranean region, Pinus pinaster forests play a very important role in biodiversity conservation and offer many socioeconomic benefits. However, the conservation of this species is affected by the increase in forest fires and the expansion of Matsucoccus feytaudi. This paper proposes a methodology based on commonly available data for assessing the values and risks of P. pinaster forests and to generating maps to aid in decisions pertaining to fire and phytosanitary risk management. The criteria for assessing the values (land cover type, legislative tools for biodiversity conservation, environmental tourist sites and access routes, and timber yield) and the risks (fire and phytosanitation) of P. pinaster forests were obtained directly or by considering specific indicators, and they were subsequently aggregated by means of GIS-based multicriteria analysis. This approach was tested on the island of Corsica (France), and maps to aid in decisions pertaining to fire risk and phytosanitary risk ( M. feytaudi) were obtained for P. pinaster forest management. Study results are used by the technical offices of the local administration— Corsican Agricultural and Rural Development Agency (ODARC)—for planning the conservation of P. pinaster forests with regard to fire prevention and safety and phytosanitary risks. The decision maker took part in the evaluation criteria study (weight, normalization, and classification of the values). Most suitable locations are given to target the public intervention. The methodology presented in this paper could be applied to other species and in other Mediterranean regions.
NASA Astrophysics Data System (ADS)
Molinario, G.; Hansen, M.; Potapov, P.
2016-12-01
High resolution satellite imagery obtained from the National Geospatial Intelligence Agency through NASA was used to photo-interpret sample areas within the DRC. The area sampled is a stratifcation of the forest cover loss from circa 2014 that either occurred completely within the previosly mapped homogenous area of the Rural Complex, at it's interface with primary forest, or in isolated forest perforations. Previous research resulted in a map of these areas that contextualizes forest loss depending on where it occurs and with what spatial density, leading to a better understading of the real impacts on forest degradation of livelihood shifting cultivation. The stratified random sampling approach of these areas allows the characterization of the constituent land cover types within these areas, and their variability throughout the DRC. Shifting cultivation has a variable forest degradation footprint in the DRC depending on many factors that drive it, but it's role in forest degradation and deforestation had been disputed, leading us to investigate and quantify the clearing and reuse rates within the strata throughout the country.
Tehran Air Pollutants Prediction Based on Random Forest Feature Selection Method
NASA Astrophysics Data System (ADS)
Shamsoddini, A.; Aboodi, M. R.; Karami, J.
2017-09-01
Air pollution as one of the most serious forms of environmental pollutions poses huge threat to human life. Air pollution leads to environmental instability, and has harmful and undesirable effects on the environment. Modern prediction methods of the pollutant concentration are able to improve decision making and provide appropriate solutions. This study examines the performance of the Random Forest feature selection in combination with multiple-linear regression and Multilayer Perceptron Artificial Neural Networks methods, in order to achieve an efficient model to estimate carbon monoxide and nitrogen dioxide, sulfur dioxide and PM2.5 contents in the air. The results indicated that Artificial Neural Networks fed by the attributes selected by Random Forest feature selection method performed more accurate than other models for the modeling of all pollutants. The estimation accuracy of sulfur dioxide emissions was lower than the other air contaminants whereas the nitrogen dioxide was predicted more accurate than the other pollutants.
NASA Astrophysics Data System (ADS)
Shi, Jing; Shi, Yunli; Tan, Jian; Zhu, Lei; Li, Hu
2018-02-01
Traditional power forecasting models cannot efficiently take various factors into account, neither to identify the relation factors. In this paper, the mutual information in information theory and the artificial intelligence random forests algorithm are introduced into the medium and long-term electricity demand prediction. Mutual information can identify the high relation factors based on the value of average mutual information between a variety of variables and electricity demand, different industries may be highly associated with different variables. The random forests algorithm was used for building the different industries forecasting models according to the different correlation factors. The data of electricity consumption in Jiangsu Province is taken as a practical example, and the above methods are compared with the methods without regard to mutual information and the industries. The simulation results show that the above method is scientific, effective, and can provide higher prediction accuracy.
Hans-Erik Andersen; Jacob Strunk; Hailemariam Temesgen
2011-01-01
Airborne laser scanning, collected in a sampling mode, has the potential to be a valuable tool for estimating the biomass resources available to support bioenergy production in rural communities of interior Alaska. In this study, we present a methodology for estimating forest biomass over a 201,226-ha area (of which 163,913 ha are forested) in the upper Tanana valley...
Optimal Diameter Growth Equations for Major Tree Species of the Midsouth
Don C. Bragg
2003-01-01
Optimal diameter growth equations for 60 major tree species were fit using the potential relative increment (PRI) methodology. Almost 175,000 individuals from the Midsouth (Arkansas, Louisiana, Missouri, Oklahoma, and Texas) were selected from the USDA Forest Service's Eastwide Forest Inventory Database (EFIDB). These records were then reduced to the individuals...
Reliability and precision of pellet-group counts for estimating landscape-level deer density
David S. deCalesta
2013-01-01
This study provides hitherto unavailable methodology for reliably and precisely estimating deer density within forested landscapes, enabling quantitative rather than qualitative deer management. Reliability and precision of the deer pellet-group technique were evaluated in 1 small and 2 large forested landscapes. Density estimates, adjusted to reflect deer harvest and...
Campioli, M; Malhi, Y; Vicca, S; Luyssaert, S; Papale, D; Peñuelas, J; Reichstein, M; Migliavacca, M; Arain, M A; Janssens, I A
2016-12-14
The eddy-covariance (EC) micro-meteorological technique and the ecology-based biometric methods (BM) are the primary methodologies to quantify CO 2 exchange between terrestrial ecosystems and the atmosphere (net ecosystem production, NEP) and its two components, ecosystem respiration and gross primary production. Here we show that EC and BM provide different estimates of NEP, but comparable ecosystem respiration and gross primary production for forest ecosystems globally. Discrepancies between methods are not related to environmental or stand variables, but are consistently more pronounced for boreal forests where carbon fluxes are smaller. BM estimates are prone to underestimation of net primary production and overestimation of leaf respiration. EC biases are not apparent across sites, suggesting the effectiveness of standard post-processing procedures. Our results increase confidence in EC, show in which conditions EC and BM estimates can be integrated, and which methodological aspects can improve the convergence between EC and BM.
NASA Astrophysics Data System (ADS)
Campioli, M.; Malhi, Y.; Vicca, S.; Luyssaert, S.; Papale, D.; Peñuelas, J.; Reichstein, M.; Migliavacca, M.; Arain, M. A.; Janssens, I. A.
2016-12-01
The eddy-covariance (EC) micro-meteorological technique and the ecology-based biometric methods (BM) are the primary methodologies to quantify CO2 exchange between terrestrial ecosystems and the atmosphere (net ecosystem production, NEP) and its two components, ecosystem respiration and gross primary production. Here we show that EC and BM provide different estimates of NEP, but comparable ecosystem respiration and gross primary production for forest ecosystems globally. Discrepancies between methods are not related to environmental or stand variables, but are consistently more pronounced for boreal forests where carbon fluxes are smaller. BM estimates are prone to underestimation of net primary production and overestimation of leaf respiration. EC biases are not apparent across sites, suggesting the effectiveness of standard post-processing procedures. Our results increase confidence in EC, show in which conditions EC and BM estimates can be integrated, and which methodological aspects can improve the convergence between EC and BM.
Campioli, M.; Malhi, Y.; Vicca, S.; Luyssaert, S.; Papale, D.; Peñuelas, J.; Reichstein, M.; Migliavacca, M.; Arain, M. A.; Janssens, I. A.
2016-01-01
The eddy-covariance (EC) micro-meteorological technique and the ecology-based biometric methods (BM) are the primary methodologies to quantify CO2 exchange between terrestrial ecosystems and the atmosphere (net ecosystem production, NEP) and its two components, ecosystem respiration and gross primary production. Here we show that EC and BM provide different estimates of NEP, but comparable ecosystem respiration and gross primary production for forest ecosystems globally. Discrepancies between methods are not related to environmental or stand variables, but are consistently more pronounced for boreal forests where carbon fluxes are smaller. BM estimates are prone to underestimation of net primary production and overestimation of leaf respiration. EC biases are not apparent across sites, suggesting the effectiveness of standard post-processing procedures. Our results increase confidence in EC, show in which conditions EC and BM estimates can be integrated, and which methodological aspects can improve the convergence between EC and BM. PMID:27966534
ERIC Educational Resources Information Center
Ndirangu, Caroline
2017-01-01
This study aims to evaluate teachers' attitude towards implementation of learner-centered methodology in science education in Kenya. The study used a survey design methodology, adopting the purposive, stratified random and simple random sampling procedures and hypothesised that there was no significant relationship between the head teachers'…
Relating Vegetation Aerodynamic Roughness Length to Interferometric SAR Measurements
NASA Technical Reports Server (NTRS)
Saatchi, Sassan; Rodriquez, Ernesto
1998-01-01
In this paper, we investigate the feasibility of estimating aerodynamic roughness parameter from interferometric SAR (INSAR) measurements. The relation between the interferometric correlation and the rms height of the surface is presented analytically. Model simulations performed over realistic canopy parameters obtained from field measurements in boreal forest environment demonstrate the capability of the INSAR measurements for estimating and mapping surface roughness lengths over forests and/or other vegetation types. The procedure for estimating this parameter over boreal forests using the INSAR data is discussed and the possibility of extending the methodology over tropical forests is examined.
Application of Remote Sensing for Forest Management in Nepal
NASA Astrophysics Data System (ADS)
Bajracharya, B.; Matin, M. A.
2016-12-01
Large area of the Hindu Kush Himalayan (HKH) region is covered by forest that is playing a vital role to address the challenges of climate change and livelihood options for a growing population. Effective management of forest cover needs establishment of regular monitoring system for forest. Supporting REDD assessment needs reliable baseline assessment of forest biomass and its monitoring at multiple scale. Adaptation of forest to climate change needs understanding vulnerability of forests and dependence of local communities on these forest. We present here different forest monitoring products developed under the SERVIR-Himalaya programme to address these issues. Landsat 30 meter images were used for decadal land cover change assessment and annual forest change hotspot monitoring. Methodology developed for biomass estimation at national and sub-national level biomass estimation. Decision support system was developed for analysis of forest vulnerability and dependence and selection of adaptation options based on resource availability. These products are forming the basis for development of an integrated system that will be very useful for comprehensive forest monitoring and long term strategy development for sustainable forest management.
2012-01-01
Background Forests of the Midwest U.S. provide numerous ecosystem services. Two of these, carbon sequestration and wood production, are often portrayed as conflicting. Currently, carbon management and biofuel policies are being developed to reduce atmospheric CO2 and national dependence on foreign oil, and increase carbon storage in ecosystems. However, the biological and industrial forest carbon cycles are rarely studied in a whole-system structure. The forest system carbon balance is the difference between the biological (net ecosystem production) and industrial (net emissions from forest industry) forest carbon cycles, but to date this critical whole system analysis is lacking. This study presents a model of the forest system, uses it to compute the carbon balance, and outlines a methodology to maximize future carbon uptake in a managed forest region. Results We used a coupled forest ecosystem process and forest products life cycle inventory model for a regional temperate forest in the Midwestern U.S., and found the net system carbon balance for this 615,000 ha forest was positive (2.29 t C ha-1 yr-1). The industrial carbon budget was typically less than 10% of the biological system annually, and averaged averaged 0.082 t C ha-1 yr-1. Net C uptake over the next 100-years increased by 22% or 0.33 t C ha-1 yr-1 relative to the current harvest rate in the study region under the optized harvest regime. Conclusions The forest’s biological ecosystem current and future carbon uptake capacity is largely determined by forest harvest practices that occurred over a century ago, but we show an optimized harvesting strategy would increase future carbon sequestration, or wood production, by 20-30%, reduce long transportation chain emissions, and maintain many desirable stand structural attributes that are correlated to biodiversity. Our results for this forest region suggest that increasing harvest over the next 100 years increases the strength of the carbon sink, and that carbon sequestration and wood production are not conflicting for this particular forest ecosystem. The optimal harvest strategy found here may not be the same for all forests, but the methodology is applicable anywhere sufficient forest inventory data exist. PMID:22713794
Characterization of Canopy Layering in Forested Ecosystems Using Full Waveform Lidar
NASA Technical Reports Server (NTRS)
Whitehurst, Amanda S.; Swatantran, Anu; Blair, J. Bryan; Hofton, Michelle A.; Dubayah, Ralph
2013-01-01
Canopy structure, the vertical distribution of canopy material, is an important element of forest ecosystem dynamics and habitat preference. Although vertical stratification, or "canopy layering," is a basic characterization of canopy structure for research and forest management, it is difficult to quantify at landscape scales. In this paper we describe canopy structure and develop methodologies to map forest vertical stratification in a mixed temperate forest using full-waveform lidar. Two definitions-one categorical and one continuous-are used to map canopy layering over Hubbard Brook Experimental Forest, New Hampshire with lidar data collected in 2009 by NASA's Laser Vegetation Imaging Sensor (LVIS). The two resulting canopy layering datasets describe variation of canopy layering throughout the forest and show that layering varies with terrain elevation and canopy height. This information should provide increased understanding of vertical structure variability and aid habitat characterization and other forest management activities.
Michael G. Shelton
1995-01-01
Five forest floor weights (0, 10, 20, 30, and 40 MgJha), three forest floor compositions (pine, pine-hardwood, and hardwood), and two seed placements (forest floor and soil surface) were tested in a three-factorial. split-plot design with four incomplete, randomized blocks. The experiment was conducted in a nursery setting and used wooden frames to define 0.145-m
Extrapolating intensified forest inventory data to the surrounding landscape using landsat
Evan B. Brooks; John W. Coulston; Valerie A. Thomas; Randolph H. Wynne
2015-01-01
In 2011, a collection of spatially intensified plots was established on three of the Experimental Forests and Ranges (EFRs) sites with the intent of facilitating FIA program objectives for regional extrapolation. Characteristic coefficients from harmonic regression (HR) analysis of associated Landsat stacks are used as inputs into a conditional random forests model to...
Randall J. Wilk; Timothy B. Harrington; Robert A. Gitzen; Chris C. Maguire
2015-01-01
We evaluated the two-year effects of variable-retention harvest on chipmunk (Tamias spp.) abundance (N^) and habitat in mature coniferous forests in western Oregon and Washington because wildlife responses to density/pattern of retained trees remain largely unknown. In a randomized complete-block design, six...
Highlights of the national evaluation of the Forest Stewardship Planning Program
R.J. Moulton; J.D. Esseks
2001-01-01
In 1998 and 1999, a nationwide random sample of 1238 nonindustrial private (NIPF) landowners with approved multiple resource Forest Stewardship Plans were interviewed to determine if this program is meeting its Congressional mandate of promoting sustainable management of forest resources on NIPF ownerships. It was found that two-thirds of program participants had never...
Ownership and ecosystem as sources of spatial heterogeneity in a forested landscape, Wisconsin, USA
Thomas R. Crow; George E. Host; David J. Mladenoff
1999-01-01
The interaction between physical environment and land ownership in creating spatial heterogeneity was studied in largely forested landscapes of northern Wisconsin, USA. A stratified random approach was used in which 2500-ha plots representing two ownerships (National Forest and private non-industrial) were located within two regional ecosystems (extremely well-drained...
2012-03-01
with each SVM discriminating between a pair of the N total speakers in the data set. The (( + 1))/2 classifiers then vote on the final...classification of a test sample. The Random Forest classifier is an ensemble classifier that votes amongst decision trees generated with each node using...Forest vote , and the effects of overtraining will be mitigated by the fact that each decision tree is overtrained differently (due to the random
Probability machines: consistent probability estimation using nonparametric learning machines.
Malley, J D; Kruppa, J; Dasgupta, A; Malley, K G; Ziegler, A
2012-01-01
Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications.
A random forest algorithm for nowcasting of intense precipitation events
NASA Astrophysics Data System (ADS)
Das, Saurabh; Chakraborty, Rohit; Maitra, Animesh
2017-09-01
Automatic nowcasting of convective initiation and thunderstorms has potential applications in several sectors including aviation planning and disaster management. In this paper, random forest based machine learning algorithm is tested for nowcasting of convective rain with a ground based radiometer. Brightness temperatures measured at 14 frequencies (7 frequencies in 22-31 GHz band and 7 frequencies in 51-58 GHz bands) are utilized as the inputs of the model. The lower frequency band is associated to the water vapor absorption whereas the upper frequency band relates to the oxygen absorption and hence, provide information on the temperature and humidity of the atmosphere. Synthetic minority over-sampling technique is used to balance the data set and 10-fold cross validation is used to assess the performance of the model. Results indicate that random forest algorithm with fixed alarm generation time of 30 min and 60 min performs quite well (probability of detection of all types of weather condition ∼90%) with low false alarms. It is, however, also observed that reducing the alarm generation time improves the threat score significantly and also decreases false alarms. The proposed model is found to be very sensitive to the boundary layer instability as indicated by the variable importance measure. The study shows the suitability of a random forest algorithm for nowcasting application utilizing a large number of input parameters from diverse sources and can be utilized in other forecasting problems.
Learning accurate and interpretable models based on regularized random forests regression
2014-01-01
Background Many biology related research works combine data from multiple sources in an effort to understand the underlying problems. It is important to find and interpret the most important information from these sources. Thus it will be beneficial to have an effective algorithm that can simultaneously extract decision rules and select critical features for good interpretation while preserving the prediction performance. Methods In this study, we focus on regression problems for biological data where target outcomes are continuous. In general, models constructed from linear regression approaches are relatively easy to interpret. However, many practical biological applications are nonlinear in essence where we can hardly find a direct linear relationship between input and output. Nonlinear regression techniques can reveal nonlinear relationship of data, but are generally hard for human to interpret. We propose a rule based regression algorithm that uses 1-norm regularized random forests. The proposed approach simultaneously extracts a small number of rules from generated random forests and eliminates unimportant features. Results We tested the approach on some biological data sets. The proposed approach is able to construct a significantly smaller set of regression rules using a subset of attributes while achieving prediction performance comparable to that of random forests regression. Conclusion It demonstrates high potential in aiding prediction and interpretation of nonlinear relationships of the subject being studied. PMID:25350120
How knowledge influences a MCDM analysis: WOCAT Portuguese experience on prevention of forest fires
NASA Astrophysics Data System (ADS)
Carreiras, M.; Ferreira, A. J. D.; Moreira, J.; Esteves, T. C. J.; Valente, S.; Soares, J.; Coelho, C. O. A.; Schwilch, G.; Bachmann, F.
2012-04-01
Forest management is a major concern for land managers due to its impact on biomass production, surface water quality or landscape beauty. Pursuing the development of a holistic view of the issue (considering economic, environmental and social aspects), an appreciation of the variety policies and techniques is considered essential due to its importance in the context of sustainability. It this context, MCDM could be an important tool on the establishment for the use of the forest. It could be used for exploiting the preferences of decision-makers, stakeholders, or environmental experts obtaining economic values for impacts whose monetization remains problematic. WOCAT has developed a framework for Sustainable Land Management knowledge, covering all steps from data collection, database implementation and decision support. WOCAT methodology allows the environmental risks knowledge and also stakeholder's participation and involvement. It leads to the discussion of issues of the territory and through a participatory, integrative, holistic and impartial process, it identifies environmental problems. In the end guidelines / actions for the territory are settled based on the problems identified. Having an active participatory nature, this process reveals itself as an excellent public participation process. The methodology also brings the territory's decision-makers in contact with the stakeholders. The procedure for identification, assessment and selection of strategies has been developed by the EU project DESIRE in collaboration with WOCAT. The methodology was tested by DESIRE in 16 study sites around the world. As an outcome of the procedure, the methodology may serve as a basis for prioritizing land-use policies, conservation measures and research at a national level. It integrates several exercises for prioritizing land-use policies, conservation measures and research at a regional and national level. In Portugal, forest fires are one of the major factors of land degradation processes. Affecting large areas every year, they also have serious human, socio-economic and psychological impacts. Under the DESIRE project two Portuguese study sites were selected - Góis e Mação. Both study sites are located in Central Portugal and are frequently affected by forest fires. Nowadays different types of solutions applied at the local level are related with the prevention, combat and mitigation of forest fires. At a higher level of analysis the main solution is related with the diversification of the soil uses, mainly by the mixture of cropland, pastures and forest areas. But the selection of the technique isn't so far an open, participative and effective process, and the interests of land users are not represented most of the time. This paper aims to present WOCAT approach and results to forest fire prevention in Portugal considering stakeholder's perspectives and policy recommendations and it's evolution based on an increased of knowledge.
The contribution of competition to tree mortality in old-growth coniferous forests
Das, A.; Battles, J.; Stephenson, N.L.; van Mantgem, P.J.
2011-01-01
Competition is a well-documented contributor to tree mortality in temperate forests, with numerous studies documenting a relationship between tree death and the competitive environment. Models frequently rely on competition as the only non-random mechanism affecting tree mortality. However, for mature forests, competition may cease to be the primary driver of mortality.We use a large, long-term dataset to study the importance of competition in determining tree mortality in old-growth forests on the western slope of the Sierra Nevada of California, U.S.A. We make use of the comparative spatial configuration of dead and live trees, changes in tree spatial pattern through time, and field assessments of contributors to an individual tree's death to quantify competitive effects.Competition was apparently a significant contributor to tree mortality in these forests. Trees that died tended to be in more competitive environments than trees that survived, and suppression frequently appeared as a factor contributing to mortality. On the other hand, based on spatial pattern analyses, only three of 14 plots demonstrated compelling evidence that competition was dominating mortality. Most of the rest of the plots fell within the expectation for random mortality, and three fit neither the random nor the competition model. These results suggest that while competition is often playing a significant role in tree mortality processes in these forests it only infrequently governs those processes. In addition, the field assessments indicated a substantial presence of biotic mortality agents in trees that died.While competition is almost certainly important, demographics in these forests cannot accurately be characterized without a better grasp of other mortality processes. In particular, we likely need a better understanding of biotic agents and their interactions with one another and with competition. ?? 2011.
Allometric scaling theory applied to FIA biomass estimation
David C. Chojnacky
2002-01-01
Tree biomass estimates in the Forest Inventory and Analysis (FIA) database are derived from numerous methodologies whose abundance and complexity raise questions about consistent results throughout the U.S. A new model based on allometric scaling theory ("WBE") offers simplified methodology and a theoretically sound basis for improving the reliability and...
Analysis of Machine Learning Techniques for Heart Failure Readmissions.
Mortazavi, Bobak J; Downing, Nicholas S; Bucholz, Emily M; Dharmarajan, Kumar; Manhapra, Ajay; Li, Shu-Xia; Negahban, Sahand N; Krumholz, Harlan M
2016-11-01
The current ability to predict readmissions in patients with heart failure is modest at best. It is unclear whether machine learning techniques that address higher dimensional, nonlinear relationships among variables would enhance prediction. We sought to compare the effectiveness of several machine learning algorithms for predicting readmissions. Using data from the Telemonitoring to Improve Heart Failure Outcomes trial, we compared the effectiveness of random forests, boosting, random forests combined hierarchically with support vector machines or logistic regression (LR), and Poisson regression against traditional LR to predict 30- and 180-day all-cause readmissions and readmissions because of heart failure. We randomly selected 50% of patients for a derivation set, and a validation set comprised the remaining patients, validated using 100 bootstrapped iterations. We compared C statistics for discrimination and distributions of observed outcomes in risk deciles for predictive range. In 30-day all-cause readmission prediction, the best performing machine learning model, random forests, provided a 17.8% improvement over LR (mean C statistics, 0.628 and 0.533, respectively). For readmissions because of heart failure, boosting improved the C statistic by 24.9% over LR (mean C statistic 0.678 and 0.543, respectively). For 30-day all-cause readmission, the observed readmission rates in the lowest and highest deciles of predicted risk with random forests (7.8% and 26.2%, respectively) showed a much wider separation than LR (14.2% and 16.4%, respectively). Machine learning methods improved the prediction of readmission after hospitalization for heart failure compared with LR and provided the greatest predictive range in observed readmission rates. © 2016 American Heart Association, Inc.
NASA Astrophysics Data System (ADS)
Sheet, Debdoot; Karamalis, Athanasios; Kraft, Silvan; Noël, Peter B.; Vag, Tibor; Sadhu, Anup; Katouzian, Amin; Navab, Nassir; Chatterjee, Jyotirmoy; Ray, Ajoy K.
2013-03-01
Breast cancer is the most common form of cancer in women. Early diagnosis can significantly improve lifeexpectancy and allow different treatment options. Clinicians favor 2D ultrasonography for breast tissue abnormality screening due to high sensitivity and specificity compared to competing technologies. However, inter- and intra-observer variability in visual assessment and reporting of lesions often handicaps its performance. Existing Computer Assisted Diagnosis (CAD) systems though being able to detect solid lesions are often restricted in performance. These restrictions are inability to (1) detect lesion of multiple sizes and shapes, and (2) differentiate between hypo-echoic lesions from their posterior acoustic shadowing. In this work we present a completely automatic system for detection and segmentation of breast lesions in 2D ultrasound images. We employ random forests for learning of tissue specific primal to discriminate breast lesions from surrounding normal tissues. This enables it to detect lesions of multiple shapes and sizes, as well as discriminate between hypo-echoic lesion from associated posterior acoustic shadowing. The primal comprises of (i) multiscale estimated ultrasonic statistical physics and (ii) scale-space characteristics. The random forest learns lesion vs. background primal from a database of 2D ultrasound images with labeled lesions. For segmentation, the posterior probabilities of lesion pixels estimated by the learnt random forest are hard thresholded to provide a random walks segmentation stage with starting seeds. Our method achieves detection with 99.19% accuracy and segmentation with mean contour-to-contour error < 3 pixels on a set of 40 images with 49 lesions.
Golkarian, Ali; Naghibi, Seyed Amir; Kalantar, Bahareh; Pradhan, Biswajeet
2018-02-17
Ever increasing demand for water resources for different purposes makes it essential to have better understanding and knowledge about water resources. As known, groundwater resources are one of the main water resources especially in countries with arid climatic condition. Thus, this study seeks to provide groundwater potential maps (GPMs) employing new algorithms. Accordingly, this study aims to validate the performance of C5.0, random forest (RF), and multivariate adaptive regression splines (MARS) algorithms for generating GPMs in the eastern part of Mashhad Plain, Iran. For this purpose, a dataset was produced consisting of spring locations as indicator and groundwater-conditioning factors (GCFs) as input. In this research, 13 GCFs were selected including altitude, slope aspect, slope angle, plan curvature, profile curvature, topographic wetness index (TWI), slope length, distance from rivers and faults, rivers and faults density, land use, and lithology. The mentioned dataset was divided into two classes of training and validation with 70 and 30% of the springs, respectively. Then, C5.0, RF, and MARS algorithms were employed using R statistical software, and the final values were transformed into GPMs. Finally, two evaluation criteria including Kappa and area under receiver operating characteristics curve (AUC-ROC) were calculated. According to the findings of this research, MARS had the best performance with AUC-ROC of 84.2%, followed by RF and C5.0 algorithms with AUC-ROC values of 79.7 and 77.3%, respectively. The results indicated that AUC-ROC values for the employed models are more than 70% which shows their acceptable performance. As a conclusion, the produced methodology could be used in other geographical areas. GPMs could be used by water resource managers and related organizations to accelerate and facilitate water resource exploitation.
Bush encroachment monitoring using multi-temporal Landsat data and random forests
NASA Astrophysics Data System (ADS)
Symeonakis, E.; Higginbottom, T.
2014-11-01
It is widely accepted that land degradation and desertification (LDD) are serious global threats to humans and the environment. Around a third of savannahs in Africa are affected by LDD processes that may lead to substantial declines in ecosystem functioning and services. Indirectly, LDD can be monitored using relevant indicators. The encroachment of woody plants into grasslands, and the subsequent conversion of savannahs and open woodlands into shrublands, has attracted a lot of attention over the last decades and has been identified as a potential indicator of LDD. Mapping bush encroachment over large areas can only effectively be done using Earth Observation (EO) data and techniques. However, the accurate assessment of large-scale savannah degradation through bush encroachment with satellite imagery remains a formidable task due to the fact that on the satellite data vegetation variability in response to highly variable rainfall patterns might obscure the underlying degradation processes. Here, we present a methodological framework for the monitoring of bush encroachment-related land degradation in a savannah environment in the Northwest Province of South Africa. We utilise multi-temporal Landsat TM and ETM+ (SLC-on) data from 1989 until 2009, mostly from the dry-season, and ancillary data in a GIS environment. We then use the machine learning classification approach of random forests to identify the extent of encroachment over the 20-year period. The results show that in the area of study, bush encroachment is as alarming as permanent vegetation loss. The classification of the year 2009 is validated yielding low commission and omission errors and high k-statistic values for the grasses and woody vegetation classes. Our approach is a step towards a rigorous and effective savannah degradation assessment.
Application of Random Forests Methods to Diabetic Retinopathy Classification Analyses
Casanova, Ramon; Saldana, Santiago; Chew, Emily Y.; Danis, Ronald P.; Greven, Craig M.; Ambrosius, Walter T.
2014-01-01
Background Diabetic retinopathy (DR) is one of the leading causes of blindness in the United States and world-wide. DR is a silent disease that may go unnoticed until it is too late for effective treatment. Therefore, early detection could improve the chances of therapeutic interventions that would alleviate its effects. Methodology Graded fundus photography and systemic data from 3443 ACCORD-Eye Study participants were used to estimate Random Forest (RF) and logistic regression classifiers. We studied the impact of sample size on classifier performance and the possibility of using RF generated class conditional probabilities as metrics describing DR risk. RF measures of variable importance are used to detect factors that affect classification performance. Principal Findings Both types of data were informative when discriminating participants with or without DR. RF based models produced much higher classification accuracy than those based on logistic regression. Combining both types of data did not increase accuracy but did increase statistical discrimination of healthy participants who subsequently did or did not have DR events during four years of follow-up. RF variable importance criteria revealed that microaneurysms counts in both eyes seemed to play the most important role in discrimination among the graded fundus variables, while the number of medicines and diabetes duration were the most relevant among the systemic variables. Conclusions and Significance We have introduced RF methods to DR classification analyses based on fundus photography data. In addition, we propose an approach to DR risk assessment based on metrics derived from graded fundus photography and systemic data. Our results suggest that RF methods could be a valuable tool to diagnose DR diagnosis and evaluate its progression. PMID:24940623
Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging.
Eloyan, Ani; Muschelli, John; Nebel, Mary Beth; Liu, Han; Han, Fang; Zhao, Tuo; Barber, Anita D; Joel, Suresh; Pekar, James J; Mostofsky, Stewart H; Caffo, Brian
2012-01-01
Successful automated diagnoses of attention deficit hyperactive disorder (ADHD) using imaging and functional biomarkers would have fundamental consequences on the public health impact of the disease. In this work, we show results on the predictability of ADHD using imaging biomarkers and discuss the scientific and diagnostic impacts of the research. We created a prediction model using the landmark ADHD 200 data set focusing on resting state functional connectivity (rs-fc) and structural brain imaging. We predicted ADHD status and subtype, obtained by behavioral examination, using imaging data, intelligence quotients and other covariates. The novel contributions of this manuscript include a thorough exploration of prediction and image feature extraction methodology on this form of data, including the use of singular value decompositions (SVDs), CUR decompositions, random forest, gradient boosting, bagging, voxel-based morphometry, and support vector machines as well as important insights into the value, and potentially lack thereof, of imaging biomarkers of disease. The key results include the CUR-based decomposition of the rs-fc-fMRI along with gradient boosting and the prediction algorithm based on a motor network parcellation and random forest algorithm. We conjecture that the CUR decomposition is largely diagnosing common population directions of head motion. Of note, a byproduct of this research is a potential automated method for detecting subtle in-scanner motion. The final prediction algorithm, a weighted combination of several algorithms, had an external test set specificity of 94% with sensitivity of 21%. The most promising imaging biomarker was a correlation graph from a motor network parcellation. In summary, we have undertaken a large-scale statistical exploratory prediction exercise on the unique ADHD 200 data set. The exercise produced several potential leads for future scientific exploration of the neurological basis of ADHD.
Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging
Eloyan, Ani; Muschelli, John; Nebel, Mary Beth; Liu, Han; Han, Fang; Zhao, Tuo; Barber, Anita D.; Joel, Suresh; Pekar, James J.; Mostofsky, Stewart H.; Caffo, Brian
2012-01-01
Successful automated diagnoses of attention deficit hyperactive disorder (ADHD) using imaging and functional biomarkers would have fundamental consequences on the public health impact of the disease. In this work, we show results on the predictability of ADHD using imaging biomarkers and discuss the scientific and diagnostic impacts of the research. We created a prediction model using the landmark ADHD 200 data set focusing on resting state functional connectivity (rs-fc) and structural brain imaging. We predicted ADHD status and subtype, obtained by behavioral examination, using imaging data, intelligence quotients and other covariates. The novel contributions of this manuscript include a thorough exploration of prediction and image feature extraction methodology on this form of data, including the use of singular value decompositions (SVDs), CUR decompositions, random forest, gradient boosting, bagging, voxel-based morphometry, and support vector machines as well as important insights into the value, and potentially lack thereof, of imaging biomarkers of disease. The key results include the CUR-based decomposition of the rs-fc-fMRI along with gradient boosting and the prediction algorithm based on a motor network parcellation and random forest algorithm. We conjecture that the CUR decomposition is largely diagnosing common population directions of head motion. Of note, a byproduct of this research is a potential automated method for detecting subtle in-scanner motion. The final prediction algorithm, a weighted combination of several algorithms, had an external test set specificity of 94% with sensitivity of 21%. The most promising imaging biomarker was a correlation graph from a motor network parcellation. In summary, we have undertaken a large-scale statistical exploratory prediction exercise on the unique ADHD 200 data set. The exercise produced several potential leads for future scientific exploration of the neurological basis of ADHD. PMID:22969709
Diagnostic Value of the Impairment of Olfaction in Parkinson's Disease
Casjens, Swaantje; Eckert, Angelika; Woitalla, Dirk; Ellrichmann, Gisa; Turewicz, Michael; Stephan, Christian; Eisenacher, Martin; May, Caroline; Meyer, Helmut E.; Brüning, Thomas; Pesch, Beate
2013-01-01
Background Olfactory impairment is increasingly recognized as an early symptom in the development of Parkinson's disease. Testing olfactory function is a non-invasive method but can be time-consuming which restricts its application in clinical settings and epidemiological studies. Here, we investigate odor identification as a supportive diagnostic tool for Parkinson's disease and estimate the performance of odor subsets to allow a more rapid testing of olfactory impairment. Methodology/Principal Findings Odor identification was assessed with 16 Sniffin' sticks in 148 Parkinson patients and 148 healthy controls. Risks of olfactory impairment were estimated with proportional odds models. Random forests were applied to classify Parkinson and non-Parkinson patients. Parkinson patients were rarely normosmic (identification of more than 12 odors; 16.8%) and identified on average seven odors whereas the reference group identified 12 odors and showed a higher prevalence of normosmy (31.1%). Parkinson patients with rigidity dominance had a twofold greater prevalence of olfactory impairment. Disease severity was associated with impairment of odor identification (per score point of the Hoehn and Yahr rating OR 1.87, 95% CI 1.26–2.77). Age-related impairment of olfaction showed a steeper gradient in Parkinson patients. Coffee, peppermint, and anise showed the largest difference in odor identification between Parkinson patients and controls. Random forests estimated a misclassification rate of 22.4% when comparing Parkinson patients with healthy controls using all 16 odors. A similar rate (23.8%) was observed when only the three aforementioned odors were applied. Conclusions/Significance Our findings indicate that testing odor identification can be a supportive diagnostic tool for Parkinson's disease. The application of only three odors performed well in discriminating Parkinson patients from controls, which can facilitate a wider application of this method as a point-of-care test. PMID:23696904
Deciphering the Routes of invasion of Drosophila suzukii by Means of ABC Random Forest.
Fraimout, Antoine; Debat, Vincent; Fellous, Simon; Hufbauer, Ruth A; Foucaud, Julien; Pudlo, Pierre; Marin, Jean-Michel; Price, Donald K; Cattel, Julien; Chen, Xiao; Deprá, Marindia; François Duyck, Pierre; Guedot, Christelle; Kenis, Marc; Kimura, Masahito T; Loeb, Gregory; Loiseau, Anne; Martinez-Sañudo, Isabel; Pascual, Marta; Polihronakis Richmond, Maxi; Shearer, Peter; Singh, Nadia; Tamura, Koichiro; Xuéreb, Anne; Zhang, Jinping; Estoup, Arnaud
2017-04-01
Deciphering invasion routes from molecular data is crucial to understanding biological invasions, including identifying bottlenecks in population size and admixture among distinct populations. Here, we unravel the invasion routes of the invasive pest Drosophila suzukii using a multi-locus microsatellite dataset (25 loci on 23 worldwide sampling locations). To do this, we use approximate Bayesian computation (ABC), which has improved the reconstruction of invasion routes, but can be computationally expensive. We use our study to illustrate the use of a new, more efficient, ABC method, ABC random forest (ABC-RF) and compare it to a standard ABC method (ABC-LDA). We find that Japan emerges as the most probable source of the earliest recorded invasion into Hawaii. Southeast China and Hawaii together are the most probable sources of populations in western North America, which then in turn served as sources for those in eastern North America. European populations are genetically more homogeneous than North American populations, and their most probable source is northeast China, with evidence of limited gene flow from the eastern US as well. All introduced populations passed through bottlenecks, and analyses reveal five distinct admixture events. These findings can inform hypotheses concerning how this species evolved between different and independent source and invasive populations. Methodological comparisons indicate that ABC-RF and ABC-LDA show concordant results if ABC-LDA is based on a large number of simulated datasets but that ABC-RF out-performs ABC-LDA when using a comparable and more manageable number of simulated datasets, especially when analyzing complex introduction scenarios. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Comparing ensemble learning methods based on decision tree classifiers for protein fold recognition.
Bardsiri, Mahshid Khatibi; Eftekhari, Mahdi
2014-01-01
In this paper, some methods for ensemble learning of protein fold recognition based on a decision tree (DT) are compared and contrasted against each other over three datasets taken from the literature. According to previously reported studies, the features of the datasets are divided into some groups. Then, for each of these groups, three ensemble classifiers, namely, random forest, rotation forest and AdaBoost.M1 are employed. Also, some fusion methods are introduced for combining the ensemble classifiers obtained in the previous step. After this step, three classifiers are produced based on the combination of classifiers of types random forest, rotation forest and AdaBoost.M1. Finally, the three different classifiers achieved are combined to make an overall classifier. Experimental results show that the overall classifier obtained by the genetic algorithm (GA) weighting fusion method, is the best one in comparison to previously applied methods in terms of classification accuracy.
Polarimetric signatures of a coniferous forest canopy based on vector radiative transfer theory
NASA Technical Reports Server (NTRS)
Karam, M. A.; Fung, A. K.; Amar, F.; Mougin, E.; Lopes, A.; Beaudoin, A.
1992-01-01
Complete polarization signatures of a coniferous forest canopy are studied by the iterative solution of the vector radiative transfer equations up to the second order. The forest canopy constituents (leaves, branches, stems, and trunk) are embedded in a multi-layered medium over a rough interface. The branches, stems and trunk scatterers are modeled as finite randomly oriented cylinders. The leaves are modeled as randomly oriented needles. For a plane wave exciting the canopy, the average Mueller matrix is formulated in terms of the iterative solution of the radiative transfer solution and used to determine the linearly polarized backscattering coefficients, the co-polarized and cross-polarized power returns, and the phase difference statistics. Numerical results are presented to investigate the effect of transmitting and receiving antenna configurations on the polarimetric signature of a pine forest. Comparison is made with measurements.
Field evaluation of a random forest activity classifier for wrist-worn accelerometer data.
Pavey, Toby G; Gilson, Nicholas D; Gomersall, Sjaan R; Clark, Bronwyn; Trost, Stewart G
2017-01-01
Wrist-worn accelerometers are convenient to wear and associated with greater wear-time compliance. Previous work has generally relied on choreographed activity trials to train and test classification models. However, validity in free-living contexts is starting to emerge. Study aims were: (1) train and test a random forest activity classifier for wrist accelerometer data; and (2) determine if models trained on laboratory data perform well under free-living conditions. Twenty-one participants (mean age=27.6±6.2) completed seven lab-based activity trials and a 24h free-living trial (N=16). Participants wore a GENEActiv monitor on the non-dominant wrist. Classification models recognising four activity classes (sedentary, stationary+, walking, and running) were trained using time and frequency domain features extracted from 10-s non-overlapping windows. Model performance was evaluated using leave-one-out-cross-validation. Models were implemented using the randomForest package within R. Classifier accuracy during the 24h free living trial was evaluated by calculating agreement with concurrently worn activPAL monitors. Overall classification accuracy for the random forest algorithm was 92.7%. Recognition accuracy for sedentary, stationary+, walking, and running was 80.1%, 95.7%, 91.7%, and 93.7%, respectively for the laboratory protocol. Agreement with the activPAL data (stepping vs. non-stepping) during the 24h free-living trial was excellent and, on average, exceeded 90%. The ICC for stepping time was 0.92 (95% CI=0.75-0.97). However, sensitivity and positive predictive values were modest. Mean bias was 10.3min/d (95% LOA=-46.0 to 25.4min/d). The random forest classifier for wrist accelerometer data yielded accurate group-level predictions under controlled conditions, but was less accurate at identifying stepping verse non-stepping behaviour in free living conditions Future studies should conduct more rigorous field-based evaluations using observation as a criterion measure. Copyright © 2016 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Benchmarking dairy herd health status using routinely recorded herd summary data.
Parker Gaddis, K L; Cole, J B; Clay, J S; Maltecca, C
2016-02-01
Genetic improvement of dairy cattle health through the use of producer-recorded data has been determined to be feasible. Low estimated heritabilities indicate that genetic progress will be slow. Variation observed in lowly heritable traits can largely be attributed to nongenetic factors, such as the environment. More rapid improvement of dairy cattle health may be attainable if herd health programs incorporate environmental and managerial aspects. More than 1,100 herd characteristics are regularly recorded on farm test-days. We combined these data with producer-recorded health event data, and parametric and nonparametric models were used to benchmark herd and cow health status. Health events were grouped into 3 categories for analyses: mastitis, reproductive, and metabolic. Both herd incidence and individual incidence were used as dependent variables. Models implemented included stepwise logistic regression, support vector machines, and random forests. At both the herd and individual levels, random forest models attained the highest accuracy for predicting health status in all health event categories when evaluated with 10-fold cross-validation. Accuracy (SD) ranged from 0.61 (0.04) to 0.63 (0.04) when using random forest models at the herd level. Accuracy of prediction (SD) at the individual cow level ranged from 0.87 (0.06) to 0.93 (0.001) with random forest models. Highly significant variables and key words from logistic regression and random forest models were also investigated. All models identified several of the same key factors for each health event category, including movement out of the herd, size of the herd, and weather-related variables. We concluded that benchmarking health status using routinely collected herd data is feasible. Nonparametric models were better suited to handle this complex data with numerous variables. These data mining techniques were able to perform prediction of health status and could add evidence to personal experience in herd management. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Toward a methodical framework for comprehensively assessing forest multifunctionality.
Trogisch, Stefan; Schuldt, Andreas; Bauhus, Jürgen; Blum, Juliet A; Both, Sabine; Buscot, François; Castro-Izaguirre, Nadia; Chesters, Douglas; Durka, Walter; Eichenberg, David; Erfmeier, Alexandra; Fischer, Markus; Geißler, Christian; Germany, Markus S; Goebes, Philipp; Gutknecht, Jessica; Hahn, Christoph Zacharias; Haider, Sylvia; Härdtle, Werner; He, Jin-Sheng; Hector, Andy; Hönig, Lydia; Huang, Yuanyuan; Klein, Alexandra-Maria; Kühn, Peter; Kunz, Matthias; Leppert, Katrin N; Li, Ying; Liu, Xiaojuan; Niklaus, Pascal A; Pei, Zhiqin; Pietsch, Katherina A; Prinz, Ricarda; Proß, Tobias; Scherer-Lorenzen, Michael; Schmidt, Karsten; Scholten, Thomas; Seitz, Steffen; Song, Zhengshan; Staab, Michael; von Oheimb, Goddert; Weißbecker, Christina; Welk, Erik; Wirth, Christian; Wubet, Tesfaye; Yang, Bo; Yang, Xuefei; Zhu, Chao-Dong; Schmid, Bernhard; Ma, Keping; Bruelheide, Helge
2017-12-01
Biodiversity-ecosystem functioning (BEF) research has extended its scope from communities that are short-lived or reshape their structure annually to structurally complex forest ecosystems. The establishment of tree diversity experiments poses specific methodological challenges for assessing the multiple functions provided by forest ecosystems. In particular, methodological inconsistencies and nonstandardized protocols impede the analysis of multifunctionality within, and comparability across the increasing number of tree diversity experiments. By providing an overview on key methods currently applied in one of the largest forest biodiversity experiments, we show how methods differing in scale and simplicity can be combined to retrieve consistent data allowing novel insights into forest ecosystem functioning. Furthermore, we discuss and develop recommendations for the integration and transferability of diverse methodical approaches to present and future forest biodiversity experiments. We identified four principles that should guide basic decisions concerning method selection for tree diversity experiments and forest BEF research: (1) method selection should be directed toward maximizing data density to increase the number of measured variables in each plot. (2) Methods should cover all relevant scales of the experiment to consider scale dependencies of biodiversity effects. (3) The same variable should be evaluated with the same method across space and time for adequate larger-scale and longer-time data analysis and to reduce errors due to changing measurement protocols. (4) Standardized, practical and rapid methods for assessing biodiversity and ecosystem functions should be promoted to increase comparability among forest BEF experiments. We demonstrate that currently available methods provide us with a sophisticated toolbox to improve a synergistic understanding of forest multifunctionality. However, these methods require further adjustment to the specific requirements of structurally complex and long-lived forest ecosystems. By applying methods connecting relevant scales, trophic levels, and above- and belowground ecosystem compartments, knowledge gain from large tree diversity experiments can be optimized.
Chapter 10: Marbled Murrelet Inland Patterns of Activity: Defining Detections and Behavior
Peter W.C. Paton
1995-01-01
This chapter summarizes terminology and methodology used by Marbled Murrelet (Brachyramphus marmoratus) biologists when surveying inland forests. Information is included on the types of behaviors used to determine if murrelets may be nesting in an area, and the various types of detections used to quantify murrelet use of forest stands. Problems with...
Estimating aboveground net primary productivity in forest-dominated ecosystems
Brian D. Kloeppel; Mark E. Harmon; Timothy J. Fahey
2007-01-01
The measurement of net primary productivity (NPP) in forest ecosystems presents a variety of challenges because of the large and complex dimensions of trees and the difficulties of quantifying several components of NPP. As summarized by Clark et al. (2001a), these methodological challenges can be overcome, and more reliable spatial and temporal comparisons can be...
Shape selection in Landsat time series: A tool for monitoring forest dynamics
Gretchen G. Moisen; Mary C. Meyer; Todd A. Schroeder; Xiyue Liao; Karen G. Schleeweis; Elizabeth A. Freeman; Chris Toney
2016-01-01
We present a new methodology for fitting nonparametric shape-restricted regression splines to time series of Landsat imagery for the purpose of modeling, mapping, and monitoring annual forest disturbance dynamics over nearly three decades. For each pixel and spectral band or index of choice in temporal Landsat data, our method delivers a smoothed rendition of...
Chapter 6: Incorporating rural community characteristics into forest management decisions
Mindy S. Crandall; Jane L. Harrison; Claire A. Montgomery
2014-01-01
As part of the Integrated Landscape Assessment Project, we developed a methodology for managers to include potential community benefits when considering forest management treatments. To do this, we created a watershed impact score that scores each watershed (potential source of wood material) with respect to the communities that are likely to benefit from increased...
Protecting Oregon old-growth forests from fires: how much is it worth?
Armando González-Cabán; John Loomis; Robin Gregory
1995-01-01
Current fire management policies in the USDA Forest Service includes traditional multiple uses, but these policies do not adequately incorporate non-traditional uses such as preservation of biodiversity and related nongame and endangered animals. A contingent valuation methodology was used for valuing the general public's desire to know that rare and unique...
Wang, Wei; Liu, Juan; Sun, Lin
2016-07-01
Protein-DNA bindings are critical to many biological processes. However, the structural mechanisms underlying these interactions are not fully understood. Here, we analyzed the residues shape (peak, flat, or valley) and the surrounding environment of double-stranded DNA-binding proteins (DSBs) and single-stranded DNA-binding proteins (SSBs) in protein-DNA interfaces. In the results, we found that the interface shapes, hydrogen bonds, and the surrounding environment present significant differences between the two kinds of proteins. Built on the investigation results, we constructed a random forest (RF) classifier to distinguish DSBs and SSBs with satisfying performance. In conclusion, we present a novel methodology to characterize protein interfaces, which will deepen our understanding of the specificity of proteins binding to ssDNA (single-stranded DNA) or dsDNA (double-stranded DNA). Proteins 2016; 84:979-989. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
González-Durruthy, Michael; Monserrat, Jose M; Rasulev, Bakhtiyor; Casañola-Martín, Gerardo M; Barreiro Sorrivas, José María; Paraíso-Medina, Sergio; Maojo, Víctor; González-Díaz, Humberto; Pazos, Alejandro; Munteanu, Cristian R
2017-11-11
This study presents the impact of carbon nanotubes (CNTs) on mitochondrial oxygen mass flux ( J m ) under three experimental conditions. New experimental results and a new methodology are reported for the first time and they are based on CNT Raman spectra star graph transform (spectral moments) and perturbation theory. The experimental measures of J m showed that no tested CNT family can inhibit the oxygen consumption profiles of mitochondria. The best model for the prediction of J m for other CNTs was provided by random forest using eight features, obtaining test R-squared ( R ²) of 0.863 and test root-mean-square error (RMSE) of 0.0461. The results demonstrate the capability of encoding CNT information into spectral moments of the Raman star graphs (SG) transform with a potential applicability as predictive tools in nanotechnology and material risk assessments.
Estimating daily forest carbon fluxes using a combination of ground and remotely sensed data
NASA Astrophysics Data System (ADS)
Chirici, Gherardo; Chiesi, Marta; Corona, Piermaria; Salvati, Riccardo; Papale, Dario; Fibbi, Luca; Sirca, Costantino; Spano, Donatella; Duce, Pierpaolo; Marras, Serena; Matteucci, Giorgio; Cescatti, Alessandro; Maselli, Fabio
2016-02-01
Several studies have demonstrated that Monteith's approach can efficiently predict forest gross primary production (GPP), while the modeling of net ecosystem production (NEP) is more critical, requiring the additional simulation of forest respirations. The NEP of different forest ecosystems in Italy was currently simulated by the use of a remote sensing driven parametric model (modified C-Fix) and a biogeochemical model (BIOME-BGC). The outputs of the two models, which simulate forests in quasi-equilibrium conditions, are combined to estimate the carbon fluxes of actual conditions using information regarding the existing woody biomass. The estimates derived from the methodology have been tested against daily reference GPP and NEP data collected through the eddy correlation technique at five study sites in Italy. The first test concerned the theoretical validity of the simulation approach at both annual and daily time scales and was performed using optimal model drivers (i.e., collected or calibrated over the site measurements). Next, the test was repeated to assess the operational applicability of the methodology, which was driven by spatially extended data sets (i.e., data derived from existing wall-to-wall digital maps). A good estimation accuracy was generally obtained for GPP and NEP when using optimal model drivers. The use of spatially extended data sets worsens the accuracy to a varying degree, which is properly characterized. The model drivers with the most influence on the flux modeling strategy are, in increasing order of importance, forest type, soil features, meteorology, and forest woody biomass (growing stock volume).
Serdukova, Larissa; Zheng, Yayun; Duan, Jinqiao; Kurths, Jürgen
2017-08-24
For the tipping elements in the Earth's climate system, the most important issue to address is how stable is the desirable state against random perturbations. Extreme biotic and climatic events pose severe hazards to tropical rainforests. Their local effects are extremely stochastic and difficult to measure. Moreover, the direction and intensity of the response of forest trees to such perturbations are unknown, especially given the lack of efficient dynamical vegetation models to evaluate forest tree cover changes over time. In this study, we consider randomness in the mathematical modelling of forest trees by incorporating uncertainty through a stochastic differential equation. According to field-based evidence, the interactions between fires and droughts are a more direct mechanism that may describe sudden forest degradation in the south-eastern Amazon. In modeling the Amazonian vegetation system, we include symmetric α-stable Lévy perturbations. We report results of stability analysis of the metastable fertile forest state. We conclude that even a very slight threat to the forest state stability represents L´evy noise with large jumps of low intensity, that can be interpreted as a fire occurring in a non-drought year. During years of severe drought, high-intensity fires significantly accelerate the transition between a forest and savanna state.
Mi, Xiangcheng; Swenson, Nathan G; Jia, Qi; Rao, Mide; Feng, Gang; Ren, Haibao; Bebber, Daniel P; Ma, Keping
2016-09-07
Deterministic and stochastic processes jointly determine the community dynamics of forest succession. However, it has been widely held in previous studies that deterministic processes dominate forest succession. Furthermore, inference of mechanisms for community assembly may be misleading if based on a single axis of diversity alone. In this study, we evaluated the relative roles of deterministic and stochastic processes along a disturbance gradient by integrating species, functional, and phylogenetic beta diversity in a subtropical forest chronosequence in Southeastern China. We found a general pattern of increasing species turnover, but little-to-no change in phylogenetic and functional turnover over succession at two spatial scales. Meanwhile, the phylogenetic and functional beta diversity were not significantly different from random expectation. This result suggested a dominance of stochastic assembly, contrary to the general expectation that deterministic processes dominate forest succession. On the other hand, we found significant interactions of environment and disturbance and limited evidence for significant deviations of phylogenetic or functional turnover from random expectations for different size classes. This result provided weak evidence of deterministic processes over succession. Stochastic assembly of forest succession suggests that post-disturbance restoration may be largely unpredictable and difficult to control in subtropical forests.
Ostashev, Vladimir E; Wilson, D Keith; Muhlestein, Michael B; Attenborough, Keith
2018-02-01
Although sound propagation in a forest is important in several applications, there are currently no rigorous yet computationally tractable prediction methods. Due to the complexity of sound scattering in a forest, it is natural to formulate the problem stochastically. In this paper, it is demonstrated that the equations for the statistical moments of the sound field propagating in a forest have the same form as those for sound propagation in a turbulent atmosphere if the scattering properties of the two media are expressed in terms of the differential scattering and total cross sections. Using the existing theories for sound propagation in a turbulent atmosphere, this analogy enables the derivation of several results for predicting forest acoustics. In particular, the second-moment parabolic equation is formulated for the spatial correlation function of the sound field propagating above an impedance ground in a forest with micrometeorology. Effective numerical techniques for solving this equation have been developed in atmospheric acoustics. In another example, formulas are obtained that describe the effect of a forest on the interference between the direct and ground-reflected waves. The formulated correspondence between wave propagation in discrete and continuous random media can also be used in other fields of physics.
David B. Clark; Paulo C. Olivas; Steven F. Oberbauer; Deborah A. Clark; Michael G. Ryan
2008-01-01
Leaf Area Index (leaf area per unit ground area, LAI) is a key driver of forest productivity but has never previously been measured directly at the landscape scale in tropical rain forest (TRF). We used a modular tower and stratified random sampling to harvest all foliage from forest floor to canopy top in 55 vertical transects (4.6 m2) across 500 ha of old growth in...
NASA Astrophysics Data System (ADS)
Gilani, H., Sr.; Ganguly, S.; Zhang, G.; Koju, U. A.; Murthy, M. S. R.; Nemani, R. R.; Manandhar, U.; Thapa, G. J.
2015-12-01
Nepal is a landlocked country with 39% forest cover of the total land area (147,181 km2). Under the Forest Carbon Partnership Facility (FCPF) and implemented by the World Bank (WB), Nepal chosen as one of four countries best suitable for results-based payment system for Reducing Emissions from Deforestation and Forest Degradation (REDD and REDD+) scheme. At the national level Landsat based, from 1990 to 2000 the forest area has declined by 2%, i.e. by 1467 km2, whereas from 2000 to 2010 it has declined only by 0.12% i.e. 176 km2. A cost effective monitoring and evaluation system for REDD+ requires a balanced approach of remote sensing and ground measurements. This paper provides, for Nepal a cost effective and operational 30 m Above Ground Biomass (AGB) estimation and mapping methodology using freely available satellite data integrated with field inventory. Leaf Area Index (LAI) generated based on propose methodology by Ganguly et al. (2012) using Landsat-8 the OLI cloud free images. To generate tree canopy height map, a density scatter graph between the Geoscience Laser Altimeter System (GLAS) on the Ice, Cloud, and Land Elevation Satellite (ICESat) estimated maximum height and Landsat LAI nearest to the center coordinates of the GLAS shots show a moderate but significant exponential correlation (31.211*LAI0.4593, R2= 0.33, RMSE=13.25 m). From the field well distributed circular (750m2 and 500m2), 1124 field plots (0.001% representation of forest cover) measured which were used for estimation AGB (ton/ha) using Sharma et al. (1990) proposed equations for all tree species of Nepal. A satisfactory linear relationship (AGB = 8.7018*Hmax-101.24, R2=0.67, RMSE=7.2 ton/ha) achieved between maximum canopy height (Hmax) and AGB (ton/ha). This cost effective and operational methodology is replicable, over 5-10 years with minimum ground samples through integration of satellite images. Developed AGB used to produce optimum fuel wood scenarios using population and road accessibility datasets.
Gu, Jing; Wang, Qi; Wang, Xiaogang; Li, Hailong; Gu, Mei; Ming, Haixia; Dong, Xiaoli; Yang, Kehu; Wu, Hongyan
2014-01-01
Background. This review provides the first methodological information assessment of protocol of acupuncture RCTs registered in WHO International Clinical Trials Registry Platform (ICTRP). Methods. All records of acupuncture RCTs registered in the ICTRP have been collected. The methodological design assessment involved whether the randomization methods, allocation concealment, and blinding were adequate or not based on the information of registration records (protocols of acupuncture RCTs). Results. A total of 453 records, found in 11 registries, were examined. Methodological details were insufficient in registration records; there were 76.4%, 89.0%, and 21.4% records that did not provide information on randomization methods, allocation concealment, and blinding respectively. The proportions of adequate randomization methods, allocation concealment, and blinding were only 107 (23.6%), 48 (10.6%), and 210 (46.4%), respectively. The methodological design improved year by year, especially after 2007. Additionally, methodology of RCTs with ethics approval was clearly superior to those without ethics approval and different among registries. Conclusions. The overall methodological design based on registration records of acupuncture RCTs is not very well but improved year by year. The insufficient information on randomization methods, allocation concealment, and blinding maybe due to the relevant description is not taken seriously in acupuncture RCTs' registration. PMID:24688591
Gu, Jing; Wang, Qi; Wang, Xiaogang; Li, Hailong; Gu, Mei; Ming, Haixia; Dong, Xiaoli; Yang, Kehu; Wu, Hongyan
2014-01-01
Background. This review provides the first methodological information assessment of protocol of acupuncture RCTs registered in WHO International Clinical Trials Registry Platform (ICTRP). Methods. All records of acupuncture RCTs registered in the ICTRP have been collected. The methodological design assessment involved whether the randomization methods, allocation concealment, and blinding were adequate or not based on the information of registration records (protocols of acupuncture RCTs). Results. A total of 453 records, found in 11 registries, were examined. Methodological details were insufficient in registration records; there were 76.4%, 89.0%, and 21.4% records that did not provide information on randomization methods, allocation concealment, and blinding respectively. The proportions of adequate randomization methods, allocation concealment, and blinding were only 107 (23.6%), 48 (10.6%), and 210 (46.4%), respectively. The methodological design improved year by year, especially after 2007. Additionally, methodology of RCTs with ethics approval was clearly superior to those without ethics approval and different among registries. Conclusions. The overall methodological design based on registration records of acupuncture RCTs is not very well but improved year by year. The insufficient information on randomization methods, allocation concealment, and blinding maybe due to the relevant description is not taken seriously in acupuncture RCTs' registration.
NASA Astrophysics Data System (ADS)
Szatmári, Gábor; Laborczi, Annamária; Takács, Katalin; Pásztor, László
2017-04-01
The knowledge about soil organic carbon (SOC) baselines and changes, and the detection of vulnerable hot spots for SOC losses and gains under climate change and changed land management is still fairly limited. Thus Global Soil Partnership (GSP) has been requested to develop a global SOC mapping campaign by 2017. GSPs concept builds on official national data sets, therefore, a bottom-up (country-driven) approach is pursued. The elaborated Hungarian methodology suits the general specifications of GSOC17 provided by GSP. The input data for GSOC17@HU mapping approach has involved legacy soil data bases, as well as proper environmental covariates related to the main soil forming factors, such as climate, organisms, relief and parent material. Nowadays, digital soil mapping (DSM) highly relies on the assumption that soil properties of interest can be modelled as a sum of a deterministic and stochastic component, which can be treated and modelled separately. We also adopted this assumption in our methodology. In practice, multiple regression techniques are commonly used to model the deterministic part. However, this global (and usually linear) models commonly oversimplify the often complex and non-linear relationship, which has a crucial effect on the resulted soil maps. Thus, we integrated machine learning algorithms (namely random forest and quantile regression forest) in the elaborated methodology, supposing then to be more suitable for the problem in hand. This approach has enable us to model the GSOC17 soil properties in that complex and non-linear forms as the soil itself. Furthermore, it has enable us to model and assess the uncertainty of the results, which is highly relevant in decision making. The applied methodology has used geostatistical approach to model the stochastic part of the spatial variability of the soil properties of interest. We created GSOC17@HU map with 1 km grid resolution according to the GSPs specifications. The map contributes to the GSPs GSOC17 proposals, as well as to the development of global soil information system under GSP Pillar 4 on soil data and information. However, we elaborated our adherent code (created in R software environment) in such a way that it can be improved, specified and applied for further uses. Hence, it opens the door to create countrywide map(s) with higher grid resolution for SOC (or other soil related properties) using the advanced methodology, as well as to contribute and support the SOC (or other soil) related country level decision making. Our paper will present the soil mapping methodology itself, the resulted GSOC17@HU map, some of our conclusions drawn from the experiences and their effects on the further uses. Acknowledgement: Our work was supported by the Hungarian National Scientific Research Foundation (OTKA, Grant No. K105167).
Dynamics of Tree Species Diversity in Unlogged and Selectively Logged Malaysian Forests.
Shima, Ken; Yamada, Toshihiro; Okuda, Toshinori; Fletcher, Christine; Kassim, Abdul Rahman
2018-01-18
Selective logging that is commonly conducted in tropical forests may change tree species diversity. In rarely disturbed tropical forests, locally rare species exhibit higher survival rates. If this non-random process occurs in a logged forest, the forest will rapidly recover its tree species diversity. Here we determined whether a forest in the Pasoh Forest Reserve, Malaysia, which was selectively logged 40 years ago, recovered its original species diversity (species richness and composition). To explore this, we compared the dynamics of secies diversity between unlogged forest plot (18.6 ha) and logged forest plot (5.4 ha). We found that 40 years are not sufficient to recover species diversity after logging. Unlike unlogged forests, tree deaths and recruitments did not contribute to increased diversity in the selectively logged forests. Our results predict that selectively logged forests require a longer time at least than our observing period (40 years) to regain their diversity.
Assessing change in large-scale forest area by visually interpreting Landsat images
Jerry D. Greer; Frederick P. Weber; Raymond L. Czaplewski
2000-01-01
As part of the Forest Resources Assessment 1990, the Food and Agriculture Organization of the United Nations visually interpreted a stratified random sample of 117 Landsat scenes to estimate global status and change in tropical forest area. Images from 1980 and 1990 were interpreted by a group of widely experienced technical people in many different tropical countries...
A ground-based method of assessing urban forest structure and ecosystem services
David J. Nowak; Daniel E. Crane; Jack C. Stevens; Robert E. Hoehn; Jeffrey T. Walton; Jerry Bond
2008-01-01
To properly manage urban forests, it is essential to have data on this important resource. An efficient means to obtain this information is to randomly sample urban areas. To help assess the urban forest structure (e.g., number of trees, species composition, tree sizes, health) and several functions (e.g., air pollution removal, carbon storage and sequestration), the...
Spatially random mortality in old-growth red pine forests of northern Minnesota
Tuomas Aakala; Shawn Fraver; Brian J. Palik; Anthony W. D' Amato
2012-01-01
Characterizing the spatial distribution of tree mortality is critical to understanding forest dynamics, but empirical studies on these patterns under old-growth conditions are rare. This rarity is due in part to low mortality rates in old-growth forests, the study of which necessitates long observation periods, and the confounding influence of tree in-growth during...
ERIC Educational Resources Information Center
Wells, George A.; Shea, Beverley; Higgins, Julian P. T.; Sterne, Jonathan; Tugwell, Peter; Reeves, Barnaby C.
2013-01-01
Background: There is increasing interest from review authors about including non-randomized studies (NRS) in their systematic reviews of health care interventions. This series from the Ottawa Non-Randomized Studies Workshop consists of six papers identifying methodological issues when doing this. Aim: To format the guidance from the preceding…
Su, Xiaogang; Peña, Annette T; Liu, Lei; Levine, Richard A
2018-04-29
Assessing heterogeneous treatment effects is a growing interest in advancing precision medicine. Individualized treatment effects (ITEs) play a critical role in such an endeavor. Concerning experimental data collected from randomized trials, we put forward a method, termed random forests of interaction trees (RFIT), for estimating ITE on the basis of interaction trees. To this end, we propose a smooth sigmoid surrogate method, as an alternative to greedy search, to speed up tree construction. The RFIT outperforms the "separate regression" approach in estimating ITE. Furthermore, standard errors for the estimated ITE via RFIT are obtained with the infinitesimal jackknife method. We assess and illustrate the use of RFIT via both simulation and the analysis of data from an acupuncture headache trial. Copyright © 2018 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Polichtchouk, Yuri; Tokareva, Olga; Bulgakova, Irina V.
2003-03-01
Methodical problems of space images processing for assessment of atmosphere pollution impact on forest ecosystems using geoinformation systems are developed. An approach to quantitative assessment of atmosphere pollution impact on forest ecosystems is based on calculating relative squares of forest landscapes which are inside atmosphere pollution zones. Landscape structure of forested territories in the southern part of Western Siberia are determined on the basis of procession of middle resolution space images from spaceborn Resource-O. Particularities of atmosphere pollution zones modeling caused by gas burning in torches on territories of oil fields are considered. Pollution zones were revealed by modeling of contaminants dispersal in atmosphere with standard models. Polluted landscapes squares are calculated depending on atmosphere pollution level.
CRF: detection of CRISPR arrays using random forest.
Wang, Kai; Liang, Chun
2017-01-01
CRISPRs (clustered regularly interspaced short palindromic repeats) are particular repeat sequences found in wide range of bacteria and archaea genomes. Several tools are available for detecting CRISPR arrays in the genomes of both domains. Here we developed a new web-based CRISPR detection tool named CRF (CRISPR Finder by Random Forest). Different from other CRISPR detection tools, a random forest classifier was used in CRF to filter out invalid CRISPR arrays from all putative candidates and accordingly enhanced detection accuracy. In CRF, particularly, triplet elements that combine both sequence content and structure information were extracted from CRISPR repeats for classifier training. The classifier achieved high accuracy and sensitivity. Moreover, CRF offers a highly interactive web interface for robust data visualization that is not available among other CRISPR detection tools. After detection, the query sequence, CRISPR array architecture, and the sequences and secondary structures of CRISPR repeats and spacers can be visualized for visual examination and validation. CRF is freely available at http://bioinfolab.miamioh.edu/crf/home.php.
Do bioclimate variables improve performance of climate envelope models?
Watling, James I.; Romañach, Stephanie S.; Bucklin, David N.; Speroterra, Carolina; Brandt, Laura A.; Pearlstine, Leonard G.; Mazzotti, Frank J.
2012-01-01
Climate envelope models are widely used to forecast potential effects of climate change on species distributions. A key issue in climate envelope modeling is the selection of predictor variables that most directly influence species. To determine whether model performance and spatial predictions were related to the selection of predictor variables, we compared models using bioclimate variables with models constructed from monthly climate data for twelve terrestrial vertebrate species in the southeastern USA using two different algorithms (random forests or generalized linear models), and two model selection techniques (using uncorrelated predictors or a subset of user-defined biologically relevant predictor variables). There were no differences in performance between models created with bioclimate or monthly variables, but one metric of model performance was significantly greater using the random forest algorithm compared with generalized linear models. Spatial predictions between maps using bioclimate and monthly variables were very consistent using the random forest algorithm with uncorrelated predictors, whereas we observed greater variability in predictions using generalized linear models.
Clustering Single-Cell Expression Data Using Random Forest Graphs.
Pouyan, Maziyar Baran; Nourani, Mehrdad
2017-07-01
Complex tissues such as brain and bone marrow are made up of multiple cell types. As the study of biological tissue structure progresses, the role of cell-type-specific research becomes increasingly important. Novel sequencing technology such as single-cell cytometry provides researchers access to valuable biological data. Applying machine-learning techniques to these high-throughput datasets provides deep insights into the cellular landscape of the tissue where those cells are a part of. In this paper, we propose the use of random-forest-based single-cell profiling, a new machine-learning-based technique, to profile different cell types of intricate tissues using single-cell cytometry data. Our technique utilizes random forests to capture cell marker dependences and model the cellular populations using the cell network concept. This cellular network helps us discover what cell types are in the tissue. Our experimental results on public-domain datasets indicate promising performance and accuracy of our technique in extracting cell populations of complex tissues.
Comparative analysis of used car price evaluation models
NASA Astrophysics Data System (ADS)
Chen, Chuancan; Hao, Lulu; Xu, Cong
2017-05-01
An accurate used car price evaluation is a catalyst for the healthy development of used car market. Data mining has been applied to predict used car price in several articles. However, little is studied on the comparison of using different algorithms in used car price estimation. This paper collects more than 100,000 used car dealing records throughout China to do empirical analysis on a thorough comparison of two algorithms: linear regression and random forest. These two algorithms are used to predict used car price in three different models: model for a certain car make, model for a certain car series and universal model. Results show that random forest has a stable but not ideal effect in price evaluation model for a certain car make, but it shows great advantage in the universal model compared with linear regression. This indicates that random forest is an optimal algorithm when handling complex models with a large number of variables and samples, yet it shows no obvious advantage when coping with simple models with less variables.
Random forest models to predict aqueous solubility.
Palmer, David S; O'Boyle, Noel M; Glen, Robert C; Mitchell, John B O
2007-01-01
Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.
Pandis, Nikolaos; Polychronopoulou, Argy; Eliades, Theodore
2011-12-01
Randomization is a key step in reducing selection bias during the treatment allocation phase in randomized clinical trials. The process of randomization follows specific steps, which include generation of the randomization list, allocation concealment, and implementation of randomization. The phenomenon in the dental and orthodontic literature of characterizing treatment allocation as random is frequent; however, often the randomization procedures followed are not appropriate. Randomization methods assign, at random, treatment to the trial arms without foreknowledge of allocation by either the participants or the investigators thus reducing selection bias. Randomization entails generation of random allocation, allocation concealment, and the actual methodology of implementing treatment allocation randomly and unpredictably. Most popular randomization methods include some form of restricted and/or stratified randomization. This article introduces the reasons, which make randomization an integral part of solid clinical trial methodology, and presents the main randomization schemes applicable to clinical trials in orthodontics.
Study on the methodology of road carbon sink forest
NASA Astrophysics Data System (ADS)
Wan, Lijuan; Zhang, Yi; Cheng, Dongxiang; Huang, Yanan
2017-01-01
Advanced concepts of forest carbon sink and forestry carbon sequestration are introduced in road carbon sink forest project and the measurement and carbon monitoring of road carbon sink forest are explored. Experience and technology are accumulated and a set of the carbon sequestration forestation and carbon measurement and monitoring technology systems on both sides of road are formed. To update the green concept, improve the forestation quality along road and to enhanced sequestration and ecological efficiency, it is important to realize the traffic low carbon and energy saving and emission reduction. To use scientific planting and monitoring methods, soil properties, carbon sequestration of soil organic carbon pool, and carbon sequestration capacity of different species of trees were studied and monitored. High carbon sequestration species selection, silvicultural management, measurement of carbon sink and carbon monitoring are explored.
Fouad Fanous; Jeremy May; Terry Wipf; Michael Ritter
2011-01-01
Increased use of timber bridges in the U.S. transportation system has required additional research to improve the current design methodology of these bridges. For this reason, the U.S. Forest Service, Forest Products Laboratory (FPL), and the Federal Highway Administration have supported several research programs to attain the objective listed above. This report is a...
NCFES
1966-01-01
Included are (1) 22 technical papers (by researchers from many sections of the United States and Canada) pertaining to selection and progeny testing, radiation genetics, intraspecific variation, natural and artificial hybridization, breeding systems, breeding methodology and specialized tree breeding techniques, and applied breeding and allied fields; (2) concise...
D.B.H. and Survival Analysis: A New Methodology for Assessing Forest Inventory Mortality
Christopher W. Woodall; Patricia L. Grambsch; William Thomas
2005-01-01
Tree mortality has typically been assessed in Forest Inventory and Analysis (FIA) studies through summaries of mortality by location, species, and causal agents. Although these methods have historically been used for most of FIA's tree mortality analyses, they are inadequate for robust assessment of mortality trends and dynamics. To offer a new method of analyzing...
Methodology used in Cuba for estimating economic losses caused by forest fires
Marcos Pedro Ramos Rodríguez; Raúl González Rodríguez
2013-01-01
Assessment of economic losses caused by forest fires is a highly complex but important activity. It is complicated first by the large number of effects, in different periods, brought about in the social, economic and environmental fields. Secondly, the difficulty of assigning a market value to resources such as biodiversity or endangered species should be mentioned. It...
Randall S. Rosenberger; John B. Loomis
2001-01-01
We present an annotated bibliography that provides information on and reference to the literature on outdoor recreation use valuation studies. This information is presented by study source, benefit measures, recreation activity, valuation methodology, and USDA Forest Service region. Tables are provided that reference the bibliography for each activity, enabling easy...
Sandra Brown
2013-01-01
Two methodologies for estimating net emissions from forest harvesting practices (for timber and possibly fuel) are presented: (1) a standard approach of using medium resolution imagery to monitor the expansion of logging infrastructure into non-logged areas for activity data combined with ground plots and the stock-change method for emission factors; and (2) a...
John G. Hof; Curtis H. Flather; Tony J. Baltic; Rudy M. King
2004-01-01
This article reports the methodology and results of a data envelopment analysis (DEA) that attempts to identify areas in the country where there is maximum potential for improving the forest and rangeland condition, based on 12 indicator variables. This analysis differs from previous DEA studies in that the primary variables are measures of human activity and...
Spatial allocation of market and nonmarket values in wildland fire management: A case study
John W. Benoit; Armando González-Cabán; Francis M. Fujioka; Shyh-Chin Chen; José J. Sanchez
2013-01-01
We developed a methodology to evaluate the efficacy of fuel treatments by estimating their costs and potential costs/losses with and without treatments in the San Jacinto Ranger District of the San Bernardino National Forest, California. This district is a typical southern California forest complex containing a large amount of high-valued real estate. We chose four...
Predicting Coastal Flood Severity using Random Forest Algorithm
NASA Astrophysics Data System (ADS)
Sadler, J. M.; Goodall, J. L.; Morsy, M. M.; Spencer, K.
2017-12-01
Coastal floods have become more common recently and are predicted to further increase in frequency and severity due to sea level rise. Predicting floods in coastal cities can be difficult due to the number of environmental and geographic factors which can influence flooding events. Built stormwater infrastructure and irregular urban landscapes add further complexity. This paper demonstrates the use of machine learning algorithms in predicting street flood occurrence in an urban coastal setting. The model is trained and evaluated using data from Norfolk, Virginia USA from September 2010 - October 2016. Rainfall, tide levels, water table levels, and wind conditions are used as input variables. Street flooding reports made by city workers after named and unnamed storm events, ranging from 1-159 reports per event, are the model output. Results show that Random Forest provides predictive power in estimating the number of flood occurrences given a set of environmental conditions with an out-of-bag root mean squared error of 4.3 flood reports and a mean absolute error of 0.82 flood reports. The Random Forest algorithm performed much better than Poisson regression. From the Random Forest model, total daily rainfall was by far the most important factor in flood occurrence prediction, followed by daily low tide and daily higher high tide. The model demonstrated here could be used to predict flood severity based on forecast rainfall and tide conditions and could be further enhanced using more complete street flooding data for model training.
Differentiation of fat, muscle, and edema in thigh MRIs using random forest classification
NASA Astrophysics Data System (ADS)
Kovacs, William; Liu, Chia-Ying; Summers, Ronald M.; Yao, Jianhua
2016-03-01
There are many diseases that affect the distribution of muscles, including Duchenne and fascioscapulohumeral dystrophy among other myopathies. In these disease cases, it is important to quantify both the muscle and fat volumes to track the disease progression. There has also been evidence that abnormal signal intensity on the MR images, which often is an indication of edema or inflammation can be a good predictor for muscle deterioration. We present a fully-automated method that examines magnetic resonance (MR) images of the thigh and identifies the fat, muscle, and edema using a random forest classifier. First the thigh regions are automatically segmented using the T1 sequence. Then, inhomogeneity artifacts were corrected using the N3 technique. The T1 and STIR (short tau inverse recovery) images are then aligned using landmark based registration with the bone marrow. The normalized T1 and STIR intensity values are used to train the random forest. Once trained, the random forest can accurately classify the aforementioned classes. This method was evaluated on MR images of 9 patients. The precision values are 0.91+/-0.06, 0.98+/-0.01 and 0.50+/-0.29 for muscle, fat, and edema, respectively. The recall values are 0.95+/-0.02, 0.96+/-0.03 and 0.43+/-0.09 for muscle, fat, and edema, respectively. This demonstrates the feasibility of utilizing information from multiple MR sequences for the accurate quantification of fat, muscle and edema.
AUTOCLASSIFICATION OF THE VARIABLE 3XMM SOURCES USING THE RANDOM FOREST MACHINE LEARNING ALGORITHM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Farrell, Sean A.; Murphy, Tara; Lo, Kitty K., E-mail: s.farrell@physics.usyd.edu.au
In the current era of large surveys and massive data sets, autoclassification of astrophysical sources using intelligent algorithms is becoming increasingly important. In this paper we present the catalog of variable sources in the Third XMM-Newton Serendipitous Source catalog (3XMM) autoclassified using the Random Forest machine learning algorithm. We used a sample of manually classified variable sources from the second data release of the XMM-Newton catalogs (2XMMi-DR2) to train the classifier, obtaining an accuracy of ∼92%. We also evaluated the effectiveness of identifying spurious detections using a sample of spurious sources, achieving an accuracy of ∼95%. Manual investigation of amore » random sample of classified sources confirmed these accuracy levels and showed that the Random Forest machine learning algorithm is highly effective at automatically classifying 3XMM sources. Here we present the catalog of classified 3XMM variable sources. We also present three previously unidentified unusual sources that were flagged as outlier sources by the algorithm: a new candidate supergiant fast X-ray transient, a 400 s X-ray pulsar, and an eclipsing 5 hr binary system coincident with a known Cepheid.« less
Advances in remote sensing of forest background reflectance with MODIS BRDF data across Europe
NASA Astrophysics Data System (ADS)
Pisek, Jan; Alikas, Krista; Lukeš, Petr; Lundin, Lars; Kobler, Johannes; Santos-Reis, Margarida; Chen, Jing
2017-04-01
Spatial and temporal patterns of forest background (understory) reflectance are crucial for retrieving biophysical parameters of forest canopies (overstory) and subsequently for ecosystem modeling. However, systematic reflectance data covering different site types are almost missing. This presentation will focus on the validation of background reflectance retrievals using MODIS bidirectional reflectance distribution function (BRDF) data against in-situ understory reflectance measurements covering a diverse set of long-term ecological research (LTER) sites distributed along a wide latitudinal and elevational gradient across Europe: protected coniferous blueberry forest in Sweden, karst forest system in Austria, floodplain broadleaf forest and coniferous forest in the Czech Republic, and Mediterranean agro-sylvo-pastoral woodlands in Portugal. The multi-angle remote sensing data-based methodology was originally developed for the forest background signal retrieval in a boreal region. Here its performance will be tested across diverse forest conditions and moments during the growing season, which is a necessary step before conducting extensive mapping over forested areas. The results can be also used as an input for improved modeling of local carbon and energy fluxes.
Calibration and Application of FOREST-BGC in NorthWestern of Portugal
NASA Astrophysics Data System (ADS)
Rodrigues, M. A.; Lopes, D. M.; Leite, M. S.; Tabuada, V. M.
2010-05-01
Net primary production (NPP) is one of the most important variables in terms of ecosystems inventory and management, because it quantifies its growth and reflects the impact of biotic and abiotic factors, which could affect it. Interest in NP has increased recently because of the increasing interesting in climate change and the need in understanding its impact on the environment. There are ecophysiologic models, as Forest-BGC that allow for estimating NPP. The types of models offer a possible methodology to test these phenomena, beyond temporal and spatial scales, not available with tradicional inventory methodologies. To analyze the Forest-BGC performance, NPP data obtained with model were compared with collected data in the field, in the same sampling plots. For a parameterization and validation of the FOREST-BGC, this study was carried on based on 500m2 sampling plots from the National Forest Inventory 2006 and are located in several County Halls of the district of Vila Real, Portugal (Montalegre, Chaves, Valpaços, Boticas, Vila Pouca de Aguiar, Murça, Mondim de Basto, Alijó, Sabrosa and Vila Real). In order to quantify Biomass dinamics, we have selected 45 sampling plots: 19 from Pinus pinaster stands, 17 from Quercus pyreneica and 10 from mixed of Quercus with Pinus. Adaptation strategies for climate change impacts can be proposed based on these research results.
No evidence for intervention-dependent influence of methodological features on treatment effect.
Jacobs, Wilco C H; Kruyt, Moyo C; Moojen, Wouter A; Verbout, Ab J; Oner, F Cumhur
2013-12-01
The goal of this systematic review was to evaluate if the influence of methodological features on treatment effect differs between types of intervention. MEDLINE, Embase, Web of Science, Cochrane methodology register, and reference lists were searched for meta-epidemiologic studies on the influence of methodological features on treatment effect. Studies analyzing influence of methodological features related to internal validity were included. We made a distinction among surgical, pharmaceutical, and therapeutical as separate types of intervention. Heterogeneity was calculated to identify differences among these types. Fourteen meta-epidemiologic studies were found with 51 estimates of influence of methodological features on treatment effect. Heterogeneity was observed among the intervention types for randomization. Surgical intervention studies showed a larger treatment effect when randomized; this was in contrast to pharmaceutical studies that found the opposite. For allocation concealment and double blinding, the influence of methodological features on the treatment effect was comparable across different types of intervention. For the remaining methodological features, there were insufficient observations. The influence of allocation concealment and double blinding on the treatment effect is consistent across studies of different interventional types. The influence of randomization although, may be different between surgical and nonsurgical studies. Copyright © 2013 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Bassa, Zaakirah; Bob, Urmilla; Szantoi, Zoltan; Ismail, Riyad
2016-01-01
In recent years, the popularity of tree-based ensemble methods for land cover classification has increased significantly. Using WorldView-2 image data, we evaluate the potential of the oblique random forest algorithm (oRF) to classify a highly heterogeneous protected area. In contrast to the random forest (RF) algorithm, the oRF algorithm builds multivariate trees by learning the optimal split using a supervised model. The oRF binary algorithm is adapted to a multiclass land cover and land use application using both the "one-against-one" and "one-against-all" combination approaches. Results show that the oRF algorithms are capable of achieving high classification accuracies (>80%). However, there was no statistical difference in classification accuracies obtained by the oRF algorithms and the more popular RF algorithm. For all the algorithms, user accuracies (UAs) and producer accuracies (PAs) >80% were recorded for most of the classes. Both the RF and oRF algorithms poorly classified the indigenous forest class as indicated by the low UAs and PAs. Finally, the results from this study advocate and support the utility of the oRF algorithm for land cover and land use mapping of protected areas using WorldView-2 image data.
NASA Astrophysics Data System (ADS)
Santos, E. G.; Jorge, A.; Shimabukuro, Y. E.; Gasparini, K.
2017-12-01
The State of Mato Grosso - MT has the second largest area with degraded forest among the states of the Brazilian Legal Amazon. Land use and land cover change processes that occur in this region cause the loss of forest biomass, releasing greenhouse gases that contribute to the increase of temperature on earth. These degraded forest areas lose biomass according to the intensity and magnitude of the degradation type. The estimate of forest biomass, commonly performed by forest inventory through sample plots, shows high variance in degraded forest areas. Due to this variance and complexity of tropical forests, the aim of this work was to estimate forest biomass using LiDAR point clouds in three distinct forest areas: one degraded by fire, another by selective logging and one area of intact forest. The approach applied in these areas was the Individual Tree Detection (ITD). To isolate the trees, we generated Canopy Height Models (CHM) images, which are obtained by subtracting the Digital Elevation Model (MDE) and the Digital Terrain Model (MDT), created by the cloud of LiDAR points. The trees in the CHM images are isolated by an algorithm provided by the Quantitative Ecology research group at the School of Forestry at Northern Arizona University (SILVA, 2015). With these points, metrics were calculated for some areas, which were used in the model of biomass estimation. The methodology used in this work was expected to reduce the error in biomass estimate in the study area. The cloud points of the most representative trees were analyzed, and thus field data was correlated with the individual trees found by the proposed algorithm. In a pilot study, the proposed methodology was applied generating the individual tree metrics: total height and area of the crown. When correlating 339 isolated trees, an unsatisfactory R² was obtained, as heights found by the algorithm were lower than those obtained in the field, with an average difference of 2.43 m. This shows that the algorithm used to isolate trees in temperate areas did not obtained satisfactory results in the tropical forest of Mato Grosso State. Due to this, in future works two algorithms, one developed by Dalponte et al. (2015) and another by Li et al. (2012) will be used.
Predicting healthcare associated infections using patients' experiences
NASA Astrophysics Data System (ADS)
Pratt, Michael A.; Chu, Henry
2016-05-01
Healthcare associated infections (HAI) are a major threat to patient safety and are costly to health systems. Our goal is to predict the HAI performance of a hospital using the patients' experience responses as input. We use four classifiers, viz. random forest, naive Bayes, artificial feedforward neural networks, and the support vector machine, to perform the prediction of six types of HAI. The six types include blood stream, urinary tract, surgical site, and intestinal infections. Experiments show that the random forest and support vector machine perform well across the six types of HAI.
Sánchez-Ribas, Jordi; Oliveira-Ferreira, Joseli; Rosa-Freitas, Maria Goreti; Trilla, Lluís; Silva-do-Nascimento, Teresa Fernandes
2015-09-01
Here we present the first in a series of articles about the ecology of immature stages of anophelines in the Brazilian Yanomami area. We propose a new larval habitat classification and a new larval sampling methodology. We also report some preliminary results illustrating the applicability of the methodology based on data collected in the Brazilian Amazon rainforest in a longitudinal study of two remote Yanomami communities, Parafuri and Toototobi. In these areas, we mapped and classified 112 natural breeding habitats located in low-order river systems based on their association with river flood pulses, seasonality and exposure to sun. Our classification rendered seven types of larval habitats: lakes associated with the river, which are subdivided into oxbow lakes and nonoxbow lakes, flooded areas associated with the river, flooded areas not associated with the river, rainfall pools, small forest streams, medium forest streams and rivers. The methodology for larval sampling was based on the accurate quantification of the effective breeding area, taking into account the area of the perimeter and subtypes of microenvironments present per larval habitat type using a laser range finder and a small portable inflatable boat. The new classification and new sampling methodology proposed herein may be useful in vector control programs.
Sánchez-Ribas, Jordi; Oliveira-Ferreira, Joseli; Rosa-Freitas, Maria Goreti; Trilla, Lluís; Silva-do-Nascimento, Teresa Fernandes
2015-01-01
Here we present the first in a series of articles about the ecology of immature stages of anophelines in the Brazilian Yanomami area. We propose a new larval habitat classification and a new larval sampling methodology. We also report some preliminary results illustrating the applicability of the methodology based on data collected in the Brazilian Amazon rainforest in a longitudinal study of two remote Yanomami communities, Parafuri and Toototobi. In these areas, we mapped and classified 112 natural breeding habitats located in low-order river systems based on their association with river flood pulses, seasonality and exposure to sun. Our classification rendered seven types of larval habitats: lakes associated with the river, which are subdivided into oxbow lakes and nonoxbow lakes, flooded areas associated with the river, flooded areas not associated with the river, rainfall pools, small forest streams, medium forest streams and rivers. The methodology for larval sampling was based on the accurate quantification of the effective breeding area, taking into account the area of the perimeter and subtypes of microenvironments present per larval habitat type using a laser range finder and a small portable inflatable boat. The new classification and new sampling methodology proposed herein may be useful in vector control programs. PMID:26517655
Andrew T. Hudak; Jeffrey S. Evans; Nicholas L. Crookston; Michael J. Falkowski; Brant K. Steigers; Rob Taylor; Halli Hemingway
2008-01-01
Stand exams are the principal means by which timber companies monitor and manage their forested lands. Airborne LiDAR surveys sample forest stands at much finer spatial resolution and broader spatial extent than is practical on the ground. In this paper, we developed models that leverage spatially intensive and extensive LiDAR data and a stratified random sample of...
John D. Baldridge; James T. Sylvester; William T. Borrie
2005-01-01
Local, state, and national agencies charged with managing wildlands in the United States are now seeking to learn more about the public's preferences for managing forests. For this reason agency wildland managers are making use of survey research to supplement their public input processes. Agency managers often choose random-digit dial telephone surveys because of...
Effect of the federal estate tax on nonindustrial private forest holdings
John L. Greene; Steven H. Bullard; Tamara L. Cushing; Theodore Beauvais
2006-01-01
Data for this study were collected using a questionnaire mailed to randomly selected members of two forest owner organizations. Among the key findings is that 38% of forest estates owed federal estate tax, a rate many times higher than US estates in general. In 28% of the cases where estate tax was due, timber or land was sold because other assets were not adequate. In...
Acorn Production on the Missouri Ozark Forest Ecosystem Project Study Sites: Pre-treatment Data
Larry D. Vangilder
1997-01-01
In the pre-treatment phase of a study to determine if even- and uneven-aged forest management affects the production of acorns on the Missourt Forest Ecosystem Project (MOFEP) study sites, acorn production was measured on the nine study sites by randomly placing from 2 to 6 plots in each of four ecological land type (ELT) groupings (N=130 plots). A split-plot...
Sonia Wharton; Matt Schroeder; Kyaw Tha Paw U; Matthias Falk; Ken Bible
2009-01-01
Carbon dioxide (CO2), water vapor, and energy fluxes were measured using eddy covariance (EC) methodology over three adjacent evergreen forests in southern Washington State to identify stand-level age-effects on ecosystem exchange. The sites represent Douglas-fir forest ecosystems at two contrasting successional stages: old-growth (OG) and early...
David N. Bengston; David P. Fan
1999-01-01
This article presents an innovative methodology for evaluating strategic planning goals in a public agency. Computer-coded content analysis was used to evaluate attitudes expressed in about 28,000 on-line news media stories about the U.S. Department of Agriculture Forest Service and its strategic goal of conservation leadership. Three dimensions of conservation...
Riparian buffers and forest thinning: Effects on headwater vertebrates 10 years after thinning
Deanna H. Olson; Jeffery B. Leirness; Patrick G. Cunningham; E. Ashley Steel
2014-01-01
We monitored instream vertebrate and stream-bank-dwelling amphibian counts during a stand-scale experiment of the effect of riparian buffer width with upland forest thinning in western Oregon, USA using a before/after/control methodology. We analyzed animal counts along 45 streams at 8 study sites, distributed from the foothills of Mount Hood to Coos Bay, Oregon using...
Francisco Rodríguez y Silva; Juan Ramón Molina Martínez; Miguel Castillo Soto
2013-01-01
Assessing areas affected by forest fires requires comprehensive studies covering a wide range of analyzes. From an economic standpoint, assessing the affected area in monetary terms is crucial. Determining the degree of loss in the value of natural resources, both those of a tangible and intangible nature, enables knowing the residual value remaining after a fire, i.e...
Andrew Moldenke; Becky Fichter
1988-01-01
A fully illustrated key is presented for identifying genera of oribatid mites known from or suspected of occurring in the Pacific Northwest. The manual includes an introduction detailing sampling methodology; an illustrated glossary of all terminology used; two color plates of all taxa from the H. J. Andrews Experimental Forest; a diagrammatic key to the 16 major...
Katharine White; Jennifer Pontius; Paul Schaberg
2014-01-01
Current remote sensing studies of phenology have been limited to coarse spatial or temporal resolution and often lack a direct link to field measurements. To address this gap, we compared remote sensing methodologies using Landsat Thematic Mapper (TM) imagery to extensive field measurements in a mixed northern hardwood forest. Five vegetation indices, five mathematical...
Effects of foliage clumping on the estimation of global terrestrial gross primary productivity
NASA Astrophysics Data System (ADS)
Chen, Jing M.; Mo, Gang; Pisek, Jan; Liu, Jane; Deng, Feng; Ishizawa, Misa; Chan, Douglas
2012-03-01
Sunlit and shaded leaf separation proposed by Norman (1982) is an effective way to upscale from leaf to canopy in modeling vegetation photosynthesis. The Boreal Ecosystem Productivity Simulator (BEPS) makes use of this methodology, and has been shown to be reliable in modeling the gross primary productivity (GPP) derived from CO2flux and tree ring measurements. In this study, we use BEPS to investigate the effect of canopy architecture on the global distribution of GPP. For this purpose, we use not only leaf area index (LAI) but also the first ever global map of the foliage clumping index derived from the multiangle satellite sensor POLDER at 6 km resolution. The clumping index, which characterizes the degree of the deviation of 3-dimensional leaf spatial distributions from the random case, is used to separate sunlit and shaded LAI values for a given LAI. Our model results show that global GPP in 2003 was 132 ± 22 Pg C. Relative to this baseline case, our results also show: (1) global GPP is overestimated by 12% when accurate LAI is available but clumping is ignored, and (2) global GPP is underestimated by 9% when the effective LAI is available and clumping is ignored. The clumping effects in both cases are statistically significant (p < 0.001). The effective LAI is often derived from remote sensing by inverting the measured canopy gap fraction to LAI without considering the clumping. Global GPP would therefore be generally underestimated when remotely sensed LAI (actually effective LAI by our definition) is used. This is due to the underestimation of the shaded LAI and therefore the contribution of shaded leaves to GPP. We found that shaded leaves contribute 50%, 38%, 37%, 39%, 26%, 29% and 21% to the total GPP for broadleaf evergreen forest, broadleaf deciduous forest, evergreen conifer forest, deciduous conifer forest, shrub, C4 vegetation, and other vegetation, respectively. The global average of this ratio is 35%.
Functional Redundancy and Complementarities of Seed Dispersal by the Last Neotropical Megafrugivores
Bueno, Rafael S.; Guevara, Roger; Ribeiro, Milton C.; Culot, Laurence; Bufalo, Felipe S.; Galetti, Mauro
2013-01-01
Background Functional redundancy has been debated largely in ecology and conservation, yet we lack detailed empirical studies on the roles of functionally similar species in ecosystem function. Large bodied frugivores may disperse similar plant species and have strong impact on plant recruitment in tropical forests. The two largest frugivores in the neotropics, tapirs (Tapirus terrestris) and muriquis (Brachyteles arachnoides) are potential candidates for functional redundancy on seed dispersal effectiveness. Here we provide a comparison of the quantitative, qualitative and spatial effects on seed dispersal by these megafrugivores in a continuous Brazilian Atlantic forest. Methodology/Principal Findings We found a low overlap of plant species dispersed by both muriquis and tapirs. A group of 35 muriquis occupied an area of 850 ha and dispersed 5 times more plant species, and 13 times more seeds than 22 tapirs living in the same area. Muriquis dispersed 2.4 times more seeds in any random position than tapirs. This can be explained mainly because seed deposition by muriquis leaves less empty space than tapirs. However, tapirs are able to disperse larger seeds than muriquis and move them into sites not reached by primates, such as large forest gaps, open areas and fragments nearby. Based on published information we found 302 plant species that are dispersed by at least one of these megafrugivores in the Brazilian Atlantic forest. Conclusions/Significance Our study showed that both megafrugivores play complementary rather than redundant roles as seed dispersers. Although tapirs disperse fewer seeds and species than muriquis, they disperse larger-seeded species and in places not used by primates. The selective extinction of these megafrugivores will change the spatial seed rain they generate and may have negative effects on the recruitment of several plant species, particularly those with large seeds that have muriquis and tapirs as the last living seed dispersers. PMID:23409161
Charney, Noah D; Babst, Flurin; Poulter, Benjamin; Record, Sydne; Trouet, Valerie M; Frank, David; Enquist, Brian J; Evans, Margaret E K
2016-09-01
Predicting long-term trends in forest growth requires accurate characterisation of how the relationship between forest productivity and climatic stress varies across climatic regimes. Using a network of over two million tree-ring observations spanning North America and a space-for-time substitution methodology, we forecast climate impacts on future forest growth. We explored differing scenarios of increased water-use efficiency (WUE) due to CO2 -fertilisation, which we simulated as increased effective precipitation. In our forecasts: (1) climate change negatively impacted forest growth rates in the interior west and positively impacted forest growth along the western, southeastern and northeastern coasts; (2) shifting climate sensitivities offset positive effects of warming on high-latitude forests, leaving no evidence for continued 'boreal greening'; and (3) it took a 72% WUE enhancement to compensate for continentally averaged growth declines under RCP 8.5. Our results highlight the importance of locally adapted forest management strategies to handle regional differences in growth responses to climate change. © 2016 John Wiley & Sons Ltd/CNRS.
Smith, W Brad; Cuenca Lara, Rubí Angélica; Delgado Caballero, Carina Edith; Godínez Valdivia, Carlos Isaías; Kapron, Joseph S; Leyva Reyes, Juan Carlos; Meneses Tovar, Carmen Lourdes; Miles, Patrick D; Oswalt, Sonja N; Ramírez Salgado, Mayra; Song, Xilong Alex; Stinson, Graham; Villela Gaytán, Sergio Armando
2018-05-21
Forests cannot be managed sustainably without reliable data to inform decisions. National Forest Inventories (NFI) tend to report national statistics, with sub-national stratification based on domestic ecological classification systems. It is becoming increasingly important to be able to report statistics on ecosystems that span international borders, as global change and globalization expand stakeholders' spheres of concern. The state of a transnational ecosystem can only be properly assessed by examining the entire ecosystem. In global forest resource assessments, it may be useful to break national statistics down by ecosystem, especially for large countries. The Inventory and Monitoring Working Group (IMWG) of the North American Forest Commission (NAFC) has begun developing a harmonized North American Forest Database (NAFD) for managing forest inventory data, enabling consistent, continental-scale forest assessment supporting ecosystem-level reporting and relational queries. The first iteration of the database contains data describing 1.9 billion ha, including 677.5 million ha of forest. Data harmonization is made challenging by the existence of definitions and methodologies tailored to suit national circumstances, emerging from each country's professional forestry development. This paper reports the methods used to synchronize three national forest inventories, starting with a small suite of variables and attributes.
Co-Benefits of Sustainable Forest Management in Biodiversity Conservation and Carbon Sequestration
Imai, Nobuo; Samejima, Hiromitsu; Langner, Andreas; Ong, Robert C.; Kita, Satoshi; Titin, Jupiri; Chung, Arthur Y. C.; Lagan, Peter; Lee, Ying Fah; Kitayama, Kanehiro
2009-01-01
Background Sustainable forest management (SFM), which has been recently introduced to tropical natural production forests, is beneficial in maintaining timber resources, but information about the co-benefits for biodiversity conservation and carbon sequestration is currently lacking. Methodology/Principal Findings We estimated the diversity of medium to large-bodied forest-dwelling vertebrates using a heat-sensor camera trapping system and the amount of above-ground, fine-roots, and soil organic carbon by a combination of ground surveys and aerial-imagery interpretations. This research was undertaken both in SFM applied as well as conventionally logged production forests in Sabah, Malaysian Borneo. Our carbon estimation revealed that the application of SFM resulted in a net gain of 54 Mg C ha-1 on a landscape scale. Overall vertebrate diversity was greater in the SFM applied forest than in the conventionally logged forest. Specifically, several vertebrate species (6 out of recorded 36 species) showed higher frequency in the SFM applied forest than in the conventionally logged forest. Conclusions/Significance The application of SFM to degraded natural production forests could result in greater diversity and abundance of vertebrate species as well as increasing carbon storage in the tropical rain forest ecosystems. PMID:20011516
Community turnover of wood-inhabiting fungi across hierarchical spatial scales.
Abrego, Nerea; García-Baquero, Gonzalo; Halme, Panu; Ovaskainen, Otso; Salcedo, Isabel
2014-01-01
For efficient use of conservation resources it is important to determine how species diversity changes across spatial scales. In many poorly known species groups little is known about at which spatial scales the conservation efforts should be focused. Here we examined how the community turnover of wood-inhabiting fungi is realised at three hierarchical levels, and how much of community variation is explained by variation in resource composition and spatial proximity. The hierarchical study design consisted of management type (fixed factor), forest site (random factor, nested within management type) and study plots (randomly placed plots within each study site). To examine how species richness varied across the three hierarchical scales, randomized species accumulation curves and additive partitioning of species richness were applied. To analyse variation in wood-inhabiting species and dead wood composition at each scale, linear and Permanova modelling approaches were used. Wood-inhabiting fungal communities were dominated by rare and infrequent species. The similarity of fungal communities was higher within sites and within management categories than among sites or between the two management categories, and it decreased with increasing distance among the sampling plots and with decreasing similarity of dead wood resources. However, only a small part of community variation could be explained by these factors. The species present in managed forests were in a large extent a subset of those species present in natural forests. Our results suggest that in particular the protection of rare species requires a large total area. As managed forests have only little additional value complementing the diversity of natural forests, the conservation of natural forests is the key to ecologically effective conservation. As the dissimilarity of fungal communities increases with distance, the conserved natural forest sites should be broadly distributed in space, yet the individual conserved areas should be large enough to ensure local persistence.
Community Turnover of Wood-Inhabiting Fungi across Hierarchical Spatial Scales
Abrego, Nerea; García-Baquero, Gonzalo; Halme, Panu; Ovaskainen, Otso; Salcedo, Isabel
2014-01-01
For efficient use of conservation resources it is important to determine how species diversity changes across spatial scales. In many poorly known species groups little is known about at which spatial scales the conservation efforts should be focused. Here we examined how the community turnover of wood-inhabiting fungi is realised at three hierarchical levels, and how much of community variation is explained by variation in resource composition and spatial proximity. The hierarchical study design consisted of management type (fixed factor), forest site (random factor, nested within management type) and study plots (randomly placed plots within each study site). To examine how species richness varied across the three hierarchical scales, randomized species accumulation curves and additive partitioning of species richness were applied. To analyse variation in wood-inhabiting species and dead wood composition at each scale, linear and Permanova modelling approaches were used. Wood-inhabiting fungal communities were dominated by rare and infrequent species. The similarity of fungal communities was higher within sites and within management categories than among sites or between the two management categories, and it decreased with increasing distance among the sampling plots and with decreasing similarity of dead wood resources. However, only a small part of community variation could be explained by these factors. The species present in managed forests were in a large extent a subset of those species present in natural forests. Our results suggest that in particular the protection of rare species requires a large total area. As managed forests have only little additional value complementing the diversity of natural forests, the conservation of natural forests is the key to ecologically effective conservation. As the dissimilarity of fungal communities increases with distance, the conserved natural forest sites should be broadly distributed in space, yet the individual conserved areas should be large enough to ensure local persistence. PMID:25058128
Decision tree modeling using R.
Zhang, Zhongheng
2016-08-01
In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building.
NASA Astrophysics Data System (ADS)
Hoffman, A.; Forest, C. E.; Kemanian, A.
2016-12-01
A significant number of food-insecure nations exist in regions of the world where dust plays a large role in the climate system. While the impacts of common climate variables (e.g. temperature, precipitation, ozone, and carbon dioxide) on crop yields are relatively well understood, the impact of mineral aerosols on yields have not yet been thoroughly investigated. This research aims to develop the data and tools to progress our understanding of mineral aerosol impacts on crop yields. Suspended dust affects crop yields by altering the amount and type of radiation reaching the plant, modifying local temperature and precipitation. While dust events (i.e. dust storms) affect crop yields by depleting the soil of nutrients or by defoliation via particle abrasion. The impact of dust on yields is modeled statistically because we are uncertain which impacts will dominate the response on national and regional scales considered in this study. Multiple linear regression is used in a number of large-scale statistical crop modeling studies to estimate yield responses to various climate variables. In alignment with previous work, we develop linear crop models, but build upon this simple method of regression with machine-learning techniques (e.g. random forests) to identify important statistical predictors and isolate how dust affects yields on the scales of interest. To perform this analysis, we develop a crop-climate dataset for maize, soybean, groundnut, sorghum, rice, and wheat for the regions of West Africa, East Africa, South Africa, and the Sahel. Random forest regression models consistently model historic crop yields better than the linear models. In several instances, the random forest models accurately capture the temperature and precipitation threshold behavior in crops. Additionally, improving agricultural technology has caused a well-documented positive trend that dominates time series of global and regional yields. This trend is often removed before regression with traditional crop models, but likely at the cost of removing climate information. Our random forest models consistently discover the positive trend without removing any additional data. The application of random forests as a statistical crop model provides insight into understanding the impact of dust on yields in marginal food producing regions.
NASA Astrophysics Data System (ADS)
Overstreet, B. T.; Legleiter, C. J.
2012-12-01
The Snake River in Grand Teton National Park is a dam-regulated but highly dynamic gravel-bed river that alternates between a single thread and a multithread planform. Identifying key drivers of channel change on this river could improve our understanding of 1) how flow regulation at Jackson Lake Dam has altered the character of the river over time; 2) how changes in the distribution of various types of vegetation impacts river dynamics; and 3) how the Snake River will respond to future human and climate driven disturbances. Despite the importance of monitoring planform changes over time, automated channel extraction and understanding the physical drivers contributing to channel change continue to be challenging yet critical steps in the remote sensing of riverine environments. In this study we use the random forest statistical technique to first classify land cover within the Snake River corridor and then extract channel features from a sequence of high-resolution multispectral images of the Snake River spanning the period from 2006 to 2012, which encompasses both exceptionally dry years and near-record runoff in 2011. We show that the random forest technique can be used to classify images with as few as four spectral bands with far greater accuracy than traditional single-tree classification approaches. Secondly, we couple random forest derived land cover maps with LiDAR derived topography, bathymetry, and canopy height to explore physical drivers contributing to observed channel changes on the Snake River. In conclusion we show that the random forest technique is a powerful tool for classifying multispectral images of rivers. Moreover, we hypothesize that with sufficient data for calculating spatially distributed metrics of channel form and more frequent channel monitoring, this tool can also be used to identify areas with high probabilities of channel change. Land cover maps of a portion of the Snake River produced from digital aerial photography from 2010 and a 2011 WorldView2 satellite image. This pair of maps thus captures changes that occurred during the 2011 runoff
NASA Astrophysics Data System (ADS)
Saenz, Edward J.
Forests provide vital ecosystem functions and services that maintain the integrity of our natural and human environment. Understanding the structural components of forests (extent, tree density, heights of multi-story canopies, biomass, etc.) provides necessary information to preserve ecosystem services. Increasingly, remote sensing resources have been used to map and monitor forests globally. However, traditional satellite and airborne multi-angle imagery only provide information about the top of the canopy and little about the forest structure and understory. In this research, we investigative the use of rapidly evolving lidar technology, and how the fusion of aerial and terrestrial lidar data can be utilized to better characterize forest stand information. We further apply a novel terrestrial lidar methodology to characterize a Hemlock Woolly Adelgid infestation in Harvard Forest, Massachusetts, and adapt a dynamic terrestrial lidar sampling scheme to identify key structural vegetation profiles of tropical rainforests in La Selva, Costa Rica.
Pigmented skin lesion detection using random forest and wavelet-based texture
NASA Astrophysics Data System (ADS)
Hu, Ping; Yang, Tie-jun
2016-10-01
The incidence of cutaneous malignant melanoma, a disease of worldwide distribution and is the deadliest form of skin cancer, has been rapidly increasing over the last few decades. Because advanced cutaneous melanoma is still incurable, early detection is an important step toward a reduction in mortality. Dermoscopy photographs are commonly used in melanoma diagnosis and can capture detailed features of a lesion. A great variability exists in the visual appearance of pigmented skin lesions. Therefore, in order to minimize the diagnostic errors that result from the difficulty and subjectivity of visual interpretation, an automatic detection approach is required. The objectives of this paper were to propose a hybrid method using random forest and Gabor wavelet transformation to accurately differentiate which part belong to lesion area and the other is not in a dermoscopy photographs and analyze segmentation accuracy. A random forest classifier consisting of a set of decision trees was used for classification. Gabor wavelets transformation are the mathematical model of visual cortical cells of mammalian brain and an image can be decomposed into multiple scales and multiple orientations by using it. The Gabor function has been recognized as a very useful tool in texture analysis, due to its optimal localization properties in both spatial and frequency domain. Texture features based on Gabor wavelets transformation are found by the Gabor filtered image. Experiment results indicate the following: (1) the proposed algorithm based on random forest outperformed the-state-of-the-art in pigmented skin lesions detection (2) and the inclusion of Gabor wavelet transformation based texture features improved segmentation accuracy significantly.
NASA Astrophysics Data System (ADS)
Sadler, J. M.; Goodall, J. L.; Morsy, M. M.; Spencer, K.
2018-04-01
Sea level rise has already caused more frequent and severe coastal flooding and this trend will likely continue. Flood prediction is an essential part of a coastal city's capacity to adapt to and mitigate this growing problem. Complex coastal urban hydrological systems however, do not always lend themselves easily to physically-based flood prediction approaches. This paper presents a method for using a data-driven approach to estimate flood severity in an urban coastal setting using crowd-sourced data, a non-traditional but growing data source, along with environmental observation data. Two data-driven models, Poisson regression and Random Forest regression, are trained to predict the number of flood reports per storm event as a proxy for flood severity, given extensive environmental data (i.e., rainfall, tide, groundwater table level, and wind conditions) as input. The method is demonstrated using data from Norfolk, Virginia USA from September 2010 to October 2016. Quality-controlled, crowd-sourced street flooding reports ranging from 1 to 159 per storm event for 45 storm events are used to train and evaluate the models. Random Forest performed better than Poisson regression at predicting the number of flood reports and had a lower false negative rate. From the Random Forest model, total cumulative rainfall was by far the most dominant input variable in predicting flood severity, followed by low tide and lower low tide. These methods serve as a first step toward using data-driven methods for spatially and temporally detailed coastal urban flood prediction.
Hu, Chen; Steingrimsson, Jon Arni
2018-01-01
A crucial component of making individualized treatment decisions is to accurately predict each patient's disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to censoring. Risk prediction models based on recursive partitioning methods are becoming increasingly popular largely due to their ability to handle nonlinear relationships, higher-order interactions, and/or high-dimensional covariates. The most popular recursive partitioning methods are versions of the Classification and Regression Tree (CART) algorithm, which builds a simple interpretable tree structured model. With the aim of increasing prediction accuracy, the random forest algorithm averages multiple CART trees, creating a flexible risk prediction model. Risk prediction models used in clinical oncology commonly use both traditional demographic and tumor pathological factors as well as high-dimensional genetic markers and treatment parameters from multimodality treatments. In this article, we describe the most commonly used extensions of the CART and random forest algorithms to right-censored outcomes. We focus on how they differ from the methods for noncensored outcomes, and how the different splitting rules and methods for cost-complexity pruning impact these algorithms. We demonstrate these algorithms by analyzing a randomized Phase III clinical trial of breast cancer. We also conduct Monte Carlo simulations to compare the prediction accuracy of survival forests with more commonly used regression models under various scenarios. These simulation studies aim to evaluate how sensitive the prediction accuracy is to the underlying model specifications, the choice of tuning parameters, and the degrees of missing covariates.
Climate Model Diagnostic Analyzer Web Service System
NASA Astrophysics Data System (ADS)
Lee, S.; Pan, L.; Zhai, C.; Tang, B.; Kubar, T. L.; Li, J.; Zhang, J.; Wang, W.
2015-12-01
Both the National Research Council Decadal Survey and the latest Intergovernmental Panel on Climate Change Assessment Report stressed the need for the comprehensive and innovative evaluation of climate models with the synergistic use of global satellite observations in order to improve our weather and climate simulation and prediction capabilities. The abundance of satellite observations for fundamental climate parameters and the availability of coordinated model outputs from CMIP5 for the same parameters offer a great opportunity to understand and diagnose model biases in climate models. In addition, the Obs4MIPs efforts have created several key global observational datasets that are readily usable for model evaluations. However, a model diagnostic evaluation process requires physics-based multi-variable comparisons that typically involve large-volume and heterogeneous datasets, making them both computationally- and data-intensive. In response, we have developed a novel methodology to diagnose model biases in contemporary climate models and implementing the methodology as a web-service based, cloud-enabled, provenance-supported climate-model evaluation system. The evaluation system is named Climate Model Diagnostic Analyzer (CMDA), which is the product of the research and technology development investments of several current and past NASA ROSES programs. The current technologies and infrastructure of CMDA are designed and selected to address several technical challenges that the Earth science modeling and model analysis community faces in evaluating and diagnosing climate models. In particular, we have three key technology components: (1) diagnostic analysis methodology; (2) web-service based, cloud-enabled technology; (3) provenance-supported technology. The diagnostic analysis methodology includes random forest feature importance ranking, conditional probability distribution function, conditional sampling, and time-lagged correlation map. We have implemented the new methodology as web services and incorporated the system into the Cloud. We have also developed a provenance management system for CMDA where CMDA service semantics modeling, service search and recommendation, and service execution history management are designed and implemented.
Patch forest: a hybrid framework of random forest and patch-based segmentation
NASA Astrophysics Data System (ADS)
Xie, Zhongliu; Gillies, Duncan
2016-03-01
The development of an accurate, robust and fast segmentation algorithm has long been a research focus in medical computer vision. State-of-the-art practices often involve non-rigidly registering a target image with a set of training atlases for label propagation over the target space to perform segmentation, a.k.a. multi-atlas label propagation (MALP). In recent years, the patch-based segmentation (PBS) framework has gained wide attention due to its advantage of relaxing the strict voxel-to-voxel correspondence to a series of pair-wise patch comparisons for contextual pattern matching. Despite a high accuracy reported in many scenarios, computational efficiency has consistently been a major obstacle for both approaches. Inspired by recent work on random forest, in this paper we propose a patch forest approach, which by equipping the conventional PBS with a fast patch search engine, is able to boost segmentation speed significantly while retaining an equal level of accuracy. In addition, a fast forest training mechanism is also proposed, with the use of a dynamic grid framework to efficiently approximate data compactness computation and a 3D integral image technique for fast box feature retrieval.
Developing a methodology to predict oak wilt distribution using classification tree analysis
Marla C. Downing; Vernon L. Thomas; Robin M. Reich
2006-01-01
Oak wilt (Ceratocystis fagacearum), a fungal disease that causes some species of oak trees to wilt and die rapidly, is a threat to oak forested resources in 22 states in the United States. We developed a methodology for predicting the Potential Distribution of Oak Wilt (PDOW) using Anoka County, Minnesota as our study area. The PDOW utilizes GIS; the...
Multi-label spacecraft electrical signal classification method based on DBN and random forest
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data. PMID:28486479
Multi-label spacecraft electrical signal classification method based on DBN and random forest.
Li, Ke; Yu, Nan; Li, Pengfei; Song, Shimin; Wu, Yalei; Li, Yang; Liu, Meng
2017-01-01
In spacecraft electrical signal characteristic data, there exists a large amount of data with high-dimensional features, a high computational complexity degree, and a low rate of identification problems, which causes great difficulty in fault diagnosis of spacecraft electronic load systems. This paper proposes a feature extraction method that is based on deep belief networks (DBN) and a classification method that is based on the random forest (RF) algorithm; The proposed algorithm mainly employs a multi-layer neural network to reduce the dimension of the original data, and then, classification is applied. Firstly, we use the method of wavelet denoising, which was used to pre-process the data. Secondly, the deep belief network is used to reduce the feature dimension and improve the rate of classification for the electrical characteristics data. Finally, we used the random forest algorithm to classify the data and comparing it with other algorithms. The experimental results show that compared with other algorithms, the proposed method shows excellent performance in terms of accuracy, computational efficiency, and stability in addressing spacecraft electrical signal data.
Intelligent Fault Diagnosis of HVCB with Feature Space Optimization-Based Random Forest
Ma, Suliang; Wu, Jianwen; Wang, Yuhao; Jia, Bowen; Jiang, Yuan
2018-01-01
Mechanical faults of high-voltage circuit breakers (HVCBs) always happen over long-term operation, so extracting the fault features and identifying the fault type have become a key issue for ensuring the security and reliability of power supply. Based on wavelet packet decomposition technology and random forest algorithm, an effective identification system was developed in this paper. First, compared with the incomplete description of Shannon entropy, the wavelet packet time-frequency energy rate (WTFER) was adopted as the input vector for the classifier model in the feature selection procedure. Then, a random forest classifier was used to diagnose the HVCB fault, assess the importance of the feature variable and optimize the feature space. Finally, the approach was verified based on actual HVCB vibration signals by considering six typical fault classes. The comparative experiment results show that the classification accuracy of the proposed method with the origin feature space reached 93.33% and reached up to 95.56% with optimized input feature vector of classifier. This indicates that feature optimization procedure is successful, and the proposed diagnosis algorithm has higher efficiency and robustness than traditional methods. PMID:29659548
Klingensmith, Jon D; Haggard, Asher; Fedewa, Russell J; Qiang, Beidi; Cummings, Kenneth; DeGrande, Sean; Vince, D Geoffrey; Elsharkawy, Hesham
2018-04-19
Spectral analysis of ultrasound radiofrequency backscatter has the potential to identify intercostal blood vessels during ultrasound-guided placement of paravertebral nerve blocks and intercostal nerve blocks. Autoregressive models were used for spectral estimation, and bandwidth, autoregressive order and region-of-interest size were evaluated. Eight spectral parameters were calculated and used to create random forests. An autoregressive order of 10, bandwidth of 6 dB and region-of-interest size of 1.0 mm resulted in the minimum out-of-bag error. An additional random forest, using these chosen values, was created from 70% of the data and evaluated independently from the remaining 30% of data. The random forest achieved a predictive accuracy of 92% and Youden's index of 0.85. These results suggest that spectral analysis of ultrasound radiofrequency backscatter has the potential to identify intercostal blood vessels. (jokling@siue.edu) © 2018 World Federation for Ultrasound in Medicine and Biology. Copyright © 2018 World Federation for Ultrasound in Medicine and Biology. Published by Elsevier Inc. All rights reserved.
RandomForest4Life: a Random Forest for predicting ALS disease progression.
Hothorn, Torsten; Jung, Hans H
2014-09-01
We describe a method for predicting disease progression in amyotrophic lateral sclerosis (ALS) patients. The method was developed as a submission to the DREAM Phil Bowen ALS Prediction Prize4Life Challenge of summer 2012. Based on repeated patient examinations over a three- month period, we used a random forest algorithm to predict future disease progression. The procedure was set up and internally evaluated using data from 1197 ALS patients. External validation by an expert jury was based on undisclosed information of an additional 625 patients; all patient data were obtained from the PRO-ACT database. In terms of prediction accuracy, the approach described here ranked third best. Our interpretation of the prediction model confirmed previous reports suggesting that past disease progression is a strong predictor of future disease progression measured on the ALS functional rating scale (ALSFRS). We also found that larger variability in initial ALSFRS scores is linked to faster future disease progression. The results reported here furthermore suggested that approaches taking the multidimensionality of the ALSFRS into account promise some potential for improved ALS disease prediction.
RAQ–A Random Forest Approach for Predicting Air Quality in Urban Sensing Systems
Yu, Ruiyun; Yang, Yu; Yang, Leyou; Han, Guangjie; Move, Oguti Ann
2016-01-01
Air quality information such as the concentration of PM2.5 is of great significance for human health and city management. It affects the way of traveling, urban planning, government policies and so on. However, in major cities there is typically only a limited number of air quality monitoring stations. In the meantime, air quality varies in the urban areas and there can be large differences, even between closely neighboring regions. In this paper, a random forest approach for predicting air quality (RAQ) is proposed for urban sensing systems. The data generated by urban sensing includes meteorology data, road information, real-time traffic status and point of interest (POI) distribution. The random forest algorithm is exploited for data training and prediction. The performance of RAQ is evaluated with real city data. Compared with three other algorithms, this approach achieves better prediction precision. Exciting results are observed from the experiments that the air quality can be inferred with amazingly high accuracy from the data which are obtained from urban sensing. PMID:26761008
PET-CT image fusion using random forest and à-trous wavelet transform.
Seal, Ayan; Bhattacharjee, Debotosh; Nasipuri, Mita; Rodríguez-Esparragón, Dionisio; Menasalvas, Ernestina; Gonzalo-Martin, Consuelo
2018-03-01
New image fusion rules for multimodal medical images are proposed in this work. Image fusion rules are defined by random forest learning algorithm and a translation-invariant à-trous wavelet transform (AWT). The proposed method is threefold. First, source images are decomposed into approximation and detail coefficients using AWT. Second, random forest is used to choose pixels from the approximation and detail coefficients for forming the approximation and detail coefficients of the fused image. Lastly, inverse AWT is applied to reconstruct fused image. All experiments have been performed on 198 slices of both computed tomography and positron emission tomography images of a patient. A traditional fusion method based on Mallat wavelet transform has also been implemented on these slices. A new image fusion performance measure along with 4 existing measures has been presented, which helps to compare the performance of 2 pixel level fusion methods. The experimental results clearly indicate that the proposed method outperforms the traditional method in terms of visual and quantitative qualities and the new measure is meaningful. Copyright © 2017 John Wiley & Sons, Ltd.
GPURFSCREEN: a GPU based virtual screening tool using random forest classifier.
Jayaraj, P B; Ajay, Mathias K; Nufail, M; Gopakumar, G; Jaleel, U C A
2016-01-01
In-silico methods are an integral part of modern drug discovery paradigm. Virtual screening, an in-silico method, is used to refine data models and reduce the chemical space on which wet lab experiments need to be performed. Virtual screening of a ligand data model requires large scale computations, making it a highly time consuming task. This process can be speeded up by implementing parallelized algorithms on a Graphical Processing Unit (GPU). Random Forest is a robust classification algorithm that can be employed in the virtual screening. A ligand based virtual screening tool (GPURFSCREEN) that uses random forests on GPU systems has been proposed and evaluated in this paper. This tool produces optimized results at a lower execution time for large bioassay data sets. The quality of results produced by our tool on GPU is same as that on a regular serial environment. Considering the magnitude of data to be screened, the parallelized virtual screening has a significantly lower running time at high throughput. The proposed parallel tool outperforms its serial counterpart by successfully screening billions of molecules in training and prediction phases.
Methodological reporting of randomized trials in five leading Chinese nursing journals.
Shi, Chunhu; Tian, Jinhui; Ren, Dan; Wei, Hongli; Zhang, Lihuan; Wang, Quan; Yang, Kehu
2014-01-01
Randomized controlled trials (RCTs) are not always well reported, especially in terms of their methodological descriptions. This study aimed to investigate the adherence of methodological reporting complying with CONSORT and explore associated trial level variables in the Chinese nursing care field. In June 2012, we identified RCTs published in five leading Chinese nursing journals and included trials with details of randomized methods. The quality of methodological reporting was measured through the methods section of the CONSORT checklist and the overall CONSORT methodological items score was calculated and expressed as a percentage. Meanwhile, we hypothesized that some general and methodological characteristics were associated with reporting quality and conducted a regression with these data to explore the correlation. The descriptive and regression statistics were calculated via SPSS 13.0. In total, 680 RCTs were included. The overall CONSORT methodological items score was 6.34 ± 0.97 (Mean ± SD). No RCT reported descriptions and changes in "trial design," changes in "outcomes" and "implementation," or descriptions of the similarity of interventions for "blinding." Poor reporting was found in detailing the "settings of participants" (13.1%), "type of randomization sequence generation" (1.8%), calculation methods of "sample size" (0.4%), explanation of any interim analyses and stopping guidelines for "sample size" (0.3%), "allocation concealment mechanism" (0.3%), additional analyses in "statistical methods" (2.1%), and targeted subjects and methods of "blinding" (5.9%). More than 50% of trials described randomization sequence generation, the eligibility criteria of "participants," "interventions," and definitions of the "outcomes" and "statistical methods." The regression analysis found that publication year and ITT analysis were weakly associated with CONSORT score. The completeness of methodological reporting of RCTs in the Chinese nursing care field is poor, especially with regard to the reporting of trial design, changes in outcomes, sample size calculation, allocation concealment, blinding, and statistical methods.
Monitoring tropical forest degradation using time series analysis of Landsat and Sentinel-2 data
NASA Astrophysics Data System (ADS)
Bullock, E.; Woodcock, C. E.
2017-12-01
Tropical forest loss is expected to be contribute 5 to 15% of anthropogenic carbon emissions in the coming century. The wide range of expected emissions is indicative of the large uncertainties that exist in the terrestrial carbon cycle. Total carbon loss from forest conversion consists of loss from deforestation plus loss from degradation. There have been significant improvements in the ability to relate plot-level estimates of carbon stocks to remote sensing-derived calculations of deforestation to estimate total carbon emissions from forest loss. These approaches, however, have been limited in their ability to assess the magnitude, extent, and overall impact of forest degradation. The causes of tropical degradation include selective logging, fuel wood collection, fires, and the development of forest plantations. This study demonstrates a newly developed methodology for detecting subtle changes in forest structure and condition using time series analysis of Landsat and Sentinel-2 data. The research shows how the ability to detect small changes in forest biomass, in addition to changes in forest composition, can be improved by incorporating historical context and multi-sensor data fusion. Results are demonstrated from two climatically unique tropical forests in Thailand and Brazil.
Armijo-Olivo, Susan; Cummings, Greta G.; Amin, Maryam; Flores-Mir, Carlos
2017-01-01
Objectives To examine the risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions and the development of these aspects over time. Methods We included 540 randomized clinical trials from 64 selected systematic reviews. We extracted, in duplicate, details from each of the selected randomized clinical trials with respect to publication and trial characteristics, reporting and methodologic characteristics, and Cochrane risk of bias domains. We analyzed data using logistic regression and Chi-square statistics. Results Sequence generation was assessed to be inadequate (at unclear or high risk of bias) in 68% (n = 367) of the trials, while allocation concealment was inadequate in the majority of trials (n = 464; 85.9%). Blinding of participants and blinding of the outcome assessment were judged to be inadequate in 28.5% (n = 154) and 40.5% (n = 219) of the trials, respectively. A sample size calculation before the initiation of the study was not performed/reported in 79.1% (n = 427) of the trials, while the sample size was assessed as adequate in only 17.6% (n = 95) of the trials. Two thirds of the trials were not described as double blinded (n = 358; 66.3%), while the method of blinding was appropriate in 53% (n = 286) of the trials. We identified a significant decrease over time (1955–2013) in the proportion of trials assessed as having inadequately addressed methodological quality items (P < 0.05) in 30 out of the 40 quality criteria, or as being inadequate (at high or unclear risk of bias) in five domains of the Cochrane risk of bias tool: sequence generation, allocation concealment, incomplete outcome data, other sources of bias, and overall risk of bias. Conclusions The risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions have improved over time; however, further efforts that contribute to the development of more stringent methodology and detailed reporting of trials are still needed. PMID:29272315
Saltaji, Humam; Armijo-Olivo, Susan; Cummings, Greta G; Amin, Maryam; Flores-Mir, Carlos
2017-01-01
To examine the risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions and the development of these aspects over time. We included 540 randomized clinical trials from 64 selected systematic reviews. We extracted, in duplicate, details from each of the selected randomized clinical trials with respect to publication and trial characteristics, reporting and methodologic characteristics, and Cochrane risk of bias domains. We analyzed data using logistic regression and Chi-square statistics. Sequence generation was assessed to be inadequate (at unclear or high risk of bias) in 68% (n = 367) of the trials, while allocation concealment was inadequate in the majority of trials (n = 464; 85.9%). Blinding of participants and blinding of the outcome assessment were judged to be inadequate in 28.5% (n = 154) and 40.5% (n = 219) of the trials, respectively. A sample size calculation before the initiation of the study was not performed/reported in 79.1% (n = 427) of the trials, while the sample size was assessed as adequate in only 17.6% (n = 95) of the trials. Two thirds of the trials were not described as double blinded (n = 358; 66.3%), while the method of blinding was appropriate in 53% (n = 286) of the trials. We identified a significant decrease over time (1955-2013) in the proportion of trials assessed as having inadequately addressed methodological quality items (P < 0.05) in 30 out of the 40 quality criteria, or as being inadequate (at high or unclear risk of bias) in five domains of the Cochrane risk of bias tool: sequence generation, allocation concealment, incomplete outcome data, other sources of bias, and overall risk of bias. The risks of bias, risks of random errors, reporting quality, and methodological quality of randomized clinical trials of oral health interventions have improved over time; however, further efforts that contribute to the development of more stringent methodology and detailed reporting of trials are still needed.
Multiscale habitat use and selection in cooperatively breeding Micronesian kingfishers
Kesler, D.C.; Haig, S.M.
2007-01-01
Information about the interaction between behavior and landscape resources is key to directing conservation management for endangered species. We studied multi-scale occurrence, habitat use, and selection in a cooperatively breeding population of Micronesian kingfishers (Todiramphus cinnamominus) on the island of Pohnpei, Federated States of Micronesia. At the landscape level, point-transect surveys resulted in kingfisher detection frequencies that were higher than those reported in 1994, although they remained 15-40% lower than 1983 indices. Integration of spatially explicit vegetation information with survey results indicated that kingfisher detections were positively associated with the amount of wet forest and grass-urban vegetative cover, and they were negatively associated with agricultural forest, secondary vegetation, and upland forest cover types. We used radiotelemetry and remote sensing to evaluate habitat use by individual kingfishers at the home-range scale. A comparison of habitats in Micronesian kingfisher home ranges with those in randomly placed polygons illustrated that birds used more forested areas than were randomly available in the immediate surrounding area. Further, members of cooperatively breeding groups included more forest in their home ranges than birds in pair-breeding territories, and forested portions of study areas appeared to be saturated with territories. Together, these results suggested that forest habitats were limited for Micronesian kingfishers. Thus, protecting and managing forests is important for the restoration of Micronesian kingfishers to the island of Guam (United States Territory), where they are currently extirpated, as well as to maintaining kingfisher populations on the islands of Pohnpei and Palau. Results further indicated that limited forest resources may restrict dispersal opportunities and, therefore, play a role in delayed dispersal and cooperative behaviors in Micronesian kingfishers.
Carbon Budget and its Dynamics over Northern Eurasia Forest Ecosystems
NASA Astrophysics Data System (ADS)
Shvidenko, Anatoly; Schepaschenko, Dmitry; Kraxner, Florian; Maksyutov, Shamil
2016-04-01
The presentation contains an overview of recent findings and results of assessment of carbon cycling of forest ecosystems of Northern Eurasia. From a methodological point of view, there is a clear tendency in understanding a need of a Full and Verified Carbon Account (FCA), i.e. in reliable assessment of uncertainties for all modules and all stages of FCA. FCA is considered as a fuzzy (underspecified) system that supposes a system integration of major methods of carbon cycling study (land-ecosystem approach, LEA; process-based models; eddy covariance; and inverse modelling). Landscape-ecosystem approach 1) serves for accumulation of all relevant knowledge of landscape and ecosystems; 2) for strict systems designing the account, 3) contains all relevant spatially distributed empirical and semi-empirical data and models, and 4) is presented in form of an Integrated Land Information System (ILIS). The ILIS includes a hybrid land cover in a spatially and temporarily explicit way and corresponding attributive databases. The forest mask is provided by utilizing multi-sensor remote sensing data, geographically weighed regression and validation within GEO-wiki platform. By-pixel parametrization of forest cover is based on a special optimization algorithms using all available knowledge and information sources (data of forest inventory and different surveys, observations in situ, official statistics of forest management etc.). Major carbon fluxes within the LEA (NPP, HR, disturbances etc.) are estimated based on fusion of empirical data and aggregations with process-based elements by sets of regionally distributed models. Uncertainties within LEA are assessed for each module and at each step of the account. Within method results of LEA and corresponding uncertainties are harmonized and mutually constrained with independent outputs received by other methods based on the Bayesian approach. The above methodology have been applied to carbon account of Russian forests for 2000-2012. It has been shown that the Net Ecosystem Carbon Budget (NECB) of Russian forests for this period was in range of 0.5-0.7 Pg C yr-1 with a slight negative trend during the period due to acceleration of disturbance regimes and negative impacts of weather extremes (heat waves etc.). Uncertainties of the FCA for individual years were estimated at about 25% (CI 0.9). It has been shown that some models (e.g. majority of DGVMs) do not describe some processes on permafrost satisfactory while results of applications of ensembles of inverse models on average are closed to empirical assessments. A most important conclusion from this experience is that future improvements of knowledge of carbon cycling of Northern Eurasia forests requires development of an integrated observing system as a unified information background, as well as systems methodological improvements of all methods of cognition of carbon cycling.
Modelling above Ground Biomass of Mangrove Forest Using SENTINEL-1 Imagery
NASA Astrophysics Data System (ADS)
Labadisos Argamosa, Reginald Jay; Conferido Blanco, Ariel; Balidoy Baloloy, Alvin; Gumbao Candido, Christian; Lovern Caboboy Dumalag, John Bart; Carandang Dimapilis, Lee, , Lady; Camero Paringit, Enrico
2018-04-01
Many studies have been conducted in the estimation of forest above ground biomass (AGB) using features from synthetic aperture radar (SAR). Specifically, L-band ALOS/PALSAR (wavelength 23 cm) data is often used. However, few studies have been made on the use of shorter wavelengths (e.g., C-band, 3.75 cm to 7.5 cm) for forest mapping especially in tropical forests since higher attenuation is observed for volumetric objects where energy propagated is absorbed. This study aims to model AGB estimates of mangrove forest using information derived from Sentinel-1 C-band SAR data. Combinations of polarisations (VV, VH), its derivatives, grey level co-occurrence matrix (GLCM), and its principal components were used as features for modelling AGB. Five models were tested with varying combinations of features; a) sigma nought polarisations and its derivatives; b) GLCM textures; c) the first five principal components; d) combination of models a-c; and e) the identified important features by Random Forest variable importance algorithm. Random Forest was used as regressor to compute for the AGB estimates to avoid over fitting caused by the introduction of too many features in the model. Model e obtained the highest r2 of 0.79 and an RMSE of 0.44 Mg using only four features, namely, σ°VH GLCM variance, σ°VH GLCM contrast, PC1, and PC2. This study shows that Sentinel-1 C-band SAR data could be used to produce acceptable AGB estimates in mangrove forest to compensate for the unavailability of longer wavelength SAR.
ERIC Educational Resources Information Center
Wong, Vivian C.; Steiner, Peter M.
2015-01-01
Across the disciplines of economics, political science, public policy, and now, education, the randomized controlled trial (RCT) is the preferred methodology for establishing causal inference about program impacts. But randomized experiments are not always feasible because of ethical, political, and/or practical considerations, so non-experimental…
David C. Chojnacky; Randolph H. Wynne; Christine E. Blinn
2009-01-01
Methodology is lacking to easily map Forest Inventory and Analysis (FIA) inventory statistics for all attribute variables without having to develop separate models and methods for each variable. We developed a mapping method that can directly transfer tabular data to a map on which pixels can be added any way desired to estimate carbon (or any other variable) for a...
John B. Loomis; Armando González-Cabán; Robin Gregory
1996-01-01
A contingent valuation methodology was applied to old-growth forests and critical habitat units for the Northern Spotted Owl in Oregon to estimate the economic value to the public in knowing that rare and unique ecosystems will be protected from fire for current and future generations. Generalizing to the whole state, the total annual willingness-to-pay of Oregon...
Chapter 3 - At the roadside: Forest resources
Bryce Stokes; Timothy G. Rials; Leonard R. Johnson; Karen L. Abt; Prakash Nepal; Kenneth E. Skog; Robert C. Abt; Lixia He; Burton C. English
2016-01-01
Chapter 3 assesses the availability of forest resources to the roadside. Not all woody feedstocks are discussed in this chapter. Logging residues and wholetree biomass are included. Other feedstock categories have been moved to chapter 5 or are redefined to be included in the whole-tree biomass category. New methodologies and data are used in the assessment to
TOC and TRIZ: using a dual-methodological approach to solve a forest harvesting problem
Ian Conradie
2005-01-01
Although cut-to-length forest harvesting with harvesters and forwarders is hardly used in some parts of the world, it has many advantages over conventional harvesting systems. Research has shown that the core reason for the low adoption of CTL in the southeastern USA is the complexity of the equipment to optimize value recovery. In this paper we delve deeper into this...
Bonnie Ruefenacht; Robert Benton; Vicky Johnson; Tanushree Biswas; Craig Baker; Mark Finco; Kevin Megown; John Coulston; Ken Winterberger; Mark Riley
2015-01-01
A tree canopy cover (TCC) layer is one of three elements in the National Land Cover Database (NLCD) 2011 suite of nationwide geospatial data layers. In 2010, the USDA Forest Service (USFS) committed to creating the TCC layer as a member of the Multi-Resolution Land Cover (MRLC) consortium. A general methodology for creating the TCC layer was reported at the 2012 FIA...
Trade-associated pathways of alien forest insect entries in Canada
Denys Yemshanov; Frank H. Koch; Mark Ducey; Klaus Koehler
2012-01-01
Long-distance introductions of new invasive species have often been driven by socioeconomic factors, such that traditional ââbiologicalââ invasion models may not be capable of estimating spread fully and reliably. In this study we present a new methodology to characterize and predict pathways of human-assisted entries of alien forest insects. We have developed a...
Bruce Sims; Jim Piatt; Lee Johnson; Carol Purchase; John Phillips
1996-01-01
Personnel on the Santa Fe National Forest used methodologies adapted from Bevenger and King (1995) to collect base line particle size data on streams within grazing allotments currently scheduled for permit reissuance. This information was used to determine the relative current health of the watersheds as well as being used in the development of potential alternatives...
Integrating climate change criteria in reforestation projects using a hybrid decision-support system
NASA Astrophysics Data System (ADS)
Curiel-Esparza, Jorge; Gonzalez-Utrillas, Nuria; Canto-Perello, Julian; Martin-Utrillas, Manuel
2015-09-01
The selection of appropriate species in a reforestation project has always been a complex decision-making problem in which, due mostly to government policies and other stakeholders, not only economic criteria but also other environmental issues interact. Climate change has not usually been taken into account in traditional reforestation decision-making strategies and management procedures. Moreover, there is a lack of agreement on the percentage of each one of the species in reforestation planning, which is usually calculated in a discretionary way. In this context, an effective multicriteria technique has been developed in order to improve the process of selecting species for reforestation in the Mediterranean region of Spain. A hybrid Delphi-AHP methodology is proposed, which includes a consistency analysis in order to reduce random choices. As a result, this technique provides an optimal percentage distribution of the appropriate species to be used in reforestation planning. The highest values of the weight given for each subcriteria corresponded to FR (fire forest response) and PR (pests and diseases risk), because of the increasing importance of the impact of climate change in the forest. However, CB (conservation of biodiversitiy) was in the third position in line with the aim of reforestation. Therefore, the most suitable species were Quercus faginea (19.75%) and Quercus ilex (19.35%), which offer a good balance between all the factors affecting the success and viability of reforestation.
Analysis of forest disturbance using TM and AVHRR data
NASA Technical Reports Server (NTRS)
Spanner, Michael A.; Hlavka, Christine A.; Pierce, Lars L.
1989-01-01
A methodology that will be used to determine the proportions of undisturbed, successional vegetation and recently disturbed land cover within coniferous forests using remotely sensed data from the advanced very high resolution radiometer (AVHRR) is presented. The method uses thematic mapper (TM) data to determine the proportions of the three stages of forest disturbance and regrowth for each AVHRR pixel in the sample areas, and is then applied to interpret all AVHRR imagery. Preliminary results indicate that there are predictable relationships between TM spectral response and the disturbance classes. Analysis of ellipse plots from a TM classification of the disturbed forested landscape indicates that the forest classes are separable in the red (0.63-0.69 micron) and near-infrared (0.76-0.90 micron) bands, providing evidence that the proportion of disturbance classes may be determined from AVHRR data.
Shape selection in Landsat time series: a tool for monitoring forest dynamics.
Moisen, Gretchen G; Meyer, Mary C; Schroeder, Todd A; Liao, Xiyue; Schleeweis, Karen G; Freeman, Elizabeth A; Toney, Chris
2016-10-01
We present a new methodology for fitting nonparametric shape-restricted regression splines to time series of Landsat imagery for the purpose of modeling, mapping, and monitoring annual forest disturbance dynamics over nearly three decades. For each pixel and spectral band or index of choice in temporal Landsat data, our method delivers a smoothed rendition of the trajectory constrained to behave in an ecologically sensible manner, reflecting one of seven possible 'shapes'. It also provides parameters summarizing the patterns of each change including year of onset, duration, magnitude, and pre- and postchange rates of growth or recovery. Through a case study featuring fire, harvest, and bark beetle outbreak, we illustrate how resultant fitted values and parameters can be fed into empirical models to map disturbance causal agent and tree canopy cover changes coincident with disturbance events through time. We provide our code in the r package ShapeSelectForest on the Comprehensive R Archival Network and describe our computational approaches for running the method over large geographic areas. We also discuss how this methodology is currently being used for forest disturbance and attribute mapping across the conterminous United States. Published 2016. This article is a U.S. Government work and is in the public domain in the USA.
Use of DNA markers in forest tree improvement research
D.B. Neale; M.E. Devey; K.D. Jermstad; M.R. Ahuja; M.C. Alosi; K.A. Marshall
1992-01-01
DNA markers are rapidly being developed for forest trees. The most important markers are restriction fragment length polymorphisms (RFLPs), polymerase chain reaction- (PCR) based markers such as random amplified polymorphic DNA (RAPD), and fingerprinting markers. DNA markers can supplement isozyme markers for monitoring tree improvement activities such as; estimating...
Cooperative forestry inventory project for Nevada
NASA Technical Reports Server (NTRS)
Thornhill, R.
1981-01-01
A forest inventory project employing computerized classification of LANDSAT data to inventory vegetation types in western Nevada is described. The methodology and applicability of the resulting survey are summarized.
Bee (Hymenoptera: Apoidea) Diversity and Sampling Methodology in a Midwestern USA Deciduous Forest.
McCravy, Kenneth W; Ruholl, Jared D
2017-08-04
Forests provide potentially important bee habitat, but little research has been done on forest bee diversity and the relative effectiveness of bee sampling methods in this environment. Bee diversity and sampling methodology were studied in an Illinois, USA upland oak-hickory forest using elevated and ground-level pan traps, malaise traps, and vane traps. 854 bees and 55 bee species were collected. Elevated pan traps collected the greatest number of bees (473), but ground-level pan traps collected greater species diversity (based on Simpson's diversity index) than did elevated pan traps. Elevated and ground-level pan traps collected the greatest bee species richness, with 43 and 39 species, respectively. An estimated sample size increase of over 18-fold would be required to approach minimum asymptotic richness using ground-level pan traps. Among pan trap colors/elevations, elevated yellow pan traps collected the greatest number of bees (266) but the lowest diversity. Malaise traps were relatively ineffective, collecting only 17 bees. Vane traps collected relatively low species richness (14 species), and Chao1 and abundance coverage estimators suggested that minimum asymptotic species richness was approached for that method. Bee species composition differed significantly between elevated pan traps, ground-level pan traps, and vane traps. Indicator species were significantly associated with each of these trap types, as well as with particular pan trap colors/elevations. These results indicate that Midwestern deciduous forests provide important bee habitat, and that the performance of common bee sampling methods varies substantially in this environment.
Multi-Temporal Classification and Change Detection Using Uav Images
NASA Astrophysics Data System (ADS)
Makuti, S.; Nex, F.; Yang, M. Y.
2018-05-01
In this paper different methodologies for the classification and change detection of UAV image blocks are explored. UAV is not only the cheapest platform for image acquisition but it is also the easiest platform to operate in repeated data collections over a changing area like a building construction site. Two change detection techniques have been evaluated in this study: the pre-classification and the post-classification algorithms. These methods are based on three main steps: feature extraction, classification and change detection. A set of state of the art features have been used in the tests: colour features (HSV), textural features (GLCM) and 3D geometric features. For classification purposes Conditional Random Field (CRF) has been used: the unary potential was determined using the Random Forest algorithm while the pairwise potential was defined by the fully connected CRF. In the performed tests, different feature configurations and settings have been considered to assess the performance of these methods in such challenging task. Experimental results showed that the post-classification approach outperforms the pre-classification change detection method. This was analysed using the overall accuracy, where by post classification have an accuracy of up to 62.6 % and the pre classification change detection have an accuracy of 46.5 %. These results represent a first useful indication for future works and developments.
Holliday, Jason A; Wang, Tongli; Aitken, Sally
2012-09-01
Climate is the primary driver of the distribution of tree species worldwide, and the potential for adaptive evolution will be an important factor determining the response of forests to anthropogenic climate change. Although association mapping has the potential to improve our understanding of the genomic underpinnings of climatically relevant traits, the utility of adaptive polymorphisms uncovered by such studies would be greatly enhanced by the development of integrated models that account for the phenotypic effects of multiple single-nucleotide polymorphisms (SNPs) and their interactions simultaneously. We previously reported the results of association mapping in the widespread conifer Sitka spruce (Picea sitchensis). In the current study we used the recursive partitioning algorithm 'Random Forest' to identify optimized combinations of SNPs to predict adaptive phenotypes. After adjusting for population structure, we were able to explain 37% and 30% of the phenotypic variation, respectively, in two locally adaptive traits--autumn budset timing and cold hardiness. For each trait, the leading five SNPs captured much of the phenotypic variation. To determine the role of epistasis in shaping these phenotypes, we also used a novel approach to quantify the strength and direction of pairwise interactions between SNPs and found such interactions to be common. Our results demonstrate the power of Random Forest to identify subsets of markers that are most important to climatic adaptation, and suggest that interactions among these loci may be widespread.
Hydrological modelling for flood forecasting: Calibrating the post-fire initial conditions
NASA Astrophysics Data System (ADS)
Papathanasiou, C.; Makropoulos, C.; Mimikou, M.
2015-10-01
Floods and forest fires are two of the most devastating natural hazards with severe socioeconomic, environmental as well as aesthetic impacts on the affected areas. Traditionally, these hazards are examined from different perspectives and are thus investigated through different, independent systems, overlooking the fact that they are tightly interrelated phenomena. In fact, the same flood event is more severe, i.e. associated with increased runoff discharge and peak flow and decreased time to peak, if it occurs over a burnt area than that occurring over a land not affected by fire. Mediterranean periurban areas, where forests covered with flammable vegetation coexist with agricultural land and urban zones, are typical areas particularly prone to the combined impact of floods and forest fires. Hence, the accurate assessment and effective management of post-fire flood risk becomes an issue of priority. The research presented in this paper aims to develop a robust methodological framework, using state of art tools and modern technologies to support the estimation of the change in time of five representative hydrological parameters for post-fire conditions. The proposed methodology considers both longer- and short-term initial conditions in order to assess the dynamic evolution of the selected parameters. The research focuses on typical Mediterranean periurban areas that are subjected to both hazards and concludes with a set of equations that associate post-fire and pre-fire conditions for five Fire Severity (FS) classes and three soil moisture states. The methodology has been tested for several flood events on the Rafina catchment, a periurban catchment in Eastern Attica (Greece). In order to validate the methodology, simulated hydrographs were produced and compared against available observed data. Results indicate a close convergence of observed and simulated flows. The proposed methodology is particularly flexible and thus easily adaptable to catchments with similar hydrometeorological and geomorphological features.
National Satellite Forest Monitoring systems for REDD+
NASA Astrophysics Data System (ADS)
Jonckheere, I. G.
2012-12-01
Reducing Emissions from Deforestation and Forest Degradation (REDD) is an effort to create a financial value for the carbon stored in forests, offering incentives for developing countries to reduce emissions from forested lands and invest in low-carbon paths to sustainable development. "REDD+" goes beyond deforestation and forest degradation, and includes the role of conservation, sustainable management of forests and enhancement of forest carbon stocks. In the framework of getting countries ready for REDD+, the UN-REDD Programme assists developing countries to prepare and implement national REDD+ strategies. For the monitoring, reporting and verification, FAO supports the countries to develop national satellite forest monitoring systems that allow for credible measurement, reporting and verification (MRV) of REDD+ activities. These are among the most critical elements for the successful implementation of any REDD+ mechanism. The UN-REDD Programme through a joint effort of FAO and Brazil's National Space Agency, INPE, is supporting countries to develop cost- effective, robust and compatible national monitoring and MRV systems, providing tools, methodologies, training and knowledge sharing that help countries to strengthen their technical and institutional capacity for effective MRV systems. To develop strong nationally-owned forest monitoring systems, technical and institutional capacity building is key. The UN-REDD Programme, through FAO, has taken on intensive training together with INPE, and has provided technical help and assistance for in-country training and implementation for national satellite forest monitoring. The goal of the support to UN-REDD pilot countries in this capacity building effort is the training of technical forest people and IT persons from interested REDD+ countries, and to set- up the national satellite forest monitoring systems. The Brazilian forest monitoring system, TerraAmazon, which is used as a basis for this initiative, allows countries to adapt it to country needs and the training on the TerraAmazon system is a tool to enhance existing capacity on carbon monitoring systems. The support with the National Forest Monitoring System will allow these countries to follow all actions related to the implementation of its national REDD+ policies and measures. The monitoring system will work as a platform to obtain information on their REDD+ results and actions, related directly or indirectly to national REDD+ strategies and may also include actions unrelated to carbon assessment, such as forest law enforcement. With the technical assistance of FAO, INPE and other stakeholders, the countries will set up an autonomous operational forest monitoring system. An initial version and the methodologies of the system for DRC and PNG has been launched in Durban, South Africa during COP 17 and in 2012 Paraguay, Viet Nam and Zambia will be launched in Doha, Qatar at COP 18. The access to high-quality satellite data for these countries is crucial for the set-up.
Organic carbon stock modelling for the quantification of the carbon sinks in terrestrial ecosystems
NASA Astrophysics Data System (ADS)
Durante, Pilar; Algeet, Nur; Oyonarte, Cecilio
2017-04-01
Given the recent environmental policies derived from the serious threats caused by global change, practical measures to decrease net CO2 emissions have to be put in place. Regarding this, carbon sequestration is a major measure to reduce atmospheric CO2 concentrations within a short and medium term, where terrestrial ecosystems play a basic role as carbon sinks. Development of tools for quantification, assessment and management of organic carbon in ecosystems at different scales and management scenarios, it is essential to achieve these commitments. The aim of this study is to establish a methodological framework for the modeling of this tool, applied to a sustainable land use planning and management at spatial and temporal scale. The methodology for carbon stock estimation in ecosystems is based on merger techniques between carbon stored in soils and aerial biomass. For this purpose, both spatial variability map of soil organic carbon (SOC) and algorithms for calculation of forest species biomass will be created. For the modelling of the SOC spatial distribution at different map scales, it is necessary to fit in and screen the available information of soil database legacy. Subsequently, SOC modelling will be based on the SCORPAN model, a quantitative model use to assess the correlation among soil-forming factors measured at the same site location. These factors will be selected from both static (terrain morphometric variables) and dynamic variables (climatic variables and vegetation indexes -NDVI-), providing to the model the spatio-temporal characteristic. After the predictive model, spatial inference techniques will be used to achieve the final map and to extrapolate the data to unavailable information areas (automated random forest regression kriging). The estimated uncertainty will be calculated to assess the model performance at different scale approaches. Organic carbon modelling of aerial biomass will be estimate using LiDAR (Light Detection And Ranging) algorithms. The available LiDAR databases will be used. LiDAR statistics (which describe the LiDAR cloud point data to calculate forest stand parameters) will be correlated with different canopy cover variables. The regression models applied to the total area will produce a continuous geo-information map to each canopy variable. The CO2 estimation will be calculated by dry-mass conversion factors for each forest species (C kg-CO2 kg equivalent). The result is the organic carbon modelling at spatio-temporal scale with different levels of uncertainty associated to the predictive models and diverse detailed scales. However, one of the main expected problems is due to the heterogeneous spatial distribution of the soil information, which influences on the prediction of the models at different spatial scales and, consequently, at SOC map scale. Besides this, the variability and mixture of the forest species of the aerial biomass decrease the accuracy assessment of the organic carbon.
Review of Recent Methodological Developments in Group-Randomized Trials: Part 1—Design
Li, Fan; Gallis, John A.; Prague, Melanie; Murray, David M.
2017-01-01
In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have highlighted the developments of the past 13 years in design with a companion article to focus on developments in analysis. As a pair, these articles update the 2004 review. We have discussed developments in the topics of the earlier review (e.g., clustering, matching, and individually randomized group-treatment trials) and in new topics, including constrained randomization and a range of randomized designs that are alternatives to the standard parallel-arm GRT. These include the stepped-wedge GRT, the pseudocluster randomized trial, and the network-randomized GRT, which, like the parallel-arm GRT, require clustering to be accounted for in both their design and analysis. PMID:28426295
Review of Recent Methodological Developments in Group-Randomized Trials: Part 1-Design.
Turner, Elizabeth L; Li, Fan; Gallis, John A; Prague, Melanie; Murray, David M
2017-06-01
In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have highlighted the developments of the past 13 years in design with a companion article to focus on developments in analysis. As a pair, these articles update the 2004 review. We have discussed developments in the topics of the earlier review (e.g., clustering, matching, and individually randomized group-treatment trials) and in new topics, including constrained randomization and a range of randomized designs that are alternatives to the standard parallel-arm GRT. These include the stepped-wedge GRT, the pseudocluster randomized trial, and the network-randomized GRT, which, like the parallel-arm GRT, require clustering to be accounted for in both their design and analysis.
An empirical, integrated forest biomass monitoring system
NASA Astrophysics Data System (ADS)
Kennedy, Robert E.; Ohmann, Janet; Gregory, Matt; Roberts, Heather; Yang, Zhiqiang; Bell, David M.; Kane, Van; Hughes, M. Joseph; Cohen, Warren B.; Powell, Scott; Neeti, Neeti; Larrue, Tara; Hooper, Sam; Kane, Jonathan; Miller, David L.; Perkins, James; Braaten, Justin; Seidl, Rupert
2018-02-01
The fate of live forest biomass is largely controlled by growth and disturbance processes, both natural and anthropogenic. Thus, biomass monitoring strategies must characterize both the biomass of the forests at a given point in time and the dynamic processes that change it. Here, we describe and test an empirical monitoring system designed to meet those needs. Our system uses a mix of field data, statistical modeling, remotely-sensed time-series imagery, and small-footprint lidar data to build and evaluate maps of forest biomass. It ascribes biomass change to specific change agents, and attempts to capture the impact of uncertainty in methodology. We find that: • A common image framework for biomass estimation and for change detection allows for consistent comparison of both state and change processes controlling biomass dynamics. • Regional estimates of total biomass agree well with those from plot data alone. • The system tracks biomass densities up to 450-500 Mg ha-1 with little bias, but begins underestimating true biomass as densities increase further. • Scale considerations are important. Estimates at the 30 m grain size are noisy, but agreement at broad scales is good. Further investigation to determine the appropriate scales is underway. • Uncertainty from methodological choices is evident, but much smaller than uncertainty based on choice of allometric equation used to estimate biomass from tree data. • In this forest-dominated study area, growth and loss processes largely balance in most years, with loss processes dominated by human removal through harvest. In years with substantial fire activity, however, overall biomass loss greatly outpaces growth. Taken together, our methods represent a unique combination of elements foundational to an operational landscape-scale forest biomass monitoring program.
NASA Astrophysics Data System (ADS)
Hudak, A. T.; Crookston, N.; Kennedy, R. E.; Domke, G. M.; Fekety, P.; Falkowski, M. J.
2017-12-01
Commercial off-the-shelf lidar collections associated with tree measures in field plots allow aboveground biomass (AGB) estimation with high confidence. Predictive models developed from such datasets are used operationally to map AGB across lidar project areas. We use a random selection of these pixel-level AGB predictions as training for predicting AGB annually across Idaho and western Montana, primarily from Landsat time series imagery processed through LandTrendr. At both the landscape and regional scales, Random Forests is used for predictive AGB modeling. To project future carbon dynamics, we use Climate-FVS (Forest Vegetation Simulator), the tree growth engine used by foresters to inform forest planning decisions, under either constant or changing climate scenarios. Disturbance data compiled from LandTrendr (Kennedy et al. 2010) using TimeSync (Cohen et al. 2010) in forested lands of Idaho (n=509) and western Montana (n=288) are used to generate probabilities of disturbance (harvest, fire, or insect) by land ownership class (public, private) as well as the magnitude of disturbance. Our verification approach is to aggregate the regional, annual AGB predictions at the county level and compare them to annual county-level AGB summarized independently from systematic, field-based, annual inventories conducted by the US Forest Inventory and Analysis (FIA) Program nationally. This analysis shows that when federal lands are disturbed the magnitude is generally high and when other lands are disturbed the magnitudes are more moderate. The probability of disturbance in corporate lands is higher than in other lands but the magnitudes are generally lower. This is consistent with the much higher prevalence of fire and insects occurring on federal lands, and greater harvest activity on private lands. We found large forest carbon losses in drier southern Idaho, only partially offset by carbon gains in wetter northern Idaho, due to anticipated climate change. Public and private forest managers can use these forest carbon projections to 2117 to inform 2017 decisions on which tree species and seed sources to select for planting, and implement forest management strategies now that may seek to maximize forest carbon sequestration for greenhouse gas abatement a century from now.
Blencowe, Natalie S; Cook, Jonathan A; Pinkney, Thomas; Rogers, Chris; Reeves, Barnaby C; Blazeby, Jane M
2017-04-01
Randomized controlled trials in surgery are notoriously difficult to design and conduct due to numerous methodological and cultural challenges. Over the last 5 years, several UK-based surgical trial-related initiatives have been funded to address these issues. These include the development of Surgical Trials Centers and Surgical Specialty Leads (individual surgeons responsible for championing randomized controlled trials in their specialist fields), both funded by the Royal College of Surgeons of England; networks of research-active surgeons in training; and investment in methodological research relating to surgical randomized controlled trials (to address issues such as recruitment, blinding, and the selection and standardization of interventions). This article discusses these initiatives more in detail and provides exemplar cases to illustrate how the methodological challenges have been tackled. The initiatives have surpassed expectations, resulting in a renaissance in surgical research throughout the United Kingdom, such that the number of patients entering surgical randomized controlled trials has doubled.
E.M. (Ted) Bilek
2007-01-01
The model ChargeOut! was developed to determine charge-out rates or rates of return for machines and capital equipment. This paper introduces a costing methodology and applies it to a piece of capital equipment. Although designed for the forest industry, the methodology is readily transferable to other sectors. Based on discounted cash-flow analysis, ChargeOut!...
Calculation of Dynamic Loads Due to Random Vibration Environments in Rocket Engine Systems
NASA Technical Reports Server (NTRS)
Christensen, Eric R.; Brown, Andrew M.; Frady, Greg P.
2007-01-01
An important part of rocket engine design is the calculation of random dynamic loads resulting from internal engine "self-induced" sources. These loads are random in nature and can greatly influence the weight of many engine components. Several methodologies for calculating random loads are discussed and then compared to test results using a dynamic testbed consisting of a 60K thrust engine. The engine was tested in a free-free condition with known random force inputs from shakers attached to three locations near the main noise sources on the engine. Accelerations and strains were measured at several critical locations on the engines and then compared to the analytical results using two different random response methodologies.
Tools and methodologies to support more sustainable biofuel feedstock production.
Dragisic, Christine; Ashkenazi, Erica; Bede, Lucio; Honzák, Miroslav; Killeen, Tim; Paglia, Adriano; Semroc, Bambi; Savy, Conrad
2011-02-01
Increasingly, government regulations, voluntary standards, and company guidelines require that biofuel production complies with sustainability criteria. For some stakeholders, however, compliance with these criteria may seem complex, costly, or unfeasible. What existing tools, then, might facilitate compliance with a variety of biofuel-related sustainability criteria? This paper presents four existing tools and methodologies that can help stakeholders assess (and mitigate) potential risks associated with feedstock production, and can thus facilitate compliance with requirements under different requirement systems. These include the Integrated Biodiversity Assessment Tool (IBAT), the ARtificial Intelligence for Ecosystem Services (ARIES) tool, the Responsible Cultivation Areas (RCA) methodology, and the related Biofuels + Forest Carbon (Biofuel + FC) methodology.
Eric H. Wharton; Tiberius Cunia
1987-01-01
Proceedings of a workshop co-sponsored by the USDA Forest Service, the State University of New York, and the Society of American Foresters. Presented were papers on the methodology of sample tree selection, tree biomass measurement, construction of biomass tables and estimation of their error, and combining the error of biomass tables with that of the sample plots or...
Aspen, climate, and sudden decline in western USA
Gerald E. Rehfeldt; Dennis E. Ferguson; Nicholas L. Crookston
2009-01-01
A bioclimate model predicting the presence or absence of aspen, Populus tremuloides, in western USA from climate variables was developed by using the Random Forests classification tree on Forest Inventory data from about 118,000 permanent sample plots. A reasonably parsimonious model used eight predictors to describe aspen's climate profile. Classification errors...
Variation in Local-Scale Edge Effects: Mechanisms and landscape Context
Therese M. Donovan; Peter W. Jones; Elizabeth M. Annand; Frank R. Thompson III
1997-01-01
Ecological processes near habitat edges often differ from processes away from edges. Yet, the generality of "edge effects" has been hotly debated because results vary tremendously. To understand the factors responsible for this variation, we described nest predation and cowbird distribution patterns in forest edge and forest core habitats on 36 randomly...
Mitigating budget constraints on visitation volume surveys: the case of U.S. National forests
Ashley E. Askew; Donald B.K. English; Stanley J. Zarnoch; Neelam C. Poudyal; J.M. Bowker
2014-01-01
Stratified random sampling (SRS) provides a scientifically based estimate of a population comprising mutually exclusive, homogenous subgroups. In the National Visitor Use Monitoring (NVUM) program, SRS is used to estimate recreation visitation and visitor characteristics across activities on National forests. However, with rising costs and declining budgets, carrying...
Jerry J. Vaske; Maureen P. Donnelly; Daniel R. Williams; Sandra Jonker
2001-01-01
Using the cognitive hierarchy as the theoretical foundation, this article examines the predictive influence of individuals' demographic characteristics on environmental value orientations and normative beliefs about national forest management. Data for this investigation were obtained from a random sample of Colorado residents (n = 960). As predicted by theory, a...
In Defense of the Randomized Controlled Trial for Health Promotion Research
Rosen, Laura; Manor, Orly; Engelhard, Dan; Zucker, David
2006-01-01
The overwhelming evidence about the role lifestyle plays in mortality, morbidity, and quality of life has pushed the young field of modern health promotion to center stage. The field is beset with intense debate about appropriate evaluation methodologies. Increasingly, randomized designs are considered inappropriate for health promotion research. We have reviewed criticisms against randomized trials that raise philosophical and practical issues, and we will show how most of these criticisms can be overcome with minor design modifications. By providing rebuttal to arguments against randomized trials, our work contributes to building a sound methodological base for health promotion research. PMID:16735622
Subtyping cognitive profiles in Autism Spectrum Disorder using a Functional Random Forest algorithm.
Feczko, E; Balba, N M; Miranda-Dominguez, O; Cordova, M; Karalunas, S L; Irwin, L; Demeter, D V; Hill, A P; Langhorst, B H; Grieser Painter, J; Van Santen, J; Fombonne, E J; Nigg, J T; Fair, D A
2018-05-15
DSM-5 Autism Spectrum Disorder (ASD) comprises a set of neurodevelopmental disorders characterized by deficits in social communication and interaction and repetitive behaviors or restricted interests, and may both affect and be affected by multiple cognitive mechanisms. This study attempts to identify and characterize cognitive subtypes within the ASD population using our Functional Random Forest (FRF) machine learning classification model. This model trained a traditional random forest model on measures from seven tasks that reflect multiple levels of information processing. 47 ASD diagnosed and 58 typically developing (TD) children between the ages of 9 and 13 participated in this study. Our RF model was 72.7% accurate, with 80.7% specificity and 63.1% sensitivity. Using the random forest model, the FRF then measures the proximity of each subject to every other subject, generating a distance matrix between participants. This matrix is then used in a community detection algorithm to identify subgroups within the ASD and TD groups, and revealed 3 ASD and 4 TD putative subgroups with unique behavioral profiles. We then examined differences in functional brain systems between diagnostic groups and putative subgroups using resting-state functional connectivity magnetic resonance imaging (rsfcMRI). Chi-square tests revealed a significantly greater number of between group differences (p < .05) within the cingulo-opercular, visual, and default systems as well as differences in inter-system connections in the somato-motor, dorsal attention, and subcortical systems. Many of these differences were primarily driven by specific subgroups suggesting that our method could potentially parse the variation in brain mechanisms affected by ASD. Copyright © 2017. Published by Elsevier Inc.
van der Meer, D; Hoekstra, P J; van Donkelaar, M; Bralten, J; Oosterlaan, J; Heslenfeld, D; Faraone, S V; Franke, B; Buitelaar, J K; Hartman, C A
2017-01-01
Identifying genetic variants contributing to attention-deficit/hyperactivity disorder (ADHD) is complicated by the involvement of numerous common genetic variants with small effects, interacting with each other as well as with environmental factors, such as stress exposure. Random forest regression is well suited to explore this complexity, as it allows for the analysis of many predictors simultaneously, taking into account any higher-order interactions among them. Using random forest regression, we predicted ADHD severity, measured by Conners’ Parent Rating Scales, from 686 adolescents and young adults (of which 281 were diagnosed with ADHD). The analysis included 17 374 single-nucleotide polymorphisms (SNPs) across 29 genes previously linked to hypothalamic–pituitary–adrenal (HPA) axis activity, together with information on exposure to 24 individual long-term difficulties or stressful life events. The model explained 12.5% of variance in ADHD severity. The most important SNP, which also showed the strongest interaction with stress exposure, was located in a region regulating the expression of telomerase reverse transcriptase (TERT). Other high-ranking SNPs were found in or near NPSR1, ESR1, GABRA6, PER3, NR3C2 and DRD4. Chronic stressors were more influential than single, severe, life events. Top hits were partly shared with conduct problems. We conclude that random forest regression may be used to investigate how multiple genetic and environmental factors jointly contribute to ADHD. It is able to implicate novel SNPs of interest, interacting with stress exposure, and may explain inconsistent findings in ADHD genetics. This exploratory approach may be best combined with more hypothesis-driven research; top predictors and their interactions with one another should be replicated in independent samples. PMID:28585928
Simple to complex modeling of breathing volume using a motion sensor.
John, Dinesh; Staudenmayer, John; Freedson, Patty
2013-06-01
To compare simple and complex modeling techniques to estimate categories of low, medium, and high ventilation (VE) from ActiGraph™ activity counts. Vertical axis ActiGraph™ GT1M activity counts, oxygen consumption and VE were measured during treadmill walking and running, sports, household chores and labor-intensive employment activities. Categories of low (<19.3 l/min), medium (19.3 to 35.4 l/min) and high (>35.4 l/min) VEs were derived from activity intensity classifications (light <2.9 METs, moderate 3.0 to 5.9 METs and vigorous >6.0 METs). We examined the accuracy of two simple techniques (multiple regression and activity count cut-point analyses) and one complex (random forest technique) modeling technique in predicting VE from activity counts. Prediction accuracy of the complex random forest technique was marginally better than the simple multiple regression method. Both techniques accurately predicted VE categories almost 80% of the time. The multiple regression and random forest techniques were more accurate (85 to 88%) in predicting medium VE. Both techniques predicted the high VE (70 to 73%) with greater accuracy than low VE (57 to 60%). Actigraph™ cut-points for light, medium and high VEs were <1381, 1381 to 3660 and >3660 cpm. There were minor differences in prediction accuracy between the multiple regression and the random forest technique. This study provides methods to objectively estimate VE categories using activity monitors that can easily be deployed in the field. Objective estimates of VE should provide a better understanding of the dose-response relationship between internal exposure to pollutants and disease. Copyright © 2013 Elsevier B.V. All rights reserved.
Tillman, Fred; Anning, David W.; Heilman, Julian A.; Buto, Susan G.; Miller, Matthew P.
2018-01-01
Elevated concentrations of dissolved-solids (salinity) including calcium, sodium, sulfate, and chloride, among others, in the Colorado River cause substantial problems for its water users. Previous efforts to reduce dissolved solids in upper Colorado River basin (UCRB) streams often focused on reducing suspended-sediment transport to streams, but few studies have investigated the relationship between suspended sediment and salinity, or evaluated which watershed characteristics might be associated with this relationship. Are there catchment properties that may help in identifying areas where control of suspended sediment will also reduce salinity transport to streams? A random forests classification analysis was performed on topographic, climate, land cover, geology, rock chemistry, soil, and hydrologic information in 163 UCRB catchments. Two random forests models were developed in this study: one for exploring stream and catchment characteristics associated with stream sites where dissolved solids increase with increasing suspended-sediment concentration, and the other for predicting where these sites are located in unmonitored reaches. Results of variable importance from the exploratory random forests models indicate that no simple source, geochemical process, or transport mechanism can easily explain the relationship between dissolved solids and suspended sediment concentrations at UCRB monitoring sites. Among the most important watershed characteristics in both models were measures of soil hydraulic conductivity, soil erodibility, minimum catchment elevation, catchment area, and the silt component of soil in the catchment. Predictions at key locations in the basin were combined with observations from selected monitoring sites, and presented in map-form to give a complete understanding of where catchment sediment control practices would also benefit control of dissolved solids in streams.
NASA Astrophysics Data System (ADS)
Chemura, Abel; Mutanga, Onisimo; Dube, Timothy
2017-08-01
Water management is an important component in agriculture, particularly for perennial tree crops such as coffee. Proper detection and monitoring of water stress therefore plays an important role not only in mitigating the associated adverse impacts on crop growth and productivity but also in reducing expensive and environmentally unsustainable irrigation practices. Current methods for water stress detection in coffee production mainly involve monitoring plant physiological characteristics and soil conditions. In this study, we tested the ability of selected wavebands in the VIS/NIR range to predict plant water content (PWC) in coffee using the random forest algorithm. An experiment was set up such that coffee plants were exposed to different levels of water stress and reflectance and plant water content measured. In selecting appropriate parameters, cross-correlation identified 11 wavebands, reflectance difference identified 16 and reflectance sensitivity identified 22 variables related to PWC. Only three wavebands (485 nm, 670 nm and 885 nm) were identified by at least two methods as significant. The selected wavebands were trained (n = 36) and tested on independent data (n = 24) after being integrated into the random forest algorithm to predict coffee PWC. The results showed that the reflectance sensitivity selected bands performed the best in water stress detection (r = 0.87, RMSE = 4.91% and pBias = 0.9%), when compared to reflectance difference (r = 0.79, RMSE = 6.19 and pBias = 2.5%) and cross-correlation selected wavebands (r = 0.75, RMSE = 6.52 and pBias = 1.6). These results indicate that it is possible to reliably predict PWC using wavebands in the VIS/NIR range that correspond with many of the available multispectral scanners using random forests and further research at field and landscape scale is required to operationalize these findings.
Properties of Protein Drug Target Classes
Bull, Simon C.; Doig, Andrew J.
2015-01-01
Accurate identification of drug targets is a crucial part of any drug development program. We mined the human proteome to discover properties of proteins that may be important in determining their suitability for pharmaceutical modulation. Data was gathered concerning each protein’s sequence, post-translational modifications, secondary structure, germline variants, expression profile and drug target status. The data was then analysed to determine features for which the target and non-target proteins had significantly different values. This analysis was repeated for subsets of the proteome consisting of all G-protein coupled receptors, ion channels, kinases and proteases, as well as proteins that are implicated in cancer. Machine learning was used to quantify the proteins in each dataset in terms of their potential to serve as a drug target. This was accomplished by first inducing a random forest that could distinguish between its targets and non-targets, and then using the random forest to quantify the drug target likeness of the non-targets. The properties that can best differentiate targets from non-targets were primarily those that are directly related to a protein’s sequence (e.g. secondary structure). Germline variants, expression levels and interactions between proteins had minimal discriminative power. Overall, the best indicators of drug target likeness were found to be the proteins’ hydrophobicities, in vivo half-lives, propensity for being membrane bound and the fraction of non-polar amino acids in their sequences. In terms of predicting potential targets, datasets of proteases, ion channels and cancer proteins were able to induce random forests that were highly capable of distinguishing between targets and non-targets. The non-target proteins predicted to be targets by these random forests comprise the set of the most suitable potential future drug targets, and should therefore be prioritised when building a drug development programme. PMID:25822509
Sarica, Alessia; Cerasa, Antonio; Quattrone, Aldo
2017-01-01
Objective: Machine learning classification has been the most important computational development in the last years to satisfy the primary need of clinicians for automatic early diagnosis and prognosis. Nowadays, Random Forest (RF) algorithm has been successfully applied for reducing high dimensional and multi-source data in many scientific realms. Our aim was to explore the state of the art of the application of RF on single and multi-modal neuroimaging data for the prediction of Alzheimer's disease. Methods: A systematic review following PRISMA guidelines was conducted on this field of study. In particular, we constructed an advanced query using boolean operators as follows: ("random forest" OR "random forests") AND neuroimaging AND ("alzheimer's disease" OR alzheimer's OR alzheimer) AND (prediction OR classification) . The query was then searched in four well-known scientific databases: Pubmed, Scopus, Google Scholar and Web of Science. Results: Twelve articles-published between the 2007 and 2017-have been included in this systematic review after a quantitative and qualitative selection. The lesson learnt from these works suggest that when RF was applied on multi-modal data for prediction of Alzheimer's disease (AD) conversion from the Mild Cognitive Impairment (MCI), it produces one of the best accuracies to date. Moreover, the RF has important advantages in terms of robustness to overfitting, ability to handle highly non-linear data, stability in the presence of outliers and opportunity for efficient parallel processing mainly when applied on multi-modality neuroimaging data, such as, MRI morphometric, diffusion tensor imaging, and PET images. Conclusions: We discussed the strengths of RF, considering also possible limitations and by encouraging further studies on the comparisons of this algorithm with other commonly used classification approaches, particularly in the early prediction of the progression from MCI to AD.
Wearn, Oliver R.; Rowcliffe, J. Marcus; Carbone, Chris; Bernard, Henry; Ewers, Robert M.
2013-01-01
The proliferation of camera-trapping studies has led to a spate of extensions in the known distributions of many wild cat species, not least in Borneo. However, we still do not have a clear picture of the spatial patterns of felid abundance in Southeast Asia, particularly with respect to the large areas of highly-disturbed habitat. An important obstacle to increasing the usefulness of camera trap data is the widespread practice of setting cameras at non-random locations. Non-random deployment interacts with non-random space-use by animals, causing biases in our inferences about relative abundance from detection frequencies alone. This may be a particular problem if surveys do not adequately sample the full range of habitat features present in a study region. Using camera-trapping records and incidental sightings from the Kalabakan Forest Reserve, Sabah, Malaysian Borneo, we aimed to assess the relative abundance of felid species in highly-disturbed forest, as well as investigate felid space-use and the potential for biases resulting from non-random sampling. Although the area has been intensively logged over three decades, it was found to still retain the full complement of Bornean felids, including the bay cat Pardofelis badia, a poorly known Bornean endemic. Camera-trapping using strictly random locations detected four of the five Bornean felid species and revealed inter- and intra-specific differences in space-use. We compare our results with an extensive dataset of >1,200 felid records from previous camera-trapping studies and show that the relative abundance of the bay cat, in particular, may have previously been underestimated due to the use of non-random survey locations. Further surveys for this species using random locations will be crucial in determining its conservation status. We advocate the more wide-spread use of random survey locations in future camera-trapping surveys in order to increase the robustness and generality of inferences that can be made. PMID:24223717
Yu, Dan-Dan; Xie, Yan-Ming; Liao, Xing; Zhi, Ying-Jie; Jiang, Jun-Jie; Chen, Wei
2018-02-01
To evaluate the methodological quality and reporting quality of randomized controlled trials(RCTs) published in China Journal of Chinese Materia Medica, we searched CNKI and China Journal of Chinese Materia webpage to collect RCTs since the establishment of the magazine. The Cochrane risk of bias assessment tool was used to evaluate the methodological quality of RCTs. The CONSORT 2010 list was adopted as reporting quality evaluating tool. Finally, 184 RCTs were included and evaluated methodologically, of which 97 RCTs were evaluated with reporting quality. For the methodological evaluating, 62 trials(33.70%) reported the random sequence generation; 9(4.89%) trials reported the allocation concealment; 25(13.59%) trials adopted the method of blinding; 30(16.30%) trials reported the number of patients withdrawing, dropping out and those lost to follow-up;2 trials (1.09%) reported trial registration and none of the trial reported the trial protocol; only 8(4.35%) trials reported the sample size estimation in details. For reporting quality appraising, 3 reporting items of 25 items were evaluated with high-quality,including: abstract, participants qualified criteria, and statistical methods; 4 reporting items with medium-quality, including purpose, intervention, random sequence method, and data collection of sites and locations; 9 items with low-quality reporting items including title, backgrounds, random sequence types, allocation concealment, blindness, recruitment of subjects, baseline data, harms, and funding;the rest of items were of extremely low quality(the compliance rate of reporting item<10%). On the whole, the methodological and reporting quality of RCTs published in the magazine are generally low. Further improvement in both methodological and reporting quality for RCTs of traditional Chinese medicine are warranted. It is recommended that the international standards and procedures for RCT design should be strictly followed to conduct high-quality trials. At the same time, in order to improve the reporting quality of randomized controlled trials, CONSORT standards should be adopted in the preparation of research reports and submissions. Copyright© by the Chinese Pharmaceutical Association.
Ko, Kyung Dae; El-Ghazawi, Tarek; Kim, Dongkyu; Morizono, Hiroki
2014-05-01
Motor neuron diseases (MNDs) are a class of progressive neurological diseases that damage the motor neurons. An accurate diagnosis is important for the treatment of patients with MNDs because there is no standard cure for the MNDs. However, the rates of false positive and false negative diagnoses are still very high in this class of diseases. In the case of Amyotrophic Lateral Sclerosis (ALS), current estimates indicate 10% of diagnoses are false-positives, while 44% appear to be false negatives. In this study, we developed a new methodology to profile specific medical information from patient medical records for predicting the progression of motor neuron diseases. We implemented a system using Hbase and the Random forest classifier of Apache Mahout to profile medical records provided by the Pooled Resource Open-Access ALS Clinical Trials Database (PRO-ACT) site, and we achieved 66% accuracy in the prediction of ALS progress.
NASA Technical Reports Server (NTRS)
Iverson, Louis R.; Cook, Elizabeth A.; Graham, Robin L.; Olson, Jerry S.; Frank, Thomas; Ke, Ying; Treworgy, Colin; Risser, Paul G.
1987-01-01
This report summarizes progress made in our investigation of forest productivity assessment using TM and other biogeographical data during the third six-month period of the grant. Data acquisition and methodology hurdles are largely complete. Four study areas for which the appropriate TM and ancillary data were available are currently being intensively analyzed. Significant relationships have been found on a site by site basis to suggest that forest productivity can be qualitatively assessed using TM band values and site characteristics. Perhaps the most promising results relate TM unsupervised classes to forest productivity, with enhancement from elevation data. During the final phases of the research, multi-temporal and regional comparisons of results will be addressed, as well as the predictability of forest productivity patterns over a large region using TM data and/or TM nested within AVHRR data.
Comparison results of forest cover mapping of Peninsular Malaysia using geospatial technology
NASA Astrophysics Data System (ADS)
Hamid, Wan Abdul; Abd Rahman, Shukri B. Wan
2016-06-01
Climate change and global warming transpire due to several factors. Among them is deforestation which occur mostly in developing countries including Malaysia where forested areas are converted to other land use for tangible economic returns and to a smaller extent, as subsistence for local communities. As a cause for concern, efforts have been taken by the World Resource Institute (WRI) and World Wildlife Fund (WWF) to monitor forest loss using geospatial technology - interpreting time-based remote sensing imageries and producing statistics of forested areas lost since 2001. In Peninsular Malaysia, the Forestry Department of Peninsular Malaysia(FDPM) has conducted forest cover mapping for the region using the same technology since 2011, producing GIS maps for 2009-2010,2011-2012,2013-2014 and 2015. This paper focuses on the comparative study of the results generated from WRI,WWF and FDPM interpretations between 2010 and 2015, the methodologies used, the similarities and differences, challenges and recommendations for future enhancement of forest cover mapping technique.
Grietens, Koen Peeters; Xuan, Xa Nguyen; Ribera, Joan; Duc, Thang Ngo; Bortel, Wim van; Ba, Nhat Truong; Van, Ky Pham; Xuan, Hung Le; D'Alessandro, Umberto; Erhart, Annette
2012-01-01
Long-lasting insecticidal hammocks (LLIHs) are being evaluated as an additional malaria prevention tool in settings where standard control strategies have a limited impact. This is the case among the Ra-glai ethnic minority communities of Ninh Thuan, one of the forested and mountainous provinces of Central Vietnam where malaria morbidity persist due to the sylvatic nature of the main malaria vector An. dirus and the dependence of the population on the forest for subsistence--as is the case for many impoverished ethnic minorities in Southeast Asia. A social science study was carried out ancillary to a community-based cluster randomized trial on the effectiveness of LLIHs to control forest malaria. The social science research strategy consisted of a mixed methods study triangulating qualitative data from focused ethnography and quantitative data collected during a malariometric cross-sectional survey on a random sample of 2,045 study participants. To meet work requirements during the labor intensive malaria transmission and rainy season, Ra-glai slash and burn farmers combine living in government supported villages along the road with a second home at their fields located in the forest. LLIH use was evaluated in both locations. During daytime, LLIH use at village level was reported by 69.3% of all respondents, and in forest fields this was 73.2%. In the evening, 54.1% used the LLIHs in the villages, while at the fields this was 20.7%. At night, LLIH use was minimal, regardless of the location (village 4.4%; forest 6.4%). Despite the free distribution of insecticide-treated nets (ITNs) and LLIHs, around half the local population remains largely unprotected when sleeping in their forest plot huts. In order to tackle forest malaria more effectively, control policies should explicitly target forest fields where ethnic minority farmers are more vulnerable to malaria.
Muela Ribera, Joan; Ngo Duc, Thang; van Bortel, Wim; Truong Ba, Nhat; Van, Ky Pham; Le Xuan, Hung; D'Alessandro, Umberto; Erhart, Annette
2012-01-01
Background Long-lasting insecticidal hammocks (LLIHs) are being evaluated as an additional malaria prevention tool in settings where standard control strategies have a limited impact. This is the case among the Ra-glai ethnic minority communities of Ninh Thuan, one of the forested and mountainous provinces of Central Vietnam where malaria morbidity persist due to the sylvatic nature of the main malaria vector An. dirus and the dependence of the population on the forest for subsistence - as is the case for many impoverished ethnic minorities in Southeast Asia. Methods A social science study was carried out ancillary to a community-based cluster randomized trial on the effectiveness of LLIHs to control forest malaria. The social science research strategy consisted of a mixed methods study triangulating qualitative data from focused ethnography and quantitative data collected during a malariometric cross-sectional survey on a random sample of 2,045 study participants. Results To meet work requirements during the labor intensive malaria transmission and rainy season, Ra-glai slash and burn farmers combine living in government supported villages along the road with a second home at their fields located in the forest. LLIH use was evaluated in both locations. During daytime, LLIH use at village level was reported by 69.3% of all respondents, and in forest fields this was 73.2%. In the evening, 54.1% used the LLIHs in the villages, while at the fields this was 20.7%. At night, LLIH use was minimal, regardless of the location (village 4.4%; forest 6.4%). Discussion Despite the free distribution of insecticide-treated nets (ITNs) and LLIHs, around half the local population remains largely unprotected when sleeping in their forest plot huts. In order to tackle forest malaria more effectively, control policies should explicitly target forest fields where ethnic minority farmers are more vulnerable to malaria. PMID:22253852
Fifty years dynamics of Russian forests: Impacts on the earth system
NASA Astrophysics Data System (ADS)
Shvidenko, Anatoly; Schepaschenko, Dmitry; Kraxner, Florian
2015-04-01
The paper presents a succinct history of Russian forests during the time period of 1960-2010 and reanalysis of their impacts on global carbon and nitrogen cycles. We present dynamics of land cover change (including major categories of forest land) and biometric characteristics of forests (species composition, age structure, growing stock volume etc.) based on reconciling all relevant information (data of forest and land inventories, official forest management statistics, multi-sensor remote sensing products, data of forest pathological monitoring etc.). Completeness and reliability of background information was different during the period of the study. Forest inventory data and official statistics were partially modified based on relevant auxiliary information and used for 1960-2000. The analysis for 2001-2010 was provided with a crucial use of multi-sensor remote sensing data. For this last period a hybrid forest mask was developed at resolution of 230m by integration of 8 remote sensing products and using geographical weighted regression and data of crowdsourcing. During the considered 50 years forested areas of Russia substantially increased by middle of 1990s and slightly declined (at about 5%) after. Indicators needed for assessment of carbon and nitrogen cycles of forest ecosystems were defined for the entire period (aggregated estimates by decades for 1960-2000 and yearly for 2001-2010) based on unified methodology with some peculiarities following from availability of information. Major results were obtained by landscape-ecosystem method that uses as comprehensive as possible empirical and semi-empirical information on ecosystems and landscapes in form of an Integrated Land Information System and complimentary combines pool- and flux-based methods. We discuss and quantify major drivers of forest cover change (socio-economic, environmental and climatic) including forest management (harvest, reforestation and afforestation), impacts of seasonal weather on carbon fluxes (Net Primary Production, Heterotrophic Respiration), disturbances (fire, outbreaks of insects and diseases), and industrial pressure (land change, air pollution, water and soil contamination). During the entire period Russian forests provided the net carbon sink in range from 350-700 Tg C yr-1 with inter-annual variability in limits of 10-15% for the entire country. The overall sink is a result of superposition of trends of major carbon fluxes (caused by removal of harvested wood and use of forest products; land cover change; impact of climatic trends; change of disturbance regimes) and inter-annual variation of seasonal weather. Major indicators of the nitrogen cycle are assessed and discussed in connection with the carbon cycle. We provide comparative analysis of other results published for the considered period taken into account successive improvements of information and methodology used for studying the major biogeochemical cycles.
NASA Astrophysics Data System (ADS)
Jeuck, James A.
This dissertation consists of research projects related to forest land use / land cover (LULC): (1) factors predicting LULC change and (2) methodology to predict particular forest use, or "potential working timberland" (PWT), from current forms of land data. The first project resulted in a published paper, a meta-analysis of 64 econometric models from 47 studies predicting forest land use changes. The response variables, representing some form of forest land change, were organized into four groups: forest conversion to agriculture (F2A), forestland to development (F2D), forestland to non-forested (F2NF) and undeveloped (including forestland) to developed (U2D) land. Over 250 independent econometric variables were identified, from 21 F2A models, 21 F2D models, 12 F2NF models, and 10 U2D models. These variables were organized into a hierarchy of 119 independent variable groups, 15 categories, and 4 econometric drivers suitable for conducting simple vote count statistics. Vote counts were summarized at the independent variable group level and formed into ratios estimating the predictive success of each variable group. Two ratio estimates were developed based on (1) proportion of times independent variables successfully achieved statistical significance (p ≤0.10), and (2) proportion of times independent variables successfully met the original researchers'expectations. In F2D models, popular independent variables such as population, income, and urban proximity often achieved statistical significance. In F2A models, popular independent variables such as forest and agricultural rents and costs, governmental programs, and site quality often achieved statistical significance. In U2D models, successful independent variables included urban rents and costs, zoning issues concerning forestland loss, site quality, urban proximity, population, and income. F2NF models high success variables were found to be agricultural rents, site quality, population, and income. This meta-analysis provides insight into the general success of econometric independent variables for future forest use or cover change research. The second part of this dissertation developed a method for predicting area estimates and spatial distribution of PWT in the US South. This technique determined land use from USFS Forest Inventory and Analysis (FIA) and land cover from the National Land Cover Database (NLCD). Three dependent variable forms (DV Forms) were derived from the FIA data: DV Form 1, timberland, other; DV Form 2, short timberland, tall timberland, agriculture, other; and DV Form 3, short hardwood (HW) timberland, tall HW timberland, short softwood (SW) timberland, tall SW timberland, agriculture, other. The prediction accuracy of each DV Form was investigated using both random forest model and logistic regression model specifications and data optimization techniques. Model verification employing a "leave-group-out" Monte Carlo simulation determined the selection of a stratified version of the random forest model using one-year NLCD observations with an overall accuracy of 0.53-0.94. The lower accuracy side of the range was when predictions were made from an aggregated NLCD land cover class "grass_shrub". The selected model specification was run using 2011 NLCD and the other predictor variables to produce three levels of timberland prediction and probability maps for the US South. Spatial masks removed areas unlikely to be working forests (protected and urbanized lands) resulting in PWT maps. The area of the resulting maps compared well with USFS area estimates and masked PWT maps and had an 8-11% reduction of the USFS timberland estimate for the US South compared to the DV Form. Change analysis of the 2011 NLCD to PWT showed (1) the majority of the short timberland came from NLCD grass_shrub; (2) the majority of NLCD grass_shrub predicted into tall timberland, and (3) NLCD grass_shrub was more strongly associated with timberland in the Coastal Plain. Resulting map products provide practical analytical tools for those interested in studying the area and distribution of PWT in the US South.
Advanced analysis of forest fire clustering
NASA Astrophysics Data System (ADS)
Kanevski, Mikhail; Pereira, Mario; Golay, Jean
2017-04-01
Analysis of point pattern clustering is an important topic in spatial statistics and for many applications: biodiversity, epidemiology, natural hazards, geomarketing, etc. There are several fundamental approaches used to quantify spatial data clustering using topological, statistical and fractal measures. In the present research, the recently introduced multi-point Morisita index (mMI) is applied to study the spatial clustering of forest fires in Portugal. The data set consists of more than 30000 fire events covering the time period from 1975 to 2013. The distribution of forest fires is very complex and highly variable in space. mMI is a multi-point extension of the classical two-point Morisita index. In essence, mMI is estimated by covering the region under study by a grid and by computing how many times more likely it is that m points selected at random will be from the same grid cell than it would be in the case of a complete random Poisson process. By changing the number of grid cells (size of the grid cells), mMI characterizes the scaling properties of spatial clustering. From mMI, the data intrinsic dimension (fractal dimension) of the point distribution can be estimated as well. In this study, the mMI of forest fires is compared with the mMI of random patterns (RPs) generated within the validity domain defined as the forest area of Portugal. It turns out that the forest fires are highly clustered inside the validity domain in comparison with the RPs. Moreover, they demonstrate different scaling properties at different spatial scales. The results obtained from the mMI analysis are also compared with those of fractal measures of clustering - box counting and sand box counting approaches. REFERENCES Golay J., Kanevski M., Vega Orozco C., Leuenberger M., 2014: The multipoint Morisita index for the analysis of spatial patterns. Physica A, 406, 191-202. Golay J., Kanevski M. 2015: A new estimator of intrinsic dimension based on the multipoint Morisita index. Pattern Recognition, 48, 4070-4081.
EDITORIAL: Special section on foliage penetration
NASA Astrophysics Data System (ADS)
Fiddy, M. A.; Lang, R.; McGahan, R. V.
2004-04-01
Waves in Random Media was founded in 1991 to provide a forum for papers dealing with electromagnetic and acoustic waves as they propagate and scatter through media or objects having some degree of randomness. This is a broad charter since, in practice, all scattering obstacles and structures have roughness or randomness, often on the scale of the wavelength being used to probe them. Including this random component leads to some quite different methods for describing propagation effects, for example, when propagating through the atmosphere or the ground. This special section on foliage penetration (FOPEN) focuses on the problems arising from microwave propagation through foliage and vegetation. Applications of such studies include the estimation for forest biomass and the moisture of the underlying soil, as well as detecting objects hidden therein. In addition to the so-called `direct problem' of trying to describe energy propagating through such media, the complementary inverse problem is of great interest and much harder to solve. The development of theoretical models and associated numerical algorithms for identifying objects concealed by foliage has applications in surveillance, ranging from monitoring drug trafficking to targeting military vehicles. FOPEN can be employed to map the earth's surface in cases when it is under a forest canopy, permitting the identification of objects or targets on that surface, but the process for doing so is not straightforward. There has been an increasing interest in foliage penetration synthetic aperture radar (FOPEN or FOPENSAR) over the last 10 years and this special section provides a broad overview of many of the issues involved. The detection, identification, and geographical location of targets under foliage or otherwise obscured by poor visibility conditions remains a challenge. In particular, a trade-off often needs to be appreciated, namely that diminishing the deleterious effects of multiple scattering from leaves is typically associated with a significant loss in target resolution. Foliage is more or less transparent to some radar frequencies, but longer wavelengths found in the VHF (30 to 300 MHz) and UHF (300 MHz to 3 GHz) portions of the microwave spectrum have more chance of penetrating foliage than do wavelengths at the X band (8 to 12 GHz). Reflection and multiple scattering occur for some other frequencies and models of the processes involved are crucial. Two topical reviews can be found in this issue, one on the microwave radiometry of forests (page S275) and another describing ionospheric effects on space-based radar (page S189). Subsequent papers present new results on modelling coherent backscatter from forests (page S299), modelling forests as discrete random media over a random interface (page S359) and interpreting ranging scatterometer data from forests (page S317). Cloude et al present research on identifying targets beneath foliage using polarimetric SAR interferometry (page S393) while Treuhaft and Siqueira use interferometric radar to describe forest structure and biomass (page S345). Vechhia et al model scattering from leaves (page S333) and Semichaevsky et al address the problem of the trade-off between increasing wavelength, reduction in multiple scattering, and target resolution (page S415).
Spatio-temporal Change Patterns of Tropical Forests from 2000 to 2014 Using MOD09A1 Dataset
NASA Astrophysics Data System (ADS)
Qin, Y.; Xiao, X.; Dong, J.
2016-12-01
Large-scale deforestation and forest degradation in the tropical region have resulted in extensive carbon emissions and biodiversity loss. However, restricted by the availability of good-quality observations, large uncertainty exists in mapping the spatial distribution of forests and their spatio-temporal changes. In this study, we proposed a pixel- and phenology-based algorithm to identify and map annual tropical forests from 2000 to 2014, using the 8-day, 500-m MOD09A1 (v005) product, under the support of Google cloud computing (Google Earth Engine). A temporal filter was applied to reduce the random noises and to identify the spatio-temporal changes of forests. We then built up a confusion matrix and assessed the accuracy of the annual forest maps based on the ground reference interpreted from high spatial resolution images in Google Earth. The resultant forest maps showed the consistent forest/non-forest, forest loss, and forest gain in the pan-tropical zone during 2000 - 2014. The proposed algorithm showed the potential for tropical forest mapping and the resultant forest maps are important for the estimation of carbon emission and biodiversity loss.
Building National Capacity To Implement National Forest Monitoring System In Africa By GLAD
NASA Astrophysics Data System (ADS)
Lola Amani, P. K.
2017-12-01
Earth Observation data provide numerous information on the earth and its phenomena from space/satellite. They also offer the ability to compile and analyze information at global or local scales in a timely manner. However, to use them, it is important to develop methods that can enable the extraction of the desired information. Such methods should be robust and consistent enough to be considered for national monitoring systems. At the University of Maryland, the Global Land Analysis and Discovery (GLAD) Laboratory, led by Dr. Hansen, has developed automatic methods using Landsat data that have been applied for the Global Forest Change (GFC) in collaboration with the World Resources Institute (WRI), Google and others to providing information on tree cover loss throughout the global on a yearly basis, and on a daily basis a tree cover loss alert system to improve transparency and accessible at GFW Initiative (Global Forest Watch) website. Following the increasing interest in utilizing the GFC data, the GLAD Laboratory is working closely with national governments of different countries to reinforce their capacities in using the data in the best way and implementing the methodological framework for supporting their national forest monitoring, notification, and reporting (MNV) system. More precisely, the Lab supports step by step the countries in developing their reference emission levels and/or forest reference levels based on the country-specific needs, goals, and requirements, including the definition of the forest. Once in place, the methodology can easily be extended to different applications, such as monitoring the droughts events, etc. Here, we present the work accomplished with the national agencies of some countries in Africa, like Cameroon, Republic of Congo and Madagascar with the support of the Silva-Carbon and USAID-CARPE Programs and WRI. These countries are mainly engaged at different levels of the REDD+ process. Keywords: Earth Observation, Landsat data, Global Forest Change, National Monitoring System, Capacity Building, Africa
Exploring prediction uncertainty of spatial data in geostatistical and machine learning Approaches
NASA Astrophysics Data System (ADS)
Klump, J. F.; Fouedjio, F.
2017-12-01
Geostatistical methods such as kriging with external drift as well as machine learning techniques such as quantile regression forest have been intensively used for modelling spatial data. In addition to providing predictions for target variables, both approaches are able to deliver a quantification of the uncertainty associated with the prediction at a target location. Geostatistical approaches are, by essence, adequate for providing such prediction uncertainties and their behaviour is well understood. However, they often require significant data pre-processing and rely on assumptions that are rarely met in practice. Machine learning algorithms such as random forest regression, on the other hand, require less data pre-processing and are non-parametric. This makes the application of machine learning algorithms to geostatistical problems an attractive proposition. The objective of this study is to compare kriging with external drift and quantile regression forest with respect to their ability to deliver reliable prediction uncertainties of spatial data. In our comparison we use both simulated and real world datasets. Apart from classical performance indicators, comparisons make use of accuracy plots, probability interval width plots, and the visual examinations of the uncertainty maps provided by the two approaches. By comparing random forest regression to kriging we found that both methods produced comparable maps of estimated values for our variables of interest. However, the measure of uncertainty provided by random forest seems to be quite different to the measure of uncertainty provided by kriging. In particular, the lack of spatial context can give misleading results in areas without ground truth data. These preliminary results raise questions about assessing the risks associated with decisions based on the predictions from geostatistical and machine learning algorithms in a spatial context, e.g. mineral exploration.
Assessing Impacts of Climate Change on Forests: The State of Biological Modeling
DOE R&D Accomplishments Database
Dale, V. H.; Rauscher, H. M.
1993-04-06
Models that address the impacts to forests of climate change are reviewed by four levels of biological organization: global, regional or landscape, community, and tree. The models are compared as to their ability to assess changes in greenhouse gas flux, land use, maps of forest type or species composition, forest resource productivity, forest health, biodiversity, and wildlife habitat. No one model can address all of these impacts, but landscape transition models and regional vegetation and land-use models consider the largest number of impacts. Developing landscape vegetation dynamics models of functional groups is suggested as a means to integrate the theory of both landscape ecology and individual tree responses to climate change. Risk assessment methodologies can be adapted to deal with the impacts of climate change at various spatial and temporal scales. Four areas of research development are identified: (1) linking socioeconomic and ecologic models, (2) interfacing forest models at different scales, (3) obtaining data on susceptibility of trees and forest to changes in climate and disturbance regimes, and (4) relating information from different scales.
Xu, Ge Xi; Shi, Zuo Min; Tang, Jing Chao; Liu, Shun; Ma, Fan Qiang; Xu, Han; Liu, Shi Rong; Li, Yi de
2016-11-18
Based on three 1-hm 2 plots of Jianfengling tropical montane rainforest on Hainan Island, 11 commom used functional traits of canopy trees were measured. After combining with topographical factors and trees census data of these three plots, we compared the impacts of weighted species abundance on two functional dispersion indices, mean pairwise distance (MPD) and mean nearest taxon distance (MNTD), by using single- and multi-dimensional traits, respectively. The relationship between functional richness of the forest canopies and species abundance was analyzed. We used a null model approach to explore the variations in standardized size effects of MPD and MNTD, which were weighted by species abundance and eliminated the influences of species richness diffe-rences among communities, and assessed functional diversity patterns of the forest canopies and their responses to local habitat heterogeneity at community's level. The results showed that variation in MPD was greatly dependent on the dimensionalities of functional traits as well as species abundance. The correlations between weighted and non-weighted MPD based on different dimensional traits were relatively weak (R=0.359-0.628). On the contrary, functional traits and species abundance had relatively weak effects on MNTD, which brought stronger correlations between weighted and non-weighted MNTD based on different dimensional traits (R=0.746-0.820). Functional dispersion of the forest canopies were generally overestimated when using non-weighted MPD and MNTD. Functional richness of the forest canopies showed an exponential relationship with species abundance (F=128.20; R 2 =0.632; AIC=97.72; P<0.001), which might exist a species abundance threshold value. Patterns of functional diversity of the forest canopies based on different dimensional functional traits and their habitat responses showed variations in some degree. Forest canopies in the valley usually had relatively stronger biological competition, and functional diversity was higher than expected functional diversity randomized by null model, which indicated dispersed distribution of functional traits among canopy tree species in this habitat. However, the functional diversity of the forest canopies tended to be close or lower than randomization in the other habitat types, which demonstrated random or clustered distribution of the functional traits among canopy tree species.
Dimitriadis, Stavros I; Liparas, Dimitris
2018-06-01
Neuroinformatics is a fascinating research field that applies computational models and analytical tools to high dimensional experimental neuroscience data for a better understanding of how the brain functions or dysfunctions in brain diseases. Neuroinformaticians work in the intersection of neuroscience and informatics supporting the integration of various sub-disciplines (behavioural neuroscience, genetics, cognitive psychology, etc.) working on brain research. Neuroinformaticians are the pathway of information exchange between informaticians and clinicians for a better understanding of the outcome of computational models and the clinical interpretation of the analysis. Machine learning is one of the most significant computational developments in the last decade giving tools to neuroinformaticians and finally to radiologists and clinicians for an automatic and early diagnosis-prognosis of a brain disease. Random forest (RF) algorithm has been successfully applied to high-dimensional neuroimaging data for feature reduction and also has been applied to classify the clinical label of a subject using single or multi-modal neuroimaging datasets. Our aim was to review the studies where RF was applied to correctly predict the Alzheimer's disease (AD), the conversion from mild cognitive impairment (MCI) and its robustness to overfitting, outliers and handling of non-linear data. Finally, we described our RF-based model that gave us the 1 st position in an international challenge for automated prediction of MCI from MRI data.
NASA Astrophysics Data System (ADS)
Deng, Chengbin; Wu, Changshan
2013-12-01
Urban impervious surface information is essential for urban and environmental applications at the regional/national scales. As a popular image processing technique, spectral mixture analysis (SMA) has rarely been applied to coarse-resolution imagery due to the difficulty of deriving endmember spectra using traditional endmember selection methods, particularly within heterogeneous urban environments. To address this problem, we derived endmember signatures through a least squares solution (LSS) technique with known abundances of sample pixels, and integrated these endmember signatures into SMA for mapping large-scale impervious surface fraction. In addition, with the same sample set, we carried out objective comparative analyses among SMA (i.e. fully constrained and unconstrained SMA) and machine learning (i.e. Cubist regression tree and Random Forests) techniques. Analysis of results suggests three major conclusions. First, with the extrapolated endmember spectra from stratified random training samples, the SMA approaches performed relatively well, as indicated by small MAE values. Second, Random Forests yields more reliable results than Cubist regression tree, and its accuracy is improved with increased sample sizes. Finally, comparative analyses suggest a tentative guide for selecting an optimal approach for large-scale fractional imperviousness estimation: unconstrained SMA might be a favorable option with a small number of samples, while Random Forests might be preferred if a large number of samples are available.
Prediction of aquatic toxicity mode of action using linear discriminant and random forest models.
Martin, Todd M; Grulke, Christopher M; Young, Douglas M; Russom, Christine L; Wang, Nina Y; Jackson, Crystal R; Barron, Mace G
2013-09-23
The ability to determine the mode of action (MOA) for a diverse group of chemicals is a critical part of ecological risk assessment and chemical regulation. However, existing MOA assignment approaches in ecotoxicology have been limited to a relatively few MOAs, have high uncertainty, or rely on professional judgment. In this study, machine based learning algorithms (linear discriminant analysis and random forest) were used to develop models for assigning aquatic toxicity MOA. These methods were selected since they have been shown to be able to correlate diverse data sets and provide an indication of the most important descriptors. A data set of MOA assignments for 924 chemicals was developed using a combination of high confidence assignments, international consensus classifications, ASTER (ASessment Tools for the Evaluation of Risk) predictions, and weight of evidence professional judgment based an assessment of structure and literature information. The overall data set was randomly divided into a training set (75%) and a validation set (25%) and then used to develop linear discriminant analysis (LDA) and random forest (RF) MOA assignment models. The LDA and RF models had high internal concordance and specificity and were able to produce overall prediction accuracies ranging from 84.5 to 87.7% for the validation set. These results demonstrate that computational chemistry approaches can be used to determine the acute toxicity MOAs across a large range of structures and mechanisms.
Löfgren, Stefan; Fröberg, Mats; Yu, Jun; Nisell, Jakob; Ranneby, Bo
2014-12-01
From a policy perspective, it is important to understand forestry effects on surface waters from a landscape perspective. The EU Water Framework Directive demands remedial actions if not achieving good ecological status. In Sweden, 44 % of the surface water bodies have moderate ecological status or worse. Many of these drain catchments with a mosaic of managed forests. It is important for the forestry sector and water authorities to be able to identify where, in the forested landscape, special precautions are necessary. The aim of this study was to quantify the relations between forestry parameters and headwater stream concentrations of nutrients, organic matter and acid-base chemistry. The results are put into the context of regional climate, sulphur and nitrogen deposition, as well as marine influences. Water chemistry was measured in 179 randomly selected headwater streams from two regions in southwest and central Sweden, corresponding to 10 % of the Swedish land area. Forest status was determined from satellite images and Swedish National Forest Inventory data using the probabilistic classifier method, which was used to model stream water chemistry with Bayesian model averaging. The results indicate that concentrations of e.g. nitrogen, phosphorus and organic matter are related to factors associated with forest production but that it is not forestry per se that causes the excess losses. Instead, factors simultaneously affecting forest production and stream water chemistry, such as climate, extensive soil pools and nitrogen deposition, are the most likely candidates The relationships with clear-felled and wetland areas are likely to be direct effects.
NASA Astrophysics Data System (ADS)
Rahmani, K.; Mayer, H.
2018-05-01
In this paper we present a pipeline for high quality semantic segmentation of building facades using Structured Random Forest (SRF), Region Proposal Network (RPN) based on a Convolutional Neural Network (CNN) as well as rectangular fitting optimization. Our main contribution is that we employ features created by the RPN as channels in the SRF.We empirically show that this is very effective especially for doors and windows. Our pipeline is evaluated on two datasets where we outperform current state-of-the-art methods. Additionally, we quantify the contribution of the RPN and the rectangular fitting optimization on the accuracy of the result.
Bridging the gap between formal and experience-based knowledge for context-aware laparoscopy.
Katić, Darko; Schuck, Jürgen; Wekerle, Anna-Laura; Kenngott, Hannes; Müller-Stich, Beat Peter; Dillmann, Rüdiger; Speidel, Stefanie
2016-06-01
Computer assistance is increasingly common in surgery. However, the amount of information is bound to overload processing abilities of surgeons. We propose methods to recognize the current phase of a surgery for context-aware information filtering. The purpose is to select the most suitable subset of information for surgical situations which require special assistance. We combine formal knowledge, represented by an ontology, and experience-based knowledge, represented by training samples, to recognize phases. For this purpose, we have developed two different methods. Firstly, we use formal knowledge about possible phase transitions to create a composition of random forests. Secondly, we propose a method based on cultural optimization to infer formal rules from experience to recognize phases. The proposed methods are compared with a purely formal knowledge-based approach using rules and a purely experience-based one using regular random forests. The comparative evaluation on laparoscopic pancreas resections and adrenalectomies employs a consistent set of quality criteria on clean and noisy input. The rule-based approaches proved best with noisefree data. The random forest-based ones were more robust in the presence of noise. Formal and experience-based knowledge can be successfully combined for robust phase recognition.
Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie
2015-01-01
Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation. PMID:26180536
Guo, Rui; Wang, Yiqin; Yan, Hanxia; Yan, Jianjun; Yuan, Fengyin; Xu, Zhaoxia; Liu, Guoping; Xu, Wenjie
2015-01-01
Objective. This research provides objective and quantitative parameters of the traditional Chinese medicine (TCM) pulse conditions for distinguishing between patients with the coronary heart disease (CHD) and normal people by using the proposed classification approach based on Hilbert-Huang transform (HHT) and random forest. Methods. The energy and the sample entropy features were extracted by applying the HHT to TCM pulse by treating these pulse signals as time series. By using the random forest classifier, the extracted two types of features and their combination were, respectively, used as input data to establish classification model. Results. Statistical results showed that there were significant differences in the pulse energy and sample entropy between the CHD group and the normal group. Moreover, the energy features, sample entropy features, and their combination were inputted as pulse feature vectors; the corresponding average recognition rates were 84%, 76.35%, and 90.21%, respectively. Conclusion. The proposed approach could be appropriately used to analyze pulses of patients with CHD, which can lay a foundation for research on objective and quantitative criteria on disease diagnosis or Zheng differentiation.
Studies of the DIII-D disruption database using Machine Learning algorithms
NASA Astrophysics Data System (ADS)
Rea, Cristina; Granetz, Robert; Meneghini, Orso
2017-10-01
A Random Forests Machine Learning algorithm, trained on a large database of both disruptive and non-disruptive DIII-D discharges, predicts disruptive behavior in DIII-D with about 90% of accuracy. Several algorithms have been tested and Random Forests was found superior in performances for this particular task. Over 40 plasma parameters are included in the database, with data for each of the parameters taken from 500k time slices. We focused on a subset of non-dimensional plasma parameters, deemed to be good predictors based on physics considerations. Both binary (disruptive/non-disruptive) and multi-label (label based on the elapsed time before disruption) classification problems are investigated. The Random Forests algorithm provides insight on the available dataset by ranking the relative importance of the input features. It is found that q95 and Greenwald density fraction (n/nG) are the most relevant parameters for discriminating between DIII-D disruptive and non-disruptive discharges. A comparison with the Gradient Boosted Trees algorithm is shown and the first results coming from the application of regression algorithms are presented. Work supported by the US Department of Energy under DE-FC02-04ER54698, DE-SC0014264 and DE-FG02-95ER54309.
Analysis of landslide hazard area in Ludian earthquake based on Random Forests
NASA Astrophysics Data System (ADS)
Xie, J.-C.; Liu, R.; Li, H.-W.; Lai, Z.-L.
2015-04-01
With the development of machine learning theory, more and more algorithms are evaluated for seismic landslides. After the Ludian earthquake, the research team combine with the special geological structure in Ludian area and the seismic filed exploration results, selecting SLOPE(PODU); River distance(HL); Fault distance(DC); Seismic Intensity(LD) and Digital Elevation Model(DEM), the normalized difference vegetation index(NDVI) which based on remote sensing images as evaluation factors. But the relationships among these factors are fuzzy, there also exists heavy noise and high-dimensional, we introduce the random forest algorithm to tolerate these difficulties and get the evaluation result of Ludian landslide areas, in order to verify the accuracy of the result, using the ROC graphs for the result evaluation standard, AUC covers an area of 0.918, meanwhile, the random forest's generalization error rate decreases with the increase of the classification tree to the ideal 0.08 by using Out Of Bag(OOB) Estimation. Studying the final landslides inversion results, paper comes to a statistical conclusion that near 80% of the whole landslides and dilapidations are in areas with high susceptibility and moderate susceptibility, showing the forecast results are reasonable and adopted.
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
On the information content of hydrological signatures and their relationship to catchment attributes
NASA Astrophysics Data System (ADS)
Addor, Nans; Clark, Martyn P.; Prieto, Cristina; Newman, Andrew J.; Mizukami, Naoki; Nearing, Grey; Le Vine, Nataliya
2017-04-01
Hydrological signatures, which are indices characterizing hydrologic behavior, are increasingly used for the evaluation, calibration and selection of hydrological models. Their key advantage is to provide more direct insights into specific hydrological processes than aggregated metrics (e.g., the Nash-Sutcliffe efficiency). A plethora of signatures now exists, which enable characterizing a variety of hydrograph features, but also makes the selection of signatures for new studies challenging. Here we propose that the selection of signatures should be based on their information content, which we estimated using several approaches, all leading to similar conclusions. To explore the relationship between hydrological signatures and the landscape, we extended a previously published data set of hydrometeorological time series for 671 catchments in the contiguous United States, by characterizing the climatic conditions, topography, soil, vegetation and stream network of each catchment. This new catchment attributes data set will soon be in open access, and we are looking forward to introducing it to the community. We used this data set in a data-learning algorithm (random forests) to explore whether hydrological signatures could be inferred from catchment attributes alone. We find that some signatures can be predicted remarkably well by random forests and, interestingly, the same signatures are well captured when simulating discharge using a conceptual hydrological model. We discuss what this result reveals about our understanding of hydrological processes shaping hydrological signatures. We also identify which catchment attributes exert the strongest control on catchment behavior, in particular during extreme hydrological events. Overall, climatic attributes have the most significant influence, and strongly condition how well hydrological signatures can be predicted by random forests and simulated by the hydrological model. In contrast, soil characteristics at the catchment scale are not found to be significant predictors by random forests, which raises questions on how to best use soil data for hydrological modeling, for instance for parameter estimation. We finally demonstrate that signatures with high spatial variability are poorly captured by random forests and model simulations, which makes their regionalization delicate. We conclude with a ranking of signatures based on their information content, and propose that the signatures with high information content are best suited for model calibration, model selection and understanding hydrologic similarity.
Methodological Reporting of Randomized Trials in Five Leading Chinese Nursing Journals
Shi, Chunhu; Tian, Jinhui; Ren, Dan; Wei, Hongli; Zhang, Lihuan; Wang, Quan; Yang, Kehu
2014-01-01
Background Randomized controlled trials (RCTs) are not always well reported, especially in terms of their methodological descriptions. This study aimed to investigate the adherence of methodological reporting complying with CONSORT and explore associated trial level variables in the Chinese nursing care field. Methods In June 2012, we identified RCTs published in five leading Chinese nursing journals and included trials with details of randomized methods. The quality of methodological reporting was measured through the methods section of the CONSORT checklist and the overall CONSORT methodological items score was calculated and expressed as a percentage. Meanwhile, we hypothesized that some general and methodological characteristics were associated with reporting quality and conducted a regression with these data to explore the correlation. The descriptive and regression statistics were calculated via SPSS 13.0. Results In total, 680 RCTs were included. The overall CONSORT methodological items score was 6.34±0.97 (Mean ± SD). No RCT reported descriptions and changes in “trial design,” changes in “outcomes” and “implementation,” or descriptions of the similarity of interventions for “blinding.” Poor reporting was found in detailing the “settings of participants” (13.1%), “type of randomization sequence generation” (1.8%), calculation methods of “sample size” (0.4%), explanation of any interim analyses and stopping guidelines for “sample size” (0.3%), “allocation concealment mechanism” (0.3%), additional analyses in “statistical methods” (2.1%), and targeted subjects and methods of “blinding” (5.9%). More than 50% of trials described randomization sequence generation, the eligibility criteria of “participants,” “interventions,” and definitions of the “outcomes” and “statistical methods.” The regression analysis found that publication year and ITT analysis were weakly associated with CONSORT score. Conclusions The completeness of methodological reporting of RCTs in the Chinese nursing care field is poor, especially with regard to the reporting of trial design, changes in outcomes, sample size calculation, allocation concealment, blinding, and statistical methods. PMID:25415382
Liu, Quan; Ma, Li; Fan, Shou-Zen; Abbod, Maysam F; Shieh, Jiann-Shing
2018-01-01
Estimating the depth of anaesthesia (DoA) in operations has always been a challenging issue due to the underlying complexity of the brain mechanisms. Electroencephalogram (EEG) signals are undoubtedly the most widely used signals for measuring DoA. In this paper, a novel EEG-based index is proposed to evaluate DoA for 24 patients receiving general anaesthesia with different levels of unconsciousness. Sample Entropy (SampEn) algorithm was utilised in order to acquire the chaotic features of the signals. After calculating the SampEn from the EEG signals, Random Forest was utilised for developing learning regression models with Bispectral index (BIS) as the target. Correlation coefficient, mean absolute error, and area under the curve (AUC) were used to verify the perioperative performance of the proposed method. Validation comparisons with typical nonstationary signal analysis methods (i.e., recurrence analysis and permutation entropy) and regression methods (i.e., neural network and support vector machine) were conducted. To further verify the accuracy and validity of the proposed methodology, the data is divided into four unconsciousness-level groups on the basis of BIS levels. Subsequently, analysis of variance (ANOVA) was applied to the corresponding index (i.e., regression output). Results indicate that the correlation coefficient improved to 0.72 ± 0.09 after filtering and to 0.90 ± 0.05 after regression from the initial values of 0.51 ± 0.17. Similarly, the final mean absolute error dramatically declined to 5.22 ± 2.12. In addition, the ultimate AUC increased to 0.98 ± 0.02, and the ANOVA analysis indicates that each of the four groups of different anaesthetic levels demonstrated significant difference from the nearest levels. Furthermore, the Random Forest output was extensively linear in relation to BIS, thus with better DoA prediction accuracy. In conclusion, the proposed method provides a concrete basis for monitoring patients' anaesthetic level during surgeries.
Hydrologic Landscape Regionalisation Using Deductive Classification and Random Forests
Brown, Stuart C.; Lester, Rebecca E.; Versace, Vincent L.; Fawcett, Jonathon; Laurenson, Laurie
2014-01-01
Landscape classification and hydrological regionalisation studies are being increasingly used in ecohydrology to aid in the management and research of aquatic resources. We present a methodology for classifying hydrologic landscapes based on spatial environmental variables by employing non-parametric statistics and hybrid image classification. Our approach differed from previous classifications which have required the use of an a priori spatial unit (e.g. a catchment) which necessarily results in the loss of variability that is known to exist within those units. The use of a simple statistical approach to identify an appropriate number of classes eliminated the need for large amounts of post-hoc testing with different number of groups, or the selection and justification of an arbitrary number. Using statistical clustering, we identified 23 distinct groups within our training dataset. The use of a hybrid classification employing random forests extended this statistical clustering to an area of approximately 228,000 km2 of south-eastern Australia without the need to rely on catchments, landscape units or stream sections. This extension resulted in a highly accurate regionalisation at both 30-m and 2.5-km resolution, and a less-accurate 10-km classification that would be more appropriate for use at a continental scale. A smaller case study, of an area covering 27,000 km2, demonstrated that the method preserved the intra- and inter-catchment variability that is known to exist in local hydrology, based on previous research. Preliminary analysis linking the regionalisation to streamflow indices is promising suggesting that the method could be used to predict streamflow behaviour in ungauged catchments. Our work therefore simplifies current classification frameworks that are becoming more popular in ecohydrology, while better retaining small-scale variability in hydrology, thus enabling future attempts to explain and visualise broad-scale hydrologic trends at the scale of catchments and continents. PMID:25396410
Hydrologic landscape regionalisation using deductive classification and random forests.
Brown, Stuart C; Lester, Rebecca E; Versace, Vincent L; Fawcett, Jonathon; Laurenson, Laurie
2014-01-01
Landscape classification and hydrological regionalisation studies are being increasingly used in ecohydrology to aid in the management and research of aquatic resources. We present a methodology for classifying hydrologic landscapes based on spatial environmental variables by employing non-parametric statistics and hybrid image classification. Our approach differed from previous classifications which have required the use of an a priori spatial unit (e.g. a catchment) which necessarily results in the loss of variability that is known to exist within those units. The use of a simple statistical approach to identify an appropriate number of classes eliminated the need for large amounts of post-hoc testing with different number of groups, or the selection and justification of an arbitrary number. Using statistical clustering, we identified 23 distinct groups within our training dataset. The use of a hybrid classification employing random forests extended this statistical clustering to an area of approximately 228,000 km2 of south-eastern Australia without the need to rely on catchments, landscape units or stream sections. This extension resulted in a highly accurate regionalisation at both 30-m and 2.5-km resolution, and a less-accurate 10-km classification that would be more appropriate for use at a continental scale. A smaller case study, of an area covering 27,000 km2, demonstrated that the method preserved the intra- and inter-catchment variability that is known to exist in local hydrology, based on previous research. Preliminary analysis linking the regionalisation to streamflow indices is promising suggesting that the method could be used to predict streamflow behaviour in ungauged catchments. Our work therefore simplifies current classification frameworks that are becoming more popular in ecohydrology, while better retaining small-scale variability in hydrology, thus enabling future attempts to explain and visualise broad-scale hydrologic trends at the scale of catchments and continents.
NASA Astrophysics Data System (ADS)
Molinario, G.
2015-12-01
Conflict in the Democratic Republic of Congo (DRC) and neighboring countries has caused the displacement of people internally and internationally sometimes leading to drastic changes in the impact that traditional slash and burn shifting cultivation has on the forest ecosystem. In other areas, the lack of infrastructure and governance has isolated and protected areas of core forest from large scale exploitation. Observing specific patterns of forest fragmentation caused either by the expansion of existing rural complex areas or of isolated forest perforations has allowed us to track the differential growth of the human footprint throughout forested area of the country during the period 2000-2010. Our methodological approach involved the development of a model of shifting cultivation and forest fragmentation in which spatial rules applied morphological image processing to the Forets d'Afrique Central Evaluee par Teledetection (FACET) product. The result is a disaggregated classification of the primary forest into patch, edge, perforated, fragmented and core forest subtypes which we subsequently re-aggregated into homogenous anthropogenic macro-areas of rural complex and isolated forest perforations. We tracked how subsequent forest loss observed in 2005 and 2010 grew or shrunk these areas, presumably with differential impacts on the forest ecosystem. Using this approach we were able to map forest degradation by contextualizing the contribution of forest loss to change in different types of areas, highlighting how it can be greatly underestimated by a non contextualized per-pixel assessment of forest cover loss.
Wolfslehner, Bernhard; Seidl, Rupert
2010-12-01
The decision-making environment in forest management (FM) has changed drastically during the last decades. Forest management planning is facing increasing complexity due to a widening portfolio of forest goods and services, a societal demand for a rational, transparent decision process and rising uncertainties concerning future environmental conditions (e.g., climate change). Methodological responses to these challenges include an intensified use of ecosystem models to provide an enriched, quantitative information base for FM planning. Furthermore, multi-criteria methods are increasingly used to amalgamate information, preferences, expert judgments and value expressions, in support of the participatory and communicative dimensions of modern forestry. Although the potential of combining these two approaches has been demonstrated in a number of studies, methodological aspects in interfacing forest ecosystem models (FEM) and multi-criteria decision analysis (MCDA) are scarcely addressed explicitly. In this contribution we review the state of the art in FEM and MCDA in the context of FM planning and highlight some of the crucial issues when combining ecosystem and preference modeling. We discuss issues and requirements in selecting approaches suitable for supporting FM planning problems from the growing body of FEM and MCDA concepts. We furthermore identify two major challenges in a harmonized application of FEM-MCDA: (i) the design and implementation of an indicator-based analysis framework capturing ecological and social aspects and their interactions relevant for the decision process, and (ii) holistic information management that supports consistent use of different information sources, provides meta-information as well as information on uncertainties throughout the planning process.
NASA Astrophysics Data System (ADS)
Wolfslehner, Bernhard; Seidl, Rupert
2010-12-01
The decision-making environment in forest management (FM) has changed drastically during the last decades. Forest management planning is facing increasing complexity due to a widening portfolio of forest goods and services, a societal demand for a rational, transparent decision process and rising uncertainties concerning future environmental conditions (e.g., climate change). Methodological responses to these challenges include an intensified use of ecosystem models to provide an enriched, quantitative information base for FM planning. Furthermore, multi-criteria methods are increasingly used to amalgamate information, preferences, expert judgments and value expressions, in support of the participatory and communicative dimensions of modern forestry. Although the potential of combining these two approaches has been demonstrated in a number of studies, methodological aspects in interfacing forest ecosystem models (FEM) and multi-criteria decision analysis (MCDA) are scarcely addressed explicitly. In this contribution we review the state of the art in FEM and MCDA in the context of FM planning and highlight some of the crucial issues when combining ecosystem and preference modeling. We discuss issues and requirements in selecting approaches suitable for supporting FM planning problems from the growing body of FEM and MCDA concepts. We furthermore identify two major challenges in a harmonized application of FEM-MCDA: (i) the design and implementation of an indicator-based analysis framework capturing ecological and social aspects and their interactions relevant for the decision process, and (ii) holistic information management that supports consistent use of different information sources, provides meta-information as well as information on uncertainties throughout the planning process.
NASA Astrophysics Data System (ADS)
García-Santos, Glenda; Madruga de Brito, Mariana; Höllermann, Britta; Taft, Linda; Almoradie, Adrian; Evers, Mariele
2018-06-01
Understanding the interactions between water resources and its social dimensions is crucial for an effective and sustainable water management. The identification of sensitive control variables and feedback loops of a specific human-hydro-scape can enhance the knowledge about the potential factors and/or agents leading to the current water resources and ecosystems situation, which in turn supports the decision-making process of desirable futures. Our study presents the utility of a system dynamics modeling approach for water management and decision-making for the case of a forest ecosystem under risk of wildfires. We use the pluralistic water research concept to explore different scenarios and simulate the emergent behaviour of water interception and net precipitation after a wildfire in a forest ecosystem. Through a case study, we illustrate the applicability of this new methodology.
Daolan Zheng; Linda S. Heath; Mark J. Ducey; James E. Smith
2011-01-01
We examined spatial patterns of changes in forest area and nonsoil carbon (C) dynamics affected by land use/cover change (LUC) and harvests in 24 northern states of the United States using an integrated methodology combining remote sensing and ground inventory data between 1992 and 2001. We used the Retrofit Change Product from the Multi-Resolution Land Characteristics...
Effect of inventory method on niche models: random versus systematic error
Heather E. Lintz; Andrew N. Gray; Bruce McCune
2013-01-01
Data from large-scale biological inventories are essential for understanding and managing Earth's ecosystems. The Forest Inventory and Analysis Program (FIA) of the U.S. Forest Service is the largest biological inventory in North America; however, the FIA inventory recently changed from an amalgam of different approaches to a nationally-standardized approach in...
Marcus V. Warwell; Gerald E. Rehfeldt; Nicholas L. Crookston
2010-01-01
The Random Forests multiple regression tree was used to develop an empirically based bioclimatic model of the presence-absence of species occupying small geographic distributions in western North America. The species assessed were subalpine larch (Larix lyallii), smooth Arizona cypress (Cupressus arizonica ssp. glabra...
Determining soil erosion from roads in coastal plain of Alabama
McFero Grace; W.J. Elliot
2008-01-01
This paper reports soil losses and observed sediment deposition for 16 randomly selected forest road sections in the National Forests of Alabama. Visible sediment deposition zones were tracked along the stormwater flow path to the most remote location as a means of quantifying soil loss from road sections. Volumes of sediment in deposition zones were determined by...
Quantifying the abundance of co-occurring conifers along Inland Northwest (USA) climate gradients
Gerald E. Rehfeldt; Dennis E. Ferguson; Nicholas L. Crookston
2008-01-01
The occurrence and abundance of conifers along climate gradients in the Inland Northwest (USA) was assessed using data from 5082 field plots, 81% of which were forested. Analyses using the Random Forests classification tree revealed that the sequential distribution of species along an altitudinal gradient could be predicted with reasonable accuracy from a single...
Susan J. Crocker; Dacia M. Meneguzzo; Greg C. Liknes
2010-01-01
Landscape metrics, including host abundance and population density, were calculated using forest inventory and land cover data to assess the relationship between landscape pattern and the presence or absence of the emerald ash borer (EAB) (Agrilus planipennis Fairmaire). The Random Forests classification algorithm in the R statistical environment was...
Quantitative Trait Inheritance in a Forty-Year-Old Longleaf Pine Partial Diallel Test
Michael Stine; Jim Roberds; C. Dana Nelson; David P. Gwaze; Todd Shupe; Les Groom
2002-01-01
A longleaf pine (Pinus palustris Mill.) 13 parent partial diallel field experiment was established at two locations on the Harrison Experimental Forest in 1960. Parent trees were randomly selected from a natural population growing on the Harrison Experimental Forest, near Gulfport, Miss. Distance between trees chosen as parents ranged from 13 to 357...
Babcock, Chad; Finley, Andrew O.; Bradford, John B.; Kolka, Randall K.; Birdsey, Richard A.; Ryan, Michael G.
2015-01-01
Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both residual spatial dependence and non-stationarity of model covariates through the introduction of spatial random effects. We explored this objective using four forest inventory datasets that are part of the North American Carbon Program, each comprising point-referenced measures of above-ground forest biomass and discrete LiDAR. For each dataset, we considered at least five regression model specifications of varying complexity. Models were assessed based on goodness of fit criteria and predictive performance using a 10-fold cross-validation procedure. Results showed that the addition of spatial random effects to the regression model intercept improved fit and predictive performance in the presence of substantial residual spatial dependence. Additionally, in some cases, allowing either some or all regression slope parameters to vary spatially, via the addition of spatial random effects, further improved model fit and predictive performance. In other instances, models showed improved fit but decreased predictive performance—indicating over-fitting and underscoring the need for cross-validation to assess predictive ability. The proposed Bayesian modeling framework provided access to pixel-level posterior predictive distributions that were useful for uncertainty mapping, diagnosing spatial extrapolation issues, revealing missing model covariates, and discovering locally significant parameters.
Hernández, Jaime; Núñez, Ignacia; Bacigalupo, Antonella; Cattan, Pedro E
2013-05-31
Chagas disease is caused by the protozoan Trypanosoma cruzi, which is transmitted to mammal hosts by triatomine insect vectors. The goal of this study was to model the spatial distribution of triatomine species in an endemic area. Vector's locations were obtained with a rural householders' survey. This information was combined with environmental data obtained from remote sensors, land use maps and topographic SRTM data, using the machine learning algorithm Random Forests to model species distribution. We analysed the combination of variables on three scales: 10 km, 5 km and 2.5 km cell size grids. The best estimation, explaining 46.2% of the triatomines spatial distribution, was obtained for 5 km of spatial resolution. Presence probability distribution increases from central Chile towards the north, tending to cover the central-coastal region and avoiding areas of the Andes range. The methodology presented here was useful to model the distribution of triatomines in an endemic area; it is best explained using 5 km of spatial resolution, and their presence increases in the northern part of the study area. This study's methodology can be replicated in other countries with Chagas disease or other vectorial transmitted diseases, and be used to locate high risk areas and to optimize resource allocation, for prevention and control of vectorial diseases.
2013-01-01
Background Chagas disease is caused by the protozoan Trypanosoma cruzi, which is transmitted to mammal hosts by triatomine insect vectors. The goal of this study was to model the spatial distribution of triatomine species in an endemic area. Methods Vector’s locations were obtained with a rural householders’ survey. This information was combined with environmental data obtained from remote sensors, land use maps and topographic SRTM data, using the machine learning algorithm Random Forests to model species distribution. We analysed the combination of variables on three scales: 10 km, 5 km and 2.5 km cell size grids. Results The best estimation, explaining 46.2% of the triatomines spatial distribution, was obtained for 5 km of spatial resolution. Presence probability distribution increases from central Chile towards the north, tending to cover the central-coastal region and avoiding areas of the Andes range. Conclusions The methodology presented here was useful to model the distribution of triatomines in an endemic area; it is best explained using 5 km of spatial resolution, and their presence increases in the northern part of the study area. This study’s methodology can be replicated in other countries with Chagas disease or other vectorial transmitted diseases, and be used to locate high risk areas and to optimize resource allocation, for prevention and control of vectorial diseases. PMID:23724993
Accounting for host cell protein behavior in anion-exchange chromatography.
Swanson, Ryan K; Xu, Ruo; Nettleton, Daniel S; Glatz, Charles E
2016-11-01
Host cell proteins (HCP) are a problematic set of impurities in downstream processing (DSP) as they behave most similarly to the target protein during separation. Approaching DSP with the knowledge of HCP separation behavior would be beneficial for the production of high purity recombinant biologics. Therefore, this work was aimed at characterizing the separation behavior of complex mixtures of HCP during a commonly used method: anion-exchange chromatography (AEX). An additional goal was to evaluate the performance of a statistical methodology, based on the characterization data, as a tool for predicting protein separation behavior. Aqueous two-phase partitioning followed by two-dimensional electrophoresis provided data on the three physicochemical properties most commonly exploited during DSP for each HCP: pI (isoelectric point), molecular weight, and surface hydrophobicity. The protein separation behaviors of two alternative expression host extracts (corn germ and E. coli) were characterized. A multivariate random forest (MVRF) statistical methodology was then applied to the database of characterized proteins creating a tool for predicting the AEX behavior of a mixture of proteins. The accuracy of the MVRF method was determined by calculating a root mean squared error value for each database. This measure never exceeded a value of 0.045 (fraction of protein populating each of the multiple separation fractions) for AEX. © 2016 American Institute of Chemical Engineers Biotechnol. Prog., 32:1453-1463, 2016. © 2016 American Institute of Chemical Engineers.
Mo, Xiao-Xue; Shi, Ling-Ling; Zhang, Yong-Jiang; Zhu, Hua; Slik, J W Ferry
2013-01-01
Tropical rainforests in Southeast Asia are facing increasing and ever more intense human disturbance that often negatively affects biodiversity. The aim of this study was to determine how tree species phylogenetic diversity is affected by traditional forest management types and to understand the change in community phylogenetic structure during succession. Four types of forests with different management histories were selected for this purpose: old growth forests, understorey planted old growth forests, old secondary forests (∼200-years after slash and burn), and young secondary forests (15-50-years after slash and burn). We found that tree phylogenetic community structure changed from clustering to over-dispersion from early to late successional forests and finally became random in old-growth forest. We also found that the phylogenetic structure of the tree overstorey and understorey responded differentially to change in environmental conditions during succession. In addition, we show that slash and burn agriculture (swidden cultivation) can increase landscape level plant community evolutionary information content.
Mo, Xiao-Xue; Shi, Ling-Ling; Zhang, Yong-Jiang; Zhu, Hua; Slik, J. W. Ferry
2013-01-01
Tropical rainforests in Southeast Asia are facing increasing and ever more intense human disturbance that often negatively affects biodiversity. The aim of this study was to determine how tree species phylogenetic diversity is affected by traditional forest management types and to understand the change in community phylogenetic structure during succession. Four types of forests with different management histories were selected for this purpose: old growth forests, understorey planted old growth forests, old secondary forests (∼200-years after slash and burn), and young secondary forests (15–50-years after slash and burn). We found that tree phylogenetic community structure changed from clustering to over-dispersion from early to late successional forests and finally became random in old-growth forest. We also found that the phylogenetic structure of the tree overstorey and understorey responded differentially to change in environmental conditions during succession. In addition, we show that slash and burn agriculture (swidden cultivation) can increase landscape level plant community evolutionary information content. PMID:23936268
The structure of tropical forests and sphere packings
Jahn, Markus Wilhelm; Dobner, Hans-Jürgen; Wiegand, Thorsten; Huth, Andreas
2015-01-01
The search for simple principles underlying the complex architecture of ecological communities such as forests still challenges ecological theorists. We use tree diameter distributions—fundamental for deriving other forest attributes—to describe the structure of tropical forests. Here we argue that tree diameter distributions of natural tropical forests can be explained by stochastic packing of tree crowns representing a forest crown packing system: a method usually used in physics or chemistry. We demonstrate that tree diameter distributions emerge accurately from a surprisingly simple set of principles that include site-specific tree allometries, random placement of trees, competition for space, and mortality. The simple static model also successfully predicted the canopy structure, revealing that most trees in our two studied forests grow up to 30–50 m in height and that the highest packing density of about 60% is reached between the 25- and 40-m height layer. Our approach is an important step toward identifying a minimal set of processes responsible for generating the spatial structure of tropical forests. PMID:26598678
Forest structure in low-diversity tropical forests: a study of Hawaiian wet and dry forests.
Ostertag, Rebecca; Inman-Narahari, Faith; Cordell, Susan; Giardina, Christian P; Sack, Lawren
2014-01-01
The potential influence of diversity on ecosystem structure and function remains a topic of significant debate, especially for tropical forests where diversity can range widely. We used Center for Tropical Forest Science (CTFS) methodology to establish forest dynamics plots in montane wet forest and lowland dry forest on Hawai'i Island. We compared the species diversity, tree density, basal area, biomass, and size class distributions between the two forest types. We then examined these variables across tropical forests within the CTFS network. Consistent with other island forests, the Hawai'i forests were characterized by low species richness and very high relative dominance. The two Hawai'i forests were floristically distinct, yet similar in species richness (15 vs. 21 species) and stem density (3078 vs. 3486/ha). While these forests were selected for their low invasive species cover relative to surrounding forests, both forests averaged 5->50% invasive species cover; ongoing removal will be necessary to reduce or prevent competitive impacts, especially from woody species. The montane wet forest had much larger trees, resulting in eightfold higher basal area and above-ground biomass. Across the CTFS network, the Hawaiian montane wet forest was similar to other tropical forests with respect to diameter distributions, density, and aboveground biomass, while the Hawai'i lowland dry forest was similar in density to tropical forests with much higher diversity. These findings suggest that forest structural variables can be similar across tropical forests independently of species richness. The inclusion of low-diversity Pacific Island forests in the CTFS network provides an ∼80-fold range in species richness (15-1182 species), six-fold variation in mean annual rainfall (835-5272 mm yr(-1)) and 1.8-fold variation in mean annual temperature (16.0-28.4°C). Thus, the Hawaiian forest plots expand the global forest plot network to enable testing of ecological theory for links among species diversity, environmental variation and ecosystem function.
Forest Structure in Low-Diversity Tropical Forests: A Study of Hawaiian Wet and Dry Forests
Ostertag, Rebecca; Inman-Narahari, Faith; Cordell, Susan; Giardina, Christian P.; Sack, Lawren
2014-01-01
The potential influence of diversity on ecosystem structure and function remains a topic of significant debate, especially for tropical forests where diversity can range widely. We used Center for Tropical Forest Science (CTFS) methodology to establish forest dynamics plots in montane wet forest and lowland dry forest on Hawai‘i Island. We compared the species diversity, tree density, basal area, biomass, and size class distributions between the two forest types. We then examined these variables across tropical forests within the CTFS network. Consistent with other island forests, the Hawai‘i forests were characterized by low species richness and very high relative dominance. The two Hawai‘i forests were floristically distinct, yet similar in species richness (15 vs. 21 species) and stem density (3078 vs. 3486/ha). While these forests were selected for their low invasive species cover relative to surrounding forests, both forests averaged 5–>50% invasive species cover; ongoing removal will be necessary to reduce or prevent competitive impacts, especially from woody species. The montane wet forest had much larger trees, resulting in eightfold higher basal area and above-ground biomass. Across the CTFS network, the Hawaiian montane wet forest was similar to other tropical forests with respect to diameter distributions, density, and aboveground biomass, while the Hawai‘i lowland dry forest was similar in density to tropical forests with much higher diversity. These findings suggest that forest structural variables can be similar across tropical forests independently of species richness. The inclusion of low-diversity Pacific Island forests in the CTFS network provides an ∼80-fold range in species richness (15–1182 species), six-fold variation in mean annual rainfall (835–5272 mm yr−1) and 1.8-fold variation in mean annual temperature (16.0–28.4°C). Thus, the Hawaiian forest plots expand the global forest plot network to enable testing of ecological theory for links among species diversity, environmental variation and ecosystem function. PMID:25162731
Boehnke, Denise; Gebhardt, Reiner; Petney, Trevor; Norra, Stefan
2017-11-06
Ecological field research on the influence of meteorological parameters on a forest inhabiting species is confronted with the complex relations between measured data and the real conditions the species is exposed to. This study highlights this complexity for the example of Ixodes ricinus. This species lives mainly in forest habitats near the ground, but field research on impacts of meteorological conditions on population dynamics is often based on data from nearby official weather stations or occasional in situ measurements. In addition, studies use very different data approaches to analyze comparable research questions. This study is an extensive examination of the methodology used to analyze the impact of meteorological parameters on Ixodes ricinus and proposes a methodological approach that tackles the underlying complexity. Our specifically developed measurement concept was implemented at 25 forest study sites across Baden-Württemberg, Germany. Meteorological weather stations recorded data in situ and continuously between summer 2012 and autumn 2015, including relative humidity measures in the litter layer and different heights above it (50 cm, 2 m). Hourly averages of relative humidity were calculated and compared with data from the nearest official weather station. Data measured directly in the forest can differ dramatically from conditions recorded at official weather stations. In general, data indicate a remarkable relative humidity decrease from inside to outside the forest and from ground to atmosphere. Relative humidity measured in the litter layer were, on average, 24% higher than the official data and were much more balanced, especially in summer. The results illustrate the need for, and benefit of, continuous in situ measurements to grasp the complex relative humidity conditions in forests. Data from official weather stations do not accurately represent actual humidity conditions in forest stands and the explanatory power of short period and fragmentary in situ measurements is extremely limited. However, it is still an open question to what kind of meteorological data are necessary to answer specific questions in tick research. The comparison of research findings was hindered by the variety of information provided, which is why we propose details for future reporting.
NASA Astrophysics Data System (ADS)
Jonckheere, I. G.; FAO UN-REDD Team Forestry Department
2011-12-01
Reducing Emissions from Deforestation and Forest Degradation (REDD) is an effort to create a financial value for the carbon stored in forests, offering incentives for developing countries to reduce emissions from forested lands and invest in low-carbon paths to sustainable development. "REDD+" goes beyond deforestation and forest degradation, and includes the role of conservation, sustainable management of forests and enhancement of forest carbon stocks. In the framework of getting countries ready for REDD+, the UN-REDD Programme, a partnership between UNEP, FAO and UNDP, assists developing countries to prepare and implement national REDD+ strategies. Designed collaboratively by a broad range of stakeholders, national UN-REDD Programmes are informed by the technical expertise of FAO, UNDP and UNEP. For the monitoring, reporting and verification, FAO supports the countries to develop satellite forest monitoring systems that allow for credible measurement, reporting and verification (MRV)of REDD+ activities. These are among the most critical elements for the successful implementation of any REDD+ mechanism, also following the COP 16 decisions in Cancun last year. The UN-REDD Programme through a joint effort of FAO and Brazil's National Space Agency, INPE, is supporting countries to develop cost-effective, robust and compatible national monitoring and MRV systems, providing tools, methodologies, training and knowledge sharing that help countries to strengthen their technical and institutional capacity for effective MRV systems. To develop strong nationally-owned forest monitoring systems, technical and institutional capacity building is key. The UN-REDD Programme, through FAO, has taken on intensive training together with INPE, and has provided technical help and assistance for in-country training and implementation for national satellite forest monitoring. The goal of the start-up phase for DRC and Papua New Guinea (PNG) in this capacity building effort is the training of technical forest people and IT persons from these two interested REDD+ countries, and to set-up the national satellite forest monitoring systems. The Brazilian forest monitoring system, TerraAmazon, which is used as a basis for this initiative, allows countries to adapt it to country needs and the training on the TerraAmazon system is a tool to enhance existing capacity on carbon monitoring systems. The start-up phase of the National Forest Monitoring System for DRC and PNG will allow these countries to follow all actions related to the implementation of its national REDD+ policies and measures. The monitoring system will work as a platform to obtain information on their REDD+ results and actions, related directly or indirectly to national REDD+ strategies and may also include actions unrelated to carbon assessment, such as forest law enforcement. With the technical assistance of FAO, INPE and other stakeholders, the countries will set up an autonomous operational forest monitoring system. An initial version and the methodologies of these syste,s will be launched in Durban, South Africa during COP 17 and is presented here.
ERIC Educational Resources Information Center
Lacey, John H.; Kelley-Baker, Tara; Voas, Robert B.; Romano, Eduardo; Furr-Holden, C. Debra; Torres, Pedro; Berning, Amy
2011-01-01
This article describes the methodology used in the 2007 U.S. National Roadside Survey to estimate the prevalence of alcohol- and drug-impaired driving and alcohol- and drug-involved driving. This study involved randomly stopping drivers at 300 locations across the 48 continental U.S. states at sites selected through a stratified random sampling…
Lees, Alexander C; Peres, Carlos A
2008-04-01
Forest corridors are often considered the main instrument with which to offset the effects of habitat loss and fragmentation. Brazilian forestry legislation requires that all riparian zones on private landholdings be maintained as permanent reserves and sets fixed minimum widths of riparian forest buffers to be retained alongside rivers and perennial streams. We investigated the effects of corridor width and degradation status of 37 riparian forest sites (including 24 corridors connected to large source-forest patches, 8 unconnected forest corridors, and 5 control riparian zones embedded within continuous forest patches) on bird and mammal species richness in a hyper-fragmented forest landscape surrounding Alta Floresta, Mato Grosso, Brazil. We used point-count and track-sampling methodology, coupled with an intensive forest-quality assessment that combined satellite imagery and ground truthed data. Vertebrate use of corridors was highly species-specific, but broad trends emerged depending on species life histories and their sensitivity to disturbance. Narrow and/or highly disturbed riparian corridors retained only a depauperate vertebrate assemblage that was typical of deforested habitats, whereas wide, well-preserved corridors retained a nearly complete species assemblage. Restriction of livestock movement along riparian buffers and their exclusion from key areas alongside deforested streams would permit corridor regeneration and facilitate restoration of connectivity.
Manterola, Carlos; Torres, Rodrigo; Burgos, Luis; Vial, Manuel; Pineda, Viviana
2006-07-01
Surgery is a curative treatment for gastric cancer (GC). As relapse is frequent, adjuvant therapies such as postoperative chemo radiotherapy have been tried. In Chile, some hospitals adopted Macdonald's study as a protocol for the treatment of GC. To determine methodological quality and internal and external validity of the Macdonald study. Three instruments were applied that assess methodological quality. A critical appraisal was done and the internal and external validity of the methodological quality was analyzed with two scales: MINCIR (Methodology and Research in Surgery), valid for therapy studies and CONSORT (Consolidated Standards of Reporting Trials), valid for randomized controlled trials (RCT). Guides and scales were applied by 5 researchers with training in clinical epidemiology. The reader's guide verified that the Macdonald study was not directed to answer a clearly defined question. There was random assignment, but the method used is not described and the patients were not considered until the end of the study (36% of the group with surgery plus chemo radiotherapy did not complete treatment). MINCIR scale confirmed a multicentric RCT, not blinded, with an unclear randomized sequence, erroneous sample size estimation, vague objectives and no exclusion criteria. CONSORT system proved the lack of working hypothesis and specific objectives as well as an absence of exclusion criteria and identification of the primary variable, an imprecise estimation of sample size, ambiguities in the randomization process, no blinding, an absence of statistical adjustment and the omission of a subgroup analysis. The instruments applied demonstrated methodological shortcomings that compromise the internal and external validity of the.
Peh, Kelvin S.-H.; Sonké, Bonaventure; Séné, Olivier; Djuikouo, Marie-Noël K.; Nguembou, Charlemagne K.; Taedoumg, Hermann; Begne, Serge K.; Lewis, Simon L.
2014-01-01
Background Traits of non-dominant mixed-forest tree species and their synergies for successful co-occurrence in monodominant Gilbertiodendron dewevrei forest have not yet been investigated. Here we compared the tree species diversity of the monodominant forest with its adjacent mixed forest and then determined which fitness proxies and life history traits of the mixed-forest tree species were most associated with successful co-existence in the monodominant forest. Methodology/Principal Findings We sampled all trees (diameter in breast height [dbh]≥10 cm) within 6×1 ha topographically homogenous areas of intact central African forest in SE Cameroon, three independent patches of G. dewevrei-dominated forest and three adjacent areas (450–800 m apart). Monodominant G. dewevrei forest had lower sample-controlled species richness, species density and population density than its adjacent mixed forest in terms of stems with dbh≥10 cm. Analysis of a suite of population-level characteristics, such as relative abundance and geographical distribution, and traits such as wood density, height, diameter at breast height, fruit/seed dispersal mechanism and light requirement–revealed after controlling for phylogeny, species that co-occur with G. dewevrei tend to have higher abundance in adjacent mixed forest, higher wood density and a lower light requirement. Conclusions/Significance Our results suggest that certain traits (wood density and light requirement) and population-level characteristics (relative abundance) may increase the invasibility of a tree species into a tropical closed-canopy system. Such knowledge may assist in the pre-emptive identification of invasive tree species. PMID:24844914
2006-12-01
92–101. Bovee, K . D . 1982. A guide to stream habitat analysis using the in stream flow incremental methodology. Instream Flow Information Paper No...Thames. 1991. Hydrology and the management of watersheds . Iowa State University Press, Ames, IA. Brown, J. K . 1974. Handbook for inventorying downed...woody material. General Technical Report INT-16, U.S. Department of Agriculture, Forest Service. Brown, J. K ., R. D . Oberheu, and C. M. Johnston
NASA Technical Reports Server (NTRS)
Spanner, Michael A.; Pierce, Lars L.; Running, Steven W.; Peterson, David L.
1990-01-01
Consideration is given to the effects of canopy closure, understory vegetation, and background reflectance on the relationship between Landsat TM data and the leaf area index (LAI) of temperate coniferous forests in the western U.S. A methodology for correcting TM data for atmospheric conditions and sun-surface-sensor geometry is discussed. Strong inverse curvilinear relationships were found between coniferous forest LAI and TM bands 3 and 5. It is suggested that these inverse relationships are due to increased reflectance of understory vegetation and background in open stands of lower LAI and decreased reflectance of the overstory in closed canopy stands with higher LAI.
NASA Technical Reports Server (NTRS)
Neumann, Maxim; Hensley, Scott; Lavalle, Marco; Ahmed, Razi
2013-01-01
This paper concerns forest remote sensing using JPL's multi-baseline polarimetric interferometric UAVSAR data. It presents exemplary results and analyzes the possibilities and limitations of using SAR Tomography and Polarimetric SAR Interferometry (PolInSAR) techniques for the estimation of forest structure. Performance and error indicators for the applicability and reliability of the used multi-baseline (MB) multi-temporal (MT) PolInSAR random volume over ground (RVoG) model are discussed. Experimental results are presented based on JPL's L-band repeat-pass polarimetric interferometric UAVSAR data over temperate and tropical forest biomes in the Harvard Forest, Massachusetts, and in the La Amistad Park, Panama and Costa Rica. The results are partially compared with ground field measurements and with air-borne LVIS lidar data.
NASA Technical Reports Server (NTRS)
Neumann, Maxim; Hensley, Scott; Lavalle, Marco; Ahmed, Razi
2013-01-01
This paper concerns forest remote sensing using JPL's multi-baseline polarimetric interferometric UAVSAR data. It presents exemplary results and analyzes the possibilities and limitations of using SAR Tomography and Polarimetric SAR Interferometry (PolInSAR) techniques for the estimation of forest structure. Performance and error indicators for the applicability and reliability of the used multi-baseline (MB) multi-temporal (MT) PolInSAR random volume over ground (RVoG) model are discussed. Experimental results are presented based on JPL's L-band repeat-pass polarimetric interferometric UAVSAR data over temperate and tropical forest biomes in the Harvard Forest, Massachusetts, and in the La Amistad Park, Panama and Costa Rica. The results are partially compared with ground field measurements and with air-borne LVIS lidar data.
Nagasawa, Shinji; Al-Naamani, Eman; Saeki, Akinori
2018-05-17
Owing to the diverse chemical structures, organic photovoltaic (OPV) applications with a bulk heterojunction framework have greatly evolved over the last two decades, which has produced numerous organic semiconductors exhibiting improved power conversion efficiencies (PCEs). Despite the recent fast progress in materials informatics and data science, data-driven molecular design of OPV materials remains challenging. We report a screening of conjugated molecules for polymer-fullerene OPV applications by supervised learning methods (artificial neural network (ANN) and random forest (RF)). Approximately 1000 experimental parameters including PCE, molecular weight, and electronic properties are manually collected from the literature and subjected to machine learning with digitized chemical structures. Contrary to the low correlation coefficient in ANN, RF yields an acceptable accuracy, which is twice that of random classification. We demonstrate the application of RF screening for the design, synthesis, and characterization of a conjugated polymer, which facilitates a rapid development of optoelectronic materials.
Quantifying and mapping spatial variability in simulated forest plots
Gavin R. Corral; Harold E. Burkhart
2016-01-01
We used computer simulations to test the efficacy of multivariate statistical methods to detect, quantify, and map spatial variability of forest stands. Simulated stands were developed of regularly-spaced plantations of loblolly pine (Pinus taeda L.). We assumed no affects of competition or mortality, but random variability was added to individual tree characteristics...
Courtney Flint; Hua Qin; Michael Daab
2008-01-01
The US Forest Service, Pacific Northwest Research Station funded research to assess community responses to forest disturbance by mountain pine beetles (Dendroctonus ponderosae) and public reaction to invasive plants in north central Colorado. In the Spring of2007, 4,027 16-page questionnaires were mailed to randomly selected households with addresses in Breckenridge,...
D. Jordan; F., Jr. Ponder; V. C. Hubbard
2003-01-01
A greenhouse study examined the effects of soil compaction and forest leaf litter on the growth and nitrogen (N) uptake and recovery of red oak (Quercus rubra L.) and scarlet oak (Quercus coccinea Muencch) seedlings and selected microbial activity over a 6-month period. The experiment had a randomized complete block design with...
Stemflow estimation in a redwood forest using model-based stratified random sampling
Jack Lewis
2003-01-01
Model-based stratified sampling is illustrated by a case study of stemflow volume in a redwood forest. The approach is actually a model-assisted sampling design in which auxiliary information (tree diameter) is utilized in the design of stratum boundaries to optimize the efficiency of a regression or ratio estimator. The auxiliary information is utilized in both the...
Estimating erosion risk on forest lands using improved methods of discriminant analysis
J. Lewis; R. M. Rice
1990-01-01
A population of 638 timber harvest areas in northwestern California was sampled for data related to the occurrence of critical amounts of erosion (>153 m3 within 0.81 ha). Separate analyses were done for forest roads and logged areas. Linear discriminant functions were computed in each analysis to contrast site conditions at critical plots with randomly selected...
Sample-based estimation of tree species richness in a wet tropical forest compartment
Steen Magnussen; Raphael Pelissier
2007-01-01
Petersen's capture-recapture ratio estimator and the well-known bootstrap estimator are compared across a range of simulated low-intensity simple random sampling with fixed-area plots of 100 m? in a rich wet tropical forest compartment with 93 tree species in the Western Ghats of India. Petersen's ratio estimator was uniformly superior to the bootstrap...
Rates and Implications of Rainfall Interception in a Coastal Redwood Forest
Leslie M. Reid; Jack Lewis
2007-01-01
Throughfall was measured for a year at five-min intervals in 11 collectors randomly located on two plots in a second-growth redwood forest at the Caspar Creek Experimental Watersheds. Monitoring at one plot continued two more years, during which stemflow from 24 trees was also measured. Comparison of throughfall and stemflow to rainfall measured in adjacent clearings...
Jonatha L. Horton; Barton D. Clinton; John F. Walker; Colin M. Beir; Erik T. Nilsen
2009-01-01
Ericaceous shrubs can influence soil properties in many ecosystems. In this study, we examined how soil and forest floor properties vary among sites with different ericaceous evergreen shrub basal area in the southern Appalachian mountains. We randomly located plots along transects that included open understories and understories with varying amounts of Rhododendron...
Shannon L. Savage; Rick L. Lawrence; John R. Squires
2015-01-01
Ecological and land management applications would often benefit from maps of relative canopy cover of each species present within a pixel, instead of traditional remote-sensing based maps of either dominant species or percent canopy cover without regard to species composition. Widely used statistical models for remote sensing, such as randomForest (RF),...
'Pygmy' old-growth redwood characteristics on an edaphic ecotone in Mendocino County, California
Will Russell; Suzie. Woolhouse
2012-01-01
The 'pygmy forest' is a specialized community that is adapted to highly acidic, hydrophobic, nutrient deprived soils, and exists in pockets within the coast redwood forest in Mendocino County. While coast redwood is known as an exceptionally tall tree, stunted trees exhibit unusual growth-forms on pygmy soils. We used a stratified random sampling procedure to...
Ecological impacts and management strategies for western larch in the face of climate-change
Gerald E. Rehfeldt; Barry C. Jaquish
2010-01-01
Approximately 185,000 forest inventory and ecological plots from both USA and Canada were used to predict the contemporary distribution of western larch (Larix occidentalis Nutt.) from climate variables. The random forests algorithm, using an 8-variable model, produced an overall error rate of about 2.9 %, nearly all of which consisted of predicting presence at...
Simulation of long-term landscape-level fuel treatment effects on large wildfires
Mark A. Finney; Rob C. Seli; Charles W. McHugh; Alan A. Ager; Bernhard Bahro; James K. Agee
2008-01-01
A simulation system was developed to explore how fuel treatments placed in topologically random and optimal spatial patterns affect the growth and behaviour of large fires when implemented at different rates over the course of five decades. The system consisted of a forest and fuel dynamics simulation module (Forest Vegetation Simulator, FVS), logic for deriving fuel...
NASA Astrophysics Data System (ADS)
Soulard, C. E.; Acevedo, W.; Yang, Z.; Cohen, W. B.; Stehman, S. V.; Taylor, J. L.
2015-12-01
A wide range of spatial forest disturbance data exist for the conterminous United States, yet inconsistencies between map products arise because of differing programmatic objectives and methodologies. Researchers on the Land Change Research Project (LCRP) are working to assess spatial agreement, characterize uncertainties, and resolve discrepancies between these national level datasets, in regard to forest disturbance. Disturbance maps from the Global Forest Change (GFC), Landfire Vegetation Disturbance (LVD), National Land Cover Dataset (NLCD), Vegetation Change Tracker (VCT), Web-enabled Landsat Data (WELD), and Monitoring Trends in Burn Severity (MTBS) were harmonized using a pixel-based data fusion process. The harmonization process reconciled forest harvesting, forest fire, and remaining forest disturbance across four intervals (1986-1992, 1992-2001, 2001-2006, and 2006-2011) by relying on convergence of evidence across all datasets available for each interval. Pixels with high agreement across datasets were retained, while moderate-to-low agreement pixels were visually assessed and either manually edited using reference imagery or discarded from the final disturbance map(s). National results show that annual rates of forest harvest and overall fire have increased over the past 25 years. Overall, this study shows that leveraging the best elements of readily-available data improves forest loss monitoring relative to using a single dataset to monitor forest change, particularly by reducing commission errors.
Foster, Jane R.; D'Amato, Anthony W.; Bradford, John B.
2014-01-01
Forest biomass growth is almost universally assumed to peak early in stand development, near canopy closure, after which it will plateau or decline. The chronosequence and plot remeasurement approaches used to establish the decline pattern suffer from limitations and coarse temporal detail. We combined annual tree ring measurements and mortality models to address two questions: first, how do assumptions about tree growth and mortality influence reconstructions of biomass growth? Second, under what circumstances does biomass production follow the model that peaks early, then declines? We integrated three stochastic mortality models with a census tree-ring data set from eight temperate forest types to reconstruct stand-level biomass increments (in Minnesota, USA). We compared growth patterns among mortality models, forest types and stands. Timing of peak biomass growth varied significantly among mortality models, peaking 20–30 years earlier when mortality was random with respect to tree growth and size, than when mortality favored slow-growing individuals. Random or u-shaped mortality (highest in small or large trees) produced peak growth 25–30 % higher than the surviving tree sample alone. Growth trends for even-aged, monospecific Pinus banksiana or Acer saccharum forests were similar to the early peak and decline expectation. However, we observed continually increasing biomass growth in older, low-productivity forests of Quercus rubra, Fraxinus nigra, and Thuja occidentalis. Tree-ring reconstructions estimated annual changes in live biomass growth and identified more diverse development patterns than previous methods. These detailed, long-term patterns of biomass development are crucial for detecting recent growth responses to global change and modeling future forest dynamics.
Fernández-de-las-Peñas, César; Alonso-Blanco, Cristina; San-Roman, Jesús; Miangolarra-Page, Juan C
2006-03-01
Literature review of quality of clinical trials. To determine the methodological quality of published randomized controlled trials that used spinal manipulation and/or mobilization to treat patients with tension-type headache (TTH), cervicogenic headache (CeH), and migraine (M) in the last decade. TTH, CeH, and M are the most prevalent types of headaches seen in adults. Individuals who have headaches frequently use physical therapy, manual therapy, or chiropractic care. Randomized controlled trials are considered an optimal method with which to assess the efficacy of any intervention. Computerized literature searches were performed in MEDLINE, EMBASE, COCHRANE, AMED, MANTIS, CINHAL, and PEDro databases. Randomized controlled trials in which spinal manipulation and/or mobilization had been used for TTH, CeH, and M published in a peer-reviewed journal as full text, and with at least 1 clinically relevant outcome measure (ie, headache intensity, duration, or frequency) were reviewed. The methodological quality of the studies was assessed independently by 2 reviewers using a set of predefined criteria. Only 8 studies met all the inclusion criteria. One clinical trial evaluated spinal manipulation and mobilization together, and the remaining 7 assessed spinal manipulative therapy. No controlled trials analyzing exclusively the effects of spinal mobilization were found. Methodological scores ranged from 35 to 56 points out of a theoretical maximum of 100 points, indicating an overall poor methodology of the studies. Only 2 studies obtained a high-quality score (greater than 50 points). No significant differences in quality scores were found based on the type of headache investigated. Methodological quality was not associated with the year of publication (before 2000, or later) nor with the results (positive, neutral, negative) reported in the studies. The most common flaws were a small sample size, the absence of a placebo control group, lack of blinded patients, and no description of the manipulative procedure. There are few published randomized controlled trials analyzing the effectiveness of spinal manipulation and/or mobilization for TTH, CeH, and M in the last decade. In addition, the methodological quality of these papers is typically low. Clearly, there is a need for high-quality randomized controlled trials assessing the effectiveness of these interventions in these headache disorders.
NASA Astrophysics Data System (ADS)
Kaskhedikar, Apoorva Prakash
According to the U.S. Energy Information Administration, commercial buildings represent about 40% of the United State's energy consumption of which office buildings consume a major portion. Gauging the extent to which an individual building consumes energy in excess of its peers is the first step in initiating energy efficiency improvement. Energy Benchmarking offers initial building energy performance assessment without rigorous evaluation. Energy benchmarking tools based on the Commercial Buildings Energy Consumption Survey (CBECS) database are investigated in this thesis. This study proposes a new benchmarking methodology based on decision trees, where a relationship between the energy use intensities (EUI) and building parameters (continuous and categorical) is developed for different building types. This methodology was applied to medium office and school building types contained in the CBECS database. The Random Forest technique was used to find the most influential parameters that impact building energy use intensities. Subsequently, correlations which were significant were identified between EUIs and CBECS variables. Other than floor area, some of the important variables were number of workers, location, number of PCs and main cooling equipment. The coefficient of variation was used to evaluate the effectiveness of the new model. The customization technique proposed in this thesis was compared with another benchmarking model that is widely used by building owners and designers namely, the ENERGY STAR's Portfolio Manager. This tool relies on the standard Linear Regression methods which is only able to handle continuous variables. The model proposed uses data mining technique and was found to perform slightly better than the Portfolio Manager. The broader impacts of the new benchmarking methodology proposed is that it allows for identifying important categorical variables, and then incorporating them in a local, as against a global, model framework for EUI pertinent to the building type. The ability to identify and rank the important variables is of great importance in practical implementation of the benchmarking tools which rely on query-based building and HVAC variable filters specified by the user.
Hybrid analysis for indicating patients with breast cancer using temperature time series.
Silva, Lincoln F; Santos, Alair Augusto S M D; Bravo, Renato S; Silva, Aristófanes C; Muchaluat-Saade, Débora C; Conci, Aura
2016-07-01
Breast cancer is the most common cancer among women worldwide. Diagnosis and treatment in early stages increase cure chances. The temperature of cancerous tissue is generally higher than that of healthy surrounding tissues, making thermography an option to be considered in screening strategies of this cancer type. This paper proposes a hybrid methodology for analyzing dynamic infrared thermography in order to indicate patients with risk of breast cancer, using unsupervised and supervised machine learning techniques, which characterizes the methodology as hybrid. The dynamic infrared thermography monitors or quantitatively measures temperature changes on the examined surface, after a thermal stress. In the dynamic infrared thermography execution, a sequence of breast thermograms is generated. In the proposed methodology, this sequence is processed and analyzed by several techniques. First, the region of the breasts is segmented and the thermograms of the sequence are registered. Then, temperature time series are built and the k-means algorithm is applied on these series using various values of k. Clustering formed by k-means algorithm, for each k value, is evaluated using clustering validation indices, generating values treated as features in the classification model construction step. A data mining tool was used to solve the combined algorithm selection and hyperparameter optimization (CASH) problem in classification tasks. Besides the classification algorithm recommended by the data mining tool, classifiers based on Bayesian networks, neural networks, decision rules and decision tree were executed on the data set used for evaluation. Test results support that the proposed analysis methodology is able to indicate patients with breast cancer. Among 39 tested classification algorithms, K-Star and Bayes Net presented 100% classification accuracy. Furthermore, among the Bayes Net, multi-layer perceptron, decision table and random forest classification algorithms, an average accuracy of 95.38% was obtained. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Selection of forest canopy gaps by male Cerulean Warblers in West Virginia
Perkins, Kelly A.; Wood, Petra Bohall
2014-01-01
Forest openings, or canopy gaps, are an important resource for many forest songbirds, such as Cerulean Warblers (Setophaga cerulea). We examined canopy gap selection by this declining species to determine if male Cerulean Warblers selected particular sizes, vegetative heights, or types of gaps. We tested whether these parameters differed among territories, territory core areas, and randomly-placed sample plots. We used enhanced territory mapping techniques (burst sampling) to define habitat use within the territory. Canopy gap densities were higher within core areas of territories than within territories or random plots, indicating that Cerulean Warblers selected habitat within their territories with the highest gap densities. Selection of regenerating gaps with woody vegetation >12 m within the gap, and canopy heights >24 m surrounding the gap, occurred within territory core areas. These findings differed between two sites indicating that gap selection may vary based on forest structure. Differences were also found regarding the placement of territories with respect to gaps. Larger gaps, such as wildlife food plots, were located on the periphery of territories more often than other types and sizes of gaps, while smaller gaps, such as treefalls, were located within territory boundaries more often than expected. The creations of smaller canopy gaps, <100 m2, within dense stands are likely compatible with forest management for this species.
Ramírez, J; Górriz, J M; Segovia, F; Chaves, R; Salas-Gonzalez, D; López, M; Alvarez, I; Padilla, P
2010-03-19
This letter shows a computer aided diagnosis (CAD) technique for the early detection of the Alzheimer's disease (AD) by means of single photon emission computed tomography (SPECT) image classification. The proposed method is based on partial least squares (PLS) regression model and a random forest (RF) predictor. The challenge of the curse of dimensionality is addressed by reducing the large dimensionality of the input data by downscaling the SPECT images and extracting score features using PLS. A RF predictor then forms an ensemble of classification and regression tree (CART)-like classifiers being its output determined by a majority vote of the trees in the forest. A baseline principal component analysis (PCA) system is also developed for reference. The experimental results show that the combined PLS-RF system yields a generalization error that converges to a limit when increasing the number of trees in the forest. Thus, the generalization error is reduced when using PLS and depends on the strength of the individual trees in the forest and the correlation between them. Moreover, PLS feature extraction is found to be more effective for extracting discriminative information from the data than PCA yielding peak sensitivity, specificity and accuracy values of 100%, 92.7%, and 96.9%, respectively. Moreover, the proposed CAD system outperformed several other recently developed AD CAD systems. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
Evaluating Plot Designs for the Tropics
Paul C. van Deusen; Bruce Bayle
1991-01-01
Theory and procedures are reviewed for determining the best type of plot for a given forest inventory. A general methodology is given that clarifies the relationship between different plot designs and the associated methods to produce the inventory estimates.
Schindler, Dirk; Grebhan, Karin; Albrecht, Axel; Schönborn, Jochen; Kohnle, Ulrich
2012-01-01
Data on storm damage attributed to the two high-impact winter storms 'Wiebke' (28 February 1990) and 'Lothar' (26 December 1999) were used for GIS-based estimation and mapping (in a 50 × 50 m resolution grid) of the winter storm damage probability (P(DAM)) for the forests of the German federal state of Baden-Wuerttemberg (Southwest Germany). The P(DAM)-calculation was based on weights of evidence (WofE) methodology. A combination of information on forest type, geology, soil type, soil moisture regime, and topographic exposure, as well as maximum gust wind speed field was used to compute P(DAM) across the entire study area. Given the condition that maximum gust wind speed during the two storm events exceeded 35 m s(-1), the highest P(DAM) values computed were primarily where coniferous forest grows in severely exposed areas on temporarily moist soils on bunter sandstone formations. Such areas are found mainly in the mountainous ranges of the northern Black Forest, the eastern Forest of Odes, in the Virngrund area, and in the southwestern Alpine Foothills.
NASA Astrophysics Data System (ADS)
Zafari, A.; Zurita-Milla, R.; Izquierdo-Verdiguier, E.
2017-10-01
Crop maps are essential inputs for the agricultural planning done at various governmental and agribusinesses agencies. Remote sensing offers timely and costs efficient technologies to identify and map crop types over large areas. Among the plethora of classification methods, Support Vector Machine (SVM) and Random Forest (RF) are widely used because of their proven performance. In this work, we study the synergic use of both methods by introducing a random forest kernel (RFK) in an SVM classifier. A time series of multispectral WorldView-2 images acquired over Mali (West Africa) in 2014 was used to develop our case study. Ground truth containing five common crop classes (cotton, maize, millet, peanut, and sorghum) were collected at 45 farms and used to train and test the classifiers. An SVM with the standard Radial Basis Function (RBF) kernel, a RF, and an SVM-RFK were trained and tested over 10 random training and test subsets generated from the ground data. Results show that the newly proposed SVM-RFK classifier can compete with both RF and SVM-RBF. The overall accuracies based on the spectral bands only are of 83, 82 and 83% respectively. Adding vegetation indices to the analysis result in the classification accuracy of 82, 81 and 84% for SVM-RFK, RF, and SVM-RBF respectively. Overall, it can be observed that the newly tested RFK can compete with SVM-RBF and RF classifiers in terms of classification accuracy.
Review of Recent Methodological Developments in Group-Randomized Trials: Part 2-Analysis.
Turner, Elizabeth L; Prague, Melanie; Gallis, John A; Li, Fan; Murray, David M
2017-07-01
In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have updated that review with developments in analysis of the past 13 years, with a companion article to focus on developments in design. We discuss developments in the topics of the earlier review (e.g., methods for parallel-arm GRTs, individually randomized group-treatment trials, and missing data) and in new topics, including methods to account for multiple-level clustering and alternative estimation methods (e.g., augmented generalized estimating equations, targeted maximum likelihood, and quadratic inference functions). In addition, we describe developments in analysis of alternative group designs (including stepped-wedge GRTs, network-randomized trials, and pseudocluster randomized trials), which require clustering to be accounted for in their design and analysis.
Diversity of Medicinal Plants among Different Forest-use Types of the Pakistani Himalaya.
Adnan, Muhammad; Hölscher, Dirk
2012-12-01
Diversity of Medicinal Plants among Different Forest-use Types of the Pakistani Himalaya Medicinal plants collected in Himalayan forests play a vital role in the livelihoods of regional rural societies and are also increasingly recognized at the international level. However, these forests are being heavily transformed by logging. Here we ask how forest transformation influences the diversity and composition of medicinal plants in northwestern Pakistan, where we studied old-growth forests, forests degraded by logging, and regrowth forests. First, an approximate map indicating these forest types was established and then 15 study plots per forest type were randomly selected. We found a total of 59 medicinal plant species consisting of herbs and ferns, most of which occurred in the old-growth forest. Species number was lowest in forest degraded by logging and intermediate in regrowth forest. The most valuable economic species, including six Himalayan endemics, occurred almost exclusively in old-growth forest. Species composition and abundance of forest degraded by logging differed markedly from that of old-growth forest, while regrowth forest was more similar to old-growth forest. The density of medicinal plants positively correlated with tree canopy cover in old-growth forest and negatively in degraded forest, which indicates that species adapted to open conditions dominate in logged forest. Thus, old-growth forests are important as refuge for vulnerable endemics. Forest degraded by logging has the lowest diversity of relatively common medicinal plants. Forest regrowth may foster the reappearance of certain medicinal species valuable to local livelihoods and as such promote acceptance of forest expansion and medicinal plants conservation in the region. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s12231-012-9213-4) contains supplementary material, which is available to authorized users.
The Impact of Afforestation on the Carbon Stocks of Mineral Soils Across the Republic of Ireland.
NASA Astrophysics Data System (ADS)
Wellock, M.; Laperle, C.; Kiely, G.; Reidy, B.; Duffy, C.; Tobin, B.
2009-04-01
At the beginning of the twentieth century forests accounted for only 1% of the total Irish land cover (Pilcher & Mac an tSaoir, 1995). However, due to the efforts of successive governments there has been rapid afforestation since the 1960s resulting in a 10.0% forest land cover as of 2007 (The Department of Agriculture, Fisheries, and Food, 2007). A large proportion of this afforestation took place after the mid-1980s and was fueled by government grant incentive schemes targeted at private landowners (Renou & Farrell 2005). Consequently, 54% of forests are less than 20 years old (Byrne, 2006). This specific land use change provides an opportunity for Ireland to meet international obligations set forth by the United Nations Framework Convention on Climate Change (UNFCCC, 1992). These obligations include the limitation of greenhouse gas emissions to 13% above 1990 levels. In order to promote accountability for these commitments, the UNFCCC treaty and the Kyoto Protocol (Kyoto Protocol, 1997) mandate signatories to publish greenhouse gas (GHG) emissions inventories for both greenhouse gas sources and removals by sinks. Article 3.3 of the Kyoto Protocol allows changes in C stocks due to afforestation, reforestation, and deforestation since 1990 to be used to offset inventory emissions. Therefore, due to the rapid rate of afforestation and its increased carbon sequestration since 1990, Ireland has the potential to significantly offset GHG emissions. There is little known as to the impacts of afforestation on the carbon stocks in soils over time, and even less known about the impact on Irish soils. The FORESTC project aims to analyse this impact by undertaking a nationwide study using a method similar to that of the paired plot method in Davis and Condron, 2002. The study will examine 42 forest sites across Ireland selected randomly from the National Forest Inventory (National Forest Inventory, 2007). These 42 sites will be grouped based on the forest type which includes conifer, broadleaf, and mixed (broadleaf and conifer) and soil type: brown earth, podzol, brown podzolic, gley and brown earth. The paired plot method involves selecting a second site that represents the same soil type and physical characteristics as the forest site. The only difference between the two sites should be the current land-use of the pair site, which should represent the pre-afforestation land-use of the forest site. Each forest site and its pair site will be sampled in the top 30 cm of soil for bulk density and organic carbon %, while litter and F/H layer samples will be taken and analysed for carbon. This data should provide an analysis of the carbon stocks of the soil and litter of both the forest site and its pair site allowing for comparison and thus the impact of afforestation on carbon stocks. References. Byrne, K.A., & Milne, R. (2006). Carbon stocks and sequestration in plantation forests in the Republic of Ireland. Forestry, 79, no. 4: 361. Davis, M.R., & Condron, L.M. (2002). Impact of grassland afforestation on soil carbon in New Zealand: a review of paired-site studies. Australian Journal of Soil Research, 40, no. 4: 675-690. Kyoto Protocol. 1997 Kyoto Protocol to the United Nations Framework Convention on Climate Change. FCCC/CP/1997/7/Add.1, Decision 1/CP.3, Annex 7. UN. National Forest Inventory: NFI Methodology. (2007). Forest Service, The Department of Agriculture, Fisheries, and Food, Wexford, Ireland. Pilcher, J.R. & Mac an tSaoir, S. (1995). Wood, Trees and Forests in Ireland. (Royal Irish Academy, Dublin. Renou, F. & Farrell, E.P. (2005). Reclaiming peatlands for forestry: the Irish experience. In: Stanturf, J.A. and Madsen, P.A. (eds.). Restoration of boreal and temperate forests. CRC Press, Boca Raton. p.541-557. UNFCCC. 1992 United Nations Framework Convention on Climate Change. Palais des Nations, Geneva. http://www.unfccc.de/index.html
Mhaskar, Rahul; Djulbegovic, Benjamin; Magazin, Anja; Soares, Heloisa P.; Kumar, Ambuj
2011-01-01
Objectives To assess whether reported methodological quality of randomized controlled trials (RCTs) reflect the actual methodological quality, and to evaluate the association of effect size (ES) and sample size with methodological quality. Study design Systematic review Setting Retrospective analysis of all consecutive phase III RCTs published by 8 National Cancer Institute Cooperative Groups until year 2006. Data were extracted from protocols (actual quality) and publications (reported quality) for each study. Results 429 RCTs met the inclusion criteria. Overall reporting of methodological quality was poor and did not reflect the actual high methodological quality of RCTs. The results showed no association between sample size and actual methodological quality of a trial. Poor reporting of allocation concealment and blinding exaggerated the ES by 6% (ratio of hazard ratio [RHR]: 0.94, 95%CI: 0.88, 0.99) and 24% (RHR: 1.24, 95%CI: 1.05, 1.43), respectively. However, actual quality assessment showed no association between ES and methodological quality. Conclusion The largest study to-date shows poor quality of reporting does not reflect the actual high methodological quality. Assessment of the impact of quality on the ES based on reported quality can produce misleading results. PMID:22424985
Hydrological processes in major types of Chinese forest
NASA Astrophysics Data System (ADS)
Wei, X.; Liu, S.; Zhou, G.; Wang, C.
2005-01-01
Overexploitation of forest resources in China has caused serious concerns over its negative impacts on water resources, biodiversity, soil erosion, wildlife habitat and community stability. One key concern is the impact of forestry practices on hydrological processes, particularly the effect of forest harvest on water quality and quantity. Since the mid 1980s, a series of scientific studies on forest hydrology have been initiated in major types of forest across the country, including Korean pine (Pinus koraiensis), Chinese fir (Cunninghamia lanceolata), oak (Quercus mongolica), larch (Larix gmelinii), faber fir (Abies fabri), Chinese pine (Pinus tabulaeformis), armand pine (Pinus arandi), birch (Betula platyphylla) and some tropical forests. These studies measured rainfall interception, streamflow, evapotranspiration and impacts of forest management (clearcutting and reforestation). This paper reviews key findings from these forest hydrological studies conducted over the past 20 years in China.
NASA Astrophysics Data System (ADS)
Beguet, Benoit; Guyon, Dominique; Boukir, Samia; Chehata, Nesrine
2014-10-01
The main goal of this study is to design a method to describe the structure of forest stands from Very High Resolution satellite imagery, relying on some typical variables such as crown diameter, tree height, trunk diameter, tree density and tree spacing. The emphasis is placed on the automatization of the process of identification of the most relevant image features for the forest structure retrieval task, exploiting both spectral and spatial information. Our approach is based on linear regressions between the forest structure variables to be estimated and various spectral and Haralick's texture features. The main drawback of this well-known texture representation is the underlying parameters which are extremely difficult to set due to the spatial complexity of the forest structure. To tackle this major issue, an automated feature selection process is proposed which is based on statistical modeling, exploring a wide range of parameter values. It provides texture measures of diverse spatial parameters hence implicitly inducing a multi-scale texture analysis. A new feature selection technique, we called Random PRiF, is proposed. It relies on random sampling in feature space, carefully addresses the multicollinearity issue in multiple-linear regression while ensuring accurate prediction of forest variables. Our automated forest variable estimation scheme was tested on Quickbird and Pléiades panchromatic and multispectral images, acquired at different periods on the maritime pine stands of two sites in South-Western France. It outperforms two well-established variable subset selection techniques. It has been successfully applied to identify the best texture features in modeling the five considered forest structure variables. The RMSE of all predicted forest variables is improved by combining multispectral and panchromatic texture features, with various parameterizations, highlighting the potential of a multi-resolution approach for retrieving forest structure variables from VHR satellite images. Thus an average prediction error of ˜ 1.1 m is expected on crown diameter, ˜ 0.9 m on tree spacing, ˜ 3 m on height and ˜ 0.06 m on diameter at breast height.
NASA Astrophysics Data System (ADS)
Suiter, Ashley Elizabeth
Multi-spectral imagery provides a robust and low-cost dataset for assessing wetland extent and quality over broad regions and is frequently used for wetland inventories. However in forested wetlands, hydrology is obscured by tree canopy making it difficult to detect with multi-spectral imagery alone. Because of this, classification of forested wetlands often includes greater errors than that of other wetlands types. Elevation and terrain derivatives have been shown to be useful for modelling wetland hydrology. But, few studies have addressed the use of LiDAR intensity data detecting hydrology in forested wetlands. Due the tendency of LiDAR signal to be attenuated by water, this research proposed the fusion of LiDAR intensity data with LiDAR elevation, terrain data, and aerial imagery, for the detection of forested wetland hydrology. We examined the utility of LiDAR intensity data and determined whether the fusion of Lidar derived data with multispectral imagery increased the accuracy of forested wetland classification compared with a classification performed with only multi-spectral image. Four classifications were performed: Classification A -- All Imagery, Classification B -- All LiDAR, Classification C -- LiDAR without Intensity, and Classification D -- Fusion of All Data. These classifications were performed using random forest and each resulted in a 3-foot resolution thematic raster of forested upland and forested wetland locations in Vermilion County, Illinois. The accuracies of these classifications were compared using Kappa Coefficient of Agreement. Importance statistics produced within the random forest classifier were evaluated in order to understand the contribution of individual datasets. Classification D, which used the fusion of LiDAR and multi-spectral imagery as input variables, had moderate to strong agreement between reference data and classification results. It was found that Classification A performed using all the LiDAR data and its derivatives (intensity, elevation, slope, aspect, curvatures, and Topographic Wetness Index) was the most accurate classification with Kappa: 78.04%, indicating moderate to strong agreement. However, Classification C, performed with LiDAR derivative without intensity data had less agreement than would be expected by chance, indicating that LiDAR contributed significantly to the accuracy of Classification B.
Cadogan, Beresford L; Scharbach, Roger D
2003-04-01
A field trial using true replicates was conducted successfully in a boreal forest in 1996 to evaluate the efficacy of two aerially applied Bacillus thuringiensis formulations, ABG 6429 and ABG 6430. A complete randomized design with four replicates per treatment was chosen. Twelve to 15 balsam fir (Abies balsamea [L.] Mill.) per plot were randomly selected as sample trees. Interplot buffer zones, > or = 200 m wide, adequately prevented cross contamination from sprays that were atomized with four rotary atomizers (volume median diameters ranging from 64.6 to 139.4 microm) and released approximately 30 m above the ground. The B. thuringiensis formulations were not significantly different (P > 0.05) from each other in reducing spruce budworm (Choristoneura fumiferana [Clem.]) populations and protecting balsam trees from defoliation but both formulations were significantly more efficacious than the controls. The results suggest that true replicates are a feasible alternative to pseudoreplication in experimental forest aerial applications.
Enhancing Multimedia Imbalanced Concept Detection Using VIMP in Random Forests.
Sadiq, Saad; Yan, Yilin; Shyu, Mei-Ling; Chen, Shu-Ching; Ishwaran, Hemant
2016-07-01
Recent developments in social media and cloud storage lead to an exponential growth in the amount of multimedia data, which increases the complexity of managing, storing, indexing, and retrieving information from such big data. Many current content-based concept detection approaches lag from successfully bridging the semantic gap. To solve this problem, a multi-stage random forest framework is proposed to generate predictor variables based on multivariate regressions using variable importance (VIMP). By fine tuning the forests and significantly reducing the predictor variables, the concept detection scores are evaluated when the concept of interest is rare and imbalanced, i.e., having little collaboration with other high level concepts. Using classical multivariate statistics, estimating the value of one coordinate using other coordinates standardizes the covariates and it depends upon the variance of the correlations instead of the mean. Thus, conditional dependence on the data being normally distributed is eliminated. Experimental results demonstrate that the proposed framework outperforms those approaches in the comparison in terms of the Mean Average Precision (MAP) values.
Cai, Tianxi; Karlson, Elizabeth W.
2013-01-01
Objectives To test whether data extracted from full text patient visit notes from an electronic medical record (EMR) would improve the classification of PsA compared to an algorithm based on codified data. Methods From the > 1,350,000 adults in a large academic EMR, all 2318 patients with a billing code for PsA were extracted and 550 were randomly selected for chart review and algorithm training. Using codified data and phrases extracted from narrative data using natural language processing, 31 predictors were extracted and three random forest algorithms trained using coded, narrative, and combined predictors. The receiver operator curve (ROC) was used to identify the optimal algorithm and a cut point was chosen to achieve the maximum sensitivity possible at a 90% positive predictive value (PPV). The algorithm was then used to classify the remaining 1768 charts and finally validated in a random sample of 300 cases predicted to have PsA. Results The PPV of a single PsA code was 57% (95%CI 55%–58%). Using a combination of coded data and NLP the random forest algorithm reached a PPV of 90% (95%CI 86%–93%) at sensitivity of 87% (95% CI 83% – 91%) in the training data. The PPV was 93% (95%CI 89%–96%) in the validation set. Adding NLP predictors to codified data increased the area under the ROC (p < 0.001). Conclusions Using NLP with text notes from electronic medical records improved the performance of the prediction algorithm significantly. Random forests were a useful tool to accurately classify psoriatic arthritis cases to enable epidemiological research. PMID:20701955