Science.gov

Sample records for classification tree approach

  1. The decision tree approach to classification

    NASA Technical Reports Server (NTRS)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  2. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  3. Decision tree approach for classification of remotely sensed satellite data using open source support

    NASA Astrophysics Data System (ADS)

    Sharma, Richa; Ghosh, Aniruddha; Joshi, P. K.

    2013-10-01

    In this study, an attempt has been made to develop a decision tree classification (DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source data mining software. The classified image is compared with the image classified using classical ISODATA clustering and Maximum Likelihood Classifier (MLC) algorithms. Classification result based on DTC method provided better visual depiction than results produced by ISODATA clustering or by MLC algorithms. The overall accuracy was found to be 90% (kappa = 0.88) using the DTC, 76.67% (kappa = 0.72) using the Maximum Likelihood and 57.5% (kappa = 0.49) using ISODATA clustering method. Based on the overall accuracy and kappa statistics, DTC was found to be more preferred classification approach than others.

  4. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  5. Event-based prediction of stream turbidity using a combined cluster analysis and classification tree approach

    NASA Astrophysics Data System (ADS)

    Mather, Amanda L.; Johnson, Richard L.

    2015-11-01

    Stream turbidity typically increases during streamflow events; however, similar event hydrographs can produce markedly different event turbidity behaviors because many factors influence turbidity in addition to streamflow, including antecedent moisture conditions, season, and supply of turbidity-causing materials. Modeling of sub-hourly turbidity as a function of streamflow shows that event model parameters vary on an event-by-event basis. Here we examine the extent to which stream turbidity can be predicted through the prediction of event model parameters. Using three mid-sized streams from the Mid-Atlantic region of the U.S., we show the model parameter set for each event can be predicted based on the event characteristics (e.g., hydrologic, meteorologic and antecedent moisture conditions) using a combined cluster analysis and classification tree approach. The results suggest that the ratio of beginning event discharge to peak event discharge (an estimate of the event baseflow index), as well as catchment antecedent moisture, are important factors in the prediction of event turbidity. Indicators of antecedent moisture, particularly those derived from antecedent discharge, account for the majority of the splitting nodes in the classification trees for all three streams. For this study, prediction of turbidity during streamflow events is based upon observed data (e.g., measured streamflow, precipitation and air temperature). However, the results also suggest that the methods presented here can, in future work, be used in conjunction with forecasts of streamflow, precipitation and air temperature to forecast stream turbidity.

  6. Predictive Classification Trees

    NASA Astrophysics Data System (ADS)

    Dlugosz, Stephan; Müller-Funk, Ulrich

    CART (Breiman et al., Classification and Regression Trees, Chapman and Hall, New York, 1984) and (exhaustive) CHAID (Kass, Appl Stat 29:119-127, 1980) figure prominently among the procedures actually used in data based management, etc. CART is a well-established procedure that produces binary trees. CHAID, in contrast, admits multiple splittings, a feature that allows to exploit the splitting variable more extensively. On the other hand, that procedure depends on premises that are questionable in practical applications. This can be put down to the fact that CHAID relies on simultaneous Chi-Square- resp. F-tests. The null-distribution of the second test statistic, for instance, relies on the normality assumption that is not plausible in a data mining context. Moreover, none of these procedures - as implemented in SPSS, for instance - take ordinal dependent variables into account. In the paper we suggest an alternative tree-algorithm that: Requires explanatory categorical variables

  7. Snow event classification with a 2D video disdrometer - A decision tree approach

    NASA Astrophysics Data System (ADS)

    Bernauer, F.; Hürkamp, K.; Rühm, W.; Tschiersch, J.

    2016-05-01

    Snowfall classification according to crystal type or degree of riming of the snowflakes is import for many atmospheric processes, e.g. wet deposition of aerosol particles. 2D video disdrometers (2DVD) have recently proved their capability to measure microphysical parameters of snowfall. The present work has the aim of classifying snowfall according to microphysical properties of single hydrometeors (e.g. shape and fall velocity) measured by means of a 2DVD. The constraints for the shape and velocity parameters which are used in a decision tree for classification of the 2DVD measurements, are derived from detailed on-site observations, combining automatic 2DVD classification with visual inspection. The developed decision tree algorithm subdivides the detected events into three classes of dominating crystal type (single crystals, complex crystals and pellets) and three classes of dominating degree of riming (weak, moderate and strong). The classification results for the crystal type were validated with an independent data set proving the unambiguousness of the classification. In addition, for three long-term events, good agreement of the classification results with independently measured maximum dimension of snowflakes, snowflake bulk density and surrounding temperature was found. The developed classification algorithm is applicable for wind speeds below 5.0 m s -1 and has the advantage of being easily implemented by other users.

  8. Chronic subdural hematoma: Surgical management and outcome in 986 cases: A classification and regression tree approach

    PubMed Central

    Rovlias, Aristedis; Theodoropoulos, Spyridon; Papoutsakis, Dimitrios

    2015-01-01

    Background: Chronic subdural hematoma (CSDH) is one of the most common clinical entities in daily neurosurgical practice which carries a most favorable prognosis. However, because of the advanced age and medical problems of patients, surgical therapy is frequently associated with various complications. This study evaluated the clinical features, radiological findings, and neurological outcome in a large series of patients with CSDH. Methods: A classification and regression tree (CART) technique was employed in the analysis of data from 986 patients who were operated at Asclepeion General Hospital of Athens from January 1986 to December 2011. Burr holes evacuation with closed system drainage has been the operative technique of first choice at our institution for 29 consecutive years. A total of 27 prognostic factors were examined to predict the outcome at 3-month postoperatively. Results: Our results indicated that neurological status on admission was the best predictor of outcome. With regard to the other data, age, brain atrophy, thickness and density of hematoma, subdural accumulation of air, and antiplatelet and anticoagulant therapy were found to correlate significantly with prognosis. The overall cross-validated predictive accuracy of CART model was 85.34%, with a cross-validated relative error of 0.326. Conclusions: Methodologically, CART technique is quite different from the more commonly used methods, with the primary benefit of illustrating the important prognostic variables as related to outcome. Since, the ideal therapy for the treatment of CSDH is still under debate, this technique may prove useful in developing new therapeutic strategies and approaches for patients with CSDH. PMID:26257985

  9. Quantification of chemical peptide reactivity for screening contact allergens: a classification tree model approach.

    PubMed

    Gerberick, G Frank; Vassallo, Jeffrey D; Foertsch, Leslie M; Price, Brad B; Chaney, Joel G; Lepoittevin, Jean-Pierre

    2007-06-01

    In the interest of reducing animal use, in vitro alternatives for skin sensitization testing are under development. One unifying characteristic of chemical allergens is the requirement that they react with proteins for the effective induction of skin sensitization. The majority of chemical allergens are electrophilic and react with nucleophilic amino acids. To determine whether and to what extent reactivity correlates with skin sensitization potential, 82 chemicals comprising allergens of different potencies and nonallergenic chemicals were evaluated for their ability to react with reduced glutathione (GSH) or with two synthetic peptides containing either a single cysteine or lysine. Following a 15-min reaction time with GSH, or a 24-h reaction time with the two synthetic peptides, the samples were analyzed by high-performance liquid chromatography. UV detection was used to monitor the depletion of GSH or the peptides. The peptide reactivity data were compared with existing local lymph node assay data using recursive partitioning methodology to build a classification tree that allowed a ranking of reactivity as minimal, low, moderate, and high. Generally, nonallergens and weak allergens demonstrated minimal to low peptide reactivity, whereas moderate to extremely potent allergens displayed moderate to high peptide reactivity. Classifying minimal reactivity as nonsensitizers and low, moderate, and high reactivity as sensitizers, it was determined that a model based on cysteine and lysine gave a prediction accuracy of 89%. The results of these investigations reveal that measurement of peptide reactivity has considerable potential utility as a screening approach for skin sensitization testing, and thereby for reducing reliance on animal-based test methods. PMID:17400584

  10. Applying an Ensemble Classification Tree Approach to the Prediction of Completion of a 12-Step Facilitation Intervention with Stimulant Abusers

    PubMed Central

    Doyle, Suzanne R.; Donovan, Dennis M.

    2014-01-01

    Aims The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed and the subagging procedure of random subsampling is applied. Methods Among 234 individuals with stimulant use disorders randomized to a 12-Step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. Findings The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower ASI Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. Conclusions The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038

  11. Applying an ensemble classification tree approach to the prediction of completion of a 12-step facilitation intervention with stimulant abusers.

    PubMed

    Doyle, Suzanne R; Donovan, Dennis M

    2014-12-01

    The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed, and the subagging procedure of random subsampling is applied. Among 234 individuals with stimulant use disorders randomized to a 12-step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower Addiction Severity Index (ASI) Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038

  12. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  13. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object-based classification approaches.

    PubMed

    Meneguzzo, Dacia M; Liknes, Greg C; Nelson, Mark D

    2013-08-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics. Despite the significance of ToF, forest and other natural resource inventory programs and geospatial land cover datasets that are available at a national scale do not include comprehensive information regarding ToF in the United States. Additional ground-based data collection and acquisition of specialized imagery to inventory these resources are expensive alternatives. As a potential solution, we identified two remote sensing-based approaches that use free high-resolution aerial imagery from the National Agriculture Imagery Program (NAIP) to map all tree cover in an agriculturally dominant landscape. We compared the results obtained using an unsupervised per-pixel classifier (independent component analysis-[ICA]) and an object-based image analysis (OBIA) procedure in Steele County, Minnesota, USA. Three types of accuracy assessments were used to evaluate how each method performed in terms of: (1) producing a county-level estimate of total tree-covered area, (2) correctly locating tree cover on the ground, and (3) how tree cover patch metrics computed from the classified outputs compared to those delineated by a human photo interpreter. Both approaches were found to be viable for mapping tree cover over a broad spatial extent and could serve to supplement ground-based inventory data. The ICA approach produced an estimate of total tree cover more similar to the photo-interpreted result, but the output from the OBIA method was more realistic in terms of describing the actual observed spatial pattern of tree cover. PMID:23255169

  14. Phylogenetic classification and the universal tree.

    PubMed

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished. PMID:10381871

  15. Predicting 'very poor' beach water quality gradings using classification tree.

    PubMed

    Thoe, Wai; Choi, King Wah; Lee, Joseph Hun-wei

    2016-02-01

    A beach water quality prediction system has been developed in Hong Kong using multiple linear regression (MLR) models. However, linear models are found to be weak at capturing the infrequent 'very poor' water quality occasions when Escherichia coli (E. coli) concentration exceeds 610 counts/100 mL. This study uses a classification tree to increase the accuracy in predicting the 'very poor' water quality events at three Hong Kong beaches affected either by non-point source or point source pollution. Binary-output classification trees (to predict whether E. coli concentration exceeds 610 counts/100 mL) are developed over the periods before and after the implementation of the Harbour Area Treatment Scheme, when systematic changes in water quality were observed. Results show that classification trees can capture more 'very poor' events in both periods when compared to the corresponding linear models, with an increase in correct positives by an average of 20%. Classification trees are also developed at two beaches to predict the four-category Beach Water Quality Indices. They perform worse than the binary tree and give excessive false alarms of 'very poor' events. Finally, a combined modelling approach using both MLR model and classification tree is proposed to enhance the beach water quality prediction system for Hong Kong. PMID:26837834

  16. Classification based on full decision trees

    NASA Astrophysics Data System (ADS)

    Genrikhov, I. E.; Djukova, E. V.

    2012-04-01

    The ideas underlying a series of the authors' studies dealing with the design of classification algorithms based on full decision trees are further developed. It is shown that the decision tree construction under consideration takes into account all the features satisfying a branching criterion. Full decision trees with an entropy branching criterion are studied as applied to precedent-based pattern recognition problems with real-valued data. Recognition procedures are constructed for solving problems with incomplete data (gaps in the feature descriptions of the objects) in the case when the learning objects are nonuniformly distributed over the classes. The authors' basic results previously obtained in this area are overviewed.

  17. Fast Image Texture Classification Using Decision Trees

    NASA Technical Reports Server (NTRS)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  18. Seasonal Effect on Tree Species Classification in an Urban Environment Using Hyperspectral Data, LiDAR, and an Object-Oriented Approach

    PubMed Central

    Voss, Matthew; Sugumaran, Ramanathan

    2008-01-01

    The objective of the current study was to analyze the seasonal effect on differentiating tree species in an urban environment using multi-temporal hyperspectral data, Light Detection And Ranging (LiDAR) data, and a tree species database collected from the field. Two Airborne Imaging Spectrometer for Applications (AISA) hyperspectral images were collected, covering the Summer and Fall seasons. In order to make both datasets spatially and spectrally compatible, several preprocessing steps, including band reduction and a spatial degradation, were performed. An object-oriented classification was performed on both images using training data collected randomly from the tree species database. The seven dominant tree species (Gleditsia triacanthos, Acer saccharum, Tilia Americana, Quercus palustris, Pinus strobus and Picea glauca) were used in the classification. The results from this analysis did not show any major difference in overall accuracy between the two seasons. Overall accuracy was approximately 57% for the Summer dataset and 56% for the Fall dataset. However, the Fall dataset provided more consistent results for all tree species while the Summer dataset had a few higher individual class accuracies. Further, adding LiDAR into the classification improved the results by 19% for both fall and summer. This is mainly due to the removal of shadow effect and the addition of elevation data to separate low and high vegetation.

  19. Voxel classification based airway tree segmentation

    NASA Astrophysics Data System (ADS)

    Lo, Pechin; de Bruijne, Marleen

    2008-03-01

    This paper presents a voxel classification based method for segmenting the human airway tree in volumetric computed tomography (CT) images. In contrast to standard methods that use only voxel intensities, our method uses a more complex appearance model based on a set of local image appearance features and Kth nearest neighbor (KNN) classification. The optimal set of features for classification is selected automatically from a large set of features describing the local image structure at several scales. The use of multiple features enables the appearance model to differentiate between airway tree voxels and other voxels of similar intensities in the lung, thus making the segmentation robust to pathologies such as emphysema. The classifier is trained on imperfect segmentations that can easily be obtained using region growing with a manual threshold selection. Experiments show that the proposed method results in a more robust segmentation that can grow into the smaller airway branches without leaking into emphysematous areas, and is able to segment many branches that are not present in the training set.

  20. Semi-supervised SVM for individual tree crown species classification

    NASA Astrophysics Data System (ADS)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  1. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    NASA Astrophysics Data System (ADS)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  2. Consensus of classification trees for skin sensitisation hazard prediction.

    PubMed

    Asturiol, D; Casati, S; Worth, A

    2016-10-01

    Since March 2013, it is no longer possible to market in the European Union (EU) cosmetics containing new ingredients tested on animals. Although several in silico alternatives are available and achievements have been made in the development and regulatory adoption of skin sensitisation non-animal tests, there is not yet a generally accepted approach for skin sensitisation assessment that would fully substitute the need for animal testing. The aim of this work was to build a defined approach (i.e. a predictive model based on readouts from various information sources that uses a fixed procedure for generating a prediction) for skin sensitisation hazard prediction (sensitiser/non-sensitiser) using Local Lymph Node Assay (LLNA) results as reference classifications. To derive the model, we built a dataset with high quality data from in chemico (DPRA) and in vitro (KeratinoSens™ and h-CLAT) methods, and it was complemented with predictions from several software packages. The modelling exercise showed that skin sensitisation hazard was better predicted by classification trees based on in silico predictions. The defined approach consists of a consensus of two classification trees that are based on descriptors that account for protein reactivity and structural features. The model showed an accuracy of 0.93, sensitivity of 0.98, and specificity of 0.85 for 269 chemicals. In addition, the defined approach provides a measure of confidence associated to the prediction. PMID:27458072

  3. Evaluating multimedia chemical persistence: Classification and regression tree analysis

    SciTech Connect

    Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.

    2000-04-01

    For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Widely distributed persistent chemicals are impossible to remove from the environment and remediation by natural processes may take decades, which is problematic if adverse health or ecological effects are discovered after prolonged release into the environment. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as persistent or nonpersistent based on the chemical properties. In this approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robust for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and can be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.

  4. Tree Classification with Fused Mobile Laser Scanning and Hyperspectral Data

    PubMed Central

    Puttonen, Eetu; Jaakkola, Anttoni; Litkey, Paula; Hyyppä, Juha

    2011-01-01

    Mobile Laser Scanning data were collected simultaneously with hyperspectral data using the Finnish Geodetic Institute Sensei system. The data were tested for tree species classification. The test area was an urban garden in the City of Espoo, Finland. Point clouds representing 168 individual tree specimens of 23 tree species were determined manually. The classification of the trees was done using first only the spatial data from point clouds, then with only the spectral data obtained with a spectrometer, and finally with the combined spatial and hyperspectral data from both sensors. Two classification tests were performed: the separation of coniferous and deciduous trees, and the identification of individual tree species. All determined tree specimens were used in distinguishing coniferous and deciduous trees. A subset of 133 trees and 10 tree species was used in the tree species classification. The best classification results for the fused data were 95.8% for the separation of the coniferous and deciduous classes. The best overall tree species classification succeeded with 83.5% accuracy for the best tested fused data feature combination. The respective results for paired structural features derived from the laser point cloud were 90.5% for the separation of the coniferous and deciduous classes and 65.4% for the species classification. Classification accuracies with paired hyperspectral reflectance value data were 90.5% for the separation of coniferous and deciduous classes and 62.4% for different species. The results are among the first of their kind and they show that mobile collected fused data outperformed single-sensor data in both classification tests and by a significant margin. PMID:22163894

  5. Sensitivity of missing values in classification tree for large sample

    NASA Astrophysics Data System (ADS)

    Hasan, Norsida; Adam, Mohd Bakri; Mustapha, Norwati; Abu Bakar, Mohd Rizam

    2012-05-01

    Missing values either in predictor or in response variables are a very common problem in statistics and data mining. Cases with missing values are often ignored which results in loss of information and possible bias. The objectives of our research were to investigate the sensitivity of missing data in classification tree model for large sample. Data were obtained from one of the high level educational institutions in Malaysia. Students' background data were randomly eliminated and classification tree was used to predict students degree classification. The results showed that for large sample, the structure of the classification tree was sensitive to missing values especially for sample contains more than ten percent missing values.

  6. A Mixtures-of-Trees Framework for Multi-Label Classification

    PubMed Central

    Hong, Charmgil; Batal, Iyad; Hauskrecht, Milos

    2015-01-01

    We propose a new probabilistic approach for multi-label classification that aims to represent the class posterior distribution P(Y|X). Our approach uses a mixture of tree-structured Bayesian networks, which can leverage the computational advantages of conditional tree-structured models and the abilities of mixtures to compensate for tree-structured restrictions. We develop algorithms for learning the model from data and for performing multi-label predictions using the learned model. Experiments on multiple datasets demonstrate that our approach outperforms several state-of-the-art multi-label classification methods. PMID:25927011

  7. [Automatic classification method of star spectrum data based on classification pattern tree].

    PubMed

    Zhao, Xu-Jun; Cai, Jiang-Hui; Zhang, Ji-Fu; Yang, Hai-Feng; Ma, Yang

    2013-10-01

    Frequent pattern, frequently appearing in the data set, plays an important role in data mining. For the stellar spectrum classification tasks, a classification rule mining method based on classification pattern tree is presented on the basis of frequent pattern. The procedures can be shown as follows. Firstly, a new tree structure, i. e., classification pattern tree, is introduced based on the different frequencies of stellar spectral attributes in data base and its different importance used for classification. The related concepts and the construction method of classification pattern tree are also described in this paper. Then, the characteristics of the stellar spectrum are mapped to the classification pattern tree. Two modes of top-to-down and bottom-to-up are used to traverse the classification pattern tree and extract the classification rules. Meanwhile, the concept of pattern capability is introduced to adjust the number of classification rules and improve the construction efficiency of the classification pattern tree. Finally, the SDSS (the Sloan Digital Sky Survey) stellar spectral data provided by the National Astronomical Observatory are used to verify the accuracy of the method. The results show that a higher classification accuracy has been got. PMID:24409754

  8. Watershed Merge Tree Classification for Electron Microscopy Image Segmentation

    SciTech Connect

    Liu, TIng; Jurrus, Elizabeth R.; Seyedhosseini, Mojtaba; Ellisman, Mark; Tasdizen, Tolga

    2012-11-11

    Automated segmentation of electron microscopy (EM) images is a challenging problem. In this paper, we present a novel method that utilizes a hierarchical structure and boundary classification for 2D neuron segmentation. With a membrane detection probability map, a watershed merge tree is built for the representation of hierarchical region merging from the watershed algorithm. A boundary classifier is learned with non-local image features to predict each potential merge in the tree, upon which merge decisions are made with consistency constraints in the sense of optimization to acquire the final segmentation. Independent of classifiers and decision strategies, our approach proposes a general framework for efficient hierarchical segmentation with statistical learning. We demonstrate that our method leads to a substantial improvement in segmentation accuracy.

  9. Classification of Liss IV Imagery Using Decision Tree Methods

    NASA Astrophysics Data System (ADS)

    Verma, Amit Kumar; Garg, P. K.; Prasad, K. S. Hari; Dadhwal, V. K.

    2016-06-01

    Image classification is a compulsory step in any remote sensing research. Classification uses the spectral information represented by the digital numbers in one or more spectral bands and attempts to classify each individual pixel based on this spectral information. Crop classification is the main concern of remote sensing applications for developing sustainable agriculture system. Vegetation indices computed from satellite images gives a good indication of the presence of vegetation. It is an indicator that describes the greenness, density and health of vegetation. Texture is also an important characteristics which is used to identifying objects or region of interest is an image. This paper illustrate the use of decision tree method to classify the land in to crop land and non-crop land and to classify different crops. In this paper we evaluate the possibility of crop classification using an integrated approach methods based on texture property with different vegetation indices for single date LISS IV sensor 5.8 meter high spatial resolution data. Eleven vegetation indices (NDVI, DVI, GEMI, GNDVI, MSAVI2, NDWI, NG, NR, NNIR, OSAVI and VI green) has been generated using green, red and NIR band and then image is classified using decision tree method. The other approach is used integration of texture feature (mean, variance, kurtosis and skewness) with these vegetation indices. A comparison has been done between these two methods. The results indicate that inclusion of textural feature with vegetation indices can be effectively implemented to produce classifiedmaps with 8.33% higher accuracy for Indian satellite IRS-P6, LISS IV sensor images.

  10. Classification Based on Tree-Structured Allocation Rules

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qui

    2008-01-01

    The authors consider the problem of classifying an unknown observation into 1 of several populations by using tree-structured allocation rules. Although many parametric classification procedures are robust to certain assumption violations, there is need for classification procedures that can be used regardless of the group-conditional…

  11. Urban Tree Classification Using Full-Waveform Airborne Laser Scanning

    NASA Astrophysics Data System (ADS)

    Koma, Zs.; Koenig, K.; Höfle, B.

    2016-06-01

    Vegetation mapping in urban environments plays an important role in biological research and urban management. Airborne laser scanning provides detailed 3D geodata, which allows to classify single trees into different taxa. Until now, research dealing with tree classification focused on forest environments. This study investigates the object-based classification of urban trees at taxonomic family level, using full-waveform airborne laser scanning data captured in the city centre of Vienna (Austria). The data set is characterised by a variety of taxa, including deciduous trees (beeches, mallows, plane trees and soapberries) and the coniferous pine species. A workflow for tree object classification is presented using geometric and radiometric features. The derived features are related to point density, crown shape and radiometric characteristics. For the derivation of crown features, a prior detection of the crown base is performed. The effects of interfering objects (e.g. fences and cars which are typical in urban areas) on the feature characteristics and the subsequent classification accuracy are investigated. The applicability of the features is evaluated by Random Forest classification and exploratory analysis. The most reliable classification is achieved by using the combination of geometric and radiometric features, resulting in 87.5% overall accuracy. By using radiometric features only, a reliable classification with accuracy of 86.3% can be achieved. The influence of interfering objects on feature characteristics is identified, in particular for the radiometric features. The results indicate the potential of using radiometric features in urban tree classification and show its limitations due to anthropogenic influences at the same time.

  12. Decision tree methods: applications for classification and prediction.

    PubMed

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure. PMID:26120265

  13. Automatic Classification of Trees from Laser Scanning Point Clouds

    NASA Astrophysics Data System (ADS)

    Sirmacek, B.; Lindenbergh, R.

    2015-08-01

    Development of laser scanning technologies has promoted tree monitoring studies to a new level, as the laser scanning point clouds enable accurate 3D measurements in a fast and environmental friendly manner. In this paper, we introduce a probability matrix computation based algorithm for automatically classifying laser scanning point clouds into 'tree' and 'non-tree' classes. Our method uses the 3D coordinates of the laser scanning points as input and generates a new point cloud which holds a label for each point indicating if it belongs to the 'tree' or 'non-tree' class. To do so, a grid surface is assigned to the lowest height level of the point cloud. The grids are filled with probability values which are calculated by checking the point density above the grid. Since the tree trunk locations appear with very high values in the probability matrix, selecting the local maxima of the grid surface help to detect the tree trunks. Further points are assigned to tree trunks if they appear in the close proximity of trunks. Since heavy mathematical computations (such as point cloud organization, detailed shape 3D detection methods, graph network generation) are not required, the proposed algorithm works very fast compared to the existing methods. The tree classification results are found reliable even on point clouds of cities containing many different objects. As the most significant weakness, false detection of light poles, traffic signs and other objects close to trees cannot be prevented. Nevertheless, the experimental results on mobile and airborne laser scanning point clouds indicate the possible usage of the algorithm as an important step for tree growth observation, tree counting and similar applications. While the laser scanning point cloud is giving opportunity to classify even very small trees, accuracy of the results is reduced in the low point density areas further away than the scanning location. These advantages and disadvantages of two laser scanning point

  14. Ultraviolet stellar spectral classification using a multilevel tree neural network

    NASA Astrophysics Data System (ADS)

    Gulati, R. K.; Gupta, R.; Gothoskar, P.; Khobragade, S.

    Here we present a pattern classification technique based on an Artificial Neural Network (ANN) in a multi-level tree configuration to classify ultraviolet stellar spectra from the IUE Low-Dispersion Spectra Reference Atlas. Preliminary results of this technique show that 94% of the spectra have been classified correctly with an accuracy of one sub-class. A conventional overlineχ2 minimization scheme has also been applied to the data to compare the classification obtained from these schemes with that of the IUE catalog classification.

  15. Using Classification Trees to Predict Alumni Giving for Higher Education

    ERIC Educational Resources Information Center

    Weerts, David J.; Ronca, Justin M.

    2009-01-01

    As the relative level of public support for higher education declines, colleges and universities aim to maximize alumni-giving to keep their programs competitive. Anchored in a utility maximization framework, this study employs the classification and regression tree methodology to examine characteristics of alumni donors and non-donors at a…

  16. Growth in Mathematics Achievement: Analysis with Classification and Regression Trees

    ERIC Educational Resources Information Center

    Ma, Xin

    2005-01-01

    A recently developed statistical technique, often referred to as classification and regression trees (CART), holds great potential for researchers to discover how student-level (and school-level) characteristics interactively affect growth in mathematics achievement. CART is a host of advanced statistical methods that statistically cluster…

  17. A Section-based Method For Tree Species Classification Using Airborne LiDAR Discrete Points In Urban Areas

    NASA Astrophysics Data System (ADS)

    Chunjing, Y. C.; Hui, T.; Zhongjie, R.; Guikai, B.

    2015-12-01

    As a new approach to forest inventory utilizing, LiDAR remote sensing has become an important research issue in the past. Lidar researches initially concentrate on the investigation for mapping forests at the tree level and identifying important structural parameters, such as tree height, crown size, crown base height, individual tree species, and stem volume etc. But for the virtual city visualization and mapping, the traditional methods of tree classification can't satisfy the more complex conditions. Recently, the advanced LiDAR technology has generated new full waveform scanners that provide a higher point density and additional information about the reflecting characteristics of trees. Subsequently, it was demonstrated that it is feasible to detect individual overstorey trees in forests and classify species. But the important issues like the calibration and the decomposition of full waveform data with a series of Gaussian functions usually take a lot of works. What's more, the detection and classification of vegetation results relay much on the prior outcomes. From all above, the section-based method for tree species classification using small footprint and high sampling density lidar data is proposed in this paper, which can overcome the tree species classification issues in urban areas. More specific objectives are to: (1)use local maximum height decision and four direction sections certification methods to get the precise locations of the trees;(2) develop new lidar-derived features processing techniques for characterizing the section structure of individual tree crowns;(3) investigate several techniques for filtering and analyzing vertical profiles of individual trees to classify the trees, and using the expert decision skills based on percentile analysis;(4) assess the accuracy of estimating tree species for each tree, and (5) investigate which type of lidar data, point frequency or intensity, provides the most accurate estimate of tree species

  18. Combining QuickBird, LiDAR, and GIS topography indices to identify a single native tree species in a complex landscape using an object-based classification approach

    NASA Astrophysics Data System (ADS)

    Pham, Lien T. H.; Brabyn, Lars; Ashraf, Salman

    2016-08-01

    There are now a wide range of techniques that can be combined for image analysis. These include the use of object-based classifications rather than pixel-based classifiers, the use of LiDAR to determine vegetation height and vertical structure, as well terrain variables such as topographic wetness index and slope that can be calculated using GIS. This research investigates the benefits of combining these techniques to identify individual tree species. A QuickBird image and low point density LiDAR data for a coastal region in New Zealand was used to examine the possibility of mapping Pohutukawa trees which are regarded as an iconic tree in New Zealand. The study area included a mix of buildings and vegetation types. After image and LiDAR preparation, single tree objects were identified using a range of techniques including: a threshold of above ground height to eliminate ground based objects; Normalised Difference Vegetation Index and elevation difference between the first and last return of LiDAR data to distinguish vegetation from buildings; geometric information to separate clusters of trees from single trees, and treetop identification and region growing techniques to separate tree clusters into single tree crowns. Important feature variables were identified using Random Forest, and the Support Vector Machine provided the classification. The combined techniques using LiDAR and spectral data produced an overall accuracy of 85.4% (Kappa 80.6%). Classification using just the spectral data produced an overall accuracy of 75.8% (Kappa 67.8%). The research findings demonstrate how the combining of LiDAR and spectral data improves classification for Pohutukawa trees.

  19. Multiple Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2010-01-01

    A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.

  20. A novel transferable individual tree crown delineation model based on Fishing Net Dragging and boundary classification

    NASA Astrophysics Data System (ADS)

    Liu, Tao; Im, Jungho; Quackenbush, Lindi J.

    2015-12-01

    This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.

  1. An automated approach to the design of decision tree classifiers

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Chin, P.; Beaudet, P.

    1980-01-01

    The classification of large dimensional data sets arising from the merging of remote sensing data with more traditional forms of ancillary data is considered. Decision tree classification, a popular approach to the problem, is characterized by the property that samples are subjected to a sequence of decision rules before they are assigned to a unique class. An automated technique for effective decision tree design which relies only on apriori statistics is presented. This procedure utilizes a set of two dimensional canonical transforms and Bayes table look-up decision rules. An optimal design at each node is derived based on the associated decision table. A procedure for computing the global probability of correct classfication is also provided. An example is given in which class statistics obtained from an actual LANDSAT scene are used as input to the program. The resulting decision tree design has an associated probability of correct classification of .76 compared to the theoretically optimum .79 probability of correct classification associated with a full dimensional Bayes classifier. Recommendations for future research are included.

  2. A Dynamic Classification Approach for Nursing

    PubMed Central

    Hardiker, Nicholas R.; Kim, Tae Youn; Coenen, Amy M.; Jansen, Kay R.

    2011-01-01

    Nursing has a long tradition of classification, stretching back at least 150 years. The introduction of computers into health care towards the end of the 20th Century helped to focus efforts, culminating in the development of a range of standardized classifications. Many of these classifications are still in use today and, while content is periodically updated, the underlying classification structures remain relatively static. In this paper an approach to classification that is relatively new to nursing is presented; an approach that uses formal Web Ontology Language definitions for classes, and computer-based reasoning on those classes, to determine automatically classification structures that more flexibly meet the needs of users. A new proposed classification structure for the International Classification for Nursing Practice is derived under the new approach to provide a new view on the next release of the classification and to contribute to broader quality improvement processes. PMID:22195109

  3. Flood-type classification in mountainous catchments using crisp and fuzzy decision trees

    NASA Astrophysics Data System (ADS)

    Sikorska, Anna E.; Viviroli, Daniel; Seibert, Jan

    2015-10-01

    Floods are governed by largely varying processes and thus exhibit various behaviors. Classification of flood events into flood types and the determination of their respective frequency is therefore important for a better understanding and prediction of floods. This study presents a flood classification for identifying flood patterns at a catchment scale by means of a fuzzy decision tree. Hence, events are represented as a spectrum of six main possible flood types that are attributed with their degree of acceptance. Considered types are flash, short rainfall, long rainfall, snow-melt, rainfall on snow and, in high alpine catchments, glacier-melt floods. The fuzzy decision tree also makes it possible to acknowledge the uncertainty present in the identification of flood processes and thus allows for more reliable flood class estimates than using a crisp decision tree, which identifies one flood type per event. Based on the data set in nine Swiss mountainous catchments, it was demonstrated that this approach is less sensitive to uncertainties in the classification attributes than the classical crisp approach. These results show that the fuzzy approach bears additional potential for analyses of flood patterns at a catchment scale and thereby it provides more realistic representation of flood processes.

  4. Data mining in psychological treatment research: a primer on classification and regression trees.

    PubMed

    King, Matthew W; Resick, Patricia A

    2014-10-01

    Data mining of treatment study results can reveal unforeseen but critical insights, such as who receives the most benefit from treatment and under what circumstances. The usefulness and legitimacy of exploratory data analysis have received relatively little recognition, however, and analytic methods well suited to the task are not widely known in psychology. With roots in computer science and statistics, statistical learning approaches offer a credible option: These methods take a more inductive approach to building a model than is done in traditional regression, allowing the data greater role in suggesting the correct relationships between variables rather than imposing them a priori. Classification and regression trees are presented as a powerful, flexible exemplar of statistical learning methods. Trees allow researchers to efficiently identify useful predictors of an outcome and discover interactions between predictors without the need to anticipate and specify these in advance, making them ideal for revealing patterns that inform hypotheses about treatment effects. Trees can also provide a predictive model for forecasting outcomes as an aid to clinical decision making. This primer describes how tree models are constructed, how the results are interpreted and evaluated, and how trees overcome some of the complexities of traditional regression. Examples are drawn from randomized clinical trial data and highlight some interpretations of particular interest to treatment researchers. The limitations of tree models are discussed, and suggestions for further reading and choices in software are offered. PMID:24588404

  5. Classification of dopamine, serotonin, and dual antagonists by decision trees.

    PubMed

    Kim, Hye-Jung; Choo, Hyunah; Cho, Yong Seo; Koh, Hun Yeong; No, Kyoung Tai; Pae, Ae Nim

    2006-04-15

    Dopamine antagonists (DA), serotonin antagonists (SA), and serotonin-dopamine dual antagonists (Dual) are being used as antipsychotics. A lot of dopamine and serotonin antagonists reveal non-selective binding affinity against these two receptors because the antagonists share structurally common features originated from conserved residues of binding site of the aminergic receptor family. Therefore, classification of dopamine and serotonin antagonists into their own receptors can be useful in the designing of selective antagonist for individual therapy of antipsychotic disorders. Data set containing 1135 dopamine antagonists (D2, D3, and D4), 1251 serotonin antagonists (5-HT1A, 5-HT2A, and 5-HT2C), and 386 serotonin-dopamine dual antagonists was collected from the MDDR database. Cerius2 descriptors were employed to develop a classification model for the 2772 compounds with antipsychotic activity. LDA (linear discriminant analysis), SIMCA (soft independent modeling of class analogy), RP (recursive partitioning), and ANN (artificial neural network) algorithms successfully classified the active class of each compound at the average 73.6% and predicted at the average 69.8%. The decision trees from RP, the best model, were generated to identify and interpret those descriptors that discriminate the active classes more easily. These classification models could be used as a virtual screening tool to predict the active class of new candidates. PMID:16387502

  6. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System

    PubMed Central

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3–6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable ‘if-then’ rules. PMID:26461492

  7. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System.

    PubMed

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3-6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable 'if-then' rules. PMID:26461492

  8. Decision Tree Classifier for Classification of Plant and Animal Micro RNA's

    NASA Astrophysics Data System (ADS)

    Pant, Bhasker; Pant, Kumud; Pardasani, K. R.

    Gene expression is regulated by miRNAs or micro RNAs which can be 21-23 nucleotide in length. They are non coding RNAs which control gene expression either by translation repression or mRNA degradation. Plants and animals both contain miRNAs which have been classified by wet lab techniques. These techniques are highly expensive, labour intensive and time consuming. Hence faster and economical computational approaches are needed. In view of above a machine learning model has been developed for classification of plant and animal miRNAs using decision tree classifier. The model has been tested on available data and it gives results with 91% accuracy.

  9. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    NASA Astrophysics Data System (ADS)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  10. Graduates employment classification using data mining approach

    NASA Astrophysics Data System (ADS)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  11. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    PubMed

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement. PMID:25420271

  12. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    NASA Astrophysics Data System (ADS)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  13. Integration of Classification Tree Analyses and Spatial Metrics to Assess Changes in Supraglacial Lakes in the Karakoram Himalaya

    NASA Astrophysics Data System (ADS)

    Bulley, H. N.; Bishop, M. P.; Shroder, J. F.; Haritashya, U. K.

    2007-12-01

    Alpine glacier responses to climate chnage reveal increases in retreat with corresponding increases in production of glacier melt water and development of supraglacial lakes. The rate of occurrence and spatial extent of lakes in the Himalaya are difficult to determine because current spectral-based image analysis of glacier surfaces are limited through anisotropic reflectance and lack of high quality digital elevation models. Additionally, the limitations of multivariate classification algorithms to adequately segregate glacier features in satellite imagery have led to an increased interest in non-parametric methods, such as classification and regression trees. Our objectives are to demonstrate the utility of a semi-automated approach that integrates classification- tree-based image segmentation and object-oriented analysis to differentiate supraglacial lakes from glacier debris, ice cliffs, lateral and medial moraines. The classification-tree process involves a binary, recursive, partitioning non-parametric method that can account for non-linear relationships. We used 2002 and 2004 ASTER VNIR and SWIR imagery to assess the Baltoro Glacier in the Karakoram Himalaya. Other input variables include the normalized difference water index (NDWI), ratio images, Moran's I image, and fractal dimension. The classification tree was used to generate initial image segments and it was particularly effective in differentiating glacier features. The object-oriented analysis included the use of shape and spatial metrics to refine the classification-tree output. Classification-tree results show that NDWI is the most important single variable for characterizing the glacier-surface features, followed by NIR/IR ratio, IR band, and IR/Red ratio variables. Lake features extracted from both images show there were 142 lakes in 2002 as compared to 188 lakes in 2004. In general, there was a significant increase in planimetric area from 2002 to 2004, and we documented the formation of 46 new

  14. Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui

    2016-07-01

    The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.

  15. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis

    PubMed Central

    Lo, Benjamin W. Y.; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R. Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A. H.

    2016-01-01

    Background: Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). Methods: The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Results: Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56–2.45, P < 0.01). Conclusions: A clinically useful classification tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH. PMID:27512607

  16. Biosensor Approach to Psychopathology Classification

    PubMed Central

    Koshelev, Misha; Lohrenz, Terry; Vannucci, Marina; Montague, P. Read

    2010-01-01

    We used a multi-round, two-party exchange game in which a healthy subject played a subject diagnosed with a DSM-IV (Diagnostic and Statistics Manual-IV) disorder, and applied a Bayesian clustering approach to the behavior exhibited by the healthy subject. The goal was to characterize quantitatively the style of play elicited in the healthy subject (the proposer) by their DSM-diagnosed partner (the responder). The approach exploits the dynamics of the behavior elicited in the healthy proposer as a biosensor for cognitive features that characterize the psychopathology group at the other side of the interaction. Using a large cohort of subjects (n = 574), we found statistically significant clustering of proposers' behavior overlapping with a range of DSM-IV disorders including autism spectrum disorder, borderline personality disorder, attention deficit hyperactivity disorder, and major depressive disorder. To further validate these results, we developed a computer agent to replace the human subject in the proposer role (the biosensor) and show that it can also detect these same four DSM-defined disorders. These results suggest that the highly developed social sensitivities that humans bring to a two-party social exchange can be exploited and automated to detect important psychopathologies, using an interpersonal behavioral probe not directly related to the defining diagnostic criteria. PMID:20975934

  17. An automated approach to the design of decision tree classifiers

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Chin, R.; Beaudet, P.

    1982-01-01

    An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.

  18. Information theoretic approach for accounting classification

    NASA Astrophysics Data System (ADS)

    Ribeiro, E. M. S.; Prataviera, G. A.

    2014-12-01

    In this paper we consider an information theoretic approach for the accounting classification process. We propose a matrix formalism and an algorithm for calculations of information theoretic measures associated to accounting classification. The formalism may be useful for further generalizations and computer-based implementation. Information theoretic measures, mutual information and symmetric uncertainty, were evaluated for daily transactions recorded in the chart of accounts of a small company during two years. Variation in the information measures due the aggregation of data in the process of accounting classification is observed. In particular, the symmetric uncertainty seems to be a useful parameter for comparing companies over time or in different sectors or different accounting choices and standards.

  19. A statistical approach to root system classification

    PubMed Central

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  20. A statistical approach to root system classification.

    PubMed

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  1. An evaluation of popular hyperspectral images classification approaches

    NASA Astrophysics Data System (ADS)

    Kuznetsov, Andrey; Myasnikov, Vladislav

    2015-12-01

    This work is devoted to the problem of the best hyperspectral images classification algorithm selection. The following algorithms are used for comparison: decision tree using full cross-validation; decision tree C 4.5; Bayesian classifier; maximum-likelihood method; MSE minimization classifier, including a special case - classification by conjugation; spectral angle classifier (for empirical mean and nearest neighbor), spectral mismatch classifier and support vector machine (SVM). There are used AVIRIS and SpecTIR hyperspectral images to conduct experiments.

  2. A neural network approach to cloud classification

    NASA Technical Reports Server (NTRS)

    Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.

    1990-01-01

    It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.

  3. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    PubMed

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  4. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    PubMed Central

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  5. Tree Species Classification By Multiseasonal High Resolution Satellite Data

    NASA Astrophysics Data System (ADS)

    Elatawneh, Alata; Wallner, Adelheid; Straub, Christoph; Schneider, Thomas; Knoke, Thomas

    2013-12-01

    Accurate forest tree species mapping is a fundamental issue for sustainable forest management and planning. Forest tree species mapping with the means of remote sensing data is still a topic to be investigated. The Bavaria state institute of forestry is investigating the potential of using digital aerial images for forest management purposes. However, using aerial images is still cost- and time-consuming, in addition to their acquisition restrictions. The new space-born sensor generations such as, RapidEye, with a very high temporal resolution, offering multiseasonal data have the potential to improve the forest tree species mapping. In this study, we investigated the potential of multiseasonal RapidEye data for mapping tree species in a Mid European forest in Southern Germany. The RapidEye data of level A3 were collected on ten different dates in the years 2009, 2010 and 2011. For data analysis, a model was developed, which combines the Spectral Angle Mapper technique with a 10-fold- cross-validation. The analysis succeeded to differentiate four tree species; Norway spruce (Picea abies L.), Silver Fir (Abies alba Mill.), European beech (Fagus sylvatica) and Maple (Acer pseudoplatanus). The model success was evaluated using digital aerial images acquired in the year 2009 and inventory point records from 2008/09 inventory. Model results of the multiseasonal RapidEye data analysis achieved an overall accuracy of 76%. However, the success of the model was evaluated only for all the identified species and not for the individual.

  6. Iqpc 2015 Track: Tree Separation and Classification in Mobile Mapping LIDAR Data

    NASA Astrophysics Data System (ADS)

    Gorte, B.; Oude Elberink, S.; Sirmacek, B.; Wang, J.

    2015-08-01

    The European FP7 project IQmulus yearly organizes several processing contests, where submissions are requested for novel algorithms for point cloud and other big geodata processing. This paper describes the set-up and execution of a contest having the purpose to evaluate state-of-the-art algorithms for Mobile Mapping System point clouds, in order to detect and identify (individual) trees. By the nature of MMS these are trees in the vicinity of the road network (rather than in forests). Therefore, part of the challenge is distinguishing between trees and other objects, such as buildings, street furniture, cars etc. Three submitted segmentation and classification algorithms are thus evaluated.

  7. A Systematic Approach to Subgroup Classification in Intellectual Disability

    ERIC Educational Resources Information Center

    Schalock, Robert L.; Luckasson, Ruth

    2015-01-01

    This article describes a systematic approach to subgroup classification based on a classification framework and sequential steps involved in the subgrouping process. The sequential steps are stating the purpose of the classification, identifying the classification elements, using relevant information, and using clearly stated and purposeful…

  8. Flow Analysis: A Novel Approach For Classification.

    PubMed

    Vakh, Christina; Falkova, Marina; Timofeeva, Irina; Moskvin, Alexey; Moskvin, Leonid; Bulatov, Andrey

    2016-09-01

    We suggest a novel approach for classification of flow analysis methods according to the conditions under which the mass transfer processes and chemical reactions take place in the flow mode: dispersion-convection flow methods and forced-convection flow methods. The first group includes continuous flow analysis, flow injection analysis, all injection analysis, sequential injection analysis, sequential injection chromatography, cross injection analysis, multi-commutated flow analysis, multi-syringe flow injection analysis, multi-pumping flow systems, loop flow analysis, and simultaneous injection effective mixing flow analysis. The second group includes segmented flow analysis, zone fluidics, flow batch analysis, sequential injection analysis with a mixing chamber, stepwise injection analysis, and multi-commutated stepwise injection analysis. The offered classification allows systematizing a large number of flow analysis methods. Recent developments and applications of dispersion-convection flow methods and forced-convection flow methods are presented. PMID:26364745

  9. Classification and concentration estimation of explosive precursors using nanowires sensor array and decision tree learning

    NASA Astrophysics Data System (ADS)

    Cho, Junghwan; Li, Xiaopeng; Gu, Zhiyong; Kurup, Pradeep

    2011-09-01

    This paper aims to classify and estimate concentrations of explosive precursors using a nanowire sensor array and decision tree learning algorithm. The nanowire sensor array consists of tin oxide sensors with four different additives, platinum (Pt), copper (Cu), indium (In), and nickel (Ni). The nanowire sensor array was tested using the vapors from four explosives precursors, acetone, nitrobenzene, nitrotoluene, and octane with 10 different concentration levels each. A pattern recognition technique based on decision tree learning was applied to classify the explosive precursors and estimate their concentration. Classification and regression tree (CART) analysis was used for classification. The CART was also utilized for the purpose of structure identification in Sugeno fuzzy inference system (FIS) for estimating the concentration of the precursors. Two CARTs were trained and their testing results were investigated.

  10. Automatic Approach to Vhr Satellite Image Classification

    NASA Astrophysics Data System (ADS)

    Kupidura, P.; Osińska-Skotak, K.; Pluto-Kossakowska, J.

    2016-06-01

    In this paper, we present a proposition of a fully automatic classification of VHR satellite images. Unlike the most widespread approaches: supervised classification, which requires prior defining of class signatures, or unsupervised classification, which must be followed by an interpretation of its results, the proposed method requires no human intervention except for the setting of the initial parameters. The presented approach bases on both spectral and textural analysis of the image and consists of 3 steps. The first step, the analysis of spectral data, relies on NDVI values. Its purpose is to distinguish between basic classes, such as water, vegetation and non-vegetation, which all differ significantly spectrally, thus they can be easily extracted basing on spectral analysis. The second step relies on granulometric maps. These are the product of local granulometric analysis of an image and present information on the texture of each pixel neighbourhood, depending on the texture grain. The purpose of texture analysis is to distinguish between different classes, spectrally similar, but yet of different texture, e.g. bare soil from a built-up area, or low vegetation from a wooded area. Due to the use of granulometric analysis, based on mathematical morphology opening and closing, the results are resistant to the border effect (qualifying borders of objects in an image as spaces of high texture), which affect other methods of texture analysis like GLCM statistics or fractal analysis. Therefore, the effectiveness of the analysis is relatively high. Several indices based on values of different granulometric maps have been developed to simplify the extraction of classes of different texture. The third and final step of the process relies on a vegetation index, based on near infrared and blue bands. Its purpose is to correct partially misclassified pixels. All the indices used in the classification model developed relate to reflectance values, so the preliminary step

  11. A Classification of Recent Widespread Tree Mortality in the Western US

    NASA Astrophysics Data System (ADS)

    Hicke, J. A.; Anderegg, W.; Allen, C. D.; Stephenson, N.

    2015-12-01

    Widespread tree mortality has been documented across the western United States in recent decades. Climate change has been implicated in these events, in particular warming and associated effects on tree stress and biotic disturbance agents. Given projected future warming, the capability of accurately predicting future tree mortality is critical. However, sufficient ecological understanding is needed to do so. Here we describe differences in various mortality types associated with spatial characteristics and climate drivers. We loosely classify mortality types into four categories: 1) widespread but low severity background mortality that has been increasing mainly because of greater stress associated with rising climatic water deficit; 2) tree die-offs that are driven by severe, hotter drought in which biotic agents play minor roles, such as sudden aspen decline; 3) tree die-offs in which hotter droughts combined with outbreaks of biotic agents, often less aggressive bark beetles, to cause mortality, such as piñon pine mortality in the Southwest; and 4) tree die-offs that were initiated or facilitated by droughts but which were associated with aggressive biotic agents that can kill healthy trees at high populations, such as mountain pine beetle outbreaks. An important use of this classification is the different pathways by which climate change can cause tree mortality. For some classes (background and primarily drought-driven mortality), predictions may be sufficiently accurate based on climate (drought) metrics. For classes in which biotic agents play a role, the direct warming effect on insects may occur through mechanisms not related to drought, and therefore predictions may need to include mechanisms other than drought. We note that this is a simplistic classification designed to facilitate understanding of tree mortality, and that overlap occurs among categories.

  12. Decision Tree Approach for Soil Liquefaction Assessment

    PubMed Central

    Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498

  13. Automatic template-guided classification of remnant trees

    NASA Astrophysics Data System (ADS)

    Kennedy, Peter

    Spectral features within satellite images change so frequently and unpredictably that spectral definitions of land cover are often only accurate for a single image. Consequently, land-cover maps are expensive, because the superior pattern recognition skills of human analysts are required to manually tune spectral definitions of land cover to individual images. To reduce mapping costs, this study developed the Template-Guided Classification (TGC) algorithm, which classifies land cover automatically by reusing class information embedded in freely available large-area land-cover maps. TGC was applied to map remnant forest within six 10-m resolution SPOT images of the Vermilion River watershed in Alberta, Canada. Although the accuracy of the resulting forest maps was low (58 % forest user's accuracy and 67 % forest producer's accuracy), there were 25 % and 8 % fewer errors of omission and commission than the original maps, respectively. This improvement would be very useful if it could be obtained automatically over large-areas.

  14. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    USGS Publications Warehouse

    Edwards, T.C., Jr.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, G.G.

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.

  15. Stroke Damage Detection Using Classification Trees on Electrical Bioimpedance Cerebral Spectroscopy Measurements

    PubMed Central

    Atefi, Seyed Reza; Seoane, Fernando; Thorlin, Thorleif; Lindecrantz, Kaj

    2013-01-01

    After cancer and cardio-vascular disease, stroke is the third greatest cause of death worldwide. Given the limitations of the current imaging technologies used for stroke diagnosis, the need for portable non-invasive and less expensive diagnostic tools is crucial. Previous studies have suggested that electrical bioimpedance (EBI) measurements from the head might contain useful clinical information related to changes produced in the cerebral tissue after the onset of stroke. In this study, we recorded 720 EBI Spectroscopy (EBIS) measurements from two different head regions of 18 hemispheres of nine subjects. Three of these subjects had suffered a unilateral haemorrhagic stroke. A number of features based on structural and intrinsic frequency-dependent properties of the cerebral tissue were extracted. These features were then fed into a classification tree. The results show that a full classification of damaged and undamaged cerebral tissue was achieved after three hierarchical classification steps. Lastly, the performance of the classification tree was assessed using Leave-One-Out Cross Validation (LOO-CV). Despite the fact that the results of this study are limited to a small database, and the observations obtained must be verified further with a larger cohort of patients, these findings confirm that EBI measurements contain useful information for assessing on the health of brain tissue after stroke and supports the hypothesis that classification features based on Cole parameters, spectral information and the geometry of EBIS measurements are useful to differentiate between healthy and stroke damaged brain tissue. PMID:23966181

  16. Computer-aided diagnosis of Alzheimer's disease using support vector machines and classification trees

    NASA Astrophysics Data System (ADS)

    Salas-Gonzalez, D.; Górriz, J. M.; Ramírez, J.; López, M.; Álvarez, I.; Segovia, F.; Chaves, R.; Puntonet, C. G.

    2010-05-01

    This paper presents a computer-aided diagnosis technique for improving the accuracy of early diagnosis of Alzheimer-type dementia. The proposed methodology is based on the selection of voxels which present Welch's t-test between both classes, normal and Alzheimer images, greater than a given threshold. The mean and standard deviation of intensity values are calculated for selected voxels. They are chosen as feature vectors for two different classifiers: support vector machines with linear kernel and classification trees. The proposed methodology reaches greater than 95% accuracy in the classification task.

  17. The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification.

    PubMed

    Afrasiabi, Cyrus; Samad, Bushra; Dineen, David; Meacham, Christopher; Sjölander, Kimmen

    2013-07-01

    The PhyloFacts 'Fast Approximate Tree Classification' (FAT-CAT) web server provides a novel approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs and robustness to promiscuous domains. We also present results documenting the precision of ortholog identification based on subtree hidden Markov model scoring. The FAT-CAT phylogenetic placement is used to derive a functional annotation for the query, including confidence scores and drill-down capabilities. PhyloFacts' broad taxonomic and functional coverage, with >7.3 M proteins from across the Tree of Life, enables FAT-CAT to predict orthologs and assign function for most sequence inputs. Four pipeline parameter presets are provided to handle different sequence types, including partial sequences and proteins containing promiscuous domains; users can also modify individual parameters. PhyloFacts trees matching the query can be viewed interactively online using the PhyloScope Javascript tree viewer and are hyperlinked to various external databases. The FAT-CAT web server is available at http://phylogenomics.berkeley.edu/phylofacts/fatcat/. PMID:23685612

  18. Semi-automatic approach for music classification

    NASA Astrophysics Data System (ADS)

    Zhang, Tong

    2003-11-01

    Audio categorization is essential when managing a music database, either a professional library or a personal collection. However, a complete automation in categorizing music into proper classes for browsing and searching is not yet supported by today"s technology. Also, the issue of music classification is subjective to some extent as each user may have his own criteria for categorizing music. In this paper, we propose the idea of semi-automatic music classification. With this approach, a music browsing system is set up which contains a set of tools for separating music into a number of broad types (e.g. male solo, female solo, string instruments performance, etc.) using existing music analysis methods. With results of the automatic process, the user may further cluster music pieces in the database into finer classes and/or adjust misclassifications manually according to his own preferences and definitions. Such a system may greatly improve the efficiency of music browsing and retrieval, while at the same time guarantee accuracy and user"s satisfaction of the results. Since this semi-automatic system has two parts, i.e. the automatic part and the manual part, they are described separately in the paper, with detailed descriptions and examples of each step of the two parts included.

  19. A Nonparametric Approach to Estimate Classification Accuracy and Consistency

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2014-01-01

    When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

  20. Identification and classification of dynamic event tree scenarios via possibilistic clustering: application to a steam generator tube rupture event.

    PubMed

    Mercurio, D; Podofillini, L; Zio, E; Dang, V N

    2009-11-01

    This paper illustrates a method to identify and classify scenarios generated in a dynamic event tree (DET) analysis. Identification and classification are carried out by means of an evolutionary possibilistic fuzzy C-means clustering algorithm which takes into account not only the final system states but also the timing of the events and the process evolution. An application is considered with regards to the scenarios generated following a steam generator tube rupture in a nuclear power plant. The scenarios are generated by the accident dynamic simulator (ADS), coupled to a RELAP code that simulates the thermo-hydraulic behavior of the plant and to an operators' crew model, which simulates their cognitive and procedures-guided responses. A set of 60 scenarios has been generated by the ADS DET tool. The classification approach has grouped the 60 scenarios into 4 classes of dominant scenarios, one of which was not anticipated a priori but was "discovered" by the classifier. The proposed approach may be considered as a first effort towards the application of identification and classification approaches to scenarios post-processing for real-scale dynamic safety assessments. PMID:19819366

  1. The minimum distance approach to classification

    NASA Technical Reports Server (NTRS)

    Wacker, A. G.; Landgrebe, D. A.

    1971-01-01

    The work to advance the state-of-the-art of miminum distance classification is reportd. This is accomplished through a combination of theoretical and comprehensive experimental investigations based on multispectral scanner data. A survey of the literature for suitable distance measures was conducted and the results of this survey are presented. It is shown that minimum distance classification, using density estimators and Kullback-Leibler numbers as the distance measure, is equivalent to a form of maximum likelihood sample classification. It is also shown that for the parametric case, minimum distance classification is equivalent to nearest neighbor classification in the parameter space.

  2. An object-oriented approach for agrivultural land classification using rapideye imagery

    NASA Astrophysics Data System (ADS)

    Sang, H.; Zhai, L.; Zhang, J.; An, F.

    2015-06-01

    With the improvement of remote sensing technology, the spatial, structural and texture information of land covers are present clearly in high resolution imagery, which enhances the ability of crop mapping. Since the satellite RapidEye was launched in 2009, high resolution multispectral imagery together with wide red edge band has been utilized in vegetation monitoring. Broad red edge band related vegetation indices improved land use classification and vegetation studies. RapidEye high resolution imagery acquired on May 29 and August 9th of 2012 was used in this study to evaluate the potential of red edge band in agricultural land cover/use mapping using an objected-oriented classification approach. A new object-oriented decision tree classifier was introduced in this study to map agricultural lands in the study area. Besides the five bands of RapidEye image, the vegetation indexes derived from spectral bands and the structural and texture features are utilized as inputs for agricultural land cover/use mapping in the study. The optimization of input features for classification by reducing redundant information improves the mapping precision over 9% for AdaTree. WL, and 5% for SVM, the accuracy is over 90% for both approaches. Time phase characteristic is much important in different agricultural lands, and it improves the classification accuracy 7% for AdaTree.WL and 6% for SVM.

  3. Classification tree and minimum-volume ellipsoid analyses of the distribution of ponderosa pine in the western USA

    USGS Publications Warehouse

    Norris, Jodi R.; Jackson, Stephen T.; Betancourt, Julio L.

    2006-01-01

    Aim? Ponderosa pine (Pinus ponderosa Douglas ex Lawson & C. Lawson) is an economically and ecologically important conifer that has a wide geographic range in the western USA, but is mostly absent from the geographic centre of its distribution - the Great Basin and adjoining mountain ranges. Much of its modern range was achieved by migration of geographically distinct Sierra Nevada (P. ponderosa var. ponderosa) and Rocky Mountain (P. ponderosa var. scopulorum) varieties in the last 10,000 years. Previous research has confirmed genetic differences between the two varieties, and measurable genetic exchange occurs where their ranges now overlap in western Montana. A variety of approaches in bioclimatic modelling is required to explore the ecological differences between these varieties and their implications for historical biogeography and impending changes in western landscapes. Location? Western USA. Methods? We used a classification tree analysis and a minimum-volume ellipsoid as models to explain the broad patterns of distribution of ponderosa pine in modern environments using climatic and edaphic variables. Most biogeographical modelling assumes that the target group represents a single, ecologically uniform taxonomic population. Classification tree analysis does not require this assumption because it allows the creation of pathways that predict multiple positive and negative outcomes. Thus, classification tree analysis can be used to test the ecological uniformity of the species. In addition, a multidimensional ellipsoid was constructed to describe the niche of each variety of ponderosa pine, and distances from the niche were calculated and mapped on a 4-km grid for each ecological variable. Results? The resulting classification tree identified three dominant pathways predicting ponderosa pine presence. Two of these three pathways correspond roughly to the distribution of var. ponderosa, and the third pathway generally corresponds to the distribution of var

  4. Classification of Tree Species in Overstorey Canopy of Subtropical Forest Using QuickBird Images

    PubMed Central

    Lin, Chinsu; Popescu, Sorin C.; Thomson, Gavin; Tsogt, Khongor; Chang, Chein-I

    2015-01-01

    This paper proposes a supervised classification scheme to identify 40 tree species (2 coniferous, 38 broadleaf) belonging to 22 families and 36 genera in high spatial resolution QuickBird multispectral images (HMS). Overall kappa coefficient (OKC) and species conditional kappa coefficients (SCKC) were used to evaluate classification performance in training samples and estimate accuracy and uncertainty in test samples. Baseline classification performance using HMS images and vegetation index (VI) images were evaluated with an OKC value of 0.58 and 0.48 respectively, but performance improved significantly (up to 0.99) when used in combination with an HMS spectral-spatial texture image (SpecTex). One of the 40 species had very high conditional kappa coefficient performance (SCKC ≥ 0.95) using 4-band HMS and 5-band VIs images, but, only five species had lower performance (0.68 ≤ SCKC ≤ 0.94) using the SpecTex images. When SpecTex images were combined with a Visible Atmospherically Resistant Index (VARI), there was a significant improvement in performance in the training samples. The same level of improvement could not be replicated in the test samples indicating that a high degree of uncertainty exists in species classification accuracy which may be due to individual tree crown density, leaf greenness (inter-canopy gaps), and noise in the background environment (intra-canopy gaps). These factors increase uncertainty in the spectral texture features and therefore represent potential problems when using pixel-based classification techniques for multi-species classification. PMID:25978466

  5. Automatic lung nodule classification with radiomics approach

    NASA Astrophysics Data System (ADS)

    Ma, Jingchen; Wang, Qian; Ren, Yacheng; Hu, Haibo; Zhao, Jun

    2016-03-01

    Lung cancer is the first killer among the cancer deaths. Malignant lung nodules have extremely high mortality while some of the benign nodules don't need any treatment .Thus, the accuracy of diagnosis between benign or malignant nodules diagnosis is necessary. Notably, although currently additional invasive biopsy or second CT scan in 3 months later may help radiologists to make judgments, easier diagnosis approaches are imminently needed. In this paper, we propose a novel CAD method to distinguish the benign and malignant lung cancer from CT images directly, which can not only improve the efficiency of rumor diagnosis but also greatly decrease the pain and risk of patients in biopsy collecting process. Briefly, according to the state-of-the-art radiomics approach, 583 features were used at the first step for measurement of nodules' intensity, shape, heterogeneity and information in multi-frequencies. Further, with Random Forest method, we distinguish the benign nodules from malignant nodules by analyzing all these features. Notably, our proposed scheme was tested on all 79 CT scans with diagnosis data available in The Cancer Imaging Archive (TCIA) which contain 127 nodules and each nodule is annotated by at least one of four radiologists participating in the project. Satisfactorily, this method achieved 82.7% accuracy in classification of malignant primary lung nodules and benign nodules. We believe it would bring much value for routine lung cancer diagnosis in CT imaging and provide improvement in decision-support with much lower cost.

  6. Application of Decision Tree Algorithm for classification and identification of natural minerals using SEM-EDS

    NASA Astrophysics Data System (ADS)

    Akkaş, Efe; Akin, Lutfiye; Evren Çubukçu, H.; Artuner, Harun

    2015-07-01

    A mineral is a natural, homogeneous solid with a definite chemical composition and a highly ordered atomic arrangement. Recently, fast and accurate mineral identification/classification became a necessity. Energy Dispersive X-ray Spectrometers integrated with Scanning Electron Microscopes (SEM) are used to obtain rapid and reliable elemental analysis or chemical characterization of a solid. However, mineral identification is challenging since there is wide range of spectral dataset for natural minerals. The more mineralogical data acquired, time required for classification procedures increases. Moreover, applied instrumental conditions on a SEM-EDS differ for various applications, affecting the produced X-ray patterns even for the same mineral. This study aims to test whether C5.0 Decision Tree is a rapid and reliable method algorithm for classification and identification of various natural magmatic minerals. Ten distinct mineral groups (olivine, orthopyroxene, clinopyroxene, apatite, amphibole, plagioclase, K-feldspar, zircon, magnetite, biotite) from different igneous rocks have been analyzed on SEM-EDS. 4601 elemental X-ray intensity data have been collected under various instrumental conditions. 2400 elemental data have been used to train and the remaining 2201 data have been tested to identify the minerals. The vast majority of the test data have been classified accurately. Additionally, high accuracy has been reached on the minerals with similar chemical composition, such as olivine ((Mg,Fe)2[SiO4]) and orthopyroxene ((Mg,Fe)2[SiO6]). Furthermore, two members from amphibole group (magnesiohastingsite, tschermakite) and two from clinopyroxene group (diopside, hedenbergite) have been accurately identified by the Decision Tree Algorithm. These results demonstrate that C5.0 Decision Tree Algorithm is an efficient method for mineral group classification and the identification of mineral members.

  7. Stratification of the severity of critically ill patients with classification trees

    PubMed Central

    2009-01-01

    Background Development of three classification trees (CT) based on the CART (Classification and Regression Trees), CHAID (Chi-Square Automatic Interaction Detection) and C4.5 methodologies for the calculation of probability of hospital mortality; the comparison of the results with the APACHE II, SAPS II and MPM II-24 scores, and with a model based on multiple logistic regression (LR). Methods Retrospective study of 2864 patients. Random partition (70:30) into a Development Set (DS) n = 1808 and Validation Set (VS) n = 808. Their properties of discrimination are compared with the ROC curve (AUC CI 95%), Percent of correct classification (PCC CI 95%); and the calibration with the Calibration Curve and the Standardized Mortality Ratio (SMR CI 95%). Results CTs are produced with a different selection of variables and decision rules: CART (5 variables and 8 decision rules), CHAID (7 variables and 15 rules) and C4.5 (6 variables and 10 rules). The common variables were: inotropic therapy, Glasgow, age, (A-a)O2 gradient and antecedent of chronic illness. In VS: all the models achieved acceptable discrimination with AUC above 0.7. CT: CART (0.75(0.71-0.81)), CHAID (0.76(0.72-0.79)) and C4.5 (0.76(0.73-0.80)). PCC: CART (72(69-75)), CHAID (72(69-75)) and C4.5 (76(73-79)). Calibration (SMR) better in the CT: CART (1.04(0.95-1.31)), CHAID (1.06(0.97-1.15) and C4.5 (1.08(0.98-1.16)). Conclusion With different methodologies of CTs, trees are generated with different selection of variables and decision rules. The CTs are easy to interpret, and they stratify the risk of hospital mortality. The CTs should be taken into account for the classification of the prognosis of critically ill patients. PMID:20003229

  8. An Optimized NBC Approach in Text Classification

    NASA Astrophysics Data System (ADS)

    Yao, Zhao; Zhi-Min, Chen

    state-of-the-art text classification algorithms are good at categorizing the Web documents into a few categories. But such a classification method does not give very detailed topic-related class information for the user because the first two levels are often too coarse in Large-scale Text Hierarchies. In this paper, we propose a method named DNB which can improve the performance of classification effectively in experimental results.

  9. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules…

  10. A systematic approach to the classification of diseases.

    PubMed

    Murthy, A R

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences. PMID:22556612

  11. A SYSTEMATIC APPROACH TO THE CLASSIFICATION OF DISEASES

    PubMed Central

    Murthy, A.R.V.

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences. PMID:22556612

  12. Applying Classification Trees to Hospital Administrative Data to Identify Patients with Lower Gastrointestinal Bleeding

    PubMed Central

    Siddique, Juned; Ruhnke, Gregory W.; Flores, Andrea; Prochaska, Micah T.; Paesch, Elizabeth; Meltzer, David O.; Whelan, Chad T.

    2015-01-01

    Background Lower gastrointestinal bleeding (LGIB) is a common cause of acute hospitalization. Currently, there is no accepted standard for identifying patients with LGIB in hospital administrative data. The objective of this study was to develop and validate a set of classification algorithms that use hospital administrative data to identify LGIB. Methods Our sample consists of patients admitted between July 1, 2001 and June 30, 2003 (derivation cohort) and July 1, 2003 and June 30, 2005 (validation cohort) to the general medicine inpatient service of the University of Chicago Hospital, a large urban academic medical center. Confirmed cases of LGIB in both cohorts were determined by reviewing the charts of those patients who had at least 1 of 36 principal or secondary International Classification of Diseases, Ninth revision, Clinical Modification (ICD-9-CM) diagnosis codes associated with LGIB. Classification trees were used on the data of the derivation cohort to develop a set of decision rules for identifying patients with LGIB. These rules were then applied to the validation cohort to assess their performance. Results Three classification algorithms were identified and validated: a high specificity rule with 80.1% sensitivity and 95.8% specificity, a rule that balances sensitivity and specificity (87.8% sensitivity, 90.9% specificity), and a high sensitivity rule with 100% sensitivity and 91.0% specificity. Conclusion These classification algorithms can be used in future studies to evaluate resource utilization and assess outcomes associated with LGIB without the use of chart review. PMID:26406318

  13. A discrete element modelling approach for block impacts on trees

    NASA Astrophysics Data System (ADS)

    Toe, David; Bourrier, Franck; Olmedo, Ignatio; Berger, Frederic

    2015-04-01

    These past few year rockfall models explicitly accounting for block shape, especially those using the Discrete Element Method (DEM), have shown a good ability to predict rockfall trajectories. Integrating forest effects into those models still remain challenging. This study aims at using a DEM approach to model impacts of blocks on trees and identify the key parameters controlling the block kinematics after the impact on a tree. A DEM impact model of a block on a tree was developed and validated using laboratory experiments. Then, key parameters were assessed using a global sensitivity analyse. Modelling the impact of a block on a tree using DEM allows taking into account large displacements, material non-linearities and contacts between the block and the tree. Tree stems are represented by flexible cylinders model as plastic beams sustaining normal, shearing, bending, and twisting loading. Root soil interactions are modelled using a rotation stiffness acting on the bending moment at the bottom of the tree and a limit bending moment to account for tree overturning. The crown is taken into account using an additional mass distribute uniformly on the upper part of the tree. The block is represented by a sphere. The contact model between the block and the stem consists of an elastic frictional model. The DEM model was validated using laboratory impact tests carried out on 41 fresh beech (Fagus Sylvatica) stems. Each stem was 1,3 m long with a diameter between 3 to 7 cm. Wood stems were clamped on a rigid structure and impacted by a 149 kg charpy pendulum. Finally an intensive simulation campaign of blocks impacting trees was done to identify the input parameters controlling the block kinematics after the impact on a tree. 20 input parameters were considered in the DEM simulation model : 12 parameters were related to the tree and 8 parameters to the block. The results highlight that the impact velocity, the stem diameter, and the block volume are the three input

  14. Analysis of Maryland Poisoning Deaths Using Classification And Regression Tree (CART) Analysis

    PubMed Central

    Pamer, Carol; Serpi, Tracey; Finkelstein, Joseph

    2008-01-01

    Our study is a cross-sectional analysis of Maryland poisoning deaths for years 2003 and 2004. We used Classification and Regression Tree (CART) methodology to classify 1,204 Maryland undetermined intent poisoning deaths as either unintentional or suicidal poisonings. The predictive ability of the selected set of variables (i.e., poisoned in the home or workplace, location type where poisoned, place of death, poison type, victim race and age, year of death) was extremely good. Of the 301 test cases, only eight were misclassified by the CART regression tree. Of 1,204 undetermined intent poisoning deaths, CART classified 903 as suicides and 301 as unintentional deaths. The major strength of our study is the use of CART to differentiate with a high degree of accuracy between unintentional and suicidal poisoning deaths among Maryland undetermined intent poisoning deaths. PMID:18999168

  15. Predicting Chemically Induced Duodenal Ulcer and Adrenal Necrosis with Classification Trees

    NASA Astrophysics Data System (ADS)

    Giampaolo, Casimiro; Gray, Andrew T.; Olshen, Richard A.; Szabo, Sandor

    1991-07-01

    Binary tree-structured statistical classification algorithms and properties of 56 model alkyl nucleophiles were brought to bear on two problems of experimental pharmacology and toxicology. Each rat of a learning sample of 745 was administered one compound and autopsied to determine the presence of duodenal ulcer or adrenal hemorrhagic necrosis. The cited statistical classification schemes were then applied to these outcomes and 67 features of the compounds to ascertain those characteristics that are associated with biologic activity. For predicting duodenal ulceration, dipole moment, melting point, and solubility in octanol are particularly important, while for predicting adrenal necrosis, important features include the number of sulfhydryl groups and double bonds. These methods may constitute inexpensive but powerful ways to screen untested compounds for possible organ-specific toxicity. Mechanisms for the etiology and pathogenesis of the duodenal and adrenal lesions are suggested, as are additional avenues for drug design.

  16. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis.

    PubMed

    Cohen, Ira L; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N S; Romanczyk, Raymond G; Karmel, Bernard Z; Gardner, Judith M

    2016-09-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80 %, generalized to an independent validation set, and generalized across age groups and sites, and agreed well with ADOS classifications. Parent PDDBIs yielded better results than teacher PDDBIs but, when CART predictions agreed across informants, sensitivity increased. Results also revealed three subtypes of ASD: minimally verbal, verbal, and atypical; and two, relatively common subtypes of non-ASD children: social pragmatic problems and good social skills. These subgroups corresponded to differences in behavior profiles and associated bio-medical findings. PMID:27318809

  17. Internal Carbon Recycling in Trees - New Approach, Findings, and Implications

    NASA Astrophysics Data System (ADS)

    Angert, A.; Hilman, B.

    2012-12-01

    The CO2 emitted by respiration in a tree woody tissue (stem, branch, or root) is usually assumed to diffuse directly out to the atmosphere. Given that the internal concentrations of CO2 are one to two orders of magnitude higher than the atmospheric concentration, a reuse of this respired carbon can be beneficial to plants. We have developed a new method to track the fraction of respired CO2 not emitted from stems and branches, from the ratio of the CO2 efflux to the O2 influx. This ratio, which we defined as the apparent respiratory quotient (ARQ), is expected to equal 1.0 if carbohydrates are the substrate for respiration, and all respired CO2 is directly emitted. Using this approach we have recently showed that ~30% of the CO2 respired by Amazon forest tree stems was not directly emitted. In the current study we have applied this approach to 5 tree species living in Mediterranean climate, and have performed seasonal and diurnal ARQ measurements, at different heights along the stem and branches. We found different seasonal variations in the ARQ of riparian versus drought-resilient trees. In addition, the ARQ diurnal cycle, together with the measurements in different heights, indicate that a considerable fraction of the CO2 not emitted is recycled within the tree.

  18. The Learning Tree Montessori Child Care: An Approach to Diversity

    ERIC Educational Resources Information Center

    Wick, Laurie

    2006-01-01

    In this article the author describes how she and her partners started The Learning Tree Montessori Child Care, a Montessori program with a different approach in Seattle in 1979. The author also relates that the other area Montessori schools then offered half-day programs, and as a result the children who attended were, for the most part,…

  19. A class-oriented model for hyperspectral image classification through hierarchy-tree-based selection

    NASA Astrophysics Data System (ADS)

    Tang, Zhongqi; Fu, Guangyuan; Zhao, XiaoLin; Chen, Jin; Zhang, Li

    2016-03-01

    With the development of hyperspectral sensors over the last few decades, hyperspectral images (HSIs) face new challenges in the field of data analysis. Due to those high-dimensional data, the most challenging issue is to select an effective yet minimal subset from a mass of bands. This paper proposes a class-oriented model to solve the task of classification by incorporating spectral prior of the target, since different targets have different characteristics in spectral correlation. This model operates feature selection after a partition of hyperspectral data into groups along the spectral dimension. In the process of spectral partition, we group the raw data into several subsets by a hierarchy tree structure. In each group, band selection is performed via a recursive support vector machine (R-SVM) learning, which reduces the computational cost as well as preserves the accuracy of classification. To ensure the robustness of the result, we also present a weight-voting strategy for result merging, in which the spectral independency and the classification effectivity are both considered. Extensive experiments show that our model achieves better performance than the existing methods in task-dependent classifications, such as target detection and identification.

  20. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  1. Classification of savanna tree species, in the Greater Kruger National Park region, by integrating hyperspectral and LiDAR data in a Random Forest data mining environment

    NASA Astrophysics Data System (ADS)

    Naidoo, L.; Cho, M. A.; Mathieu, R.; Asner, G.

    2012-04-01

    The accurate classification and mapping of individual trees at species level in the savanna ecosystem can provide numerous benefits for the managerial authorities. Such benefits include the mapping of economically useful tree species, which are a key source of food production and fuel wood for the local communities, and of problematic alien invasive and bush encroaching species, which can threaten the integrity of the environment and livelihoods of the local communities. Species level mapping is particularly challenging in African savannas which are complex, heterogeneous, and open environments with high intra-species spectral variability due to differences in geology, topography, rainfall, herbivory and human impacts within relatively short distances. Savanna vegetation are also highly irregular in canopy and crown shape, height and other structural dimensions with a combination of open grassland patches and dense woody thicket - a stark contrast to the more homogeneous forest vegetation. This study classified eight common savanna tree species in the Greater Kruger National Park region, South Africa, using a combination of hyperspectral and Light Detection and Ranging (LiDAR)-derived structural parameters, in the form of seven predictor datasets, in an automated Random Forest modelling approach. The most important predictors, which were found to play an important role in the different classification models and contributed to the success of the hybrid dataset model when combined, were species tree height; NDVI; the chlorophyll b wavelength (466 nm) and a selection of raw, continuum removed and Spectral Angle Mapper (SAM) bands. It was also concluded that the hybrid predictor dataset Random Forest model yielded the highest classification accuracy and prediction success for the eight savanna tree species with an overall classification accuracy of 87.68% and KHAT value of 0.843.

  2. Random subwindows and extremely randomized trees for image classification in cell biology

    PubMed Central

    Marée, Raphaël; Geurts, Pierre; Wehenkel, Louis

    2007-01-01

    Background With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation of a large amount of images at different imaging modalities/scales. It stresses the need for computer vision methods that automate image classification tasks. Results We illustrate the potential of our image classification method in cell biology by evaluating it on four datasets of images related to protein distributions or subcellular localizations, and red-blood cell shapes. Accuracy results are quite good without any specific pre-processing neither domain knowledge incorporation. The method is implemented in Java and available upon request for evaluation and research purpose. Conclusion Our method is directly applicable to any image classification problems. We foresee the use of this automatic approach as a baseline method and first try on various biological image classification problems. PMID:17634092

  3. Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image

    NASA Astrophysics Data System (ADS)

    Adelabu, Samuel; Mutanga, Onisimo; Adam, Elhadi; Cho, Moses Azong

    2013-01-01

    Classification of different tree species in semiarid areas can be challenging as a result of the change in leaf structure and orientation due to soil moisture constraints. Tree species mapping is, however, a key parameter for forest management in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples.We performed classification using random forest (RF) and support vector machines (SVM) based on EnMap box. The overall accuracies for classifying the five tree species was 88.75 and 85% for both SVM and RF, respectively. We also demonstrated that the new red-edge band in the RapidEye sensor has the potential for classifying tree species in semiarid environments when integrated with other standard bands. Similarly, we observed that where there are limited training samples, SVM is preferred over RF. Finally, we demonstrated that the two accuracy measures of quantity and allocation disagreement are simpler and more helpful for the vast majority of remote sensing classification process than the kappa coefficient. Overall, high species classification can be achieved using strategically located RapidEye bands integrated with advanced processing algorithms.

  4. The Tree of Life and a New Classification of Bony Fishes

    PubMed Central

    Betancur-R., Ricardo; Broughton, Richard E.; Wiley, Edward O.; Carpenter, Kent; López, J. Andrés; Li, Chenhong; Holcroft, Nancy I.; Arcila, Dahiana; Sanciangco, Millicent; Cureton II, James C; Zhang, Feifei; Buser, Thaddaeus; Campbell, Matthew A.; Ballesteros, Jesus A; Roa-Varon, Adela; Willis, Stuart; Borden, W. Calvin; Rowley, Thaine; Reneau, Paulette C.; Hough, Daniel J.; Lu, Guoqing; Grande, Terry; Arratia, Gloria; Ortí, Guillermo

    2013-01-01

    The tree of life of fishes is in a state of flux because we still lack a comprehensive phylogeny that includes all major groups. The situation is most critical for a large clade of spiny-finned fishes, traditionally referred to as percomorphs, whose uncertain relationships have plagued ichthyologists for over a century. Most of what we know about the higher-level relationships among fish lineages has been based on morphology, but rapid influx of molecular studies is changing many established systematic concepts. We report a comprehensive molecular phylogeny for bony fishes that includes representatives of all major lineages. DNA sequence data for 21 molecular markers (one mitochondrial and 20 nuclear genes) were collected for 1410 bony fish taxa, plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). Bony fish diversity is represented by 1093 genera, 369 families, and all traditionally recognized orders. The maximum likelihood tree provides unprecedented resolution and high bootstrap support for most backbone nodes, defining for the first time a global phylogeny of fishes. The general structure of the tree is in agreement with expectations from previous morphological and molecular studies, but significant new clades arise. Most interestingly, the high degree of uncertainty among percomorphs is now resolved into nine well-supported supraordinal groups. The order Perciformes, considered by many a polyphyletic taxonomic waste basket, is defined for the first time as a monophyletic group in the global phylogeny. A new classification that reflects our phylogenetic hypothesis is proposed to facilitate communication about the newly found structure of the tree of life of fishes. Finally, the molecular phylogeny is calibrated using 60 fossil constraints to produce a comprehensive time tree. The new time-calibrated phylogeny will provide the basis for and stimulate new comparative studies to better understand the evolution of the amazing

  5. A Novel Modulation Classification Approach Using Gabor Filter Network

    PubMed Central

    Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed

    2014-01-01

    A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel. PMID:25126603

  6. Application of classification-tree methods to identify nitrate sources in ground water

    USGS Publications Warehouse

    Spruill, T.B.; Showers, W.J.; Howe, S.S.

    2002-01-01

    A study was conducted to determine if nitrate sources in ground water (fertilizer on crops, fertilizer on golf courses, irrigation spray from hog (Sus scrofa) wastes, and leachate from poultry litter and septic systems) could be classified with 80% or greater success. Two statistical classification-tree models were devised from 48 water samples containing nitrate from five source categories. Model I was constructed by evaluating 32 variables and selecting four primary predictor variables (??15N, nitrate to ammonia ratio, sodium to potassium ratio, and zinc) to identify nitrate sources. A ??15N value of nitrate plus potassium 18.2 indicated inorganic or soil organic N. A nitrate to ammonia ratio 575 indicated nitrate from golf courses. A sodium to potassium ratio 3.2 indicated spray or poultry wastes. A value for zinc 2.8 indicated poultry wastes. Model 2 was devised by using all variables except ??15N. This model also included four variables (sodium plus potassium, nitrate to ammonia ratio, calcium to magnesium ratio, and sodium to potassium ratio) to distinguish categories. Both models were able to distinguish all five source categories with better than 80% overall success and with 71 to 100% success in individual categories using the learning samples. Seventeen water samples that were not used in model development were tested using Model 2 for three categories, and all were correctly classified. Classification-tree models show great potential in identifying sources of contamination and variables important in the source-identification process.

  7. Non-Destructive Classification Approaches for Equilibrated Ordinary Chondrites

    NASA Astrophysics Data System (ADS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-09-01

    In order to compare a few non-destructive classification techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Mössbauer spectroscopy.

  8. Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification

    NASA Astrophysics Data System (ADS)

    Höhle, J.

    2014-09-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.

  9. Eating Disorder Diagnoses: Empirical Approaches to Classification

    ERIC Educational Resources Information Center

    Wonderlich, Stephen A.; Joiner, Thomas E., Jr.; Keel, Pamela K.; Williamson, Donald A.; Crosby, Ross D.

    2007-01-01

    Decisions about the classification of eating disorders have significant scientific and clinical implications. The eating disorder diagnoses in the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) reflect the collective wisdom of experts in the field but are frequently not supported in…

  10. Identifying Transferable Skills: A Task Classification Approach.

    ERIC Educational Resources Information Center

    Ashley, William L.; Ammerman, Harry L.

    The feasibility of classifying occupational tasks as a basis for understanding better the occupational transferability of job skills was examined. To show general skill relationships among occupations, 5 classification schemes were applied to 50 selected task statements for each of 12 occupations. Ratings by five reasonably knowledgeable people…

  11. Advanced fractal approach for unsupervised classification of SAR images

    NASA Astrophysics Data System (ADS)

    Pant, Triloki; Singh, Dharmendra; Srivastava, Tanuja

    2010-06-01

    Unsupervised classification of Synthetic Aperture Radar (SAR) images is the alternative approach when no or minimum apriori information about the image is available. Therefore, an attempt has been made to develop an unsupervised classification scheme for SAR images based on textural information in present paper. For extraction of textural features two properties are used viz. fractal dimension D and Moran's I. Using these indices an algorithm is proposed for contextual classification of SAR images. The novelty of the algorithm is that it implements the textural information available in SAR image with the help of two texture measures viz. D and I. For estimation of D, the Two Dimensional Variation Method (2DVM) has been revised and implemented whose performance is compared with another method, i.e., Triangular Prism Surface Area Method (TPSAM). It is also necessary to check the classification accuracy for various window sizes and optimize the window size for best classification. This exercise has been carried out to know the effect of window size on classification accuracy. The algorithm is applied on four SAR images of Hardwar region, India and classification accuracy has been computed. A comparison of the proposed algorithm using both fractal dimension estimation methods with the K-Means algorithm is discussed. The maximum overall classification accuracy with K-Means comes to be 53.26% whereas overall classification accuracy with proposed algorithm is 66.16% for TPSAM and 61.26% for 2DVM.

  12. Prediction of radiation levels in residences: A methodological comparison of CART (Classification and Regression Tree Analysis) and conventional regression

    SciTech Connect

    Janssen, I.; Stebbings, J.H.

    1990-01-01

    In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and {approximately}200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs.

  13. The creation of a digital soil map for Cyprus using decision-tree classification techniques

    NASA Astrophysics Data System (ADS)

    Camera, Corrado; Zomeni, Zomenia; Bruggeman, Adriana; Noller, Joy; Zissimos, Andreas

    2014-05-01

    Considering the increasing threats soil are experiencing especially in semi-arid, Mediterranean environments like Cyprus (erosion, contamination, sealing and salinisation), producing a high resolution, reliable soil map is essential for further soil conservation studies. This study aims to create a 1:50.000 soil map covering the area under the direct control of the Republic of Cyprus (5.760 km2). The study consists of two major steps. The first is the creation of a raster database of predictive variables selected according to the scorpan formula (McBratney et al., 2003). It is of particular interest the possibility of using, as soil properties, data coming from three older island-wide soil maps and the recently published geochemical atlas of Cyprus (Cohen et al., 2011). Ten highly characterizing elements were selected and used as predictors in the present study. For the other factors usual variables were used: temperature and aridity index for climate; total loss on ignition, vegetation and forestry types maps for organic matter; the DEM and related relief derivatives (slope, aspect, curvature, landscape units); bedrock, surficial geology and geomorphology (Noller, 2009) for parent material and age; and a sub-watershed map to better bound location related to parent material sources. In the second step, the digital soil map is created using the Random Forests package in R. Random Forests is a decision tree classification technique where many trees, instead of a single one, are developed and compared to increase the stability and the reliability of the prediction. The model is trained and verified on areas where a 1:25.000 published soil maps obtained from field work is available and then it is applied for predictive mapping to the other areas. Preliminary results obtained in a small area in the plain around the city of Lefkosia, where eight different soil classes are present, show very good capacities of the method. The Ramdom Forest approach leads to reproduce soil

  14. Morphological and molecular characteristics do not confirm popular classification of the Brazil nut tree in Acre, Brazil.

    PubMed

    Sujii, P S; Fernandes, E T M B; Azevedo, V C R; Ciampi, A Y; Martins, K; de O Wadt, L H

    2013-01-01

    In the State of Acre, the Brazil nut tree, Bertholletia excelsa (Lecythidaceae), is classified by the local population into two types according to morphological characteristics, including color and quality of wood, shape of the trunk and crown, and fruit production. We examined the reliability of this classification by comparing morphological and molecular data of four populations of Brazil nut trees from Vale do Rio Acre in the Brazilian Amazon. For the morphological analysis, we evaluated qualitative and quantitative information of the trees, fruits, and seeds. The molecular analysis was performed using RAPD and ISSR markers, with cluster analysis. Significant differences were found between the two types of Brazil nut trees for the characters diameter at breast height, fruit yield, fruit size, and number of seeds per fruit. Despite the significant correlation between the morphological characteristics and the popular classification, we observed all possible combinations of morphological characteristics in both types of Brazil nut trees. In some individuals, the classification did not correspond to any of the characteristics. The results obtained with molecular markers showed that the two locally classified types of Brazil nut trees did not differ genetically, indicating that there is no consistent separation between them. PMID:24089091

  15. Tree-Level Hydrodynamic Approach for Improved Stomatal Conductance Parameterization

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, G.; Bohrer, G.; Matheny, A. M.; Ivanov, V. Y.

    2014-12-01

    The land-surface models do not mechanistically resolve hydrodynamic processes within the tree. The Finite-Elements Tree-Crown Hydrodynamics model version 2 (FETCH2) is based on the pervious FETCH model approach, but with finite difference numerics, and simplified single-beam conduit system. FETCH2 simulates water flow through the tree as a simplified system of porous media conduits. It explicitly resolves spatiotemporal hydraulic stresses throughout the tree's vertical extent that cannot be easily represented using other stomatal-conductance models. Empirical equations relate water potential at the stem to stomata conductance at leaves connected to the stem (through unresolved branches) at that height. While highly simplified, this approach bring some realism to the simulation of stomata conductance because the stomata can respond to stem water potential, rather than an assumed direct relationship with soil moisture, as is currently the case in almost all models. By enabling mechanistic simulation of hydrological traits, such as xylem conductivity, conductive area per DBH, vertical distribution of leaf area and maximal and minimal water content in the xylem, and their effect of the dynamics of water flow in the tree system, the FETCH2 modeling system enhanced our understanding of the role of hydraulic limitations on an experimental forest plot short-term water stresses that lead to tradeoffs between water and light availability for transpiring leaves in forest ecosystems. FETCH2 is particularly suitable to resolve the effects of structural differences between tree and species and size groups, and the consequences of differences in hydraulic strategies of different species. We leverage on a large dataset of sap flow from 60 trees of 4 species at our experimental plot at the University of Michigan Biological Station. Comparison of the sap flow and transpiration patterns in this site and an undisturbed control site shows significant difference in hydraulic strategies

  16. Tree Crown Delineation on Vhr Aerial Imagery with Svm Classification Technique Optimized by Taguchi Method: a Case Study in Zagros Woodlands

    NASA Astrophysics Data System (ADS)

    Erfanifard, Y.; Behnia, N.; Moosavi, V.

    2013-09-01

    The Support Vector Machine (SVM) is a theoretically superior machine learning methodology with great results in classification of remotely sensed datasets. Determination of optimal parameters applied in SVM is still vague to some scientists. In this research, it is suggested to use the Taguchi method to optimize these parameters. The objective of this study was to detect tree crowns on very high resolution (VHR) aerial imagery in Zagros woodlands by SVM optimized by Taguchi method. A 30 ha plot of Persian oak (Quercus persica) coppice trees was selected in Zagros woodlands, Iran. The VHR aerial imagery of the plot with 0.06 m spatial resolution was obtained from National Geographic Organization (NGO), Iran, to extract the crowns of Persian oak trees in this study. The SVM parameters were optimized by Taguchi method and thereafter, the imagery was classified by the SVM with optimal parameters. The results showed that the Taguchi method is a very useful approach to optimize the combination of parameters of SVM. It was also concluded that the SVM method could detect the tree crowns with a KHAT coefficient of 0.961 which showed a great agreement with the observed samples and overall accuracy of 97.7% that showed the accuracy of the final map. Finally, the authors suggest applying this method to optimize the parameters of classification techniques like SVM.

  17. A simulation approach for change-points on phylogenetic trees.

    PubMed

    Persing, Adam; Jasra, Ajay; Beskos, Alexandros; Balding, David; De Iorio, Maria

    2015-01-01

    We observe n sequences at each of m sites and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a transdimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the transdimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques. PMID:25506749

  18. Identifying Population Groups with Low Palliative Care Program Enrolment Using Classification and Regression Tree Analysis

    PubMed Central

    Gao, Jun; Lavergne, M. Ruth; McIntyre, Paul

    2013-01-01

    Classification and regression tree (CART) analysis was used to identify subpopulations with lower palliative care program (PCP) enrolment rates. CART analysis uses recursive partitioning to group predictors. The PCP enrolment rate was 72 percent for the 6,892 adults who died of cancer from 2000 and 2005 in two counties in Nova Scotia, Canada. The lowest PCP enrolment rates were for nursing home residents over 82 years (27 percent), a group residing more than 43 kilometres from the PCP (31 percent), and another group living less than two weeks after their cancer diagnosis (37 percent). The highest rate (86 percent) was for the 2,118 persons who received palliative radiation. Findings from multiple logistic regression (MLR) were provided for comparison. CART findings identified low PCP enrolment subpopulations that were defined by interactions among demographic, social, medical, and health system predictors. PMID:21805944

  19. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    SciTech Connect

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.; Pounds, Joel G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the set of predictors to around 8 factors that can be validated using reputable medical and public health resources.

  20. Classification tree models for predicting distributions of michigan stream fish from landscape variables

    USGS Publications Warehouse

    Steen, P.J.; Zorn, T.G.; Seelbach, P.W.; Schaeffer, J.S.

    2008-01-01

    Traditionally, fish habitat requirements have been described from local-scale environmental variables. However, recent studies have shown that studying landscape-scale processes improves our understanding of what drives species assemblages and distribution patterns across the landscape. Our goal was to learn more about constraints on the distribution of Michigan stream fish by examining landscape-scale habitat variables. We used classification trees and landscape-scale habitat variables to create and validate presence-absence models and relative abundance models for Michigan stream fishes. We developed 93 presence-absence models that on average were 72% correct in making predictions for an independent data set, and we developed 46 relative abundance models that were 76% correct in making predictions for independent data. The models were used to create statewide predictive distribution and abundance maps that have the potential to be used for a variety of conservation and scientific purposes. ?? Copyright by the American Fisheries Society 2008.

  1. Classification of oxide glasses: A polarizability approach

    SciTech Connect

    Dimitrov, Vesselin; Komatsu, Takayuki . E-mail: komatsu@chem.nagaokaut.ac.jp

    2005-03-15

    A classification of binary oxide glasses has been proposed taking into account the values obtained on their refractive index-based oxide ion polarizability {alpha}{sub O2-}(n{sub 0}), optical basicity {lambda}(n{sub 0}), metallization criterion M(n{sub 0}), interaction parameter A(n{sub 0}), and ion's effective charges as well as O1s and metal binding energies determined by XPS. Four groups of oxide glasses have been established: glasses formed by two glass-forming acidic oxides; glasses formed by glass-forming acidic oxide and modifier's basic oxide; glasses formed by glass-forming acidic and conditional glass-forming basic oxide; glasses formed by two basic oxides. The role of electronic ion polarizability in chemical bonding of oxide glasses has been also estimated. Good agreement has been found with the previous results concerning classification of simple oxides. The results obtained probably provide good basis for prediction of type of bonding in oxide glasses on the basis of refractive index as well as for prediction of new nonlinear optical materials.

  2. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  3. Discriminating Geriatric and Nongeriatric Patients Using Functional Status Information: An Example of Classification Tree Analysis via UniODA.

    ERIC Educational Resources Information Center

    Yarnold, Paul R.

    1996-01-01

    A procedure is described that involves iterative use of univariable optimal discriminant analysis (UniODA) to construct a classification tree model for discriminating observations from different groups. The procedure is illustrated using an application that involved discriminating 125 geriatric and nongeriatric patients on the basis of their…

  4. Tree species classification in the Southern Sierra Nevada Mountains based on MASTER and LIDAR imagery

    NASA Astrophysics Data System (ADS)

    Gibbons, S.; Grigsby, S.; Ustin, S.

    2013-12-01

    NASA recently collected MASTER (MODIS/ASTER) imagery over the Southern Sierra Nevada Mountains as part of the HyspIRI (Hyperspectral Infrared Imager) preparatory campaign, a location that was chosen for its distinct changes in vegetative species with elevation. Differentiation between functional types based on spectral data has been successful, however, classification between individual species is more difficult to accomplish with only the visible and near infrared portions of the spectrum. I used MASTER imagery in combination with Critical Zone Observatory LIDAR data to map species across both a low and high elevation site in the San Joaquin Experimental Range. While the visible and thermal bands of MASTER images provided an improved classification over shortwave bands, the physical characteristics from the LIDAR data showed the most contrast between the land covers, including tree species. The National Ecological Observation Network (NEON) plans to use LIDAR and spectral data to monitor 20 domains, including the San Joaquin Experimental Range, for the next thirty years. Understanding the current species distributions not only provides insight on the available resources of the area but will also act as a baseline to determine the effects of environmental changes on vegetation using future NEON data.

  5. Effect of training characteristics on object classification: An application using Boosted Decision Trees

    NASA Astrophysics Data System (ADS)

    Sevilla-Noarbe, I.; Etayo-Sotos, P.

    2015-06-01

    We present an application of a particular machine-learning method (Boosted Decision Trees, BDTs using AdaBoost) to separate stars and galaxies in photometric images using their catalog characteristics. BDTs are a well established machine learning technique used for classification purposes. They have been widely used specially in the field of particle and astroparticle physics, and we use them here in an optical astronomy application. This algorithm is able to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. The improvements are shown using the Sloan Digital Sky Survey Data Release 9, with respect to the type photometric classifier. We obtain an improvement in the impurity of the galaxy sample of a factor 2-4 for this particular dataset, adjusting for the same efficiency of the selection. Another main goal of this study is to verify the effects that different input vectors and training sets have on the classification performance, the results being of wider use to other machine learning techniques.

  6. Object classification in images for Epo doping control based on fuzzy decision trees

    NASA Astrophysics Data System (ADS)

    Bajla, Ivan; Hollander, Igor; Heiss, Dorothea; Granec, Reinhard; Minichmayr, Markus

    2005-02-01

    Erythropoietin (Epo) is a hormone which can be misused as a doping substance. Its detection involves analysis of images containing specific objects (bands), whose position and intensity are critical for doping positivity. Within a research project of the World Anti-Doping Agency (WADA) we are implementing the GASepo software that should serve for Epo testing in doping control laboratories world-wide. For identification of the bands we have developed a segmentation procedure based on a sequence of filters and edge detectors. Whereas all true bands are properly segmented, the procedure generates a relatively high number of false positives (artefacts). To separate these artefacts we suggested a post-segmentation supervised classification using real-valued geometrical measures of objects. The method is based on the ID3 (Ross Quinlan's) rule generation method, where fuzzy representation is used for linking the linguistic terms to quantitative data. The fuzzy modification of the ID3 method provides a framework that generates fuzzy decision trees, as well as fuzzy sets for input data. Using the MLTTM software (Machine Learning Framework) we have generated a set of fuzzy rules explicitly describing bands and artefacts. The method eliminated most of the artefacts. The contribution includes a comparison of the obtained misclassification errors to the errors produced by some other statistical classification methods.

  7. Autoimmune hemolytic anemia: classification and therapeutic approaches.

    PubMed

    Sève, Pascal; Philippe, Pierre; Dufour, Jean-François; Broussolle, Christiane; Michel, Marc

    2008-12-01

    Autoimmune hemolytic anemia (AIHA) is a relatively uncommon cause of anemia. Classifications of AIHA include warm AIHA, cold AIHA (including mainly chronic cold agglutinin disease and paroxysmal cold hemoglobinuria), mixed-type AIHA and drug-induced AIHA. AIHA may also be further subdivided on the basis of etiology. Management of AIHA is based mainly on empirical data and on small, retrospective, uncontrolled studies. The therapeutic options for treating AIHA are increasing with monoclonal antibodies and, potentially, complement inhibitory drugs. Based on data available in the literature and our experience, we propose algorithms for the treatment of warm AIHA and cold agglutinin disease in adults. Therapeutic trials are needed in order to better stratify treatment, taking into account the promising efficacy of rituximab. PMID:21082924

  8. Color Image Magnification: Geometrical Pattern Classification Approach

    NASA Astrophysics Data System (ADS)

    Yong, Tien Fui; Choo, Wou Onn; Meian Kok, Hui

    In an era where technology keeps advancing, it is vital that high-resolution images are available to produce high-quality displayed images and fine-quality prints. The problem is that it is quite impossible to produce high-resolution images with acceptable clarity even with the latest digital cameras. Therefore, there is a need to enlarge the original images using an effective and efficient algorithm. The main contribution of this paper is to produce an enlarge color image with high visual quality, up to four times the original size of 100x100 pixels image. In the classification phase, the basic idea is to separate the interpolation region in the form of geometrical shape. Then, in the intensity determination phase, the interpolator assigns a proper color intensity value to the undefined pixel inside the interpolation region. This paper will discuss about problem statement, literature review, research methodology, research outcome, initial results, and finally, the conclusion.

  9. A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance

    PubMed Central

    2013-01-01

    Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928

  10. Identifying tree crown delineation shapes and need for remediation on high resolution imagery using an evidence based approach

    NASA Astrophysics Data System (ADS)

    Leckie, Donald G.; Walsworth, Nicholas; Gougeon, François A.

    2016-04-01

    In order to fully realize the benefits of automated individual tree mapping for tree species, health, forest inventory attribution and forest management decision making, the tree delineations should be as good as possible. The concept of identifying poorly delineated tree crowns and suggesting likely types of remediation was investigated. Delineations (isolations or isols) were classified into shape types reflecting whether they were realistic tree shapes and the likely kind of remediation needed. Shape type was classified by an evidence based rules approach using primitives based on isol size, shape indices, morphology, the presence of local maxima, and matches with template models representing trees of different sizes. A test set containing 50,000 isols based on an automated tree delineation of 40 cm multispectral airborne imagery of a diverse temperate-boreal forest site was used. Isolations representing single trees or several trees were the focus, as opposed to cases where a tree is split into several isols. For eight shape classes from regular through to convolute, shape classification accuracy was in the order of 62%; simplifying to six classes accuracy was 83%. Shape type did give an indication of the type of remediation and there were 6% false alarms (i.e., isols classed as needing remediation but did not). Alternately, there were 5% omissions (i.e., isols of regular shape and not earmarked for remediation that did need remediation). The usefulness of the concept of identifying poor delineations in need of remediation was demonstrated and one suite of methods developed and shown to be effective.

  11. Annual Crop Type Classification of the U.S. Great Plains for 2000 - 2011: An Application of Classification Tree Modeling using Remote Sensing and Ancillary Environmental Data (Invited)

    NASA Astrophysics Data System (ADS)

    Howard, D. M.; Wylie, B. K.

    2013-12-01

    The purpose of this study was to increase spatial and temporal availability of crop classification data using reliable source data that have the potential of being applied on local, regional, national, and global levels. This study implemented classification tree modeling to map annual crop types throughout the U.S. Great Plains from 2000 - 2011. Classification tree modeling has been shown in numerous studies to be an effective tool for developing classification models. In this study, nearly 18 million crop observation points, derived from annual U.S. Department of Agriculture (USDA) National Agriculture Statistics Service (NASS) Cropland Data Layers (CDLs), were used in the training, development, and validation of a classification tree crop type model (CTM). Each observation point was further defined by weekly Normalized Differential Vegetation Index (NDVI) readings, annual climatic conditions, soil conditions, and a number of other biogeophysical environmental characteristics. The CTM accounted for the most prevalent crop types in the area, including, corn, soybeans, winter wheat, spring wheat, cotton, sorghum, and alfalfa. Other crops that did not fit into any of these classes were identified and grouped into a miscellaneous class. An 87% success rate was achieved on the classification of 1.8 million observation points (10% of total observation points) that were withheld from training. The CTM was applied to create annual crop maps of the U.S. Great Plains for 2000 - 2011 at a spatial resolution of 250 meters. Product validation was performed by comparing county acreage derived from the modeled crop maps and county acreage data from the USDA NASS Survey Program for each crop type and each year. Greater than 15,000 county records from 2001 - 2010 were compared with a Pearson's correlation coefficient of r = 0.87.

  12. A Hybrid Sensing Approach for Pure and Adulterated Honey Classification

    PubMed Central

    Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  13. Rational approaches to improving the isolation of endophytic actinobacteria from Australian native trees.

    PubMed

    Kaewkla, Onuma; Franco, Christopher M M

    2013-02-01

    In recent years, new actinobacterial species have been isolated as endophytes of plants and shrubs and are sought after both for their role as potential producers of new drug candidates for the pharmaceutical industry and as biocontrol inoculants for sustainable agriculture. Molecular-based approaches to the study of microbial ecology generally reveal a broader microbial diversity than can be obtained by cultivation methods. This study aimed to improve the success of isolating individual members of the actinobacterial population as pure cultures as well as improving the ability to characterise the large numbers obtained in pure culture. To achieve this objective, our study successfully employed rational and holistic approaches including the use of isolation media with low concentrations of nutrients normally available to the microorganism in the plant, plating larger quantities of plant sample, incubating isolation plates for up to 16 weeks, excising colonies when they are visible and choosing Australian endemic trees as the source of the actinobacteria. A hierarchy of polyphasic methods based on culture morphology, amplified 16S rRNA gene restriction analysis and limited sequencing was used to classify all 576 actinobacterial isolates from leaf, stem and root samples of two eucalypts: a Grey Box and Red Gum, a native apricot tree and a native pine tree. The classification revealed that, in addition to 413 Streptomyces spp., isolates belonged to 16 other actinobacterial genera: Actinomadura (two strains), Actinomycetospora (six), Actinopolymorpha (two), Amycolatopsis (six), Gordonia (one), Kribbella (25), Micromonospora (six), Nocardia (ten), Nocardioides (11), Nocardiopsis (one), Nonomuraea (one), Polymorphospora (two), Promicromonospora (51), Pseudonocardia (36), Williamsia (two) and a novel genus Flindersiella (one). In order to prove novelty, 12 strains were characterised fully to the species level based on polyphasic taxonomy. One strain represented a novel

  14. An efficient tree classifier ensemble-based approach for pedestrian detection.

    PubMed

    Xu, Yanwu; Cao, Xianbin; Qiao, Hong

    2011-02-01

    Classification-based pedestrian detection systems (PDSs) are currently a hot research topic in the field of intelligent transportation. A PDS detects pedestrians in real time on moving vehicles. A practical PDS demands not only high detection accuracy but also high detection speed. However, most of the existing classification-based approaches mainly seek for high detection accuracy, while the detection speed is not purposely optimized for practical application. At the same time, the performance, particularly the speed, is primarily tuned based on experiments without theoretical foundations, leading to a long training procedure. This paper starts with measuring and optimizing detection speed, and then a practical classification-based pedestrian detection solution with high detection speed and training speed is described. First, an extended classification/detection speed metric, named feature-per-object (fpo), is proposed to measure the detection speed independently from execution. Then, an fpo minimization model with accuracy constraints is formulated based on a tree classifier ensemble, where the minimum fpo can guarantee the highest detection speed. Finally, the minimization problem is solved efficiently by using nonlinear fitting based on radial basis function neural networks. In addition, the optimal solution is directly used to instruct classifier training; thus, the training speed could be accelerated greatly. Therefore, a rapid and accurate classification-based detection technique is proposed for the PDS. Experimental results on urban traffic videos show that the proposed method has a high detection speed with an acceptable detection rate and a false-alarm rate for onboard detection; moreover, the training procedure is also very fast. PMID:20457550

  15. Multidisciplinary approach to tumors of the pancreas and biliary tree.

    PubMed

    Brown, Kimberly M

    2009-02-01

    Tumors of the pancreas and biliary tree remain formidable challenges to patients and clinicians. These tumors elude early detection, rapidly spread locally and systemically, and frequently recur despite apparently complete resection. Cystic tumors of the pancreas, however, may represent a subset of patients who do not uniformly require aggressive resection, and a thoughtful, evidence-based approach to work-up allows for the rational application of surgical therapy. Increasing evidence supports treating patients who have pancreaticobiliary disease in a multidisciplinary setting. PMID:19186234

  16. Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

    NASA Astrophysics Data System (ADS)

    Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko

    2015-01-01

    Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.

  17. An approach for quantifying the efficacy of ecological classification schemes as management tools

    NASA Astrophysics Data System (ADS)

    Flanagan, A. M.; Cerrato, R. M.

    2015-10-01

    Rigorous assessments of ecological classification schemes being applied to submerged environments are needed to evaluate their utility as management tools. Verification that a scheme can quantitatively capture habitat and community variation would be of considerable value to individuals responsible for making difficult management decisions relevant to widespread environmental challenges including those in fisheries, preservation or restoration of critical habitats, and climate change. In this paper, an assessment approach that evaluates a scheme by treating it like a quantitative statistical model is presented. It couples two direct gradient, multivariate statistical techniques, multivariate regression trees (MRT) and redundancy analysis (RDA), with a modelling protocol involving model formulation, model selection, parameter estimation, and measurement of precision to produce a very flexible strategy for analyzing structure in ecological data. To illustrate the proposed approach, the assessment focused on benthic infauna and evaluating the Folk grain size classification scheme, along with some alternative grain size models. Analysis of data sets revealed that while it was fairly easy to uncover biotic-environmental relationships that were over-fitted, the community structure inherent in the data tended to be robustly discernible and preserved across all grain size models, but rigidly parameterized models (i.e., a one size fits all approach for grain size characterization with fixed boundaries) were generally ineffective. The proposed approach provided a clear, detailed, and rigorous assessment of Folk and several alternative models and can be used for the quantitative evaluation of existing ecological classification schemes and/or in the development of new schemes.

  18. An overview of the phase-modular fault tree approach to phased mission system analysis

    NASA Technical Reports Server (NTRS)

    Meshkat, L.; Xing, L.; Donohue, S. K.; Ou, Y.

    2003-01-01

    We look at how fault tree analysis (FTA), a primary means of performing reliability analysis of PMS, can meet this challenge in this paper by presenting an overview of the modular approach to solving fault trees that represent PMS.

  19. Classification

    NASA Astrophysics Data System (ADS)

    Oza, Nikunj

    2012-03-01

    would represent one sunspot’s classification (y_i) and the corresponding set of measurements (x_i). The output of a supervised learning algorithm is a model h that approximates the unknown mapping from the inputs to the outputs. In our example, h would map from the sunspot measurements to the type of sunspot. We may have a test set S—a set of examples not used in training that we use to test how well the model h predicts the outputs on new examples. Just as with the examples in T, the examples in S are assumed to be independent and identically distributed (i.i.d.) draws from the distribution D. We measure the error of h on the test set as the proportion of test cases that h misclassifies: 1/|S| Sigma(x,y union S)[I(h(x)!= y)] where I(v) is the indicator function—it returns 1 if v is true and 0 otherwise. In our sunspot classification example, we would identify additional examples of sunspots that were not used in generating the model, and use these to determine how accurate the model is—the fraction of the test samples that the model classifies correctly. An example of a classification model is the decision tree shown in Figure 23.1. We will discuss the decision tree learning algorithm in more detail later—for now, we assume that, given a training set with examples of sunspots, this decision tree is derived. This can be used to classify previously unseen examples of sunpots. For example, if a new sunspot’s inputs indicate that its "Group Length" is in the range 10-15, then the decision tree would classify the sunspot as being of type “E,” whereas if the "Group Length" is "NULL," the "Magnetic Type" is "bipolar," and the "Penumbra" is "rudimentary," then it would be classified as type "C." In this chapter, we will add to the above description of classification problems. We will discuss decision trees and several other classification models. In particular, we will discuss the learning algorithms that generate these classification models, how to use them to

  20. [It is normal for classification approaches to be diverse].

    PubMed

    Pavlinov, I Ia

    2003-01-01

    It is asserted that the postmodern concept of science, unlike the classical ideal, presumes necessary existence of various classification approaches (schools) in taxonomy, each corresponding to a particular aspect of consideration of the "taxic reality". They are set up by diversity of initial epistemological and ontological backgrounds which fix in a certain way a) fragments of that reality allowable for investigation, and b) allowable methods of exploration of the fragments being fixed. It makes it possible to define a taxonomic school as a unity of the above backgrounds together with consideration aspect delimited by them. Two extreme positions of these backgrounds could be recognized in recent taxonomic thought. One of them follows the scholastic tradition of elaboration of a formal and, hence, universal classificatory method ("new typology", numerical phenetics, pattern cladistics). Another one asserts dependence of classificatory approach on the judgment of the nature of taxic reality (natural philosophy, evolutionary schools of taxonomy). Some arguments are put forward in favor of significant impact of evolutionary thinking onto the theory of modern taxonomy. This impact is manifested by the correspondence principle which makes classificatory algorithms (and hence resulting classifications) depending onto initial assumptions about causes of taxic diversity. It is asserted that criteria of "quality" of both classifications proper and classificatory methods can be correctly formulated within the framework of a particular consideration aspect only. For any group of organisms, several particular classifications are rightful to exist, each corresponding to a particular consideration aspect. These classifications could not be arranged along the "better-worse" scale, as they reflect different fragments of the taxic reality. Their mutual interpretation depends on degree of compatibility of background assumptions and of the tasks being resolved. Extensionally

  1. Aerial Images from AN Uav System: 3d Modeling and Tree Species Classification in a Park Area

    NASA Astrophysics Data System (ADS)

    Gini, R.; Passoni, D.; Pinto, L.; Sona, G.

    2012-07-01

    The use of aerial imagery acquired by Unmanned Aerial Vehicles (UAVs) is scheduled within the FoGLIE project (Fruition of Goods Landscape in Interactive Environment): it starts from the need to enhance the natural, artistic and cultural heritage, to produce a better usability of it by employing audiovisual movable systems of 3D reconstruction and to improve monitoring procedures, by using new media for integrating the fruition phase with the preservation ones. The pilot project focus on a test area, Parco Adda Nord, which encloses various goods' types (small buildings, agricultural fields and different tree species and bushes). Multispectral high resolution images were taken by two digital compact cameras: a Pentax Optio A40 for RGB photos and a Sigma DP1 modified to acquire the NIR band. Then, some tests were performed in order to analyze the UAV images' quality with both photogrammetric and photo-interpretation purposes, to validate the vector-sensor system, the image block geometry and to study the feasibility of tree species classification. Many pre-signalized Control Points were surveyed through GPS to allow accuracy analysis. Aerial Triangulations (ATs) were carried out with photogrammetric commercial software, Leica Photogrammetry Suite (LPS) and PhotoModeler, with manual or automatic selection of Tie Points, to pick out pros and cons of each package in managing non conventional aerial imagery as well as the differences in the modeling approach. Further analysis were done on the differences between the EO parameters and the corresponding data coming from the on board UAV navigation system.

  2. A conceptual approach to approximate tree root architecture in infinite slope models

    NASA Astrophysics Data System (ADS)

    Schmaltz, Elmar; Glade, Thomas

    2016-04-01

    paraboloids represent a cordate-root-system with radius r, height h and a constant, species-independent curvature. This procedure simplifies the classification of tree species into the three defined geometric solids. In this study we introduce a conceptual approach to estimate the 2- and 3-dimensional distribution of different tree root systems, and to implement it in a raster environment, as it is used in infinite slope models. Hereto we used the PCRaster extension in a python framework. The results show that root distribution and root growth are spatially reproducible in a simple raster framework. The outputs exhibit significant effects for a synthetically generated slope on local scale for equal time-steps. The preliminary results depict an initial step to develop a vegetation module that can be coupled with hydro-mechanical slope stability models. This approach is expected to yield a valuable contribution to the implementation of vegetation-related properties, in particular effects of root-reinforcement, into physically-based approaches using infinite slope models.

  3. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    NASA Astrophysics Data System (ADS)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  4. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  5. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier.

    PubMed

    Kambhampati, Satya Samyukta; Singh, Vishal; Manikandan, M Sabarimalai; Ramkumar, Barathram

    2015-08-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%. PMID:26609414

  6. Improving Crop Classification Techniques Using Optical Remote Sensing Imagery, High-Resolution Agriculture Resource Inventory Shapefiles and Decision Trees

    NASA Astrophysics Data System (ADS)

    Melnychuk, A. L.; Berg, A. A.; Sweeney, S.

    2010-12-01

    Recognition of anthropogenic effects of land use management practices on bodies of water is important for remediating and preventing eutrophication. In the case of Lake Simcoe, Ontario the main surrounding landuse is agriculture. To better manage the nutrient flow into the lake, knowledge of the management of the agricultural land is important. For this basin, a comprehensive agricultural resource inventory is required for assessment of policy and for input into water quality management and assessment tools. Supervised decision tree classification schemes, used in many previous applications, have yielded reliable classifications in agricultural land-use systems. However, when using these classification techniques the user is confronted with numerous data sources. In this study we use a large inventory of optical satellite image products (Landsat, AWiFS, SPOT and MODIS) and ancillary data sources (temporal MODIS-NDVI product signatures, digital elevation models and soil maps) at various spatial and temporal resolutions in a decision tree classification scheme. The sensitivity of the classification accuracy to various products is assessed to identify optimal data sources for classifying crop systems.

  7. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier

    PubMed Central

    Kambhampati, Satya Samyukta; Singh, Vishal; Ramkumar, Barathram

    2015-01-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%. PMID:26609414

  8. Pattern classification approach to rocket engine diagnostics

    SciTech Connect

    Tulpule, S.

    1989-01-01

    This paper presents a systems level approach to integrate state-of-the-art rocket engine technology with advanced computational techniques to develop an integrated diagnostic system (IDS) for future rocket propulsion systems. The key feature of this IDS is the use of advanced diagnostic algorithms for failure detection as opposed to the current practice of redline-based failure detection methods. The paper presents a top-down analysis of rocket engine diagnostic requirements, rocket engine operation, applicable diagnostic algorithms, and algorithm design techniques, which serve as a basis for the IDS. The concepts of hierarchical, model-based information processing are described, together with the use uf signal processing, pattern recognition, and artificial intelligence techniques which are an integral part of this diagnostic system. 27 refs.

  9. Comparing ANNs, EAs, and Trees: a basic machine-learning approach to predictive environmental models.

    NASA Astrophysics Data System (ADS)

    Williams, J.; Poff, N.

    2005-05-01

    Machine learning techniques for ecological applications or "eco-informatics" are becoming increasingly useful and accessible for ecologists. We evaluated the predictive ability of three commercially available (i.e. user-friendly) software packages for artificial neural networks (ANNs), evolutionary algorithms (EAs), and classification/regression trees (Trees). We analyzed fish and habitat data for streams in the mid-Atlantic region of the U.S., which was collected by the U.S. Environmental Protection Agency (EPA). The data includes over 200 environmental descriptors summarizing watershed, stream, and water chemistry characteristics in addition to derived fish community metrics (i.e. richness, IBI scores, % exotics). In our analysis we predicted individual species presence/absence and fish community metrics as a function of these local and regional scale habitat variables. Predictive ability is evaluated with independent validation data. These approaches could prove especially useful for conservation or management applications where ecologists seek to utilize the most comprehensive data to make predictions at various scales. By employing "user-friendly" software we hope to show that ecologists, without extensive knowledge of computational science, can benefit from these techniques by extracting more information about complex ecosystems. Relative strengths and weaknesses of these three approaches are compared and recommendations for their use in conservation applications are presented.

  10. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  11. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees.

    PubMed

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥10(6) CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥10(5) CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  12. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  13. Study and Ranking of Determinants of Taenia solium Infections by Classification Tree Models

    PubMed Central

    Mwape, Kabemba E.; Phiri, Isaac K.; Praet, Nicolas; Dorny, Pierre; Muma, John B.; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  14. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees

    PubMed Central

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥106 CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥105 CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  15. Study and ranking of determinants of Taenia solium infections by classification tree models.

    PubMed

    Mwape, Kabemba E; Phiri, Isaac K; Praet, Nicolas; Dorny, Pierre; Muma, John B; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  16. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  17. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2000-12-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  18. New Approach for Segmentation and Extraction of Single Tree from Point Clouds Data and Aerial Images

    NASA Astrophysics Data System (ADS)

    Homainejad, A. S.

    2016-06-01

    This paper addresses a new approach for reconstructing a 3D model from single trees via Airborne Laser Scanners (ALS) data and aerial images. The approach detects and extracts single tree from ALS data and aerial images. The existing approaches are able to provide bulk segmentation from a group of trees; however, some methods focused on detection and extraction of a particular tree from ALS and images. Segmentation of a single tree within a group of trees is mostly a mission impossible since the detection of boundary lines between the trees is a tedious job and basically it is not feasible. In this approach an experimental formula based on the height of the trees was developed and applied in order to define the boundary lines between the trees. As a result, each single tree was segmented and extracted and later a 3D model was created. Extracted trees from this approach have a unique identification and attribute. The output has application in various fields of science and engineering such as forestry, urban planning, and agriculture. For example in forestry, the result can be used for study in ecologically diverse, biodiversity and ecosystem.

  19. Land cover and forest formation distributions for St. Kitts, Nevis, St. Eustatius, Grenada and Barbados from decision tree classification of cloud-cleared satellite imagery

    USGS Publications Warehouse

    Helmer, E.H.; Kennaway, T.A.; Pedreros, D.H.; Clark, M.L.; Marcano-Vega, H.; Tieszen, L.L.; Ruzycki, T.R.; Schill, S.R.; Carrington, C.M.S.

    2008-01-01

    Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius, testing a more detailed classification than earlier work in the latter three islands. Secondly, we estimate the extents of land cover and protected forest by formation for five islands and ask how land cover has changed over the second half of the 20th century. The image interpretation approach combines image mosaics and ancillary geographic data, classifying the resulting set of raster data with decision tree software. Cloud-free image mosaics for one or two seasons were created by applying regression tree normalization to scene dates that could fill cloudy areas in a base scene. Such mosaics are also known as cloud-filled, cloud-minimized or cloud-cleared imagery, mosaics, or composites. The approach accurately distinguished several classes that more standard methods would confuse; the seamless mosaics aided reference data collection; and the multiseason imagery allowed us to separate drought deciduous forests and woodlands from semi-deciduous ones. Cultivated land areas declined 60 to 100 percent from about 1945 to 2000 on several islands. Meanwhile, forest cover has increased 50 to 950%. This trend will likely continue where sugar cane cultivation has dominated. Like the island of Puerto Rico, most higher-elevation forest formations are protected in formal or informal reserves. Also similarly, lowland forests, which are drier forest types on these islands, are not well represented in reserves. Former cultivated lands in lowland areas could provide lands for new reserves of drier forest types. The land-use history of these islands may provide insight for planners in countries currently considering

  20. ADHD classification using bag of words approach on network features

    NASA Astrophysics Data System (ADS)

    Solmaz, Berkan; Dey, Soumyabrata; Rao, A. Ravishankar; Shah, Mubarak

    2012-02-01

    Attention Deficit Hyperactivity Disorder (ADHD) is receiving lots of attention nowadays mainly because it is one of the common brain disorders among children and not much information is known about the cause of this disorder. In this study, we propose to use a novel approach for automatic classification of ADHD conditioned subjects and control subjects using functional Magnetic Resonance Imaging (fMRI) data of resting state brains. For this purpose, we compute the correlation between every possible voxel pairs within a subject and over the time frame of the experimental protocol. A network of voxels is constructed by representing a high correlation value between any two voxels as an edge. A Bag-of-Words (BoW) approach is used to represent each subject as a histogram of network features; such as the number of degrees per voxel. The classification is done using a Support Vector Machine (SVM). We also investigate the use of raw intensity values in the time series for each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features. Experimental results verified that the classification accuracy improves when the combined histogram is used. We tested our approach on a highly challenging dataset released by NITRC for ADHD-200 competition and obtained promising results. The dataset not only has a large size but also includes subjects from different demography and edge groups. To the best of our knowledge, this is the first paper to propose BoW approach in any functional brain disorder classification and we believe that this approach will be useful in analysis of many brain related conditions.

  1. Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction

    PubMed Central

    2013-01-01

    Background Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. Results This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Conclusions Our newly developed method for HMC takes into account network information in the learning phase: When

  2. Is protein classification necessary? Towards alternative approaches to function annotation

    PubMed Central

    Petrey, Donald; Honig, Barry

    2009-01-01

    The current non-redundant protein sequence database contains over seven million entries and the number of individual functional domains is significantly larger than this value. The vast quantity of data associated with these proteins poses enormous challenges to any attempt at function annotation. Classification of proteins into sequence and structural groups has been widely used as an approach to simplifying the problem. In this article we question such strategies. We describe how the multi-functionality and structural diversity of even closely related proteins confounds efforts to assign function based on overall sequence or structural similarity. Rather, we suggest that strategies that avoid classification may offer a more robust approach to protein function annotation. PMID:19269161

  3. Non-Destructive Classification Approaches for Equilbrated Ordinary Chondrites

    NASA Technical Reports Server (NTRS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-01-01

    Classification of meteorites is most effectively carried out by petrographic and mineralogic studies of thin sections, but a rapid and accurate classification technique for the many samples collected in dense collection areas (hot and cold deserts) is of great interest. Oil immersion techniques have been used to classify a large proportion of the US Antarctic meteorite collections since the mid-1980s [1]. This approach has allowed rapid characterization of thousands of samples over time, but nonetheless utilizes a piece of the sample that has been ground to grains or a powder. In order to compare a few non-destructive techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Moessbauer spectroscopy.

  4. "Trees and Things That Live in Trees": Three Children with Special Needs Experience the Project Approach

    ERIC Educational Resources Information Center

    Griebling, Susan; Elgas, Peg; Konerman, Rachel

    2015-01-01

    The authors report on research conducted during a project investigation undertaken with preschool children, ages 3-5. The report focuses on three children with special needs and the positive outcomes for each child as they engaged in the project Trees and Things That Live in Trees. Two of the children were diagnosed with developmental delays, and…

  5. Availability and Capacity of Substance Abuse Programs in Correctional Settings: A Classification and Regression Tree Analysis

    PubMed Central

    Kitsantas, Panagiota

    2009-01-01

    Objective to be addressed The purpose of this study was to investigate the structural and organizational factors that contribute to the availability and increased capacity for substance abuse treatment programs in correctional settings. We used Classification and Regression Tree statistical procedures to identify how multi-level data can explain the variability in availability and capacity of substance abuse treatment programs in jails and probation/parole offices. Methods The data for this study combined the National Criminal Justice Treatment Practices survey (NCJTP) and the 2000 Census. The NCJTP survey was a nationally representative sample of correctional administrators for jails and probation/parole agencies. The sample size included 295 substance abuse treatment programs that were classified according to the intensity of their services: high, medium, and low. The independent variables included jurisdictional-level structural variables, attributes of the correctional administrators, and program and service delivery characteristics of the correctional agency. Results The two most important variables in predicting the availability of all three types of services were stronger working relationships with other organizations and the adoption of a standardized substance abuse screening tool by correctional agencies. For high and medium intensive programs, the capacity increased when an organizational learning strategy was used by administrators and the organization used a substance abuse screening tool. Implications on advancing treatment practices in correctional settings are discussed, including further work to test theories on how to better understand access to intensive treatment services. This study presents the first phase of understanding capacity-related issues regarding treatment programs offered in correctional settings. PMID:19395204

  6. Interactive change detection based on dissimilarity image and decision tree classification

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Crouzil, Alain; Puel, Jean-Baptiste

    2015-02-01

    Our study mainly focus on detecting changed regions in two images of the same scene taken by digital cameras at different times. The images taken by digital cameras generally provide less information than multi-channel remote sensing images. Moreover, the application-dependent insignificant changes, such as shadows or clouds, may cause the failure of the classical methods based on image differences. The machine learning approach seems to be promising, but the lack of a sufficient volume of training data for photographic landscape observatories discards a lot of methods. So we investigate in this work the interactive learning approach and provide a discriminative model that is a 16-dimensional feature space comprising the textural appearance and contextual information. Dissimilarity measures in different neighborhood sizes are used to detect the difference within the neighborhood of an image pair. To detect changes between two images, the user designates change and non-change samples (pixel sets) in the images using a selection tool. This data is used to train a classifier using decision tree training method which is then applied to all the other pixels of the image pair. The experiments have proved the potential of the proposed approach.

  7. A hybrid ensemble learning approach to star-galaxy classification

    NASA Astrophysics Data System (ADS)

    Kim, Edward J.; Brunner, Robert J.; Carrasco Kind, Matias

    2015-10-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template-fitting method. Using data from the CFHTLenS survey (Canada-France-Hawaii Telescope Lensing Survey), we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2 (Deep Extragalactic Evolutionary Probe Phase 2 ), SDSS (Sloan Digital Sky Survey), VIPERS (VIMOS Public Extragalactic Redshift Survey), and VVDS (VIMOS VLT Deep Survey), and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

  8. Cluster Stability Estimation Based on a Minimal Spanning Trees Approach

    NASA Astrophysics Data System (ADS)

    Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora

    2009-08-01

    Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.

  9. Use of classification trees to apportion single echo detections to species: Application to the pelagic fish community of Lake Superior

    USGS Publications Warehouse

    Yule, Daniel L.; Adams, Jean V.; Hrabik, Thomas R.; Vinson, Mark R.; Woiak, Zebadiah; Ahrenstroff, Tyler D.

    2013-01-01

    Acoustic methods are used to estimate the density of pelagic fish in large lakes with results of midwater trawling used to assign species composition. Apportionment in lakes having mixed species can be challenging because only a small fraction of the water sampled acoustically is sampled with trawl gear. Here we describe a new method where single echo detections (SEDs) are assigned to species based on classification tree models developed from catch data that separate species based on fish size and the spatial habitats they occupy. During the summer of 2011, we conducted a spatially-balanced lake-wide acoustic and midwater trawl survey of Lake Superior. A total of 51 sites in four bathymetric depth strata (0–30 m, 30–100 m, 100–200 m, and >200 m) were sampled. We developed classification tree models for each stratum and found fish length was the most important variable for separating species. To apply these trees to the acoustic data, we needed to identify a target strength to length (TS-to-L) relationship appropriate for all abundant Lake Superior pelagic species. We tested performance of 7 general (i.e., multi-species) relationships derived from three published studies. The best-performing relationship was identified by comparing predicted and observed catch compositions using a second independent Lake Superior data set. Once identified, the relationship was used to predict lengths of SEDs from the lake-wide survey, and the classification tree models were used to assign each SED to a species. Exotic rainbow smelt (Osmerus mordax) were the most common species at bathymetric depths 100 m (384 million; 6.0 kt). Cisco (Coregonus artedi) were widely distributed over all strata with their population estimated at 182 million (44 kt). The apportionment method we describe should be transferable to other large lakes provided fish are not tightly aggregated, and an appropriate TS-to-L relationship for abundant pelagic fish species can be determined.

  10. A Dynamic Tree Approach to Environmental Transport on Hillslopes

    NASA Astrophysics Data System (ADS)

    Passalacqua, P.; Zaliapin, I.; Foufoula-Georgiou, E.; Ghil, M.; Dietrich, W. E.

    2010-12-01

    The concept of dynamic tree was introduced in Zaliapin et al. (2010) as the basis of an extended conceptual framework to study the transport of spatially heterogeneous fluxes as they propagate down a network of a given topology. Here we are interested in extending this framework over the whole basin by incorporating the hillslope paths and their geometry, which are known to differ from those of the river network. Focusing on the fluxes that start at a source, propagate downstream and have constant velocity, we first capture the static structure of the hillslope network by representing it by a tree (static tree). We then describe the transport down the hillslope tree as a particular case of nearest-neighbor hierarchical aggregation and thus obtaining the so-called dynamic tree. The properties of both the dynamic and static trees are analyzed by applying Horton-Strahler and Tokunaga taxonomies. The results obtained in three hillslope areas of different characteristics, two located in California and one in Oregon, show that both the static and the dynamic tree can be well approximated by Tokunaga self-similar trees (SSTs), in agreement with what previously obtained for the channelized paths of the river network but with different parameters. The degree of side branching is larger for the static tree than for the dynamic. We also observed a phase transition in the dynamics of the three systems which reflects an abrupt emergence of a giant cluster of connected streams.

  11. A methodological approach to the classification of dermoscopy images

    PubMed Central

    Celebi, M. Emre; Kingravi, Hassan A.; Uddin, Bakhtiyar; Iyatomi, Hitoshi; Aslandogan, Y. Alp; Stoecker, William V.; Moss, Randy H.

    2011-01-01

    In this paper a methodological approach to the classification of pigmented skin lesions in dermoscopy images is presented. First, automatic border detection is performed to separate the lesion from the background skin. Shape features are then extracted from this border. For the extraction of color and texture related features, the image is divided into various clinically significant regions using the Euclidean distance transform. This feature data is fed into an optimization framework, which ranks the features using various feature selection algorithms and determines the optimal feature subset size according to the area under the ROC curve measure obtained from support vector machine classification. The issue of class imbalance is addressed using various sampling strategies, and the classifier generalization error is estimated using Monte Carlo cross validation. Experiments on a set of 564 images yielded a specificity of 92.34% and a sensitivity of 93.33%. PMID:17387001

  12. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification

    PubMed Central

    Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898

  13. Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability.

    PubMed

    Melillo, Paolo; De Luca, Nicola; Bracale, Marcello; Pecchia, Leandro

    2013-05-01

    This study aims to develop an automatic classifier for risk assessment in patients suffering from congestive heart failure (CHF). The proposed classifier separates lower risk patients from higher risk ones, using standard long-term heart rate variability (HRV) measures. Patients are labeled as lower or higher risk according to the New York Heart Association classification (NYHA). A retrospective analysis on two public Holter databases was performed, analyzing the data of 12 patients suffering from mild CHF (NYHA I and II), labeled as lower risk, and 32 suffering from severe CHF (NYHA III and IV), labeled as higher risk. Only patients with a fraction of total heartbeats intervals (RR) classified as normal-to-normal (NN) intervals (NN/RR) higher than 80% were selected as eligible in order to have a satisfactory signal quality. Classification and regression tree (CART) was employed to develop the classifiers. A total of 30 higher risk and 11 lower risk patients were included in the analysis. The proposed classification trees achieved a sensitivity and a specificity rate of 93.3% and 63.6%, respectively, in identifying higher risk patients. Finally, the rules obtained by CART are comprehensible and consistent with the consensus showed by previous studies that depressed HRV is a useful tool for risk assessment in patients suffering from CHF. PMID:24592473

  14. A comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification

    NASA Astrophysics Data System (ADS)

    Lin, Yi; Hyyppä, Juha

    2016-04-01

    Tree species information is crucial for digital forestry, and efficient techniques for classifying tree species are extensively demanded. To this end, airborne light detection and ranging (LiDAR) has been introduced. However, the literature review suggests that most of the previous airborne LiDAR-based studies were only based on limited kinds of tree signatures. To address this gap, this study proposed developing a novel modular framework for LiDAR-based tree species classification, by deriving feature parameters in a systematic way. Specifically, feature parameters of point-distribution (PD), laser pulse intensity (IN), crown-internal (CI) and tree-external (TE) structures were proposed and derived. With a support-vector-machine (SVM) classifier used, the classifications were conducted in a leave-one-out-for-cross-validation (LOOCV) mode. Based on the samples of four typical boreal tree species, i.e., Picea abies, Pinus sylvestris, Populus tremula and Quercus robur, tests showed that the accuracies of the classifications based on the acquired PD-, IN-, CI- and TE-categorized feature parameters as well as the integration of their individual optimal parameters are 65.00%, 80.00%, 82.50%, 85.00% and 92.50%, respectively. These results indicate that the procedures proposed in this study can be used as a comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification.

  15. Full Hierarchic Versus Non-Hierarchic Classification Approaches for Mapping Sealed Surfaces at the Rural-Urban Fringe Using High-Resolution Satellite Data

    PubMed Central

    De Roeck, Tim; Van de Voorde, Tim; Canters, Frank

    2009-01-01

    Since 2008 more than half of the world population is living in cities and urban sprawl is continuing. Because of these developments, the mapping and monitoring of urban environments and their surroundings is becoming increasingly important. In this study two object-oriented approaches for high-resolution mapping of sealed surfaces are compared: a standard non-hierarchic approach and a full hierarchic approach using both multi-layer perceptrons and decision trees as learning algorithms. Both methods outperform the standard nearest neighbour classifier, which is used as a benchmark scenario. For the multi-layer perceptron approach, applying a hierarchic classification strategy substantially increases the accuracy of the classification. For the decision tree approach a one-against-all hierarchic classification strategy does not lead to an improvement of classification accuracy compared to the standard all-against-all approach. Best results are obtained with the hierarchic multi-layer perceptron classification strategy, producing a kappa value of 0.77. A simple shadow reclassification procedure based on characteristics of neighbouring objects further increases the kappa value to 0.84. PMID:22389586

  16. Colorectal Cancer Classification and Cell Heterogeneity: A Systems Oncology Approach

    PubMed Central

    Blanco-Calvo, Moisés; Concha, Ángel; Figueroa, Angélica; Garrido, Federico; Valladares-Ayerbes, Manuel

    2015-01-01

    Colorectal cancer is a heterogeneous disease that manifests through diverse clinical scenarios. During many years, our knowledge about the variability of colorectal tumors was limited to the histopathological analysis from which generic classifications associated with different clinical expectations are derived. However, currently we are beginning to understand that under the intense pathological and clinical variability of these tumors there underlies strong genetic and biological heterogeneity. Thus, with the increasing available information of inter-tumor and intra-tumor heterogeneity, the classical pathological approach is being displaced in favor of novel molecular classifications. In the present article, we summarize the most relevant proposals of molecular classifications obtained from the analysis of colorectal tumors using powerful high throughput techniques and devices. We also discuss the role that cancer systems biology may play in the integration and interpretation of the high amount of data generated and the challenges to be addressed in the future development of precision oncology. In addition, we review the current state of implementation of these novel tools in the pathological laboratory and in clinical practice. PMID:26084042

  17. Prediction of an Epidemic Curve: A Supervised Classification Approach

    PubMed Central

    Nsoesie, Elaine O.; Beckman, Richard; Marathe, Madhav; Lewis, Bryan

    2012-01-01

    Classification methods are widely used for identifying underlying groupings within datasets and predicting the class for new data objects given a trained classifier. This study introduces a project aimed at using a combination of simulations and classification techniques to predict epidemic curves and infer underlying disease parameters for an ongoing outbreak. Six supervised classification methods (random forest, support vector machines, nearest neighbor with three decision rules, linear and flexible discriminant analysis) were used in identifying partial epidemic curves from six agent-based stochastic simulations of influenza epidemics. The accuracy of the methods was compared using a performance metric based on the McNemar test. The findings showed that: (1) assumptions made by the methods regarding the structure of an epidemic curve influences their performance i.e. methods with fewer assumptions perform best, (2) the performance of most methods is consistent across different individual-based networks for Seattle, Los Angeles and New York and (3) combining classifiers using a weighting approach does not guarantee better prediction. PMID:22997545

  18. A new classification approach for detecting severe weather patterns

    NASA Astrophysics Data System (ADS)

    Teixeira de Lima, Glauston R.; Stephany, Stephan

    2013-08-01

    Early detection of possible occurrences of severe convective events would be useful in order to avoid, or at least mitigate, the environmental and socio-economic damages caused by such events. However, the enormous volume of meteorological data currently available makes difficult, if not impossible, its analysis by meteorologists. In addition, severe convective events may occur in very different spatial and temporal scales, precluding their early and accurate prediction. In this work, we propose an innovative approach for the classification of meteorological data based on the frequency of occurrence of the values of different variables provided by a weather forecast model. It is possible to identify patterns that may be associated to severe convective activity. In the considered classification problem, the information attributes are variables outputted by the weather forecast model Eta, while the decision attribute is given by the density of occurrence of cloud-to-ground atmospheric electrical discharges, assumed as correlated to the level of convective activity. Results show good classification performance for some selected mini-regions of Brazil during the summer of 2007. We expect that the screening of the outputs of the meteorological model Eta by the proposed classifier could serve as a support tool for meteorologists in order to identify in advance patterns associated to severe convective events.

  19. AutoClass: A Bayesian Approach to Classification

    NASA Technical Reports Server (NTRS)

    Stutz, John; Cheeseman, Peter; Hanson, Robin; Taylor, Will; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    We describe a Bayesian approach to the untutored discovery of classes in a set of cases, sometimes called finite mixture separation or clustering. The main difference between clustering and our approach is that we search for the "best" set of class descriptions rather than grouping the cases themselves. We describe our classes in terms of a probability distribution or density function, and the locally maximal posterior probability valued function parameters. We rate our classifications with an approximate joint probability of the data and functional form, marginalizing over the parameters. Approximation is necessitated by the computational complexity of the joint probability. Thus, we marginalize w.r.t. local maxima in the parameter space. We discuss the rationale behind our approach to classification. We give the mathematical development for the basic mixture model and describe the approximations needed for computational tractability. We instantiate the basic model with the discrete Dirichlet distribution and multivariant Gaussian density likelihoods. Then we show some results for both constructed and actual data.

  20. Human and tree classification based on a model using 3D ladar in a GPS-denied environment

    NASA Astrophysics Data System (ADS)

    Cho, Kuk; Baeg, Seung-Ho; Park, Sangdeok

    2013-05-01

    This study explained a method to classify humans and trees by extraction their geometric and statistical features in data obtained from 3D LADAR. In a wooded GPS-denied environment, it is difficult to identify the location of unmanned ground vehicles and it is also difficult to properly recognize the environment in which these vehicles move. In this study, using the point cloud data obtained via 3D LADAR, a method to extract the features of humans, trees, and other objects within an environment was implemented and verified through the processes of segmentation, feature extraction, and classification. First, for the segmentation, the radially bounded nearest neighbor method was applied. Second, for the feature extraction, each segmented object was divided into three parts, and then their geometrical and statistical features were extracted. A human was divided into three parts: the head, trunk and legs. A tree was also divided into three parts: the top, middle, and bottom. The geometric features were the variance of the x-y data for the center of each part in an object, using the distance between the two central points for each part, using K-mean clustering. The statistical features were the variance of each of the parts. In this study, three, six and six features of data were extracted, respectively, resulting in a total of 15 features. Finally, after training the extracted data via an artificial network, new data were classified. This study showed the results of an experiment that applied an algorithm proposed with a vehicle equipped with 3D LADAR in a thickly forested area, which is a GPS-denied environment. A total of 5,158 segments were obtained and the classification rates for human and trees were 82.9% and 87.4%, respectively.

  1. Application of object-oriented method for classification of VHR satellite images using rule-based approach and texture measures

    NASA Astrophysics Data System (ADS)

    Lewinski, S.; Bochenek, Z.; Turlej, K.

    2010-01-01

    New approach for classification of high-resolution satellite images is presented in the article. That approach has been developed at the Institute of Geodesy and Cartography, Warsaw, within the Geoland 2 project - SATChMo Core Mapping Service. Classification algorithm, aimed at recognition of generic land cover categories, has been elaborated using the object-oriented approach. Its functionality was tested on the basis of KOMPSAT-2 satellite images, recorded in four multispectral bands (4 m ground resolution) and in panchromatic mode (1 m ground resolution). The structure of the algorithm resembles decision tree and consists of a sequence of processes. The main assumption of the presented approach is to divide image contents into objects characterized by high and low texture measures. The texture measures are generated on the basis of a panchromatic image transformed by Sigma filters. Objects belonging to the so-called high texture are classified at first steps. In the following steps the classification of the remaining objects takes place. Applying parametric criteria of recognition at the first group of objects four generic land cover classes are classified: forests, sparse woody vegetation, urban / artificial areas and bare ground. Non-classified areas are automatically assigned to the second group of objects, which contains water and agricultural land. In the course of classification process a few segmentations are performed, which are dedicated to particular land cover categories. Classified objects, smaller than 0.25 ha are removed in the process of generalization.

  2. Active optical sensors for tree stem detection and classification in nurseries.

    PubMed

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J; Hanson, Bradley D; Slaughter, David C

    2014-01-01

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops. PMID:24949638

  3. Active Optical Sensors for Tree Stem Detection and Classification in Nurseries

    PubMed Central

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J.; Hanson, Bradley D.; Slaughter, David C.

    2014-01-01

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops. PMID:24949638

  4. Reflectance properties of West African savanna trees from ground radiometer measurements. II - Classification of components

    NASA Technical Reports Server (NTRS)

    Hanan, N. P.; Prince, S. D.; Franklin, J.

    1993-01-01

    A pole-mounted radiometer was used to measure the reflectance properties in the red and near-IR of three Sahelian tree species. These properties are classified depending on their location over the canopy. A geometrical description of the patterns of shadow and sunlight on and beneath a model tree when viewed from above is given, and six components are defined. Tree canopies are found to be dark in the red waveband with respect to the soil, but have little or no effect on the near-IR.

  5. Classification of Bent-Double Galaxies: Experiences with Ensembles of Decision Trees

    SciTech Connect

    Kamath, C; Cantu-Paz, E

    2002-01-08

    In earlier work, we have described our experiences with the use of decision tree classifiers to identify radio-emitting galaxies with a bent-double morphology in the FIRST astronomical survey. We now extend this work to include ensembles of decision tree classifiers, including two algorithms developed by us. These algorithms randomize the decision at each node of the tree, and because they consider fewer candidate splitting points, are faster than other methods for creating ensembles. The experiments presented in this paper with our astronomy data show that our algorithms are competitive in accuracy, but faster than other ensemble techniques such as Boosting, Bagging, and Arcx4 with different split criteria.

  6. A triple stable isotope approach in tree rings for detecting the impact of nitrogen emissions on tree physiology

    NASA Astrophysics Data System (ADS)

    Guerrieri, M. R.; Siegwolf, R. T. W.; Saurer, M.; Jaeggi, M.; Cherubini, P.; Ripullone, F.; Borghetti, M.

    2009-04-01

    Over the last decades, human activities have contributed to increase reactive nitrogen (N) in the atmosphere (such as NOx and NHx compounds) and their deposition on terrestrial ecosystems. The relevance of the current N deposition (Ndep) on carbon (C) sequestration has lately been questioned by both experimental and modelling approaches. Widely a different estimates of C sensitivity to Ndep have been reported in recent investigations (Magnani et al., 2007; Högberg 2007; De Vries et al. 2008; Magnani et al. 2008; Sutton et al. 2008, which highlights the need for a through re-assessment of all the physiological mechanisms and processes involved. The impact of Ndep on forest ecosystems can be investigated near the pollution sources, where the effects are expected to be easily detectable. Therefore, tree rings represent a valuable archive for disturbances due to pollution events, which can be detected by combining d13C, d18O, d15N and dendrochronological approaches. The aim of this research was to investigate the impact of long term exposure to NOx emissions on two tree species, namely: a broadleaved species (Quercus cerris) that was located close to an oil refinery in Southern Italy, and a coniferous species (Picea abies) located close to a freeway in Switzerland. The analysis of d15N in tree rings allowed to detect the input of N from anthropogenic emissions. Further, variations in the ratio of intercellular and ambient CO2 concentrations (ci/ca) and the distinction between stomatal (gs) and photosynthetic (A) responses to NOx emissions in trees were assessed using a conceptual model (Scheidegger et al., 2000), which combines d13C and d18O in tree rings. The strongest fingerprint of N emissions was detected for Q. cerris at the oil refinery site, as assessed by d15N. Long-term exposure to NOx emissions had a different impact on the ci/ca ratio in the two experimental sites: at the oil refinery (Quercus cerris), gs influenced ci/ca more, as assessed by d18O, while at

  7. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy. PMID:26687087

  8. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, J.; Tuia, D.; Berne, A.

    2015-01-01

    A data-driven approach to the classification of hydrometeors from measurements collected with polarimetric weather radars is proposed. In a first step, the optimal number of hydrometeor classes (nopt) that can be reliably identified from a large set of polarimetric data is determined. This is done by means of an unsupervised clustering technique guided by criteria related both to data similarity and to spatial smoothness of the classified images. In a second step, the nopt clusters are assigned to the appropriate hydrometeor class by means of human interpretation and comparisons with the output of other classification techniques. The main innovation in the proposed method is the unsupervised part: the hydrometeor classes are not defined a priori, but they are learned from data. The approach is applied to data collected by an X-band polarimetric weather radar during two field campaigns (from which about 50 precipitation events are used in the present study). Seven hydrometeor classes (nopt = 7) have been found in the data set, and they have been identified as light rain (LR), rain (RN), heavy rain (HR), melting snow (MS), ice crystals/small aggregates (CR), aggregates (AG), and rimed-ice particles (RI).

  9. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, J.; Tuia, D.; Berne, A.

    2014-08-01

    A data-driven approach to the classification of hydrometeors from measurements collected with polarimetric weather radars is proposed. In a first step, the optimal number nopt of hydrometeor classes that can be reliably identified from a large set of polarimetric data is determined. This is done by means of an unsupervised clustering technique guided by criteria related both to data similarity and to spatial smoothness of the classified images. In a second step, the nopt clusters are assigned to the appropriate hydrometeor class by means of human interpretation and comparisons with the output of other classification techniques. The main innovation in the proposed method is the unsupervised part: the hydrometeor classes are not defined a-priori, but they are learned from data. The proposed approach is applied to data collected by an X-band polarimetric weather radar during two field campaigns (totalling about 3000 h of precipitation). Seven hydrometeor classes have been found in the data set and they have been associated to drizzle (DZ), light rain (LR), heavy rain (HR), melting snow (MS), ice crystals/small aggregates (CR), aggregates (AG), rimed particles (RI).

  10. ECOLOGICAL RESPONSE SURFACES FOR NORTH AMERICAN BOREAL TREE SPECIES AND THEIR USE IN FOREST CLASSIFICATION

    EPA Science Inventory

    Empirical ecological response surfaces were derived for eight dominant tree species in the boreal forest region of Canada. tepwise logistic regression was used to model species dominance as a response to five climatic predictor variables. he predictor variables (annual snowfall, ...

  11. An improved classification tree analysis of high cost modules based upon an axiomatic definition of complexity

    NASA Technical Reports Server (NTRS)

    Tian, Jianhui; Porter, Adam; Zelkowitz, Marvin V.

    1992-01-01

    Identification of high cost modules has been viewed as one mechanism to improve overall system reliability, since such modules tend to produce more than their share of problems. A decision tree model was used to identify such modules. In this current paper, a previously developed axiomatic model of program complexity is merged with the previously developed decision tree process for an improvement in the ability to identify such modules. This improvement was tested using data from the NASA Software Engineering Laboratory.

  12. A comparison of feature selection methods for multitemporal tree species classification

    NASA Astrophysics Data System (ADS)

    Pipkins, Kyle; Förster, Michael; Clasen, Anne; Schmidt, Tobias; Kleinschmit, Birgit

    2014-10-01

    The problem of feature selection is a significant one in classification problems, where the addition of too many features to the classification fails to lead to significant increases in classification accuracy. This problem is especially significant within the context of multitemporal remote sensing classifications, where the costs and efforts associated with the acquisition of additional imagery can be extensive. It would thus be beneficial to identify the most important seasons for acquiring imagery for specific land cover types. This study uses a phenologically-adjusted 21 date RapidEye time-series in order to evaluate two methods of feature selection. The two methods compared in this study are a genetic algorithm (GA) and a semi-exhaustive method (EXH), both of which compare permutations of sequential date and band combinations. These methods are employed using a seven class support vector machine classification on a Normalized Difference Vegetation Index (NDVI)-transformed dataset. Overall accuracy (OAA) is used as the performance metric, and OAA significance is assessed using the McNemar test. The results from the feature selection methods are compared on the basis of phenological seasons selected across all iterations and the ideal number of combinations, based on the ratio of better performing classifications to all other classifications. The results suggest that the GA has a moderate but insignificant correlation when compared with the EXH for identifying ideal phenological seasons (overall Spearman's ρ= 0.60, p = 0.13), but is comparable when considering the number of seasons and image combinations.

  13. A comparison of ARA and DNA data for microbial source tracking based on source-classification models developed using classification trees.

    PubMed

    Price, Bertram; Venso, Elichia; Frana, Mark; Greenberg, Joshua; Ware, Adam

    2007-08-01

    The literature on microbial source tracking (MST) suggests that DNA analysis of fecal samples leads to more reliable determinations of bacterial sources of surface water contamination than antibiotic resistance analysis (ARA). Our goal is to determine whether the increased reliability, if any, in library-based MST developed with DNA data is sufficient to justify its higher cost, where the bacteria source predictions are used in TMDL surface water management programs. We describe an application of classification trees for MST applied to ARA and DNA data from samples collected in the Potomac River Watershed in Maryland. Conclusions concerning the comparison of ARA and DNA data, although preliminary at the current time, suggest that the added cost of obtaining DNA data in comparison to the cost of ARA data may not be justified, where MST is applied in TMDL surface water management programs. PMID:17599384

  14. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    PubMed

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  15. Target-classification approach applied to active UXO sites

    NASA Astrophysics Data System (ADS)

    Shubitidze, F.; Fernández, J. P.; Shamatava, Irma; Barrowes, B. E.; O'Neill, K.

    2013-06-01

    This study is designed to illustrate the discrimination performance at two UXO active sites (Oklahoma's Fort Sill and the Massachusetts Military Reservation) of a set of advanced electromagnetic induction (EMI) inversion/discrimination models which include the orthonormalized volume magnetic source (ONVMS), joint diagonalization (JD), and differential evolution (DE) approaches and whose power and flexibility greatly exceed those of the simple dipole model. The Fort Sill site is highly contaminated by a mix of the following types of munitions: 37-mm target practice tracers, 60-mm illumination mortars, 75-mm and 4.5'' projectiles, 3.5'', 2.36'', and LAAW rockets, antitank mine fuzes with and without hex nuts, practice MK2 and M67 grenades, 2.5'' ballistic windshields, M2A1-mines with/without bases, M19-14 time fuzes, and 40-mm practice grenades with/without cartridges. The site at the MMR site contains targets of yet different sizes. In this work we apply our models to EMI data collected using the MetalMapper (MM) and 2 × 2 TEMTADS sensors. The data for each anomaly are inverted to extract estimates of the extrinsic and intrinsic parameters associated with each buried target. (The latter include the total volume magnetic source or NVMS, which relates to size, shape, and material properties; the former includes location, depth, and orientation). The estimated intrinsic parameters are then used for classification performed via library matching and the use of statistical classification algorithms; this process yielded prioritized dig-lists that were submitted to the Institute for Defense Analyses (IDA) for independent scoring. The models' classification performance is illustrated and assessed based on these independent evaluations.

  16. A regional classification scheme for estimating reference water quality in streams using land-use-adjusted spatial regression-tree analysis

    USGS Publications Warehouse

    Robertson, D.M.; Saad, D.A.; Heisey, D.M.

    2006-01-01

    Various approaches are used to subdivide large areas into regions containing streams that have similar reference or background water quality and that respond similarly to different factors. For many applications, such as establishing reference conditions, it is preferable to use physical characteristics that are not affected by human activities to delineate these regions. However, most approaches, such as ecoregion classifications, rely on land use to delineate regions or have difficulties compensating for the effects of land use. Land use not only directly affects water quality, but it is often correlated with the factors used to define the regions. In this article, we describe modifications to SPARTA (spatial regression-tree analysis), a relatively new approach applied to water-quality and environmental characteristic data to delineate zones with similar factors affecting water quality. In this modified approach, land-use-adjusted (residualized) water quality and environmental characteristics are computed for each site. Regression-tree analysis is applied to the residualized data to determine the most statistically important environmental characteristics describing the distribution of a specific water-quality constituent. Geographic information for small basins throughout the study area is then used to subdivide the area into relatively homogeneous environmental water-quality zones. For each zone, commonly used approaches are subsequently used to define its reference water quality and how its water quality responds to changes in land use. SPARTA is used to delineate zones of similar reference concentrations of total phosphorus and suspended sediment throughout the upper Midwestern part of the United States. ?? 2006 Springer Science+Business Media, Inc.

  17. Hierarchical Multinomial Processing Tree Models: A Latent-Class Approach

    ERIC Educational Resources Information Center

    Klauer, Karl Christoph

    2006-01-01

    Multinomial processing tree models are widely used in many areas of psychology. Their application relies on the assumption of parameter homogeneity, that is, on the assumption that participants do not differ in their parameter values. Tests for parameter homogeneity are proposed that can be routinely used as part of multinomial model analyses to…

  18. A Fault Tree Approach to Needs Assessment -- An Overview.

    ERIC Educational Resources Information Center

    Stephens, Kent G.

    A "failsafe" technology is presented based on a new unified theory of needs assessment. Basically the paper discusses fault tree analysis as a technique for enhancing the probability of success in any system by analyzing the most likely modes of failure that could occur and then suggesting high priority avoidance strategies for those failure…

  19. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach

    ERIC Educational Resources Information Center

    Klauer, Karl Christoph

    2010-01-01

    Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…

  20. A Fault Tree Approach to Analysis of Organizational Communication Systems.

    ERIC Educational Resources Information Center

    Witkin, Belle Ruth; Stephens, Kent G.

    Fault Tree Analysis (FTA) is a method of examing communication in an organization by focusing on: (1) the complex interrelationships in human systems, particularly in communication systems; (2) interactions across subsystems and system boundaries; and (3) the need to select and "prioritize" channels which will eliminate noise in the system and…

  1. Comparison of four approaches to a rock facies classification problem

    USGS Publications Warehouse

    Dubois, M.K.; Bohling, G.C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  2. An Empirical Study of Different Approaches for Protein Classification

    PubMed Central

    Nanni, Loris

    2014-01-01

    Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. These methods, however, are not generalizable and have proven useful in only a few domains. Our goal is to evaluate several feature extraction approaches for representing proteins by testing them across multiple datasets. Different types of protein representations are evaluated: those starting from the position specific scoring matrix of the proteins (PSSM), those derived from the amino-acid sequence, two matrix representations, and features taken from the 3D tertiary structure of the protein. We also test new variants of proteins descriptors. We develop our system experimentally by comparing and combining different descriptors taken from the protein representations. Each descriptor is used to train a separate support vector machine (SVM), and the results are combined by sum rule. Some stand-alone descriptors work well on some datasets but not on others. Through fusion, the different descriptors provide a performance that works well across all tested datasets, in some cases performing better than the state-of-the-art. PMID:25028675

  3. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification

    PubMed Central

    Botella, Juan; Huang, Huiling; Suero, Manuel

    2013-01-01

    Studies that evaluate the accuracy of binary classification tools are needed. Such studies provide 2 × 2 cross-classifications of test outcomes and the categories according to an unquestionable reference (or gold standard). However, sometimes a suboptimal reliability reference is employed. Several methods have been proposed to deal with studies where the observations are cross-classified with an imperfect reference. These methods require that the status of the reference, as a gold standard or as an imperfect reference, is known. In this paper a procedure for determining whether it is appropriate to maintain the assumption that the reference is a gold standard or an imperfect reference, is proposed. This procedure fits two nested multinomial tree models, and assesses and compares their absolute and incremental fit. Its implementation requires the availability of the results of several independent studies. These should be carried out using similar designs to provide frequencies of cross-classification between a test and the reference under investigation. The procedure is applied in two examples with real data. PMID:24106484

  4. Robust Orbit Determination and Classification: A Learning Theoretic Approach

    NASA Astrophysics Data System (ADS)

    Sharma, S.; Cutler, J. W.

    2015-11-01

    Orbit determination involves estimation of a non-linear mapping from feature vectors associated with the position of the spacecraft to its orbital parameters. The de facto standard in orbit determination in real-world scenarios for spacecraft has been linearized estimators such as the extended Kalman filter. Such an estimator, while very accurate and convergent over its linear region, is hard to generalize over arbitrary gravitational potentials and diverse sets of measurements. It is also challenging to perform exact mathematical characterizations of the Kalman filter performance over such general systems. Here we present a new approach to orbit determination as a learning problem involving distribution regression and, also, for the multiple-spacecraft scenario, a transfer learning system for classification of feature vectors associated with spacecraft, and provide some associated analysis of such systems.

  5. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2012-02-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  6. An Approach for Automatic Classification of Radiology Reports in Spanish.

    PubMed

    Cotik, Viviana; Filippo, Darío; Castaño, José

    2015-01-01

    Automatic detection of relevant terms in medical reports is useful for educational purposes and for clinical research. Natural language processing (NLP) techniques can be applied in order to identify them. In this work we present an approach to classify radiology reports written in Spanish into two sets: the ones that indicate pathological findings and the ones that do not. In addition, the entities corresponding to pathological findings are identified in the reports. We use RadLex, a lexicon of English radiology terms, and NLP techniques to identify the occurrence of pathological findings. Reports are classified using a simple algorithm based on the presence of pathological findings, negation and hedge terms. The implemented algorithms were tested with a test set of 248 reports annotated by an expert, obtaining a best result of 0.72 F1 measure. The output of the classification task can be used to look for specific occurrences of pathological findings. PMID:26262128

  7. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2013-01-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  8. A Novel Approach on Designing Augmented Fuzzy Cognitive Maps Using Fuzzified Decision Trees

    NASA Astrophysics Data System (ADS)

    Papageorgiou, Elpiniki I.

    This paper proposes a new methodology for designing Fuzzy Cognitive Maps using crisp decision trees that have been fuzzified. Fuzzy cognitive map is a knowledge-based technique that works as an artificial cognitive network inheriting the main aspects of cognitive maps and artificial neural networks. Decision trees, in the other hand, are well known intelligent techniques that extract rules from both symbolic and numeric data. Fuzzy theoretical techniques are used to fuzzify crisp decision trees in order to soften decision boundaries at decision nodes inherent in this type of trees. Comparisons between crisp decision trees and the fuzzified decision trees suggest that the later fuzzy tree is significantly more robust and produces a more balanced decision making. The approach proposed in this paper could incorporate any type of fuzzy decision trees. Through this methodology, new linguistic weights were determined in FCM model, thus producing augmented FCM tool. The framework is consisted of a new fuzzy algorithm to generate linguistic weights that describe the cause-effect relationships among the concepts of the FCM model, from induced fuzzy decision trees.

  9. Increased tree establishment in Lithuanian peat bogs--insights from field and remotely sensed approaches.

    PubMed

    Edvardsson, Johannes; Šimanauskienė, Rasa; Taminskas, Julius; Baužienė, Ieva; Stoffel, Markus

    2015-02-01

    Over the past century an ongoing establishment of Scots pine (Pinus sylvestris L.), sometimes at accelerating rates, is noted at three studied Lithuanian peat bogs, namely Kerėplis, Rėkyva and Aukštumala, all representing different degrees of tree coverage and geographic settings. Present establishment rates seem to depend on tree density on the bog surface and are most significant at sparsely covered sites where about three-fourth of the trees have established since the mid-1990s, whereas the initial establishment in general was during the early to mid-19th century. Three methods were used to detect, compare and describe tree establishment: (1) tree counts in small plots, (2) dendrochronological dating of bog pine trees, and (3) interpretation of aerial photographs and historical maps of the study areas. In combination, the different approaches provide complimentary information but also weigh up each other's drawbacks. Tree counts in plots provided a reasonable overview of age class distributions and enabled capturing of the most recently established trees with ages less than 50 years. The dendrochronological analysis yielded accurate tree ages and a good temporal resolution of long-term changes. Tree establishment and spread interpreted from aerial photographs and historical maps provided a good overview of tree spread and total affected area. It also helped to verify the results obtained with the other methods and an upscaling of findings to the entire peat bogs. The ongoing spread of trees in predominantly undisturbed peat bogs is related to warmer and/or drier climatic conditions, and to a minor degree to land-use changes. Our results therefore provide valuable insights into vegetation changes in peat bogs, also with respect to bog response to ongoing and future climatic changes. PMID:25310886

  10. Addition of wsp sequences to the Wolbachia phylogenetic tree and stability of the classification.

    PubMed

    Pintureau, B; Chaudier, S; Lassablière, F; Charles, H; Grenier, S

    2000-10-01

    Wolbachia are symbiotic bacteria altering reproductive characters of numerous arthropods. Their most recent phylogeny and classification are based on sequences of the wsp gene. We sequenced wsp gene from six Wolbachia strains infecting six Trichogramma species that live as egg parasitoids on many insects. This allows us to test the effect of the addition of sequences on the Wolbachia phylogeny and to check the classification of Wolbachia infecting Trichogramma. The six Wolbachia studied are classified in the B supergroup. They confirm the monophyletic structure of the B Wolbachia in Trichogramma but introduce small differences in the Wolbachia classification. Modifications include the definition of a new group, Sem, for Wolbachia of T. semblidis and the merging of the two closely related groups, Sib and Kay. Specific primers were determined and tested for the Sem group. PMID:11040288

  11. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii),...

  12. Spectral difference analysis and airborne imaging classification for citrus greening infected trees

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Citrus greening, also called Huanglongbing (HLB), became a devastating disease spread through citrus groves in Florida, since it was first found in 2005. Multispectral (MS) and hyperspectral (HS) airborne images of citrus groves in Florida were acquired to detect citrus greening infected trees in 20...

  13. Chemical classification of cattle. 2. Phylogenetic tree and specific status of the Zebu.

    PubMed

    Manwell, C; Baker, C M

    1980-01-01

    Phylogenetic trees for the ten major breed groups of cattle were constructed by Farris's (1972) maximum parsimony method, or Fitch & Margoliash's (1967) method, which averages ou the deviation over the entire assemblage. Both techniques yield essentially identical trees. The phylogenetic tree for the ten major cattle breed groups can be superimposed on a map of Europe and western Asia, the root of the tree being close to the 'fertile crescent' in Asia Minor, believed to be a primary centre of bovine domestication. For some but not all protein variants there is a cline of gene frequencies as one proceeds from the British Isles and northwest Europe towards southeast Europe and Asia Minor, with the most extreme gene frequencies in the Zebu breeds of India. It is not clear to what extent the observed clines are primary or secondary, i.e., consequent to the initial migrations of cattle towards the end of the Pleistocene or consequent to the many migrations of man with his domesticated cattle. Such clines as exist are not in themselves sufficient to prove either selection versus genetic drift or to establish taxonomic ranking. Contrary to some suggestions in the literature, the biochemical evidence supports Linnaeus's original conclusions: Bos taurus and Bos indicus are distinct species. PMID:7458002

  14. Classification of tissue pathological state using optical multiparametric monitoring approach

    NASA Astrophysics Data System (ADS)

    Kutai-Asis, Hofit; Kanter, Ido; Barbiro-Michaely, Efrat; Mayevsky, Avraham

    2008-12-01

    In order to diagnose the development of pathophysiological events in the brain, the evaluation of multiparametric data in real time is highly important. The current work presents a new approach of using cluster analysis for the evaluation of relationship between: mitochondrial NADH, tissue blood flow and hemoglobin oxygenation under various pathophysiological conditions. The Time-Sharing Fluorometer Reflectometer (TSFR) was used for monitoring of mitochondrial NADH, oxyhemoglobin (HbO2), and microcirculatory blood flow simultaneously at the same location from the rat or gerbils cortex. This allows a more accurate assessment of brain functions in real time and a better understanding of the relationship between tissue oxygen supply and demand. Moreover, in some pathophysiological cases, monitoring of only one or two parameters in the cerebral cortex may be misleading. The classification was based on the data collected in experiments where different pathophysiological conditions, such as anoxia, ischemia, and SD were used. These three parameters were plotted in three dimensions. The clustering approach results showed similar patterns in each type of treatment. The distribution of data points in space was used to define the spatial behavior of each treatment in order to produce an index for identifying different treatments. In conclusion, our present study offers a new approach of data analysis that can serve as a reliable tool for tissue pathophysiology.

  15. Identification, Classification and Differential Expression of Oleosin Genes in Tung Tree (Vernicia fordii)

    PubMed Central

    Cao, Heping; Zhang, Lin; Tan, Xiaofeng; Long, Hongxu; Shockey, Jay M.

    2014-01-01

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii), whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the “proline knot” motif (PX5SPX3P) shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach), Populus trichocarpa (poplar), Ricinus communis (castor bean), Theobroma cacao (cacao) and Vitis vinifera (grapevine). Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1) All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2) OLE mRNA levels were much higher in seeds than leaves or flowers; 3) OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4) OLE mRNA levels rapidly increased during seed development; and 5) OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1–3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants. PMID:24516650

  16. Impacts of age-dependent tree sensitivity and dating approaches on dendrogeomorphic time series of landslides

    NASA Astrophysics Data System (ADS)

    Šilhán, Karel; Stoffel, Markus

    2015-05-01

    Different approaches and thresholds have been utilized in the past to date landslides with growth ring series of disturbed trees. Past work was mostly based on conifer species because of their well-defined ring boundaries and the easy identification of compression wood after stem tilting. More recently, work has been expanded to include broad-leaved trees, which are thought to produce less and less evident reactions after landsliding. This contribution reviews recent progress made in dendrogeomorphic landslide analysis and introduces a new approach in which landslides are dated via ring eccentricity formed after tilting. We compare results of this new and the more conventional approaches. In addition, the paper also addresses tree sensitivity to landslide disturbance as a function of tree age and trunk diameter using 119 common beech (Fagus sylvatica L.) and 39 Crimean pine (Pinus nigra ssp. pallasiana) trees growing on two landslide bodies. The landslide events reconstructed with the classical approach (reaction wood) also appear as events in the eccentricity analysis, but the inclusion of eccentricity clearly allowed for more (162%) landslides to be detected in the tree-ring series. With respect to tree sensitivity, conifers and broad-leaved trees show the strongest reactions to landslides at ages comprised between 40 and 60 years, with a second phase of increased sensitivity in P. nigra at ages of ca. 120-130 years. These phases of highest sensitivities correspond with trunk diameters at breast height of 6-8 and 18-22 cm, respectively (P. nigra). This study thus calls for the inclusion of eccentricity analyses in future landslide reconstructions as well as for the selection of trees belonging to different age and diameter classes to allow for a well-balanced and more complete reconstruction of past events.

  17. Narrowing historical uncertainty: probabilistic classification of ambiguously identified tree species in historical forest survey data

    USGS Publications Warehouse

    Mladenoff, D.J.; Dahir, S.E.; Nordheim, E.V.; Schulte, L.A.; Guntenspergen, G.R.

    2002-01-01

    Historical data have increasingly become appreciated for insight into the past conditions of ecosystems. Uses of such data include assessing the extent of ecosystem change; deriving ecological baselines for management, restoration, and modeling; and assessing the importance of past conditions on the composition and function of current systems. One historical data set of this type is the Public Land Survey (PLS) of the United States General Land Office, which contains data on multiple tree species, sizes, and distances recorded at each survey point, located at half-mile (0.8 km) intervals on a 1-mi (1.6 km) grid. This survey method was begun in the 1790s on US federal lands extending westward from Ohio. Thus, the data have the potential of providing a view of much of the US landscape from the mid-1800s, and they have been used extensively for this purpose. However, historical data sources, such as those describing the species composition of forests, can often be limited in the detail recorded and the reliability of the data, since the information was often not originally recorded for ecological purposes. Forest trees are sometimes recorded ambiguously, using generic or obscure common names. For the PLS data of northern Wisconsin, USA, we developed a method to classify ambiguously identified tree species using logistic regression analysis, using data on trees that were clearly identified to species and a set of independent predictor variables to build the models. The models were first created on partial data sets for each species and then tested for fit against the remaining data. Validations were conducted using repeated, random subsets of the data. Model prediction accuracy ranged from 81% to 96% in differentiating congeneric species among oak, pine, ash, maple, birch, and elm. Major predictor variables were tree size, associated species, landscape classes indicative of soil type, and spatial location within the study region. Results help to clarify ambiguities

  18. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, Jacopo; Tuia, Devis; Berne, Alexis

    2015-04-01

    Hydrometeor classification is the process that aims at identifying the dominant type of hydrometeor (e.g. rain, hail, snow aggregates, hail, graupel, ice crystals) in a domain covered by a polarimetric weather radar during precipitation. The techniques documented in the literature are mostly based on numerical simulations and fuzzy logic. This involves the arbitrary selection of a set of hydrometeor classes and the numerical simulation of theoretical radar observations associated to each class. The information derived from the simulation is then applied to actual radar measurements by means of fuzzy logic input-output association. This approach has some limitations: the number and type of the hydrometeor categories undergoing identification is selected arbitrarily and the scattering simulations are based on constraining assumptions, especially in case of solid hydrometeors. Furthermore, in presence of noise and uncertainties, it is not guaranteed that the selected hydrometeor classes can be effectively identified in actual observations. In the present work we propose a different starting point for the classification task, which is based on observations instead of numerical simulations. We provide criteria for the selection of the number of hydrometeor classes that can be identified, by looking at how polarimetric observations collected over different precipitation events form clusters in the multi-dimensional space of the polarimetric variables. Two datasets, collected by an X-band weather radar, are employed in the study. The first dataset covers mountainous weather conditions (Swiss Alps), while the second includes Mediterranean orographic precipitation events collected during the special observation period (SOP) 2012 of the HyMeX campaign. We employ an unsupervised hierarchical clustering method to group the observations into clusters and we introduce a spatial smoothness constraint for the groups, assuming that the hydrometeor type changes smoothly in space

  19. Identification of sexually abused female adolescents at risk for suicidal ideations: a classification and regression tree analysis.

    PubMed

    Brabant, Marie-Eve; Hébert, Martine; Chagnon, François

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression, posttraumatic stress symptoms, and hopelessness discriminated profiles of suicidal and nonsuicidal survivors. The elevated prevalence of suicidal ideations among adolescent survivors of sexual abuse underscores the importance of investigating the presence of suicidal ideations in sexual abuse survivors. However, suicidal ideation is not the sole variable that needs to be investigated; depression, hopelessness and posttraumatic stress symptoms are also related to suicidal ideations in survivors and could therefore guide interventions. PMID:23428149

  20. Simple, novel approaches to investigating biophysical characteristics of individual mid-latitude deciduous trees

    NASA Astrophysics Data System (ADS)

    Kalibo, Humphrey Wafula

    Forests play a critical role in the functioning of the biosphere and support the livelihoods of millions of people. With increasing anthropogenic influences and looming effects associated with climatic variability, it is crucial that the research community and policy makers take advantage of the capabilities afforded by remote sensing technologies to generate reliable and timely data to support management decisions. Set in the species-rich woodland of Prairie Pines in Lincoln, Nebraska, this research addresses three distinct objectives that could contribute towards forest research and management. First, three supervised classification algorithms were applied to two hyperspectral AISA-Eagle images to evaluate their capability for spectrally identifying selected tree species. The findings show that each algorithm had low to moderate overall classification accuracies (46%-62%), probably due to mixed pixels resulting from pronounced heterogeneity in tree diversity; however, the algorithms could be a rapid means to assess species composition. The second objective is an investigation into how twelve individual morphologically different deciduous trees transmit incoming photosynthetically active radiation (PAR) over the course of the growing season. It was found that more diffuse light was transmitted than direct light, dictated by seasonality, vegetation fraction (VF), and leaf size. In the final objective, VF derived from upward-looking hemispherical photographs of twelve deciduous tree canopies and eight spectral vegetation indices (VIs) calculated from in situ single leaf-level reflectance data were used to investigate whether the VIs could mimic and estimate the temporal patterns of measured VF of each tree over the growing season. The findings show that all the indices accurately depicted the temporal patterns of the photo-derived VF. NDVI and SAVI had the highest correlations (R 2 > 0.7; RMSE 0.7; E > 0.8) and closely mirrored the temporal patterns of VF for nine

  1. The Iqmulus Urban Showcase: Automatic Tree Classification and Identification in Huge Mobile Mapping Point Clouds

    NASA Astrophysics Data System (ADS)

    Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.

    2016-06-01

    Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.

  2. Tree carbon allocation dynamics determined using a carbon mass balance approach.

    PubMed

    Klein, Tamir; Hoch, Günter

    2015-01-01

    Tree internal carbon (C) fluxes between compound and compartment pools are difficult to measure directly. Here we used a C mass balance approach to decipher these fluxes and provide a full description of tree C allocation dynamics. We collected independent measurements of tree C sinks, source and pools in Pinus halepensis in a semi-arid forest, and converted all fluxes to g C per tree d(-1) . Using this data set, a process flowchart was created to describe and quantify the tree C allocation on diurnal to annual time-scales. The annual C source of 24.5 kg C per tree yr(-1) was balanced by C sinks of 23.5 kg C per tree yr(-1) , which partitioned into 70%, 17% and 13% between respiration, growth, and litter (plus export to soil), respectively. Large imbalances (up to 57 g C per tree d(-1) ) were observed as C excess during the wet season, and as C deficit during the dry season. Concurrent changes in C reserves (starch) were sufficient to buffer these transient C imbalances. The C pool dynamics calculated using the flowchart were in general agreement with the observed pool sizes, providing confidence regarding our estimations of the timing, magnitude, and direction of the internal C fluxes. PMID:25157793

  3. An efficient approach to 3D single tree-crown delineation in LiDAR data

    NASA Astrophysics Data System (ADS)

    Mongus, Domen; Žalik, Borut

    2015-10-01

    This paper proposes a new method for 3D delineation of single tree-crowns in LiDAR data by exploiting the complementaries of treetop and tree trunk detections. A unified mathematical framework is provided based on the graph theory, allowing for all the segmentations to be achieved using marker-controlled watersheds. Treetops are defined by detecting concave neighbourhoods within the canopy height model using locally fitted surfaces. These serve as markers for watershed segmentation of the canopy layer where possible oversegmentation is reduced by merging the regions based on their heights, areas, and shapes. Additional tree crowns are delineated from mid- and under-storey layers based on tree trunk detection. A new approach for estimating the verticalities of the points' distributions is proposed for this purpose. The watershed segmentation is then applied on a density function within the voxel space, while boundaries of delineated trees from the canopy layer are used to prevent the overspreading of regions. The experiments show an approximately 6% increase in the efficiency of the proposed treetop definition based on locally fitted surfaces in comparison with the traditionally used local maxima of the smoothed canopy height model. In addition, 4% increase in the efficiency is achieved by the proposed tree trunk detection. Although the tree trunk detection alone is dependent on the data density, supplementing it with the treetop detection the proposed approach is efficient even when dealing with low density point-clouds.

  4. Crop classification in the U.S. Corn Belt using MODIS imagery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Land cover classification is essential in studies of land cover change, climate, hydrology, carbon sequestration and yield prediction. Land cover classification uses pattern recognition technique that includes supervised / unsupervised approaches and decision tree technique. Land cover maps for re...

  5. Assessing College Student Interest in Math and/or Computer Science in a Cross-National Sample Using Classification and Regression Trees

    ERIC Educational Resources Information Center

    Kitsantas, Anastasia; Kitsantas, Panagiota; Kitsantas, Thomas

    2012-01-01

    The purpose of this exploratory study was to assess the relative importance of a number of variables in predicting students' interest in math and/or computer science. Classification and regression trees (CART) were employed in the analysis of survey data collected from 276 college students enrolled in two U.S. and Greek universities. The…

  6. Assessment on the classification of landslide risk level using Genetic Algorithm of Operation Tree in central Taiwan

    NASA Astrophysics Data System (ADS)

    Wei, Chiang; Yeh, Hui-Chung; Chen, Yen-Chang

    2015-04-01

    This study assessed the classification of landslide areas by Genetic Algorithm of Operation Tree (GAOT) of Chen-Yu-Lan River upstream watershed of National Taiwan University Experimental Forest (NTUEF) after the Typhoon Morakot in 2009 using remotely and geological data. Landslides of 624.5 ha which accounting for 1.9% of total area were delineated with the threshold of slope (22°) and area size (1 hectare), 48 landslide sites were located in the upstream Chen-Yu-Lan watershed using FORMOSAT-II satellite imagery, the aerial photo and GIS related coverage. The five risk levels of these landslide areas was classified by the area, elevation, slope order, aspect, erosion order and geological factor order using the Simplicity Method suggested in the Technical Regulations for Soil and Water Conservation of Taiwan. If all the landslide sites were considered, the accuracy of classification using GAOT is 97.9%, superior than the K-means, Ward method, Shared Nearest Neighbor method, Maximum Likelihood Classifier and Bayesian Classifier; if 36 sites were used as training samples and the rest 12 sites were tested, the accuracy still can reach 81.3%. More geological data, anthropogenic influence and hydrological factors may be necessary for clarifying the landside area and the results benefit the assessment for future correction and management of the authorities.

  7. Evaluation of Current Approaches to Stream Classification and a Heuristic Guide to Developing Classifications of Integrated Aquatic Networks

    NASA Astrophysics Data System (ADS)

    Melles, S. J.; Jones, N. E.; Schmidt, B. J.

    2014-03-01

    Conservation and management of fresh flowing waters involves evaluating and managing effects of cumulative impacts on the aquatic environment from disturbances such as: land use change, point and nonpoint source pollution, the creation of dams and reservoirs, mining, and fishing. To assess effects of these changes on associated biotic communities it is necessary to monitor and report on the status of lotic ecosystems. A variety of stream classification methods are available to assist with these tasks, and such methods attempt to provide a systematic approach to modeling and understanding complex aquatic systems at various spatial and temporal scales. Of the vast number of approaches that exist, it is useful to group them into three main types. The first involves modeling longitudinal species turnover patterns within large drainage basins and relating these patterns to environmental predictors collected at reach and upstream catchment scales; the second uses regionalized hierarchical classification to create multi-scale, spatially homogenous aquatic ecoregions by grouping adjacent catchments together based on environmental similarities; and the third approach groups sites together on the basis of similarities in their environmental conditions both within and between catchments, independent of their geographic location. We review the literature with a focus on more recent classifications to examine the strengths and weaknesses of the different approaches. We identify gaps or problems with the current approaches, and we propose an eight-step heuristic process that may assist with development of more flexible and integrated aquatic classifications based on the current understanding, network thinking, and theoretical underpinnings.

  8. One or Two Dimensions in Spontaneous Classification: A Simplicity Approach

    ERIC Educational Resources Information Center

    Pothos, Emmanuel M.; Close, James

    2008-01-01

    When participants are asked to spontaneously categorize a set of items, they typically produce unidimensional classifications, i.e., categorize the items on the basis of only one of their dimensions of variation. We examine whether it is possible to predict unidimensional vs. two-dimensional classification on the basis of the abstract stimulus…

  9. RAVEN. Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Alfonsi, Andrea; Rabiti, Cristian; Mandelli, Diego; Cogliati, Joshua; Kinoshita, Robert

    2014-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  10. RAVEN: Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Andrea Alfonsi; Cristian Rabiti; Diego Mandelli; Joshua Cogliati; Robert Kinoshita

    2013-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  11. Integrated Analysis of Tropical Trees Growth: A Multivariate Approach

    PubMed Central

    YÁÑEZ-ESPINOSA, LAURA; TERRAZAS, TERESA; LÓPEZ-MATA, LAURO

    2006-01-01

    • Background and Aims One of the problems analysing cause–effect relationships of growth and environmental factors is that a single factor could be correlated with other ones directly influencing growth. One attempt to understand tropical trees' growth cause–effect relationships is integrating research about anatomical, physiological and environmental factors that influence growth in order to develop mathematical models. The relevance is to understand the nature of the process of growth and to model this as a function of the environment. • Methods The relationships of Aphananthe monoica, Pleuranthodendron lindenii and Psychotria costivenia radial growth and phenology with environmental factors (local climate, vertical strata microclimate and physical and chemical soil variables) were evaluated from April 2000 to September 2001. The association among these groups of variables was determined by generalized canonical correlation analysis (GCCA), which considers the probable associations of three or more data groups and the selection of the most important variables for each data group. • Key Results The GCCA allowed determination of a general model of relationships among tree phenology and radial growth with climate, microclimate and soil factors. A strong influence of climate in phenology and radial growth existed. Leaf initiation and cambial activity periods were associated with maximum temperature and day length, and vascular tissue differentiation with soil moisture and rainfall. The analyses of individual species detected different relationships for the three species. • Conclusions The analyses of the individual species suggest that each one takes advantage in a different way of the environment in which they are growing, allowing them to coexist. PMID:16822807

  12. Coal waste classification and approaches to utilization in China

    SciTech Connect

    Xu Zesheng; Yang Qiaowen; Wang Zuna

    1998-12-31

    The amounts of coal waste or coal refuse from mining and coal preparation are adding up rapidly in China because of the increase production of coal. The coal refuse disposed in 1996 amounted to 610 million tons. The stockpiled coal refuse had reached 3 billion tons by the end of 1996, occupying about an area of 8,000 hectare. It is very important to classify coal refuse scientifically, including its chemical composition and physical chemistry for proper treatment and comprehensive utilization. The significance or goal of proper classification of coal waste is: first, to make full use of coal waste on the basis of its useful mineral content and grade; second, to advance utilization methods for coal waste that use no processing technology so as to save processing time and cost; thirdly, to be able to determine relatively precisely coal waste quality and quantity so as to decrease manmade stockpiled coal waste mixtures and to be able to utilize all kinds of coal waste; and finally, to guide development of new refuse utilization approaches. According to characteristics of coal waste resources in China, all coal wastes are classified into six main classes on basis of their source and stockpiled situation. The six main classes are coal-heading coal waste, rock-heading coal waste, spontaneous combustion coal waste, mechanical separating coal waste, sorting coal waste and rock-stripping coal waste. Each is described.

  13. Single-cell approaches for molecular classification of endocrine tumors

    PubMed Central

    Koh, James; Allbritton, Nancy L.; Sosa, Julie A.

    2015-01-01

    Purpose of review In this review, we summarize recent developments in single-cell technologies that can be employed for the functional and molecular classification of endocrine cells in normal and neoplastic tissue. Recent findings The emergence of new platforms for the isolation, analysis, and dynamic assessment of individual cell identity and reactive behavior enables experimental deconstruction of intratumoral heterogeneity and other contexts, where variability in cell signaling and biochemical responsiveness inform biological function and clinical presentation. These tools are particularly appropriate for examining and classifying endocrine neoplasias, as the clinical sequelae of these tumors are often driven by disrupted hormonal responsiveness secondary to compromised cell signaling. Single-cell methods allow for multidimensional experimental designs incorporating both spatial and temporal parameters with the capacity to probe dynamic cell signaling behaviors and kinetic response patterns dependent upon sequential agonist challenge. Summary Intratumoral heterogeneity in the provenance, composition, and biological activity of different forms of endocrine neoplasia presents a significant challenge for prognostic assessment. Single-cell technologies provide an array of powerful new approaches uniquely well suited for dissecting complex endocrine tumors. Studies examining the relationship between clinical behavior and tumor compositional variations in cellular activity are now possible, providing new opportunities to deconstruct the underlying mechanisms of endocrine neoplasia. PMID:26632769

  14. Genome trees constructed using five different approaches suggest new major bacterial clades

    PubMed Central

    Wolf, Yuri I; Rogozin, Igor B; Grishin, Nick V; Tatusov, Roman L; Koonin, Eugene V

    2001-01-01

    Background The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. Results Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the

  15. Bayesian decision tree for the classification of the mode of motion in single-molecule trajectories.

    PubMed

    Türkcan, Silvan; Masson, Jean-Baptiste

    2013-01-01

    Membrane proteins move in heterogeneous environments with spatially (sometimes temporally) varying friction and with biochemical interactions with various partners. It is important to reliably distinguish different modes of motion to improve our knowledge of the membrane architecture and to understand the nature of interactions between membrane proteins and their environments. Here, we present an analysis technique for single molecule tracking (SMT) trajectories that can determine the preferred model of motion that best matches observed trajectories. The method is based on Bayesian inference to calculate the posteriori probability of an observed trajectory according to a certain model. Information theory criteria, such as the Bayesian information criterion (BIC), the Akaike information criterion (AIC), and modified AIC (AICc), are used to select the preferred model. The considered group of models includes free Brownian motion, and confined motion in 2nd or 4th order potentials. We determine the best information criteria for classifying trajectories. We tested its limits through simulations matching large sets of experimental conditions and we built a decision tree. This decision tree first uses the BIC to distinguish between free Brownian motion and confined motion. In a second step, it classifies the confining potential further using the AIC. We apply the method to experimental Clostridium Perfingens [Formula: see text]-toxin (CP[Formula: see text]T) receptor trajectories to show that these receptors are confined by a spring-like potential. An adaptation of this technique was applied on a sliding window in the temporal dimension along the trajectory. We applied this adaptation to experimental CP[Formula: see text]T trajectories that lose confinement due to disaggregation of confining domains. This new technique adds another dimension to the discussion of SMT data. The mode of motion of a receptor might hold more biologically relevant information than the diffusion

  16. A Method for Application of Classification Tree Models to Map Aquatic Vegetation Using Remotely Sensed Images from Different Sensors and Dates

    PubMed Central

    Jiang, Hao; Zhao, Dehua; Cai, Ying; An, Shuqing

    2012-01-01

    In previous attempts to identify aquatic vegetation from remotely-sensed images using classification trees (CT), the images used to apply CT models to different times or locations necessarily originated from the same satellite sensor as that from which the original images used in model development came, greatly limiting the application of CT. We have developed an effective normalization method to improve the robustness of CT models when applied to images originating from different sensors and dates. A total of 965 ground-truth samples of aquatic vegetation types were obtained in 2009 and 2010 in Taihu Lake, China. Using relevant spectral indices (SI) as classifiers, we manually developed a stable CT model structure and then applied a standard CT algorithm to obtain quantitative (optimal) thresholds from 2009 ground-truth data and images from Landsat7-ETM+, HJ-1B-CCD, Landsat5-TM and ALOS-AVNIR-2 sensors. Optimal CT thresholds produced average classification accuracies of 78.1%, 84.7% and 74.0% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. However, the optimal CT thresholds for different sensor images differed from each other, with an average relative variation (RV) of 6.40%. We developed and evaluated three new approaches to normalizing the images. The best-performing method (Method of 0.1% index scaling) normalized the SI images using tailored percentages of extreme pixel values. Using the images normalized by Method of 0.1% index scaling, CT models for a particular sensor in which thresholds were replaced by those from the models developed for images originating from other sensors provided average classification accuracies of 76.0%, 82.8% and 68.9% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. Applying the CT models developed for normalized 2009 images to 2010 images resulted in high classification (78.0%–93.3%) and overall (92.0%–93.1%) accuracies. Our results suggest

  17. Idiopathic interstitial pneumonias and emphysema: detection and classification using a texture-discriminative approach

    NASA Astrophysics Data System (ADS)

    Fetita, C.; Chang-Chien, K. C.; Brillet, P. Y.; Pr"teux, F.; Chang, R. F.

    2012-03-01

    Our study aims at developing a computer-aided diagnosis (CAD) system for fully automatic detection and classification of pathological lung parenchyma patterns in idiopathic interstitial pneumonias (IIP) and emphysema using multi-detector computed tomography (MDCT). The proposed CAD system is based on three-dimensional (3-D) mathematical morphology, texture and fuzzy logic analysis, and can be divided into four stages: (1) a multi-resolution decomposition scheme based on a 3-D morphological filter was exploited to discriminate the lung region patterns at different analysis scales. (2) An additional spatial lung partitioning based on the lung tissue texture was introduced to reinforce the spatial separation between patterns extracted at the same resolution level in the decomposition pyramid. Then, (3) a hierarchic tree structure was exploited to describe the relationship between patterns at different resolution levels, and for each pattern, six fuzzy membership functions were established for assigning a probability of association with a normal tissue or a pathological target. Finally, (4) a decision step exploiting the fuzzy-logic assignments selects the target class of each lung pattern among the following categories: normal (N), emphysema (EM), fibrosis/honeycombing (FHC), and ground glass (GDG). According to a preliminary evaluation on an extended database, the proposed method can overcome the drawbacks of a previously developed approach and achieve higher sensitivity and specificity.

  18. New Approaches to Object Classification in Synoptic Sky Surveys

    SciTech Connect

    Donalek, C.; Mahabal, A.; Djorgovski, S. G.; Marney, S.; Drake, A.; Glikman, E.; Graham, M. J.; Williams, R.

    2008-12-05

    Digital synoptic sky surveys pose several new object classification challenges. In surveys where real-time detection and classification of transient events is a science driver, there is a need for an effective elimination of instrument-related artifacts which can masquerade as transient sources in the detection pipeline, e.g., unremoved large cosmic rays, saturation trails, reflections, crosstalk artifacts, etc. We have implemented such an Artifact Filter, using a supervised neural network, for the real-time processing pipeline in the Palomar-Quest (PQ) survey. After the training phase, for each object it takes as input a set of measured morphological parameters and returns the probability of it being a real object. Despite the relatively low number of training cases for many kinds of artifacts, the overall artifact classification rate is around 90%, with no genuine transients misclassified during our real-time scans. Another question is how to assign an optimal star-galaxy classification in a multi-pass survey, where seeing and other conditions change between different epochs, potentially producing inconsistent classifications for the same object. We have implemented a star/galaxy multipass classifier that makes use of external and a priori knowledge to find the optimal classification from the individually derived ones. Both these techniques can be applied to other, similar surveys and data sets.

  19. PoMo: An Allele Frequency-Based Approach for Species Tree Estimation

    PubMed Central

    De Maio, Nicola; Schrempf, Dominik; Kosiol, Carolin

    2015-01-01

    Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches. PMID:26209413

  20. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    PubMed

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1

  1. Mathematical Programming Approaches for the Classification Problem in Two-Group Discriminant Analysis.

    ERIC Educational Resources Information Center

    Joachimsthaler, Erich A.; Stam, Antonie

    1990-01-01

    Mathematical programing formulas are introduced as new approaches to solve the classification problem in discriminant analysis. The research literature is reviewed, and an illustration using a real-world classification problem is provided. Issues relevant to potential uses of these formulations are discussed. (TJH)

  2. Text Categorization Based on K-Nearest Neighbor Approach for Web Site Classification.

    ERIC Educational Resources Information Center

    Kwon, Oh-Woog; Lee, Jong-Hyeok

    2003-01-01

    Discusses text categorization and Web site classification and proposes a three-step classification system that includes the use of Web pages linked with the home page. Highlights include the k-nearest neighbor (k-NN) approach; improving performance with a feature selection method and a term weighting scheme using HTML tags; and similarity…

  3. Machine Learning Approaches for High-resolution Urban Land Cover Classification: A Comparative Study

    SciTech Connect

    Vatsavai, Raju; Chandola, Varun; Cheriyadat, Anil M; Bright, Eddie A; Bhaduri, Budhendra L; Graesser, Jordan B

    2011-01-01

    The proliferation of several machine learning approaches makes it difficult to identify a suitable classification technique for analyzing high-resolution remote sensing images. In this study, ten classification techniques were compared from five broad machine learning categories. Surprisingly, the performance of simple statistical classification schemes like maximum likelihood and Logistic regression over complex and recent techniques is very close. Given that these two classifiers require little input from the user, they should still be considered for most classification tasks. Multiple classifier systems is a good choice if the resources permit.

  4. Neural network approaches versus statistical methods in classification of multisource remote sensing data

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.

    1990-01-01

    Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.

  5. Biodiversity among Lactobacillus helveticus Strains Isolated from Different Natural Whey Starter Cultures as Revealed by Classification Trees

    PubMed Central

    Gatti, Monica; Trivisano, Carlo; Fabrizi, Enrico; Neviani, Erasmo; Gardini, Fausto

    2004-01-01

    Lactobacillus helveticus is a homofermentative thermophilic lactic acid bacterium used extensively for manufacturing Swiss type and aged Italian cheese. In this study, the phenotypic and genotypic diversity of strains isolated from different natural dairy starter cultures used for Grana Padano, Parmigiano Reggiano, and Provolone cheeses was investigated by a classification tree technique. A data set was used that consists of 119 L. helveticus strains, each of which was studied for its physiological characters, as well as surface protein profiles and hybridization with a species-specific DNA probe. The methodology employed in this work allowed the strains to be grouped into terminal nodes without difficult and subjective interpretation. In particular, good discrimination was obtained between L. helveticus strains isolated, respectively, from Grana Padano and from Provolone natural whey starter cultures. The method used in this work allowed identification of the main characteristics that permit discrimination of biotypes. In order to understand what kind of genes could code for phenotypes of technological relevance, evidence that specific DNA sequences are present only in particular biotypes may be of great interest. PMID:14711641

  6. Rapid Erosion Modeling in a Western Kenya Watershed using Visible Near Infrared Reflectance, Classification Tree Analysis and 137Cesium

    PubMed Central

    deGraffenried, Jeff B.; Shepherd, Keith D.

    2010-01-01

    Human induced soil erosion has severe economic and environmental impacts throughout the world. It is more severe in the tropics than elsewhere and results in diminished food production and security. Kenya has limited arable land and 30 percent of the country experiences severe to very severe human induced soil degradation. The purpose of this research was to test visible near infrared diffuse reflectance spectroscopy (VNIR) as a tool for rapid assessment and benchmarking of soil condition and erosion severity class. The study was conducted in the Saiwa River watershed in the northern Rift Valley Province of western Kenya, a tropical highland area. Soil 137Cs concentration was measured to validate spectrally derived erosion classes and establish the background levels for difference land use types. Results indicate VNIR could be used to accurately evaluate a large and diverse soil data set and predict soil erosion characteristics. Soil condition was spectrally assessed and modeled. Analysis of mean raw spectra indicated significant reflectance differences between soil erosion classes. The largest differences occurred between 1,350 and 1,950 nm with the largest separation occurring at 1,920 nm. Classification and Regression Tree (CART) analysis indicated that the spectral model had practical predictive success (72%) with Receiver Operating Characteristic (ROC) of 0.74. The change in 137Cs concentrations supported the premise that VNIR is an effective tool for rapid screening of soil erosion condition. PMID:27397933

  7. Evaluating Two Approaches to Helping College Students Understand Evolutionary Trees through Diagramming Tasks

    ERIC Educational Resources Information Center

    Perry, Judy; Meir, Eli; Herron, Jon C.; Maruca, Susan; Stal, Derek

    2008-01-01

    To understand evolutionary theory, students must be able to understand and use evolutionary trees and their underlying concepts. Active, hands-on curricula relevant to macroevolution can be challenging to implement across large college-level classes where textbook learning is the norm. We evaluated two approaches to helping students learn…

  8. Total system performance assessment for waste disposal using a logic tree approach.

    PubMed

    Kessler, J H; McGuire, R K

    1999-10-01

    The Electric Power Research Institute (EPRI) has sponsored the development of a model to assess the long-term, overall "performance" of the candidate spent fuel and high-level radioactive waste (HLW) disposal facility at Yucca Mountain, Nevada. The model simulates the processes that lead to HLW container corrosion, HLW mobilization from the spent fuel, and transport by groundwater, and contaminated groundwater usage by future hypothetical individuals leading to radiation doses to those individuals. The model must incorporate a multitude of complex, coupled processes across a variety of technical disciplines. Furthermore, because of the very long time frames involved in the modeling effort (> 10(4) years), the relative lack of directly applicable data, and many uncertainties and variabilities in those data, a probabilistic approach to model development was necessary. The developers of the model chose a logic tree approach to represent uncertainties in both conceptual models and model parameter values. The developers felt the logic tree approach was the most appropriate. This paper discusses the value and use of logic trees applied to assessing the uncertainties in HLW disposal, the components of the model, and a few of the results of that model. The paper concludes with a comparison of logic trees and Monte Carlo approaches. PMID:10765439

  9. Neural network approach to classification of infrasound signals

    NASA Astrophysics Data System (ADS)

    Lee, Dong-Chang

    As part of the International Monitoring Systems of the Preparatory Commissions for the Comprehensive Nuclear Test-Ban Treaty Organization, the Infrasound Group at the University of Alaska Fairbanks maintains and operates two infrasound stations to monitor global nuclear activity. In addition, the group specializes in detecting and classifying the man-made and naturally produced signals recorded at both stations by computing various characterization parameters (e.g. mean of the cross correlation maxima, trace velocity, direction of arrival, and planarity values) using the in-house developed weighted least-squares algorithm. Classifying commonly observed low-frequency (0.015--0.1 Hz) signals at out stations, namely mountain associated waves and high trace-velocity signals, using traditional approach (e.g. analysis of power spectral density) presents a problem. Such signals can be separated statistically by setting a window to the trace-velocity estimate for each signal types, and the feasibility of such technique is demonstrated by displaying and comparing various summary plots (e.g. universal, seasonal and azimuthal variations) produced by analyzing infrasound data (2004--2007) from the Fairbanks and Antarctic arrays. Such plots with the availability of magnetic activity information (from the College International Geophysical Observatory located at Fairbanks, Alaska) leads to possible physical sources of the two signal types. Throughout this thesis a newly developed robust algorithm (sum of squares of variance ratios) with improved detection quality (under low signal to noise ratios) over two well-known detection algorithms (mean of the cross correlation maxima and Fisher Statistics) are investigated for its efficacy as a new detector. A neural network is examined for its ability to automatically classify the two signals described above against clutter (spurious signals with common characteristics). Four identical perceptron networks are trained and validated (with

  10. Using hydrogeomorphic criteria to classify wetlands on Mt. Desert Island, Maine - approach, classification system, and examples

    USGS Publications Warehouse

    Nielsen, Martha G.; Guntenspergen, Glenn R.; Neckles, Hilary A.

    2005-01-01

    A wetland classification system was designed for Mt. Desert Island, Maine, to help categorize the large number of wetlands (over 1,200 mapped units) as an aid to understanding their hydrologic functions. The classification system, developed by the U.S. Geological Survey (USGS), in cooperation with the National Park Service, uses a modified hydrogeomorphic (HGM) approach, and assigns categories based on position in the landscape, soils and surficial geologic setting, and source of water. A dichotomous key was developed to determine a preliminary HGM classification of wetlands on the island. This key is designed for use with USGS topographic maps and 1:24,000 geographic information system (GIS) coverages as an aid to the classification, but may also be used with field data. Hydrologic data collected from a wetland monitoring study were used to determine whether the preliminary classification of individual wetlands using the HGM approach yielded classes that were consistent with actual hydroperiod data. Preliminary HGM classifications of the 20 wetlands in the monitoring study were consistent with the field hydroperiod data. The modified HGM classification approach appears robust, although the method apparently works somewhat better with undisturbed wetlands than with disturbed wetlands. This wetland classification system could be applied to other hydrogeologically similar areas of northern New England.