Science.gov

Sample records for classification tree approach

  1. The decision tree approach to classification

    NASA Technical Reports Server (NTRS)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  2. Vessel-guided airway tree segmentation: A voxel classification approach.

    PubMed

    Lo, Pechin; Sporring, Jon; Ashraf, Haseem; Pedersen, Jesper J H; de Bruijne, Marleen

    2010-08-01

    This paper presents a method for airway tree segmentation that uses a combination of a trained airway appearance model, vessel and airway orientation information, and region growing. We propose a voxel classification approach for the appearance model, which uses a classifier that is trained to differentiate between airway and non-airway voxels. This is in contrast to previous works that use either intensity alone or hand crafted models of airway appearance. We show that the appearance model can be trained with a set of easily acquired, incomplete, airway tree segmentations. A vessel orientation similarity measure is introduced, which indicates how similar the orientation of an airway candidate is to the orientation of the neighboring vessel. We use this vessel orientation similarity measure to overcome regions in the airway tree that have a low response from the appearance model. The proposed method is evaluated on 250 low dose computed tomography images from a lung cancer screening trial. Our experiments showed that applying the region growing algorithm on the airway appearance model produces more complete airway segmentations, leading to on average 20% longer trees, and 50% less leakage. When combining the airway appearance model with vessel orientation similarity, the improvement is even more significant (p<0.01) than only using the airway appearance model, with on average 7% increase in the total length of branches extracted correctly. Copyright 2010 Elsevier B.V. All rights reserved.

  3. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  4. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  5. A classification tree approach to the development of actuarial violence risk assessment tools.

    PubMed

    Steadman, H J; Silver, E; Monahan, J; Appelbaum, P S; Robbins, P C; Mulvey, E P; Grisso, T; Roth, L H; Banks, S

    2000-02-01

    Since the 1970s, a wide body of research has suggested that the accuracy of clinical risk assessments of violence might be increased if clinicians used actuarial tools. Despite considerable progress in recent years in the development of such tools for violence risk assessment, they remain primarily research instruments, largely ignored in daily clinical practice. We argue that because most existing actuarial tools are based on a main effects regression approach, they do not adequately reflect the contingent nature of the clinical assessment processes. To enhance the use of actuarial violence risk assessment tools, we propose a classification tree rather than a main effects regression approach. In addition, we suggest that by employing two decision thresholds for identifying high- and low-risk cases--instead of the standard single threshold--the use of actuarial tools to make dichotomous risk classification decisions may be further enhanced. These claims are supported with empirical data from the MacArthur Violence Risk Assessment Study.

  6. A novel approach to internal crown characterization for coniferous tree species classification

    NASA Astrophysics Data System (ADS)

    Harikumar, A.; Bovolo, F.; Bruzzone, L.

    2016-10-01

    The knowledge about individual trees in forest is highly beneficial in forest management. High density small foot- print multi-return airborne Light Detection and Ranging (LiDAR) data can provide a very accurate information about the structural properties of individual trees in forests. Every tree species has a unique set of crown structural characteristics that can be used for tree species classification. In this paper, we use both the internal and external crown structural information of a conifer tree crown, derived from a high density small foot-print multi-return LiDAR data acquisition for species classification. Considering the fact that branches are the major building blocks of a conifer tree crown, we obtain the internal crown structural information using a branch level analysis. The structure of each conifer branch is represented using clusters in the LiDAR point cloud. We propose the joint use of the k-means clustering and geometric shape fitting, on the LiDAR data projected onto a novel 3-dimensional space, to identify branch clusters. After mapping the identified clusters back to the original space, six internal geometric features are estimated using a branch-level analysis. The external crown characteristics are modeled by using six least correlated features based on cone fitting and convex hull. Species classification is performed using a sparse Support Vector Machines (sparse SVM) classifier.

  7. Chronic subdural hematoma: Surgical management and outcome in 986 cases: A classification and regression tree approach

    PubMed Central

    Rovlias, Aristedis; Theodoropoulos, Spyridon; Papoutsakis, Dimitrios

    2015-01-01

    Background: Chronic subdural hematoma (CSDH) is one of the most common clinical entities in daily neurosurgical practice which carries a most favorable prognosis. However, because of the advanced age and medical problems of patients, surgical therapy is frequently associated with various complications. This study evaluated the clinical features, radiological findings, and neurological outcome in a large series of patients with CSDH. Methods: A classification and regression tree (CART) technique was employed in the analysis of data from 986 patients who were operated at Asclepeion General Hospital of Athens from January 1986 to December 2011. Burr holes evacuation with closed system drainage has been the operative technique of first choice at our institution for 29 consecutive years. A total of 27 prognostic factors were examined to predict the outcome at 3-month postoperatively. Results: Our results indicated that neurological status on admission was the best predictor of outcome. With regard to the other data, age, brain atrophy, thickness and density of hematoma, subdural accumulation of air, and antiplatelet and anticoagulant therapy were found to correlate significantly with prognosis. The overall cross-validated predictive accuracy of CART model was 85.34%, with a cross-validated relative error of 0.326. Conclusions: Methodologically, CART technique is quite different from the more commonly used methods, with the primary benefit of illustrating the important prognostic variables as related to outcome. Since, the ideal therapy for the treatment of CSDH is still under debate, this technique may prove useful in developing new therapeutic strategies and approaches for patients with CSDH. PMID:26257985

  8. Quantification of chemical peptide reactivity for screening contact allergens: a classification tree model approach.

    PubMed

    Gerberick, G Frank; Vassallo, Jeffrey D; Foertsch, Leslie M; Price, Brad B; Chaney, Joel G; Lepoittevin, Jean-Pierre

    2007-06-01

    In the interest of reducing animal use, in vitro alternatives for skin sensitization testing are under development. One unifying characteristic of chemical allergens is the requirement that they react with proteins for the effective induction of skin sensitization. The majority of chemical allergens are electrophilic and react with nucleophilic amino acids. To determine whether and to what extent reactivity correlates with skin sensitization potential, 82 chemicals comprising allergens of different potencies and nonallergenic chemicals were evaluated for their ability to react with reduced glutathione (GSH) or with two synthetic peptides containing either a single cysteine or lysine. Following a 15-min reaction time with GSH, or a 24-h reaction time with the two synthetic peptides, the samples were analyzed by high-performance liquid chromatography. UV detection was used to monitor the depletion of GSH or the peptides. The peptide reactivity data were compared with existing local lymph node assay data using recursive partitioning methodology to build a classification tree that allowed a ranking of reactivity as minimal, low, moderate, and high. Generally, nonallergens and weak allergens demonstrated minimal to low peptide reactivity, whereas moderate to extremely potent allergens displayed moderate to high peptide reactivity. Classifying minimal reactivity as nonsensitizers and low, moderate, and high reactivity as sensitizers, it was determined that a model based on cysteine and lysine gave a prediction accuracy of 89%. The results of these investigations reveal that measurement of peptide reactivity has considerable potential utility as a screening approach for skin sensitization testing, and thereby for reducing reliance on animal-based test methods.

  9. Applying an Ensemble Classification Tree Approach to the Prediction of Completion of a 12-Step Facilitation Intervention with Stimulant Abusers

    PubMed Central

    Doyle, Suzanne R.; Donovan, Dennis M.

    2014-01-01

    Aims The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed and the subagging procedure of random subsampling is applied. Methods Among 234 individuals with stimulant use disorders randomized to a 12-Step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. Findings The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower ASI Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. Conclusions The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038

  10. Ecological Factors of Being Bullied Among Adolescents: a Classification and Regression Tree Approach

    PubMed Central

    Moon, Sung Seek; Kim, Heeyoung; Seay, Kristen; Small, Eusebius; Kim, Youn Kyoung

    2015-01-01

    Being bullied is a well-recognized trauma for adolescents. Bullying can best be understood through an ecological framework since bullying or being bullied involves risk factors at multiple contextual levels. The purpose of the study was to identify the risk and protective factors that best differentiate groups along with the outcome variable of interest (being bullied) using Classification and Regression Tree (CART) analysis. The study used the Health Behavior in School-Aged Children (HBSC) data collected from a nationally representative sample of students in grades six through ten during the 2005–2006 school years. This study identified that for adolescents 12 and younger, lower parental support is a critical risk factor associated with bullying and among those 13 to 14 with lower parent support, adolescent with higher academic pressure reported experiencing more bullying. For the older group of adolescents (aged 15 and older), school related factors were identified to increase the risk level of being bullied. There was a critical age (15 years old) for implementing victimization interventions to reduce the damage from being bullied. Service providers working with adolescents aged 14 and less should focus more on family-oriented intervention and those working with adolescents aged 15 and more should offer peer- or school-related interventions. PMID:27617043

  11. Risk Profiles for Weight Gain among Postmenopausal Women: A Classification and Regression Tree Analysis Approach

    PubMed Central

    Jung, Su Yon; Vitolins, Mara Z.; Fenton, Jenifer; Frazier-Wood, Alexis C.; Hursting, Stephen D.; Chang, Shine

    2015-01-01

    Purpose Risk factors for obesity and weight gain are typically evaluated individually while “adjusting for” the influence of other confounding factors, and few studies, if any, have created risk profiles by clustering risk factors. We identified subgroups of postmenopausal women homogeneous in their clustered modifiable and non-modifiable risk factors for gaining ≥ 3% weight. Methods This study included 612 postmenopausal women 50–79 years old, enrolled in an ancillary study of the Women's Health Initiative Observational Study between February 1995 and July 1998. Classification and regression tree and stepwise regression models were built and compared. Results Of 27 selected variables, the factors significantly related to ≥ 3% weight gain were weight change in the past 2 years, age at menopause, dietary fiber, fat, alcohol intake, and smoking. In women younger than 65 years, less than 4 kg weight change in the past 2 years sufficiently reduced risk of ≥ 3% weight gain. Different combinations of risk factors related to weight gain were reported for subgroups of women: women 65 years or older (essential factor: < 9.8 g/day dietary factor), African Americans (essential factor: currently smoking), and white women (essential factor: ≥ 5 kg weight change for the past 2 years). Conclusions Our findings suggest specific characteristics for particular subgroups of postmenopausal women that may be useful for identifying those at risk for weight gain. The study results may be useful for targeting efforts to promote strategies to reduce the risk of obesity and weight gain in subgroups of postmenopausal women and maximize the effect of weight control by decreasing obesity-relevant adverse health outcomes. PMID:25822239

  12. Analysis of effects of manhole covers on motorcycle driver maneuvers: a nonparametric classification tree approach.

    PubMed

    Chang, Li-Yen

    2014-01-01

    A manhole cover is a removable plate forming the lid over the opening of a manhole to allow traffic to pass over the manhole and to prevent people from falling in. Because most manhole covers are placed in roadway traffic lanes, if these manhole covers are not appropriately installed or maintained, they can represent unexpected hazards on the road, especially for motorcycle drivers. The objective of this study is to identify the effects of manhole cover characteristics as well as driver factors and traffic and roadway conditions on motorcycle driver maneuvers. A video camera was used to record motorcycle drivers' maneuvers when they encountered an inappropriately installed or maintained manhole cover. Information on 3059 drivers' maneuver decisions was recorded. Classification and regression tree (CART) models were applied to explore factors that can significantly affect motorcycle driver maneuvers when passing a manhole cover. Nearly 50 percent of the motorcycle drivers decelerated or changed their driving path to reduce the effects of the manhole cover. The manhole cover characteristics including the level difference between manhole cover and pavement, the pavement condition over the manhole cover, and the size of the manhole cover can significantly affect motorcycle driver maneuvers. Other factors, including traffic conditions, lane width, motorcycle speed, and loading conditions, also have significant effects on motorcycle driver maneuvers. To reduce the effects and potential risks from the manhole covers, highway authorities not only need to make sure that any newly installed manhole covers are as level as possible but also need to regularly maintain all the manhole covers to ensure that they are in good condition. In the long run, the size of manhole covers should be kept as small as possible so that the impact of manhole covers on motorcycle drivers can be effectively reduced. Supplemental materials are available for this article. Go to the publisher

  13. Factors associated with caregiver stability in permanent placements: A Classification Tree approach

    PubMed Central

    Proctor, Laura J.; Van Dusen Randazzo, Katherine; Newton, Rae R.; Davis, Inger P.; Villodas, Miguel

    2013-01-01

    Objective Identify individual and environmental variables associated with caregiver stability and instability for children in diverse permanent placement types (i.e., reunification, adoption, and long-term foster care/guardianship with relatives or non-relatives), following 5 or more months in out-of-home care prior to age 4 due to substantiated maltreatment. Methods Participants were 285 children from the Southwestern site of Longitudinal Studies of Child Abuse and Neglect (LONGSCAN). Caregiver instability was defined as a change in primary caregiver between ages 6 and 8 years. Classification and regression tree (CART) analysis was used to identify the strongest predictors of instability from multiple variables assessed at age 6 with caregiver and child reports within the domains of neighborhood/community characteristics, caregiving environment, caregiver characteristics, and child characteristics. Results One out of 7, or 14% of the 285 children experienced caregiver instability in their permanent placement between ages 6 and 8. The strongest predictor of stability was whether the child had been placed in adoptive care. However, for children who were not adopted, a number of contextual factors (e.g., father involvement, expressiveness within the family) and child characteristics (e.g., intellectual functioning, externalizing problem behaviors) predicted stability and instability of permanent placements. Conclusions Current findings suggest that a number of factors should be considered, in addition to placement type, if we are to understand what predicts caregiver stability and find stable permanent placements for children who have entered foster care. These factors include involvement of a father figure, family functioning, and child functioning. Practice Implications Adoption was supported as a desired permanent placement in terms of stability, but results suggest that other placement types can also lead to stability. In fact, with attention to providing

  14. Knowledge base image classification using P-trees

    NASA Astrophysics Data System (ADS)

    Seetha, M.; Ravi, G.

    2010-02-01

    Image Classification is the process of assigning classes to the pixels in remote sensed images and important for GIS applications, since the classified image is much easier to incorporate than the original unclassified image. To resolve misclassification in traditional parametric classifier like Maximum Likelihood Classifier, the neural network classifier is implemented using back propagation algorithm. The extra spectral and spatial knowledge acquired from the ancillary information is required to improve the accuracy and remove the spectral confusion. To build knowledge base automatically, this paper explores a non-parametric decision tree classifier to extract knowledge from the spatial data in the form of classification rules. A new method is proposed using a data structure called Peano Count Tree (P-tree) for decision tree classification. The Peano Count Tree is a spatial data organization that provides a lossless compressed representation of a spatial data set and facilitates efficient classification than other data mining techniques. The accuracy is assessed using the parameters overall accuracy, User's accuracy and Producer's accuracy for image classification methods of Maximum Likelihood Classification, neural network classification using back propagation, Knowledge Base Classification, Post classification and P-tree Classifier. The results reveal that the knowledge extracted from decision tree classifier and P-tree data structure from proposed approach remove the problem of spectral confusion to a greater extent. It is ascertained that the P-tree classifier surpasses the other classification techniques.

  15. IHC and the WHO classification of lymphomas: cost effective immunohistochemistry using a deductive reasoning "decision tree" approach.

    PubMed

    Taylor, Clive R

    2009-10-01

    The 2008 World Health Organization Classification of Tumors of the Hematopoietic and Lymphoid Tissues defines current standards of practice for the diagnosis and classification of malignant lymphomas and related entities. More than 50 different types of lymphomas are described, combining fine morphologic criteria with immunohistochemical (IHC), and sometimes molecular, findings. Faced with such a broad range of different lymphomas, some encountered only rarely, and a rapidly growing, ever changing, armamentarium of approximately 80 pertinent IHC "stains", the challenge to the pathologist is to employ IHC in an efficient manner, to arrive at an assured diagnosis as rapidly as possible. This review uses deductive reasoning, after a decision tree or dendrogram model that relies upon recognition of basic morphologic patterns for efficient selection, use and interpretation of IHC markers to classify node-based malignancies by the World Health Organization schema. The review is divided into 2 parts, the first addressing those lymphomas that produce a follicular or nodular pattern of lymph nodal involvement; the second addressing diffuse proliferations in lymph nodes. It is accepted that only specialized centers are able to apply all of the technical resources and experience necessary for definitive diagnosis of unusual cases. Emphasis therefore is given to the more common lymphomas and the more commonly available IHC "stains", for a pragmatic and practical approach that is both broadly feasible and cost effective. By this method an assured diagnosis may be reached in the majority of nodal lymphomas, at the same time developing a sufficiency of data to recognize those rare or atypical cases that require referral to a specialized center.

  16. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object-based classification approaches.

    PubMed

    Meneguzzo, Dacia M; Liknes, Greg C; Nelson, Mark D

    2013-08-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics. Despite the significance of ToF, forest and other natural resource inventory programs and geospatial land cover datasets that are available at a national scale do not include comprehensive information regarding ToF in the United States. Additional ground-based data collection and acquisition of specialized imagery to inventory these resources are expensive alternatives. As a potential solution, we identified two remote sensing-based approaches that use free high-resolution aerial imagery from the National Agriculture Imagery Program (NAIP) to map all tree cover in an agriculturally dominant landscape. We compared the results obtained using an unsupervised per-pixel classifier (independent component analysis-[ICA]) and an object-based image analysis (OBIA) procedure in Steele County, Minnesota, USA. Three types of accuracy assessments were used to evaluate how each method performed in terms of: (1) producing a county-level estimate of total tree-covered area, (2) correctly locating tree cover on the ground, and (3) how tree cover patch metrics computed from the classified outputs compared to those delineated by a human photo interpreter. Both approaches were found to be viable for mapping tree cover over a broad spatial extent and could serve to supplement ground-based inventory data. The ICA approach produced an estimate of total tree cover more similar to the photo-interpreted result, but the output from the OBIA method was more realistic in terms of describing the actual observed spatial pattern of tree cover.

  17. Individualized Prediction of Heat Stress in Firefighters: A Data-Driven Approach Using Classification and Regression Trees.

    PubMed

    Mani, Ashutosh; Rao, Marepalli; James, Kelley; Bhattacharya, Amit

    2015-01-01

    The purpose of this study was to explore data-driven models, based on decision trees, to develop practical and easy to use predictive models for early identification of firefighters who are likely to cross the threshold of hyperthermia during live-fire training. Predictive models were created for three consecutive live-fire training scenarios. The final predicted outcome was a categorical variable: will a firefighter cross the upper threshold of hyperthermia - Yes/No. Two tiers of models were built, one with and one without taking into account the outcome (whether a firefighter crossed hyperthermia or not) from the previous training scenario. First tier of models included age, baseline heart rate and core body temperature, body mass index, and duration of training scenario as predictors. The second tier of models included the outcome of the previous scenario in the prediction space, in addition to all the predictors from the first tier of models. Classification and regression trees were used independently for prediction. The response variable for the regression tree was the quantitative variable: core body temperature at the end of each scenario. The predicted quantitative variable from regression trees was compared to the upper threshold of hyperthermia (38°C) to predict whether a firefighter would enter hyperthermia. The performance of classification and regression tree models was satisfactory for the second (success rate = 79%) and third (success rate = 89%) training scenarios but not for the first (success rate = 43%). Data-driven models based on decision trees can be a useful tool for predicting physiological response without modeling the underlying physiological systems. Early prediction of heat stress coupled with proactive interventions, such as pre-cooling, can help reduce heat stress in firefighters.

  18. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  19. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  20. Phylogenetic classification and the universal tree.

    PubMed

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished.

  1. Type I Error Control for Tree Classification

    PubMed Central

    Jung, Sin-Ho; Chen, Yong; Ahn, Hongshik

    2014-01-01

    Binary tree classification has been useful for classifying the whole population based on the levels of outcome variable that is associated with chosen predictors. Often we start a classification with a large number of candidate predictors, and each predictor takes a number of different cutoff values. Because of these types of multiplicity, binary tree classification method is subject to severe type I error probability. Nonetheless, there have not been many publications to address this issue. In this paper, we propose a binary tree classification method to control the probability to accept a predictor below certain level, say 5%. PMID:25452689

  2. Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting.

    PubMed

    Ehrentraut, Claudia; Ekholm, Markus; Tanushi, Hideyuki; Tiedemann, Jörg; Dalianis, Hercules

    2016-08-04

    Hospital-acquired infections pose a significant risk to patient health, while their surveillance is an additional workload for hospital staff. Our overall aim is to build a surveillance system that reliably detects all patient records that potentially include hospital-acquired infections. This is to reduce the burden of having the hospital staff manually check patient records. This study focuses on the application of text classification using support vector machines and gradient tree boosting to the problem. Support vector machines and gradient tree boosting have never been applied to the problem of detecting hospital-acquired infections in Swedish patient records, and according to our experiments, they lead to encouraging results. The best result is yielded by gradient tree boosting, at 93.7 percent recall, 79.7 percent precision and 85.7 percent F1 score when using stemming. We can show that simple preprocessing techniques and parameter tuning can lead to high recall (which we aim for in screening patient records) with appropriate precision for this task.

  3. Foot and hip contributions to high frontal plane knee projection angle in athletes: a classification and regression tree approach.

    PubMed

    Bittencourt, Natalia F N; Ocarino, Juliana M; Mendonça, Luciana D M; Hewett, Timothy E; Fonseca, Sergio T

    2012-12-01

    Cross-sectional. To investigate predictors of increased frontal plane knee projection angle (FPKPA) in athletes. The underlying mechanisms that lead to increased FPKPA are likely multifactorial and depend on how the musculoskeletal system adapts to the possible interactions between its distal and proximal segments. Bivariate and linear analyses traditionally employed to analyze the occurrence of increased FPKPA are not sufficiently robust to capture complex relationships among predictors. The investigation of nonlinear interactions among biomechanical factors is necessary to further our understanding of the interdependence of lower-limb segments and resultant dynamic knee alignment. The FPKPA was assessed in 101 athletes during a single-leg squat and in 72 athletes at the moment of landing from a jump. The investigated predictors were sex, hip abductor isometric torque, passive range of motion (ROM) of hip internal rotation (IR), and shank-forefoot alignment. Classification and regression trees were used to investigate nonlinear interactions among predictors and their influence on the occurrence of increased FPKPA. During single-leg squatting, the occurrence of high FPKPA was predicted by the interaction between hip abductor isometric torque and passive hip IR ROM. At the moment of landing, the shank-forefoot alignment, abductor isometric torque, and passive hip IR ROM were predictors of high FPKPA. In addition, the classification and regression trees established cutoff points that could be used in clinical practice to identify athletes who are at potential risk for excessive FPKPA. The models captured nonlinear interactions between hip abductor isometric torque, passive hip IR ROM, and shank-forefoot alignment.

  4. A balanced neural tree for pattern classification.

    PubMed

    Micheloni, Christian; Rani, Asha; Kumar, Sanjeev; Foresti, Gian Luca

    2012-03-01

    This paper proposes a new neural tree (NT) architecture, balanced neural tree (BNT), to reduce tree size and improve classification with respect to classical NTs. To achieve this result, two main innovations have been introduced: (a) perceptron substitution and (b) pattern removal. The first innovation aims to balance the structure of the tree. If the last-trained perceptron largely misclassifies the given training set into a reduced number of classes, then this perceptron is substituted with a new perceptron. The second novelty consists of the introduction of a new criterion for the removal of tough training patterns that generate the problem of over-fitting. Finally, a new error function based on the depth of the tree is introduced to reduce perceptron training time. The proposed BNT has been tested on various synthetic and real datasets. The experimental results show that the proposed BNT leads to satisfactory results in terms of both tree depth reduction and classification accuracy.

  5. Automated Decision Tree Classification of Corneal Shape

    PubMed Central

    Twa, Michael D.; Parthasarathy, Srinivasan; Roberts, Cynthia; Mahmoud, Ashraf M.; Raasch, Thomas W.; Bullimore, Mark A.

    2011-01-01

    Purpose The volume and complexity of data produced during videokeratography examinations present a challenge of interpretation. As a consequence, results are often analyzed qualitatively by subjective pattern recognition or reduced to comparisons of summary indices. We describe the application of decision tree induction, an automated machine learning classification method, to discriminate between normal and keratoconic corneal shapes in an objective and quantitative way. We then compared this method with other known classification methods. Methods The corneal surface was modeled with a seventh-order Zernike polynomial for 132 normal eyes of 92 subjects and 112 eyes of 71 subjects diagnosed with keratoconus. A decision tree classifier was induced using the C4.5 algorithm, and its classification performance was compared with the modified Rabinowitz–McDonnell index, Schwiegerling’s Z3 index (Z3), Keratoconus Prediction Index (KPI), KISA%, and Cone Location and Magnitude Index using recommended classification thresholds for each method. We also evaluated the area under the receiver operator characteristic (ROC) curve for each classification method. Results Our decision tree classifier performed equal to or better than the other classifiers tested: accuracy was 92% and the area under the ROC curve was 0.97. Our decision tree classifier reduced the information needed to distinguish between normal and keratoconus eyes using four of 36 Zernike polynomial coefficients. The four surface features selected as classification attributes by the decision tree method were inferior elevation, greater sagittal depth, oblique toricity, and trefoil. Conclusions Automated decision tree classification of corneal shape through Zernike polynomials is an accurate quantitative method of classification that is interpretable and can be generated from any instrument platform capable of raw elevation data output. This method of pattern classification is extendable to other classification

  6. Selecting Relevant Descriptors for Classification by Bayesian Estimates: A Comparison with Decision Trees and Support Vector Machines Approaches for Disparate Data Sets.

    PubMed

    Carbon-Mangels, Miriam; Hutter, Michael C

    2011-10-01

    Classification algorithms suffer from the curse of dimensionality, which leads to overfitting, particularly if the problem is over-determined. Therefore it is of particular interest to identify the most relevant descriptors to reduce the complexity. We applied Bayesian estimates to model the probability distribution of descriptors values used for binary classification using n-fold cross-validation. As a measure for the discriminative power of the classifiers, the symmetric form of the Kullback-Leibler divergence of their probability distributions was computed. We found that the most relevant descriptors possess a Gaussian-like distribution of their values, show the largest divergences, and therefore appear most often in the cross-validation scenario. The results were compared to those of the LASSO feature selection method applied to multiple decision trees and support vector machine approaches for data sets of substrates and nonsubstrates of three Cytochrome P450 isoenzymes, which comprise strongly unbalanced compound distributions. In contrast to decision trees and support vector machines, the performance of Bayesian estimates is less affected by unbalanced data sets. This strategy reveals those descriptors that allow a simple linear separation of the classes, whereas the superior accuracy of decision trees and support vector machines can be attributed to nonlinear separation, which are in turn more prone to overfitting.

  7. Geometric tree kernels: classification of COPD from airway tree geometry.

    PubMed

    Feragen, Aasa; Petersen, Jens; Grimm, Dominik; Dirksen, Asger; Pedersen, Jesper Holst; Borgwardt, Karsten; de Bruijne, Marleen

    2013-01-01

    Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or other vector valued properties. In addition to being flexible in their ability to model different types of attributes, the presented kernels are computationally efficient and some of them can easily be computed for large datasets (N - 10.000) of trees with 30 - 600 branches. Combining the kernels with standard machine learning tools enables us to analyze the relation between disease and anatomical tree structure and geometry. Experimental results: The kernels are used to compare airway trees segmented from low-dose CT, endowed with branch shape descriptors and airway wall area percentage measurements made along the tree. Using kernelized hypothesis testing we show that the geometric airway trees are significantly differently distributed in patients with Chronic Obstructive Pulmonary Disease (COPD) than in healthy individuals. The geometric tree kernels also give a significant increase in the classification accuracy of COPD from geometric tree structure endowed with airway wall thickness measurements in comparison with state-of-the-art methods, giving further insight into the relationship between airway wall thickness and COPD. Software: Software for computing kernels and statistical tests is available at http://image.diku.dk/aasa/software.php.

  8. Classification tree method for bacterial source tracking with antibiotic resistance analysis data.

    PubMed

    Price, Bertram; Venso, Elichia A; Frana, Mark F; Greenberg, Joshua; Ware, Adam; Currey, Lee

    2006-05-01

    Various statistical classification methods, including discriminant analysis, logistic regression, and cluster analysis, have been used with antibiotic resistance analysis (ARA) data to construct models for bacterial source tracking (BST). We applied the statistical method known as classification trees to build a model for BST for the Anacostia Watershed in Maryland. Classification trees have more flexibility than other statistical classification approaches based on standard statistical methods to accommodate complex interactions among ARA variables. This article describes the use of classification trees for BST and includes discussion of its principal parameters and features. Anacostia Watershed ARA data are used to illustrate the application of classification trees, and we report the BST results for the watershed.

  9. The WHO classification of lymphomas: cost-effective immunohistochemistry using a deductive reasoning "decision tree" approach: part II: the decision tree approach: diffuse patterns of proliferation in lymph nodes.

    PubMed

    Taylor, Clive R

    2009-12-01

    The 2008 World Health Organization Classification of Tumors of the Haematopoietic and Lymphoid Tissues defines current standards of practice for the diagnosis and classification of malignant lymphomas and related entities. More than 50 different types of lymphomas are described. Faced with such a broad range of different lymphomas, some encountered only rarely, and a rapidly growing armamentarium of 80 or more pertinent immunohistochemical (IHC) "stains," the challenge to the pathologist is to use IHC in an efficient manner to arrive at an assured and timely diagnosis. This review uses deductive reasoning following a decision tree or dendrogram model, combining basic morphologic patterns and common IHC markers to classify node-based malignancies by the World Health Organization schema. The review is divided into 2 parts, the first addressing those lymphomas that produce a follicular or nodular pattern of lymph nodal involvement appeared in the previous issue of AIMM. The second part addresses diffuse proliferations in lymph nodes. Emphasis is given to the more common lymphomas and the more commonly available IHC "stains" for a pragmatic and practical approach that is both broadly feasible and cost-effective. By this method, an assured diagnosis may be reached in the majority of nodal lymphomas, at the same time developing a sufficiency of data to recognize those rare or atypical cases that require referral to a specialized center.

  10. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object based classification approaches

    Treesearch

    Dacia M. Meneguzzo; Greg C. Liknes; Mark D. Nelson

    2013-01-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics....

  11. Fast Image Texture Classification Using Decision Trees

    NASA Technical Reports Server (NTRS)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  12. Classification of posture and activities by using decision trees.

    PubMed

    Zhang, Ting; Tang, Wenlong; Sazonov, Edward S

    2012-01-01

    Obesity prevention and treatment as well as healthy life style recommendation requires the estimation of everyday physical activity. Monitoring posture allocations and activities with sensor systems is an effective method to achieve the goal. However, at present, most devices available rely on multiple sensors distributed on the body, which might be too obtrusive for everyday use. In this study, data was collected from a wearable shoe sensor system (SmartShoe) and a decision tree algorithm was applied for classification with high computational accuracy. The dataset was collected from 9 individual subjects performing 6 different activities--sitting, standing, walking, cycling, and stairs ascent/descent. Statistical features were calculated and the classification with decision tree classifier was performed, after which, advanced boosting algorithm was applied. The computational accuracy is as high as 98.85% without boosting, and 98.90% after boosting. Additionally, the simple tree structure provides a direct approach to simplify the feature set.

  13. Seasonal Effect on Tree Species Classification in an Urban Environment Using Hyperspectral Data, LiDAR, and an Object- Oriented Approach.

    PubMed

    Voss, Matthew; Sugumaran, Ramanathan

    2008-05-06

    The objective of the current study was to analyze the seasonal effect on differentiating tree species in an urban environment using multi-temporal hyperspectral data, Light Detection And Ranging (LiDAR) data, and a tree species database collected from the field. Two Airborne Imaging Spectrometer for Applications (AISA) hyperspectral images were collected, covering the Summer and Fall seasons. In order to make both datasets spatially and spectrally compatible, several preprocessing steps, including band reduction and a spatial degradation, were performed. An object-oriented classification was performed on both images using training data collected randomly from the tree species database. The seven dominant tree species (Gleditsia triacanthos, Acer saccharum, Tilia Americana, Quercus palustris, Pinus strobus and Picea glauca) were used in the classification. The results from this analysis did not show any major difference in overall accuracy between the two seasons. Overall accuracy was approximately 57% for the Summer dataset and 56% for the Fall dataset. However, the Fall dataset provided more consistent results for all tree species while the Summer dataset had a few higher individual class accuracies. Further, adding LiDAR into the classification improved the results by 19% for both fall and summer. This is mainly due to the removal of shadow effect and the addition of elevation data to separate low and high vegetation.

  14. Seasonal Effect on Tree Species Classification in an Urban Environment Using Hyperspectral Data, LiDAR, and an Object-Oriented Approach

    PubMed Central

    Voss, Matthew; Sugumaran, Ramanathan

    2008-01-01

    The objective of the current study was to analyze the seasonal effect on differentiating tree species in an urban environment using multi-temporal hyperspectral data, Light Detection And Ranging (LiDAR) data, and a tree species database collected from the field. Two Airborne Imaging Spectrometer for Applications (AISA) hyperspectral images were collected, covering the Summer and Fall seasons. In order to make both datasets spatially and spectrally compatible, several preprocessing steps, including band reduction and a spatial degradation, were performed. An object-oriented classification was performed on both images using training data collected randomly from the tree species database. The seven dominant tree species (Gleditsia triacanthos, Acer saccharum, Tilia Americana, Quercus palustris, Pinus strobus and Picea glauca) were used in the classification. The results from this analysis did not show any major difference in overall accuracy between the two seasons. Overall accuracy was approximately 57% for the Summer dataset and 56% for the Fall dataset. However, the Fall dataset provided more consistent results for all tree species while the Summer dataset had a few higher individual class accuracies. Further, adding LiDAR into the classification improved the results by 19% for both fall and summer. This is mainly due to the removal of shadow effect and the addition of elevation data to separate low and high vegetation. PMID:27879863

  15. Guide to the measurement of tree characteristics important to the quality classification for young hardwood trees

    Treesearch

    David L. Sonderman

    1979-01-01

    A procedure is shown for measuring external tree characteristics that are important in determining the current and future quality of young hardwood trees. This guide supplements a precious study which describes the quality classification system for young hardwood trees

  16. A new tree classification system for southern hardwoods

    Treesearch

    James S. Meadows; Daniel A. Jr. Skojac

    2008-01-01

    A new tree classification system for southern hardwoods is described. The new system is based on the Putnam tree classification system, originally developed by Putnam et al., 1960, Management ond inventory of southern hardwoods, Agriculture Handbook 181, US For. Sew., Washington, DC, which consists of four tree classes: (1) preferred growing stock, (2) reserve growing...

  17. Voxel classification based airway tree segmentation

    NASA Astrophysics Data System (ADS)

    Lo, Pechin; de Bruijne, Marleen

    2008-03-01

    This paper presents a voxel classification based method for segmenting the human airway tree in volumetric computed tomography (CT) images. In contrast to standard methods that use only voxel intensities, our method uses a more complex appearance model based on a set of local image appearance features and Kth nearest neighbor (KNN) classification. The optimal set of features for classification is selected automatically from a large set of features describing the local image structure at several scales. The use of multiple features enables the appearance model to differentiate between airway tree voxels and other voxels of similar intensities in the lung, thus making the segmentation robust to pathologies such as emphysema. The classifier is trained on imperfect segmentations that can easily be obtained using region growing with a manual threshold selection. Experiments show that the proposed method results in a more robust segmentation that can grow into the smaller airway branches without leaking into emphysematous areas, and is able to segment many branches that are not present in the training set.

  18. Semi-supervised SVM for individual tree crown species classification

    NASA Astrophysics Data System (ADS)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  19. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    NASA Astrophysics Data System (ADS)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  20. Classification Tree Method for Bacterial Source Tracking with Antibiotic Resistance Analysis Data

    PubMed Central

    Price, Bertram; Venso, Elichia A.; Frana, Mark F.; Greenberg, Joshua; Ware, Adam; Currey, Lee

    2006-01-01

    Various statistical classification methods, including discriminant analysis, logistic regression, and cluster analysis, have been used with antibiotic resistance analysis (ARA) data to construct models for bacterial source tracking (BST). We applied the statistical method known as classification trees to build a model for BST for the Anacostia Watershed in Maryland. Classification trees have more flexibility than other statistical classification approaches based on standard statistical methods to accommodate complex interactions among ARA variables. This article describes the use of classification trees for BST and includes discussion of its principal parameters and features. Anacostia Watershed ARA data are used to illustrate the application of classification trees, and we report the BST results for the watershed. PMID:16672492

  1. Evaluating multimedia chemical persistence: Classification and regression tree analysis

    SciTech Connect

    Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.

    2000-04-01

    For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Widely distributed persistent chemicals are impossible to remove from the environment and remediation by natural processes may take decades, which is problematic if adverse health or ecological effects are discovered after prolonged release into the environment. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as persistent or nonpersistent based on the chemical properties. In this approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robust for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and can be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.

  2. Tree classification with fused mobile laser scanning and hyperspectral data.

    PubMed

    Puttonen, Eetu; Jaakkola, Anttoni; Litkey, Paula; Hyyppä, Juha

    2011-01-01

    Mobile Laser Scanning data were collected simultaneously with hyperspectral data using the Finnish Geodetic Institute Sensei system. The data were tested for tree species classification. The test area was an urban garden in the City of Espoo, Finland. Point clouds representing 168 individual tree specimens of 23 tree species were determined manually. The classification of the trees was done using first only the spatial data from point clouds, then with only the spectral data obtained with a spectrometer, and finally with the combined spatial and hyperspectral data from both sensors. Two classification tests were performed: the separation of coniferous and deciduous trees, and the identification of individual tree species. All determined tree specimens were used in distinguishing coniferous and deciduous trees. A subset of 133 trees and 10 tree species was used in the tree species classification. The best classification results for the fused data were 95.8% for the separation of the coniferous and deciduous classes. The best overall tree species classification succeeded with 83.5% accuracy for the best tested fused data feature combination. The respective results for paired structural features derived from the laser point cloud were 90.5% for the separation of the coniferous and deciduous classes and 65.4% for the species classification. Classification accuracies with paired hyperspectral reflectance value data were 90.5% for the separation of coniferous and deciduous classes and 62.4% for different species. The results are among the first of their kind and they show that mobile collected fused data outperformed single-sensor data in both classification tests and by a significant margin.

  3. Tree Classification with Fused Mobile Laser Scanning and Hyperspectral Data

    PubMed Central

    Puttonen, Eetu; Jaakkola, Anttoni; Litkey, Paula; Hyyppä, Juha

    2011-01-01

    Mobile Laser Scanning data were collected simultaneously with hyperspectral data using the Finnish Geodetic Institute Sensei system. The data were tested for tree species classification. The test area was an urban garden in the City of Espoo, Finland. Point clouds representing 168 individual tree specimens of 23 tree species were determined manually. The classification of the trees was done using first only the spatial data from point clouds, then with only the spectral data obtained with a spectrometer, and finally with the combined spatial and hyperspectral data from both sensors. Two classification tests were performed: the separation of coniferous and deciduous trees, and the identification of individual tree species. All determined tree specimens were used in distinguishing coniferous and deciduous trees. A subset of 133 trees and 10 tree species was used in the tree species classification. The best classification results for the fused data were 95.8% for the separation of the coniferous and deciduous classes. The best overall tree species classification succeeded with 83.5% accuracy for the best tested fused data feature combination. The respective results for paired structural features derived from the laser point cloud were 90.5% for the separation of the coniferous and deciduous classes and 65.4% for the species classification. Classification accuracies with paired hyperspectral reflectance value data were 90.5% for the separation of coniferous and deciduous classes and 62.4% for different species. The results are among the first of their kind and they show that mobile collected fused data outperformed single-sensor data in both classification tests and by a significant margin. PMID:22163894

  4. Sensitivity of missing values in classification tree for large sample

    NASA Astrophysics Data System (ADS)

    Hasan, Norsida; Adam, Mohd Bakri; Mustapha, Norwati; Abu Bakar, Mohd Rizam

    2012-05-01

    Missing values either in predictor or in response variables are a very common problem in statistics and data mining. Cases with missing values are often ignored which results in loss of information and possible bias. The objectives of our research were to investigate the sensitivity of missing data in classification tree model for large sample. Data were obtained from one of the high level educational institutions in Malaysia. Students' background data were randomly eliminated and classification tree was used to predict students degree classification. The results showed that for large sample, the structure of the classification tree was sensitive to missing values especially for sample contains more than ten percent missing values.

  5. A Mixtures-of-Trees Framework for Multi-Label Classification

    PubMed Central

    Hong, Charmgil; Batal, Iyad; Hauskrecht, Milos

    2015-01-01

    We propose a new probabilistic approach for multi-label classification that aims to represent the class posterior distribution P(Y|X). Our approach uses a mixture of tree-structured Bayesian networks, which can leverage the computational advantages of conditional tree-structured models and the abilities of mixtures to compensate for tree-structured restrictions. We develop algorithms for learning the model from data and for performing multi-label predictions using the learned model. Experiments on multiple datasets demonstrate that our approach outperforms several state-of-the-art multi-label classification methods. PMID:25927011

  6. [Automatic classification method of star spectrum data based on classification pattern tree].

    PubMed

    Zhao, Xu-Jun; Cai, Jiang-Hui; Zhang, Ji-Fu; Yang, Hai-Feng; Ma, Yang

    2013-10-01

    Frequent pattern, frequently appearing in the data set, plays an important role in data mining. For the stellar spectrum classification tasks, a classification rule mining method based on classification pattern tree is presented on the basis of frequent pattern. The procedures can be shown as follows. Firstly, a new tree structure, i. e., classification pattern tree, is introduced based on the different frequencies of stellar spectral attributes in data base and its different importance used for classification. The related concepts and the construction method of classification pattern tree are also described in this paper. Then, the characteristics of the stellar spectrum are mapped to the classification pattern tree. Two modes of top-to-down and bottom-to-up are used to traverse the classification pattern tree and extract the classification rules. Meanwhile, the concept of pattern capability is introduced to adjust the number of classification rules and improve the construction efficiency of the classification pattern tree. Finally, the SDSS (the Sloan Digital Sky Survey) stellar spectral data provided by the National Astronomical Observatory are used to verify the accuracy of the method. The results show that a higher classification accuracy has been got.

  7. Watershed Merge Tree Classification for Electron Microscopy Image Segmentation

    SciTech Connect

    Liu, TIng; Jurrus, Elizabeth R.; Seyedhosseini, Mojtaba; Ellisman, Mark; Tasdizen, Tolga

    2012-11-11

    Automated segmentation of electron microscopy (EM) images is a challenging problem. In this paper, we present a novel method that utilizes a hierarchical structure and boundary classification for 2D neuron segmentation. With a membrane detection probability map, a watershed merge tree is built for the representation of hierarchical region merging from the watershed algorithm. A boundary classifier is learned with non-local image features to predict each potential merge in the tree, upon which merge decisions are made with consistency constraints in the sense of optimization to acquire the final segmentation. Independent of classifiers and decision strategies, our approach proposes a general framework for efficient hierarchical segmentation with statistical learning. We demonstrate that our method leads to a substantial improvement in segmentation accuracy.

  8. Classification of Liss IV Imagery Using Decision Tree Methods

    NASA Astrophysics Data System (ADS)

    Verma, Amit Kumar; Garg, P. K.; Prasad, K. S. Hari; Dadhwal, V. K.

    2016-06-01

    Image classification is a compulsory step in any remote sensing research. Classification uses the spectral information represented by the digital numbers in one or more spectral bands and attempts to classify each individual pixel based on this spectral information. Crop classification is the main concern of remote sensing applications for developing sustainable agriculture system. Vegetation indices computed from satellite images gives a good indication of the presence of vegetation. It is an indicator that describes the greenness, density and health of vegetation. Texture is also an important characteristics which is used to identifying objects or region of interest is an image. This paper illustrate the use of decision tree method to classify the land in to crop land and non-crop land and to classify different crops. In this paper we evaluate the possibility of crop classification using an integrated approach methods based on texture property with different vegetation indices for single date LISS IV sensor 5.8 meter high spatial resolution data. Eleven vegetation indices (NDVI, DVI, GEMI, GNDVI, MSAVI2, NDWI, NG, NR, NNIR, OSAVI and VI green) has been generated using green, red and NIR band and then image is classified using decision tree method. The other approach is used integration of texture feature (mean, variance, kurtosis and skewness) with these vegetation indices. A comparison has been done between these two methods. The results indicate that inclusion of textural feature with vegetation indices can be effectively implemented to produce classifiedmaps with 8.33% higher accuracy for Indian satellite IRS-P6, LISS IV sensor images.

  9. Trees of trees: an approach to comparing multiple alternative phylogenies.

    PubMed

    Nye, Tom M W

    2008-10-01

    Phylogenetic analysis very commonly produces several alternative trees for a given fixed set of taxa. For example, different sets of orthologous genes may be analyzed, or the analysis may sample from a distribution of probable trees. This article describes an approach to comparing and visualizing multiple alternative phylogenies via the idea of a "tree of trees" or "meta-tree." A meta-tree clusters phylogenies with similar topologies together in the same way that a phylogeny clusters species with similar DNA sequences. Leaf nodes on a meta-tree correspond to the original set of phylogenies given by some analysis, whereas interior nodes correspond to certain consensus topologies. The construction of meta-trees is motivated by analogy with construction of a most parsimonious tree for DNA data, but instead of using DNA letters, in a meta-tree the characters are partitions or splits of the set of taxa. An efficient algorithm for meta-tree construction is described that makes use of a known relationship between the majority consensus and parsimony in terms of gain and loss of splits. To illustrate these ideas meta-trees are constructed for two datasets: a set of gene trees for species of yeast and trees from a bootstrap analysis of a set of gene trees in ray-finned fish. A software tool for constructing meta-trees and comparing alternative phylogenies is available online, and the source code can be obtained from the author.

  10. Binary tree of posterior probability support vector machines for hyperspectral image classification

    NASA Astrophysics Data System (ADS)

    Wang, Dongli; Zhou, Yan; Zheng, Jianguo

    2011-01-01

    The problem of hyperspectral remote sensing images classification is revisited by posterior probability support vector machines (PPSVMs). To address the multiclass classification problem, PPSVMs are extended using binary tree structure and boosting with the Fisher ratio as class separability measure. The class pair with larger Fisher ratio separability measure is separated at upper nodes of the binary tree to optimize the structure of the tree and improve the classification accuracy. Two approaches are proposed to select the class pair and construct the binary tree. One is the so-called some-against-rest binary tree of PPSVMs (SBT), in which some classes are separated from the remaining classes at each node considering the Fisher ratio separability measure. For the other approach, named one-against-rest binary tree of PPSVMs (OBT), only one class is separated from the remaining classes at each node. Both approaches need only to train n - 1 (n is the number of classes) binary PPSVM classifiers, while the average convergence performance of SBT and OBT are O(log2n) and O[(n! - 1)/n], respectively. Experimental results show that both approaches obtain classification accuracy if not higher, at least comparable to other multiclass approaches, while using significantly fewer support vectors and reduced testing time.

  11. Classification Based on Tree-Structured Allocation Rules

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qui

    2008-01-01

    The authors consider the problem of classifying an unknown observation into 1 of several populations by using tree-structured allocation rules. Although many parametric classification procedures are robust to certain assumption violations, there is need for classification procedures that can be used regardless of the group-conditional…

  12. Classification Based on Tree-Structured Allocation Rules

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qui

    2008-01-01

    The authors consider the problem of classifying an unknown observation into 1 of several populations by using tree-structured allocation rules. Although many parametric classification procedures are robust to certain assumption violations, there is need for classification procedures that can be used regardless of the group-conditional…

  13. Urban Tree Classification Using Full-Waveform Airborne Laser Scanning

    NASA Astrophysics Data System (ADS)

    Koma, Zs.; Koenig, K.; Höfle, B.

    2016-06-01

    Vegetation mapping in urban environments plays an important role in biological research and urban management. Airborne laser scanning provides detailed 3D geodata, which allows to classify single trees into different taxa. Until now, research dealing with tree classification focused on forest environments. This study investigates the object-based classification of urban trees at taxonomic family level, using full-waveform airborne laser scanning data captured in the city centre of Vienna (Austria). The data set is characterised by a variety of taxa, including deciduous trees (beeches, mallows, plane trees and soapberries) and the coniferous pine species. A workflow for tree object classification is presented using geometric and radiometric features. The derived features are related to point density, crown shape and radiometric characteristics. For the derivation of crown features, a prior detection of the crown base is performed. The effects of interfering objects (e.g. fences and cars which are typical in urban areas) on the feature characteristics and the subsequent classification accuracy are investigated. The applicability of the features is evaluated by Random Forest classification and exploratory analysis. The most reliable classification is achieved by using the combination of geometric and radiometric features, resulting in 87.5% overall accuracy. By using radiometric features only, a reliable classification with accuracy of 86.3% can be achieved. The influence of interfering objects on feature characteristics is identified, in particular for the radiometric features. The results indicate the potential of using radiometric features in urban tree classification and show its limitations due to anthropogenic influences at the same time.

  14. Decision tree methods: applications for classification and prediction

    PubMed Central

    SONG, Yan-yan; LU, Ying

    2015-01-01

    Summary Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure. PMID:26120265

  15. Boosted classification trees result in minor to modest improvement in the accuracy in classifying cardiovascular outcomes compared to conventional classification trees

    PubMed Central

    Austin, Peter C; Lee, Douglas S

    2011-01-01

    Purpose: Classification trees are increasingly being used to classifying patients according to the presence or absence of a disease or health outcome. A limitation of classification trees is their limited predictive accuracy. In the data-mining and machine learning literature, boosting has been developed to improve classification. Boosting with classification trees iteratively grows classification trees in a sequence of reweighted datasets. In a given iteration, subjects that were misclassified in the previous iteration are weighted more highly than subjects that were correctly classified. Classifications from each of the classification trees in the sequence are combined through a weighted majority vote to produce a final classification. The authors' objective was to examine whether boosting improved the accuracy of classification trees for predicting outcomes in cardiovascular patients. Methods: We examined the utility of boosting classification trees for classifying 30-day mortality outcomes in patients hospitalized with either acute myocardial infarction or congestive heart failure. Results: Improvements in the misclassification rate using boosted classification trees were at best minor compared to when conventional classification trees were used. Minor to modest improvements to sensitivity were observed, with only a negligible reduction in specificity. For predicting cardiovascular mortality, boosted classification trees had high specificity, but low sensitivity. Conclusions: Gains in predictive accuracy for predicting cardiovascular outcomes were less impressive than gains in performance observed in the data mining literature. PMID:22254181

  16. Accelerating protein classification using suffix trees.

    PubMed

    Dorohonceanu, B; Nevill-Manning, C G

    2000-01-01

    Position-specific scoring matrices have been used extensively to recognize highly conserved protein regions. We present a method for accelerating these searches using a suffix tree data structure computed from the sequences to be searched. Building on earlier work that allows evaluation of a scoring matrix to be stopped early, the suffix tree-based method excludes many protein segments from consideration at once by pruning entire subtrees. Although suffix trees are usually expensive in space, the fact that scoring matrix evaluation requires an in-order traversal allows nodes to be stored more compactly without loss of speed, and our implementation requires only 17 bytes of primary memory per input symbol. Searches are accelerated by up to a factor of ten.

  17. Using classification trees to profile adolescent smoking behaviors.

    PubMed

    Kitsantas, Panagiota; Moore, Trent W; Sly, David F

    2007-01-01

    The purpose of this study was to explore the interactive nature of various predictor variables in profiling adolescent smoking behaviors characterized by intention to smoke, current, situational, and established smoking using classification trees. The data (n = 3610) were obtained from cross-sectional telephone surveys of the Florida Anti-Tobacco Media Evaluation Program. Three classification trees were constructed, namely, intention versus no intention to smoke among non-smokers, current smokers versus non-smokers, and established versus situational smokers. The tree model for the intention model revealed that social and health risks are important in the context of peer smoking. Certain variables such as peer smoking and alcohol consumption retained their relative importance across the tree classifiers demonstrating that smoking intention may be predictable using some of the same variables as in current or more dependent smoking.

  18. Comparing Hydrogeomorphic Approaches to Lake Classification

    NASA Astrophysics Data System (ADS)

    Martin, Sherry L.; Soranno, Patricia A.; Bremigan, Mary T.; Cheruvelil, Kendra S.

    2011-11-01

    A classification system is often used to reduce the number of different ecosystem types that governmental agencies are charged with monitoring and managing. We compare the ability of several different hydrogeomorphic (HGM)—based classifications to group lakes for water chemistry/clarity. We ask: (1) Which approach to lake classification is most successful at classifying lakes for similar water chemistry/clarity? (2) Which HGM features are most strongly related to the lake classes? and, (3) Can a single classification successfully classify lakes for all of the water chemistry/clarity variables examined? We use univariate and multivariate classification and regression tree (CART and MvCART) analysis of HGM features to classify alkalinity, water color, Secchi, total nitrogen, total phosphorus, and chlorophyll a from 151 minimally disturbed lakes in Michigan USA. We developed two MvCART models overall and two CART models for each water chemistry/clarity variable, in each case comparing: local HGM characteristics alone and local HGM characteristics combined with regionalizations and landscape position. The combined CART models had the highest strength of evidence (ωi range 0.92-1.00) and maximized within class homogeneity (ICC range 36-66%) for all water chemistry/clarity variables except water color and chlorophyll a. Because the most successful single classification was on average 20% less successful in classifying other water chemistry/clarity variables, we found that no single classification captures variability for all lake responses tested. Therefore, we suggest that the most successful classification (1) is specific to individual response variables, and (2) incorporates information from multiple spatial scales (regionalization and local HGM variables).

  19. Market-based approaches to tree valuation

    Treesearch

    Geoffrey H. Donovan; David T. Butry

    2008-01-01

    A recent four-part series in Arborist News outlined different appraisal processes used to value urban trees. The final article in the series described the three generally accepted approaches to tree valuation: the sales comparison approach, the cost approach, and the income capitalization approach. The author, D. Logan Nelson, noted that the sales comparison approach...

  20. Automated Method for Identification and Artery-Venous Classification of Vessel Trees in Retinal Vessel Networks

    PubMed Central

    Joshi, Vinayak S.; Reinhardt, Joseph M.; Garvin, Mona K.; Abramoff, Michael D.

    2014-01-01

    The separation of the retinal vessel network into distinct arterial and venous vessel trees is of high interest. We propose an automated method for identification and separation of retinal vessel trees in a retinal color image by converting a vessel segmentation image into a vessel segment map and identifying the individual vessel trees by graph search. Orientation, width, and intensity of each vessel segment are utilized to find the optimal graph of vessel segments. The separated vessel trees are labeled as primary vessel or branches. We utilize the separated vessel trees for arterial-venous (AV) classification, based on the color properties of the vessels in each tree graph. We applied our approach to a dataset of 50 fundus images from 50 subjects. The proposed method resulted in an accuracy of 91.44 correctly classified vessel pixels as either artery or vein. The accuracy of correctly classified major vessel segments was 96.42. PMID:24533066

  1. Growth in Mathematics Achievement: Analysis with Classification and Regression Trees

    ERIC Educational Resources Information Center

    Ma, Xin

    2005-01-01

    A recently developed statistical technique, often referred to as classification and regression trees (CART), holds great potential for researchers to discover how student-level (and school-level) characteristics interactively affect growth in mathematics achievement. CART is a host of advanced statistical methods that statistically cluster…

  2. Combining QuickBird, LiDAR, and GIS topography indices to identify a single native tree species in a complex landscape using an object-based classification approach

    NASA Astrophysics Data System (ADS)

    Pham, Lien T. H.; Brabyn, Lars; Ashraf, Salman

    2016-08-01

    There are now a wide range of techniques that can be combined for image analysis. These include the use of object-based classifications rather than pixel-based classifiers, the use of LiDAR to determine vegetation height and vertical structure, as well terrain variables such as topographic wetness index and slope that can be calculated using GIS. This research investigates the benefits of combining these techniques to identify individual tree species. A QuickBird image and low point density LiDAR data for a coastal region in New Zealand was used to examine the possibility of mapping Pohutukawa trees which are regarded as an iconic tree in New Zealand. The study area included a mix of buildings and vegetation types. After image and LiDAR preparation, single tree objects were identified using a range of techniques including: a threshold of above ground height to eliminate ground based objects; Normalised Difference Vegetation Index and elevation difference between the first and last return of LiDAR data to distinguish vegetation from buildings; geometric information to separate clusters of trees from single trees, and treetop identification and region growing techniques to separate tree clusters into single tree crowns. Important feature variables were identified using Random Forest, and the Support Vector Machine provided the classification. The combined techniques using LiDAR and spectral data produced an overall accuracy of 85.4% (Kappa 80.6%). Classification using just the spectral data produced an overall accuracy of 75.8% (Kappa 67.8%). The research findings demonstrate how the combining of LiDAR and spectral data improves classification for Pohutukawa trees.

  3. A Section-based Method For Tree Species Classification Using Airborne LiDAR Discrete Points In Urban Areas

    NASA Astrophysics Data System (ADS)

    Chunjing, Y. C.; Hui, T.; Zhongjie, R.; Guikai, B.

    2015-12-01

    As a new approach to forest inventory utilizing, LiDAR remote sensing has become an important research issue in the past. Lidar researches initially concentrate on the investigation for mapping forests at the tree level and identifying important structural parameters, such as tree height, crown size, crown base height, individual tree species, and stem volume etc. But for the virtual city visualization and mapping, the traditional methods of tree classification can't satisfy the more complex conditions. Recently, the advanced LiDAR technology has generated new full waveform scanners that provide a higher point density and additional information about the reflecting characteristics of trees. Subsequently, it was demonstrated that it is feasible to detect individual overstorey trees in forests and classify species. But the important issues like the calibration and the decomposition of full waveform data with a series of Gaussian functions usually take a lot of works. What's more, the detection and classification of vegetation results relay much on the prior outcomes. From all above, the section-based method for tree species classification using small footprint and high sampling density lidar data is proposed in this paper, which can overcome the tree species classification issues in urban areas. More specific objectives are to: (1)use local maximum height decision and four direction sections certification methods to get the precise locations of the trees;(2) develop new lidar-derived features processing techniques for characterizing the section structure of individual tree crowns;(3) investigate several techniques for filtering and analyzing vertical profiles of individual trees to classify the trees, and using the expert decision skills based on percentile analysis;(4) assess the accuracy of estimating tree species for each tree, and (5) investigate which type of lidar data, point frequency or intensity, provides the most accurate estimate of tree species

  4. Multiple Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2010-01-01

    A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.

  5. Identifying fallers among ophthalmic patients using classification tree methodology

    PubMed Central

    Chirico, Franco; Pecchia, Leandro; Rossi, Settimio; Testa, Francesco; Simonelli, Francesca

    2017-01-01

    Purpose To develop and validate a tool aiming to support ophthalmologists in identifying, during routine ophthalmologic visits, patients at higher risk of falling in the following year. Methods A group of 141 subjects (age: 73.2 ± 11.4 years), recruited at our Eye Clinic, underwent a baseline ophthalmic examination and a standardized questionnaire, including lifestyles, general health, social engagement and eyesight problems. Moreover, visual disability was assessed by the Activity of Daily Vision Scale (ADVS). The subjects were followed up for 12 months in order to record prospective falls. A subject who reported at least one fall within one year from the baseline assessment was considered as faller, otherwise as non-faller. Different tree-based algorithms (i.e., C4.5, AdaBoost and Random Forests) were used to develop automatic classifiers and their performances were evaluated by the cross-validation approach. Results Over the follow-up, 25 falls were referred by 13 patients. The logistic regression analysis showed the following variables as significant predictors of prospective falls: pseudophakia and use of prescribed eyeglasses as protective factors, recent worsening of visual acuity as risk factor. Random Forest ranked best corrected visual acuity, number of sleeping hours and job type as the most important features. Finally, AdaBoost enabled the identification of subjects at higher risk of falling in the following 12 months with a sensitivity rate of 69.2% and a specificity rate of 76.6%. Conclusions The current study proposes a novel method, based on classification trees applied to self-reported factors and health information assessed by a standardized questionnaire during ophthalmological visits, to identify ophthalmic patients at higher risk of falling in the following 12 months. The findings of the current study pave the way to the validation of the proposed novel tool for fall risk screening on a larger cohort of patients with visual impairment referred

  6. Object-based methods for individual tree identification and tree species classification from high-spatial resolution imagery

    NASA Astrophysics Data System (ADS)

    Wang, Le

    2003-10-01

    Modern forest management poses an increasing need for detailed knowledge of forest information at different spatial scales. At the forest level, the information for tree species assemblage is desired whereas at or below the stand level, individual tree related information is preferred. Remote Sensing provides an effective tool to extract the above information at multiple spatial scales in the continuous time domain. To date, the increasing volume and readily availability of high-spatial-resolution data have lead to a much wider application of remotely sensed products. Nevertheless, to make effective use of the improving spatial resolution, conventional pixel-based classification methods are far from satisfactory. Correspondingly, developing object-based methods becomes a central challenge for researchers in the field of Remote Sensing. This thesis focuses on the development of methods for accurate individual tree identification and tree species classification. We develop a method in which individual tree crown boundaries and treetop locations are derived under a unified framework. We apply a two-stage approach with edge detection followed by marker-controlled watershed segmentation. Treetops are modeled from radiometry and geometry aspects. Specifically, treetops are assumed to be represented by local radiation maxima and to be located near the center of the tree-crown. As a result, a marker image was created from the derived treetop to guide a watershed segmentation to further differentiate overlapping trees and to produce a segmented image comprised of individual tree crowns. The image segmentation method developed achieves a promising result for a 256 x 256 CASI image. Then further effort is made to extend our methods to the multiscales which are constructed from a wavelet decomposition. A scale consistency and geometric consistency are designed to examine the gradients along the scale-space for the purpose of separating true crown boundary from unwanted

  7. Flotation classification of ultrafine particles -- A novel classification approach

    SciTech Connect

    Qiu Guanzhou; Luo Lin; Hu Yuehua; Xu Jin; Wang Dianzuo

    1995-12-31

    This paper introduces a novel classification approach named the flotation classification approach which works by controlling interactions between particles. It differs considerably from the conventional classification processes operating on mechanical forces. In the present test, the micro-bubble flotation technology is grafted onto hydro-classification. Selective aggregation and dispersion of ultrafine particles are achieved through governing the interactions in the classification process. A series of laboratory classification tests for {minus}44 gm kaolin have been conducted on a classification column. As a result, about 92% recovery for minus 2 {micro}m size fraction Kaolin in the final product is obtained. In addition, two criteria for the classification are set up. Finally, a principle of classifying and controlling the interactions between particles is discussed in terms of surface thermodynamics and hydrodynamics.

  8. A novel transferable individual tree crown delineation model based on Fishing Net Dragging and boundary classification

    NASA Astrophysics Data System (ADS)

    Liu, Tao; Im, Jungho; Quackenbush, Lindi J.

    2015-12-01

    This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.

  9. A representation and classification scheme for tree-like structures in medical images: an application on branching pattern analysis of ductal trees in x-ray galactograms

    NASA Astrophysics Data System (ADS)

    Megalooikonomou, Vasileios; Kontos, Despina; Danglemaier, Joseph; Javadi, Ailar; Bakic, Predrag R.; Maidment, Andrew D. A.

    2006-03-01

    We propose a multi-step approach for representing and classifying tree-like structures in medical images. Examples of such tree-like structures are encountered in the bronchial system, the vessel topology and the breast ductal network. We assume that the tree-like structures are already segmented. To avoid the tree isomorphism problem we obtain the breadth-first canonical form of a tree. Our approach is based on employing tree encoding techniques, such as the depth-first string encoding and the Prüfer encoding, to obtain a symbolic representation. Thus, the problem of classifying trees is reduced to string classification where node labels are the string terms. We employ the tf-idf text mining technique to assign a weight of significance to each string term (i.e., tree node label). We perform similarity searches and k-nearest neighbor classification of the trees using the tf-idf weight vectors and the cosine similarity metric. We applied our approach to the breast ductal network manually extracted from clinical x-ray galactograms. The goal was to characterize the ductal tree-like parenchymal structures in order to distinguish among different groups of women. Our best classification accuracy reached up to 90% for certain experimental settings (k=4), outperforming on the average by 10% that of a previous state-of-the-art method based on ramification matrices. These results illustrate the effectiveness of the proposed approach in analyzing tree-like patterns in breast images. Developing such automated tools for the analysis of tree-like structures in medical images can potentially provide insight to the relationship between the topology of branching and function or pathology.

  10. A Representation and Classification Scheme for Tree-Like Structures in Medical Images: Analyzing the Branching Pattern of Ductal Trees in X-ray Galactograms

    PubMed Central

    Megalooikonomou, Vasileios; Barnathan, Michael; Kontos, Despina; Bakic, Predrag R.; Maidment, Andrew D. A.

    2012-01-01

    We propose a multistep approach for representing and classifying tree-like structures in medical images. Tree-like structures are frequently encountered in biomedical contexts; examples are the bronchial system, the vascular topology, and the breast ductal network. We use tree encoding techniques, such as the depth-first string encoding and the Prüfer encoding, to obtain a symbolic string representation of the tree's branching topology; the problem of classifying trees is then reduced to string classification. We use the tf-idf text mining technique to assign a weight of significance to each string term (i.e., tree node label). Similarity searches and k-nearest neighbor classification of the trees is performed using the tf-idf weight vectors and the cosine similarity metric. We applied our approach to characterize the ductal tree-like parenchymal structure in X-ray galactograms, in order to distinguish among different radiological findings. Experimental results demonstrate the effectiveness of the proposed approach with classification accuracy reaching up to 86%, and also indicate that our method can potentially aid in providing insight to the relationship between branching patterns and function or pathology. PMID:19272984

  11. Improved similarity trees and their application to visual data classification.

    PubMed

    Paiva, Jose Gustavo S; Florian-Cruz, Laura; Pedrini, Helio; Telles, Guilherme P; Minghim, Rosane

    2011-12-01

    An alternative form to multidimensional projections for the visual analysis of data represented in multidimensional spaces is the deployment of similarity trees, such as Neighbor Joining trees. They organize data objects on the visual plane emphasizing their levels of similarity with high capability of detecting and separating groups and subgroups of objects. Besides this similarity-based hierarchical data organization, some of their advantages include the ability to decrease point clutter; high precision; and a consistent view of the data set during focusing, offering a very intuitive way to view the general structure of the data set as well as to drill down to groups and subgroups of interest. Disadvantages of similarity trees based on neighbor joining strategies include their computational cost and the presence of virtual nodes that utilize too much of the visual space. This paper presents a highly improved version of the similarity tree technique. The improvements in the technique are given by two procedures. The first is a strategy that replaces virtual nodes by promoting real leaf nodes to their place, saving large portions of space in the display and maintaining the expressiveness and precision of the technique. The second improvement is an implementation that significantly accelerates the algorithm, impacting its use for larger data sets. We also illustrate the applicability of the technique in visual data mining, showing its advantages to support visual classification of data sets, with special attention to the case of image classification. We demonstrate the capabilities of the tree for analysis and iterative manipulation and employ those capabilities to support evolving to a satisfactory data organization and classification.

  12. Probabilistic lung nodule classification with belief decision trees.

    PubMed

    Zinovev, Dmitriy; Feigenbaum, Jonathan; Furst, Jacob; Raicu, Daniela

    2011-01-01

    In reading Computed Tomography (CT) scans with potentially malignant lung nodules, radiologists make use of high level information (semantic characteristics) in their analysis. Computer-Aided Diagnostic Characterization (CADc) systems can assist radiologists by offering a "second opinion"--predicting these semantic characteristics for lung nodules. In this work, we propose a way of predicting the distribution of radiologists' opinions using a multiple-label classification algorithm based on belief decision trees using the National Cancer Institute (NCI) Lung Image Database Consortium (LIDC) dataset, which includes semantic annotations by up to four human radiologists for each one of the 914 nodules. Furthermore, we evaluate our multiple-label results using a novel distance-threshold curve technique--and, measuring the area under this curve, obtain 69% performance on the validation subset. We conclude that multiple-label classification algorithms are an appropriate method of representing the diagnoses of multiple radiologists on lung CT scans when ground truth is unavailable.

  13. Data mining in psychological treatment research: a primer on classification and regression trees.

    PubMed

    King, Matthew W; Resick, Patricia A

    2014-10-01

    Data mining of treatment study results can reveal unforeseen but critical insights, such as who receives the most benefit from treatment and under what circumstances. The usefulness and legitimacy of exploratory data analysis have received relatively little recognition, however, and analytic methods well suited to the task are not widely known in psychology. With roots in computer science and statistics, statistical learning approaches offer a credible option: These methods take a more inductive approach to building a model than is done in traditional regression, allowing the data greater role in suggesting the correct relationships between variables rather than imposing them a priori. Classification and regression trees are presented as a powerful, flexible exemplar of statistical learning methods. Trees allow researchers to efficiently identify useful predictors of an outcome and discover interactions between predictors without the need to anticipate and specify these in advance, making them ideal for revealing patterns that inform hypotheses about treatment effects. Trees can also provide a predictive model for forecasting outcomes as an aid to clinical decision making. This primer describes how tree models are constructed, how the results are interpreted and evaluated, and how trees overcome some of the complexities of traditional regression. Examples are drawn from randomized clinical trial data and highlight some interpretations of particular interest to treatment researchers. The limitations of tree models are discussed, and suggestions for further reading and choices in software are offered.

  14. A modified classification tree method for personalized medicine decisions

    PubMed Central

    Tsai, Wan-Min; Zhang, Heping; Buta, Eugenia; O’Malley, Stephanie

    2015-01-01

    The tree-based methodology has been widely applied to identify predictors of health outcomes in medical studies. However, the classical tree-based approaches do not pay particular attention to treatment assignment and thus do not consider prediction in the context of treatment received. In recent years, attention has been shifting from average treatment effects to identifying moderators of treatment response, and tree-based approaches to identify subgroups of subjects with enhanced treatment responses are emerging. In this study, we extend and present modifications to one of these approaches (Zhang et al., 2010 [29]) to efficiently identify subgroups of subjects who respond more favorably to one treatment than another based on their baseline characteristics. We extend the algorithm by incorporating an automatic pruning step and propose a measure for assessment of the predictive performance of the constructed tree. We evaluate the proposed method through a simulation study and illustrate the approach using a data set from a clinical trial of treatments for alcohol dependence. This simple and efficient statistical tool can be used for developing algorithms for clinical decision making and personalized treatment for patients based on their characteristics. PMID:26770292

  15. Determinants of cesarean delivery: a classification tree analysis

    PubMed Central

    2014-01-01

    Background Cesarean delivery (CD) rates are rising in many parts of the world. To define strategies to reduce them, it is important to identify their clinical and organizational determinants. The objective of this cross-sectional study is to identify sub-types of women at higher risk of CD using demographic, clinical and organizational variables. Methods All hospital discharge records of women who delivered between 2005 and mid-2010 in the Emilia-Romagna Region of Italy were retrieved and linked with birth certificates. Sociodemographic and clinical information was retrieved from the two data sources. Organizational variables included activity volume (number of births per year), hospital type, and hour and day of delivery. A classification tree analysis was used to identify the variables and the combinations of variables that best discriminated cesarean from vaginal delivery. Results The classification tree analysis indicated that the most important variables discriminating the sub-groups of women at different risk of cesarean section were: previous cesarean, mal-position/mal-presentation, fetal distress, and abruptio placentae or placenta previa or ante-partum hemorrhage. These variables account for more than 60% of all cesarean deliveries. A sensitivity analysis identified multiparity and fetal weight as additional discriminatory variables. Conclusions Clinical variables are important predictors of CD. To reduce the CD rate, audit activities should examine in more detail the clinical conditions for which the need of CD is questionable or inappropriate. PMID:24973937

  16. Buried penis: classification surgical approach.

    PubMed

    Hadidi, Ahmed T

    2014-02-01

    The purpose of this study was to describe morphological classification of congenital buried penis (BP) and present a versatile surgical approach for correction. Sixty-one patients referred with BP were classified into 3 grades according to morphological findings: Grade 1-29 patients with Longer Inner Prepuce (LIP) only, Grade II-20 patients who presented with LIP associated with indrawn penis that required division of the fundiform and suspensory ligaments, and Grade III-12 patients who had in addition to the above, excess supra-pubic fat. A ventral midline penile incision extending from the tip of prepuce down to the penoscrotal junction was used in all patients. The operation was tailored according to the BP Grade. All patients underwent circumcision. Mean follow up was 3 years (range 1 to 10). All 61 patients had an abnormally long inner prepuce (LIP). Forty-seven patients had a short penile shaft. Early improvement was noted in all cases. Satisfactory results were achieved in all 29 patients in grade I and in 27 patients in grades II and III. Five children (Grades II and III) required further surgery (9%). Congenital buried penis is a spectrum characterized by LIP and may include in addition; short penile shaft, abnormal attachment of fundiform, and suspensory ligaments and excess supra-pubic fat. Congenital Mega Prepuce (CMP) is a variant of Grade I BP, with LIP characterized by intermittent ballooning of the genital area. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System

    PubMed Central

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3–6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable ‘if-then’ rules. PMID:26461492

  18. Support vector machine classification trees based on fuzzy entropy of classification.

    PubMed

    de Boves Harrington, Peter

    2017-02-15

    The support vector machine (SVM) is a powerful classifier that has recently been implemented in a classification tree (SVMTreeG). This classifier partitioned the data by finding gaps in the data space. For large and complex datasets, there may be no gaps in the data space confounding this type of classifier. A novel algorithm was devised that uses fuzzy entropy to find optimal partitions for situations when clusters of data are overlapped in the data space. Also, a kernel version of the fuzzy entropy algorithm was devised. A fast support vector machine implementation is used that has no cost C or slack variables to optimize. Statistical comparisons using bootstrapped Latin partitions among the tree classifiers were made using a synthetic XOR data set and validated with ten prediction sets comprised of 50,000 objects and a data set of NMR spectra obtained from 12 tea sample extracts.

  19. Graduates employment classification using data mining approach

    NASA Astrophysics Data System (ADS)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  20. Decision Tree Classifier for Classification of Plant and Animal Micro RNA's

    NASA Astrophysics Data System (ADS)

    Pant, Bhasker; Pant, Kumud; Pardasani, K. R.

    Gene expression is regulated by miRNAs or micro RNAs which can be 21-23 nucleotide in length. They are non coding RNAs which control gene expression either by translation repression or mRNA degradation. Plants and animals both contain miRNAs which have been classified by wet lab techniques. These techniques are highly expensive, labour intensive and time consuming. Hence faster and economical computational approaches are needed. In view of above a machine learning model has been developed for classification of plant and animal miRNAs using decision tree classifier. The model has been tested on available data and it gives results with 91% accuracy.

  1. Use of MMPI-2 to predict cognitive effort: a hierarchically optimal classification tree analysis.

    PubMed

    Smart, Colette M; Nelson, Nathaniel W; Sweet, Jerry J; Bryant, Fred B; Berry, David T R; Granacher, Robert P; Heilbronner, Robert L

    2008-09-01

    Neuropsychologists routinely rely on response validity measures to evaluate the authenticity of test performances. However, the relationship between cognitive and psychological response validity measures is not clearly understood. It remains to be seen whether psychological test results can predict the outcome of response validity testing in clinical and civil forensic samples. The present analysis applied a unique statistical approach, classification tree methodology (Optimal Data Analysis: ODA), in a sample of 307 individuals who had completed the MMPI-2 and a variety of cognitive effort measures. One hundred ninety-eight participants were evaluated in a secondary gain context, and 109 had no identifiable secondary gain. Through recurrent dichotomous discriminations, ODA provided optimized linear decision trees to classify either sufficient effort (SE) or insufficient effort (IE) according to various MMPI-2 scale cutoffs. After of an initial, complex classification tree, the Response Bias Scale (RBS) took precedence in classifying cognitive effort. After removing RBS from the model, Hy took precedence in classifying IE. The present findings provide MMPI-2 scores that may be associated with SE and IE among civil litigants and claimants, in addition to illustrating the complexity with which MMPI-2 scores and effort test results are associated in the litigation context.

  2. Integrating classification and regression tree (CART) with GIS for assessment of heavy metals pollution.

    PubMed

    Cheng, Wei; Zhang, Xiuying; Wang, Ke; Dai, Xuelong

    2009-11-01

    The classification and regression tree (CART) model integrated with geographical information systems and the assessment of heavy-metals pollution system was developed to assess the heavy metals pollution in Fuyang, Zhejiang, China. The integration of the decision tree model with ArcGIS Engine 9 using a COM implementation in Microsoft Visual Basic 6.0 provided an approach for assessing the spatial distribution of soil Zn content with high predictive accuracy. The Zn concentration classes estimated by CART assigned the right classes with an accuracy of near 90%. This is a great improvement compared to the ordinary Kriging method for the spatial autocorrelation of the study area severely destroyed by human activities. Also, it can be used to investigate the inter-relationships between the heavy metals pollution and environmental and anthropogenic variables. Moreover, the research presents model predictions over space for further applications and investigations.

  3. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    NASA Astrophysics Data System (ADS)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  4. A novel decision-tree method for structured continuous-label classification.

    PubMed

    Hu, Hsiao-Wei; Chen, Yen-Liang; Tang, Kwei

    2013-12-01

    Structured continuous-label classification is a variety of classification in which the label is continuous in the data, but the goal is to classify data into classes that are a set of predefined ranges and can be organized in a hierarchy. In the hierarchy, the ranges at the lower levels are more specific and inherently more difficult to predict, whereas the ranges at the upper levels are less specific and inherently easier to predict. Therefore, both prediction specificity and prediction accuracy must be considered when building a decision tree (DT) from this kind of data. This paper proposes a novel classification algorithm for learning DT classifiers from data with structured continuous labels. This approach considers the distribution of labels throughout the hierarchical structure during the construction of trees without requiring discretization in the preprocessing stage. We compared the results of the proposed method with those of the C4.5 algorithm using eight real data sets. The empirical results indicate that the proposed method outperforms the C4.5 algorithm with regard to prediction accuracy, prediction specificity, and computational complexity.

  5. Proteomic mass spectra classification using decision tree based ensemble methods.

    PubMed

    Geurts, Pierre; Fillet, Marianne; de Seny, Dominique; Meuwis, Marie-Alice; Malaise, Michel; Merville, Marie-Paule; Wehenkel, Louis

    2005-07-15

    Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to diagnose the current state or predict the evolution of a disease. Recent developments in machine learning allow one to exploit such datasets, characterized by small numbers of very high-dimensional samples. We propose a systematic approach based on decision tree ensemble methods, which is used to automatically determine proteomic biomarkers and predictive models. The approach is validated on two datasets of surface-enhanced laser desorption/ionization time of flight measurements, for the diagnosis of rheumatoid arthritis and inflammatory bowel diseases. The results suggest that the methodology can handle a broad class of similar problems.

  6. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  7. Instability in a Tree Approach to Regression. Program Statistics Research.

    ERIC Educational Resources Information Center

    Kim, Sung-Ho

    One of the major problems that a tree-approach to data analysis often encounters is the instability of tree-structures. The instability issue must be dealt with before data can be interpreted by this method. Examining instability at a node of a tree provides insight into the instability of the whole tree, because the same theory of instability…

  8. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    NASA Astrophysics Data System (ADS)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  9. Comparison of chi-squared automatic interaction detection classification trees vs TNM classification for patients with head and neck squamous cell carcinoma.

    PubMed

    Avilés-Jurado, F Xavier; Terra, Ximena; Figuerola, Enric; Quer, Miquel; León, Xavier

    2012-03-01

    To compare chi-squared automatic interaction detection (CHAID) classification trees vs the seventh edition of the TNM classification for patients with head and neck squamous cell carcinoma and to assess whether CHAID classification trees might improve results obtained with the TNM classification. Patient disease was classified according to CHAID classification trees and the TNM classification, and the results were compared. Academic research. A total of 3373 patients with carcinoma of the oral cavity, oropharynx, hypopharynx, or larynx. The 2 classification methods were evaluated objectively, measuring intrastage homogeneity (hazard consistency), interstage heterogeneity (hazard discrimination), and disease stage distribution among patients (balance). In addition, to assess agreement between CHAID classification trees and the TNM classification, we calculated the κ statistic, weighted linearly and quadratically. Objective evaluation of the quality of the classification methods indicated that CHAID classification trees performed better than the TNM classification in terms of hazard consistency (2.51 for CHAID and 3.01 for TNM) and hazard discrimination (70.9% for CHAID and 52.7% for TNM) but not balance (-31.7% for CHAID and -15.5% for TNM). Analysis of concordance between the classification methods showed that the quadratic κ statistic was 0.77 (95% CI, 0.76-0.78) and the linear κ statistic was 0.59 (95% CI, 0.57-0.60) (P < .001 for both). CHAID classification trees performed better than the TNM classification and offer potential inclusion of new prognostic factors.

  10. Real-Time Speech/Music Classification With a Hierarchical Oblique Decision Tree

    DTIC Science & Technology

    2008-04-01

    REAL-TIME SPEECH/ MUSIC CLASSIFICATION WITH A HIERARCHICAL OBLIQUE DECISION TREE Jun Wang, Qiong Wu, Haojiang Deng, Qin Yan Institute of Acoustics...time speech/ music classification with a hierarchical oblique decision tree. A set of discrimination features in frequency domain are selected...handle signals without discrimination and can not work properly in the existence of multimedia signals. This paper proposes a real-time speech/ music

  11. Multivariate Approaches to Classification in Extragalactic Astronomy

    NASA Astrophysics Data System (ADS)

    Fraix-Burnet, Didier; Thuillard, Marc; Chattopadhyay, Asis Kumar

    2015-08-01

    Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono- or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.

  12. a Rough Set Decision Tree Based Mlp-Cnn for Very High Resolution Remotely Sensed Image Classification

    NASA Astrophysics Data System (ADS)

    Zhang, C.; Pan, X.; Zhang, S. Q.; Li, H. P.; Atkinson, P. M.

    2017-09-01

    Recent advances in remote sensing have witnessed a great amount of very high resolution (VHR) images acquired at sub-metre spatial resolution. These VHR remotely sensed data has post enormous challenges in processing, analysing and classifying them effectively due to the high spatial complexity and heterogeneity. Although many computer-aid classification methods that based on machine learning approaches have been developed over the past decades, most of them are developed toward pixel level spectral differentiation, e.g. Multi-Layer Perceptron (MLP), which are unable to exploit abundant spatial details within VHR images. This paper introduced a rough set model as a general framework to objectively characterize the uncertainty in CNN classification results, and further partition them into correctness and incorrectness on the map. The correct classification regions of CNN were trusted and maintained, whereas the misclassification areas were reclassified using a decision tree with both CNN and MLP. The effectiveness of the proposed rough set decision tree based MLP-CNN was tested using an urban area at Bournemouth, United Kingdom. The MLP-CNN, well capturing the complementarity between CNN and MLP through the rough set based decision tree, achieved the best classification performance both visually and numerically. Therefore, this research paves the way to achieve fully automatic and effective VHR image classification.

  13. Binary Classification and the Subtractive Approach.

    DTIC Science & Technology

    1983-09-01

    AR[l 1,t AFAMRL-TR-83-050 BINARY CLASSIFICATION AND THE SUBTRACTIVE APPROACH Gregory M. Corso Georgia Institute of Technology Suzanne Kelly 9 Danny E...GRANT NUMBER(O) Gregory M. Corso * F49620-82-C-0035 Suzanne Kelly Danny E. Bridges, TSgt, USAF 9 PERFORMING ORGANIZATION NAME AND ADDRESS 10. PROGRAM...A NA3 1 BINARY CLASSIFICATION AND THE SABTRACTIVE APPROACH(A) 1/1 EDUCATION INC ST CLOUD FL G M CORSO ET AL. SEP 83 UNCLASSIFIED AFAMRL-T 83 050

  14. Using Clinical Classification Trees to Identify Individuals At Risk of STDs During Pregnancy

    PubMed Central

    Kershaw, Trace S.; Lewis, Jessica; Westdahl, Claire; Wang, Yun F.; Rising, Sharon Schindler; Massey, Zohar; Ickovics, Jeannette

    2008-01-01

    CONTEXT Few studies have used classification tree analysis to produce empirically driven decision tools that identify subgroups of women at risk of STDs during pregnancy. Such tools can guide care, treatment and prevention efforts in clinical settings. METHODS A sample of 647 women aged 14−25 attending two urban obstetrics and gynecology clinics in 2001−2004 were surveyed in their second and third trimesters. Baseline predictors at the individual, dyad, and family and community levels were used to develop a classification tree that differentiated subgroups of women by STD incidence at 35 weeks' gestation. Logistic regression analyses were conducted to assess whether the classification tree groups or commonly used risk factors better predicted STD incidence. RESULTS Nineteen percent of women had an incident STD during pregnancy. Classification tree analysis identified three subgroups with a high STD incidence (33−61%), one with a moderate incidence (16%) and three with a low incidence (6−11%). Women in subgroups with high STD incidence included those not living with the partner with whom they conceived and those who had a moderate or a high level of depression, a history of STDs and a low level of social support. A logistic regression model using groups defined by the classification tree analysis had better predictive ability than one using common demographic and sexual risk predictors. CONCLUSION This classification tree identified risk factors not captured by traditional risk screenings, and could be used to guide STD treatment, care and prevention within the prenatal care setting. PMID:17845525

  15. New Tree-Classification System Used by the Southern Forest Inventory and Analysis Unit

    Treesearch

    Dennis M. May; John S. Vissage; D. Vince Few

    1990-01-01

    Trees at USDA Forest Service, Southern Forest Inventory and Analysis, sample locations are classified as growing stock or cull based on their ability to produce sawlogs. The old and new classification systems are compared, and the impacts of the new system on the reporting of tree volumes are illustrated with inventory data from north Alabama.

  16. A classification and regression tree model of controls on dissolved inorganic nitrogen leaching from European forests.

    PubMed

    Rothwell, James J; Futter, Martyn N; Dise, Nancy B

    2008-11-01

    Often, there is a non-linear relationship between atmospheric dissolved inorganic nitrogen (DIN) input and DIN leaching that is poorly captured by existing models. We present the first application of the non-parametric classification and regression tree approach to evaluate the key environmental drivers controlling DIN leaching from European forests. DIN leaching was classified as low (<3), medium (3-15) or high (>15kg N ha(-1) year(-1)) at 215 sites across Europe. The analysis identified throughfall NO(3)(-) deposition, acid deposition, hydrology, soil type, the carbon content of the soil, and the legacy of historic N deposition as the dominant drivers of DIN leaching for these forests. Ninety four percent of sites were successfully classified into the appropriate leaching category. This approach shows promise for understanding complex ecosystem responses to a wide range of anthropogenic stressors as well as an improved method for identifying risk and targeting pollution mitigation strategies in forest ecosystems.

  17. Biosensor approach to psychopathology classification.

    PubMed

    Koshelev, Misha; Lohrenz, Terry; Vannucci, Marina; Montague, P Read

    2010-10-21

    We used a multi-round, two-party exchange game in which a healthy subject played a subject diagnosed with a DSM-IV (Diagnostic and Statistics Manual-IV) disorder, and applied a Bayesian clustering approach to the behavior exhibited by the healthy subject. The goal was to characterize quantitatively the style of play elicited in the healthy subject (the proposer) by their DSM-diagnosed partner (the responder). The approach exploits the dynamics of the behavior elicited in the healthy proposer as a biosensor for cognitive features that characterize the psychopathology group at the other side of the interaction. Using a large cohort of subjects (n = 574), we found statistically significant clustering of proposers' behavior overlapping with a range of DSM-IV disorders including autism spectrum disorder, borderline personality disorder, attention deficit hyperactivity disorder, and major depressive disorder. To further validate these results, we developed a computer agent to replace the human subject in the proposer role (the biosensor) and show that it can also detect these same four DSM-defined disorders. These results suggest that the highly developed social sensitivities that humans bring to a two-party social exchange can be exploited and automated to detect important psychopathologies, using an interpersonal behavioral probe not directly related to the defining diagnostic criteria.

  18. Tree Species Classification Using A Fusion of LiDAR and Hyperspectral Datasets

    NASA Astrophysics Data System (ADS)

    Xu, Z.

    2016-12-01

    The accurate mapping of tree species would be beneficial to the management of forests. Remote sensing data from multiple sources including airborne LiDAR and hyperspectral sensors are widely available and have been used for tree species classification, although with often limited results. Species mapping at the individual tree level is particularly challenging in temperate forests due to high intraspecific spectral variability, irregular canopy shapes, and multiple vegetation strata. By combining LiDAR and hyperspectral datasets, we performed an individual tree level classification of tree species found in Allerton Park in central Illinois. LiDAR analysis was used to perform individual tree crown extraction, and these crowns were fused with hyperspectral imagery to provide spectral information for each crown in the upper canopy. We used per-tree 2- and 3-D morphological features as well as the spectral information as predictors into a machine learning classifier to produce the per-tree species classification. Finally, we used field data registered to individual tree crowns to validate our results.

  19. Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui

    2016-07-01

    The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.

  20. Vlsi implementation of flexible architecture for decision tree classification in data mining

    NASA Astrophysics Data System (ADS)

    Sharma, K. Venkatesh; Shewandagn, Behailu; Bhukya, Shankar Nayak

    2017-07-01

    The Data mining algorithms have become vital to researchers in science, engineering, medicine, business, search and security domains. In recent years, there has been a terrific raise in the size of the data being collected and analyzed. Classification is the main difficulty faced in data mining. In a number of the solutions developed for this problem, most accepted one is Decision Tree Classification (DTC) that gives high precision while handling very large amount of data. This paper presents VLSI implementation of flexible architecture for Decision Tree classification in data mining using c4.5 algorithm.

  1. Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree With Gene Selection

    PubMed Central

    2005-01-01

    We investigate the problems of multiclass cancer classification with gene selection from gene expression data. Two different constructed multiclass classifiers with gene selection are proposed, which are fuzzy support vector machine (FSVM) with gene selection and binary classification tree based on SVM with gene selection. Using F test and recursive feature elimination based on SVM as gene selection methods, binary classification tree based on SVM with F test, binary classification tree based on SVM with recursive feature elimination based on SVM, and FSVM with recursive feature elimination based on SVM are tested in our experiments. To accelerate computation, preselecting the strongest genes is also used. The proposed techniques are applied to analyze breast cancer data, small round blue-cell tumors, and acute leukemia data. Compared to existing multiclass cancer classifiers and binary classification tree based on SVM with F test or binary classification tree based on SVM with recursive feature elimination based on SVM mentioned in this paper, FSVM based on recursive feature elimination based on SVM can find most important genes that affect certain types of cancer with high recognition accuracy. PMID:16046822

  2. An information-based network approach for protein classification

    PubMed Central

    Wan, Xiaogeng; Zhao, Xin; Yau, Stephen S. T.

    2017-01-01

    Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method. PMID:28350835

  3. Using spectrotemporal indices to improve the fruit-tree crop classification accuracy

    NASA Astrophysics Data System (ADS)

    Peña, M. A.; Liao, R.; Brenning, A.

    2017-06-01

    This study assesses the potential of spectrotemporal indices derived from satellite image time series (SITS) to improve the classification accuracy of fruit-tree crops. Six major fruit-tree crop types in the Aconcagua Valley, Chile, were classified by applying various linear discriminant analysis (LDA) techniques on a Landsat-8 time series of nine images corresponding to the 2014-15 growing season. As features we not only used the complete spectral resolution of the SITS, but also all possible normalized difference indices (NDIs) that can be constructed from any two bands of the time series, a novel approach to derive features from SITS. Due to the high dimensionality of this ;enhanced; feature set we used the lasso and ridge penalized variants of LDA (PLDA). Although classification accuracies yielded by the standard LDA applied on the full-band SITS were good (misclassification error rate, MER = 0.13), they were further improved by 23% (MER = 0.10) with ridge PLDA using the enhanced feature set. The most important bands to discriminate the crops of interest were mainly concentrated on the first two image dates of the time series, corresponding to the crops' greenup stage. Despite the high predictor weights provided by the red and near infrared bands, typically used to construct greenness spectral indices, other spectral regions were also found important for the discrimination, such as the shortwave infrared band at 2.11-2.19 μm, sensitive to foliar water changes. These findings support the usefulness of spectrotemporal indices in the context of SITS-based crop type classifications, which until now have been mainly constructed by the arithmetic combination of two bands of the same image date in order to derive greenness temporal profiles like those from the normalized difference vegetation index.

  4. A statistical approach to root system classification

    PubMed Central

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  5. A neural network approach to cloud classification

    NASA Technical Reports Server (NTRS)

    Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.

    1990-01-01

    It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.

  6. The use of airborne hyperspectral data for tree species classification in a species-rich Central European forest area

    NASA Astrophysics Data System (ADS)

    Richter, Ronny; Reu, Björn; Wirth, Christian; Doktor, Daniel; Vohland, Michael

    2016-10-01

    The success of remote sensing approaches to assess tree species diversity in a heterogeneously mixed forest stand depends on the availability of both appropriate data and suitable classification algorithms. To separate the high number of in total ten broadleaf tree species in a small structured floodplain forest, the Leipzig Riverside Forest, we introduce a majority based classification approach for Discriminant Analysis based on Partial Least Squares (PLS-DA), which was tested against Random Forest (RF) and Support Vector Machines (SVM). The classifier performance was tested on different sets of airborne hyperspectral image data (AISA DUAL) that were acquired on single dates in August and September and also stacked to a composite product. Shadowed gaps and shadowed crown parts were eliminated via spectral mixture analysis (SMA) prior to the pixel-based classification. Training and validation sets were defined spectrally with the conditioned Latin hypercube method as a stratified random sampling procedure. In the validation, PLS-DA consistently outperformed the RF and SVM approaches on all datasets. The additional use of spectral variable selection (CARS, "competitive adaptive reweighted sampling") combined with PLS-DA further improved classification accuracies. Up to 78.4% overall accuracy was achieved for the stacked dataset. The image recorded in August provided slightly higher accuracies than the September image, regardless of the applied classifier.

  7. Parallel K-dimensional tree classification based on semi-matroid structure for remote sensing applications

    NASA Astrophysics Data System (ADS)

    Chang, Yang-Lang; Chen, Zhi-Ming; Liu, Jin-Nan; Chang, Lena; Fang, Jyh Perng

    2010-08-01

    Satellite remote sensing images can be interpreted to provide important information of large-scale natural resources, such as lands, oceans, mountains, rivers, forests and minerals for Earth observations. Recent advances of remote sensing technologies have improved the availability of satellite imagery in a wide range of applications including high dimensional remote sensing data sets (e.g. high spectral and high spatial resolution images). The information of high dimensional remote sensing images obtained by state-of-the-art sensor technologies can be identified more accurately than images acquired by conventional remote sensing techniques. However, due to its large volume of image data, it requires a huge amount of storages and computing time. In response, the computational complexity of data processing for high dimensional remote sensing data analysis will increase. Consequently, this paper proposes a novel classification algorithm based on semi-matroid structure, known as the parallel k-dimensional tree semi-matroid (PKTSM) classification, which adopts a new hybrid parallel approach to deal with high dimensional data sets. It is implemented by combining the message passing interface (MPI) library, the open multi-processing (OpenMP) application programming interface and the compute unified device architecture (CUDA) of graphics processing units (GPU) in a hybrid mode. The effectiveness of the proposed PKTSM is evaluated by using MODIS/ASTER airborne simulator (MASTER) images and airborne synthetic aperture radar (AIRSAR) images for land cover classification during the Pacrim II campaign. The experimental results demonstrated that the proposed hybrid PKTSM can significantly improve the performance in terms of both computational speed-up and classification accuracy.

  8. Tree Species Classification By Multiseasonal High Resolution Satellite Data

    NASA Astrophysics Data System (ADS)

    Elatawneh, Alata; Wallner, Adelheid; Straub, Christoph; Schneider, Thomas; Knoke, Thomas

    2013-12-01

    Accurate forest tree species mapping is a fundamental issue for sustainable forest management and planning. Forest tree species mapping with the means of remote sensing data is still a topic to be investigated. The Bavaria state institute of forestry is investigating the potential of using digital aerial images for forest management purposes. However, using aerial images is still cost- and time-consuming, in addition to their acquisition restrictions. The new space-born sensor generations such as, RapidEye, with a very high temporal resolution, offering multiseasonal data have the potential to improve the forest tree species mapping. In this study, we investigated the potential of multiseasonal RapidEye data for mapping tree species in a Mid European forest in Southern Germany. The RapidEye data of level A3 were collected on ten different dates in the years 2009, 2010 and 2011. For data analysis, a model was developed, which combines the Spectral Angle Mapper technique with a 10-fold- cross-validation. The analysis succeeded to differentiate four tree species; Norway spruce (Picea abies L.), Silver Fir (Abies alba Mill.), European beech (Fagus sylvatica) and Maple (Acer pseudoplatanus). The model success was evaluated using digital aerial images acquired in the year 2009 and inventory point records from 2008/09 inventory. Model results of the multiseasonal RapidEye data analysis achieved an overall accuracy of 76%. However, the success of the model was evaluated only for all the identified species and not for the individual.

  9. Stochastic gradient boosting classification trees for forest fuel types mapping through airborne laser scanning and IRS LISS-III imagery

    NASA Astrophysics Data System (ADS)

    Chirici, G.; Scotti, R.; Montaghi, A.; Barbati, A.; Cartisano, R.; Lopez, G.; Marchetti, M.; McRoberts, R. E.; Olsson, H.; Corona, P.

    2013-12-01

    This paper presents an application of Airborne Laser Scanning (ALS) data in conjunction with an IRS LISS-III image for mapping forest fuel types. For two study areas of 165 km2 and 487 km2 in Sicily (Italy), 16,761 plots of size 30-m × 30-m were distributed using a tessellation-based stratified sampling scheme. ALS metrics and spectral signatures from IRS extracted for each plot were used as predictors to classify forest fuel types observed and identified by photointerpretation and fieldwork. Following use of traditional parametric methods that produced unsatisfactory results, three non-parametric classification approaches were tested: (i) classification and regression tree (CART), (ii) the CART bagging method called Random Forests, and (iii) the CART bagging/boosting stochastic gradient boosting (SGB) approach. This contribution summarizes previous experiences using ALS data for estimating forest variables useful for fire management in general and for fuel type mapping, in particular. It summarizes characteristics of classification and regression trees, presents the pre-processing operation, the classification algorithms, and the achieved results. The results demonstrated superiority of the SGB method with overall accuracy of 84%. The most relevant ALS metric was canopy cover, defined as the percent of non-ground returns. Other relevant metrics included the spectral information from IRS and several other ALS metrics such as percentiles of the height distribution, the mean height of all returns, and the number of returns.

  10. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    PubMed

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity.

  11. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    PubMed Central

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  12. Using classification tree modelling to investigate drug prescription practices at health facilities in rural Tanzania.

    PubMed

    Kajungu, Dan K; Selemani, Majige; Masanja, Irene; Baraka, Amuri; Njozi, Mustafa; Khatib, Rashid; Dodoo, Alexander N; Binka, Fred; Macq, Jean; D'Alessandro, Umberto; Speybroeck, Niko

    2012-09-05

    Drug prescription practices depend on several factors related to the patient, health worker and health facilities. A better understanding of the factors influencing prescription patterns is essential to develop strategies to mitigate the negative consequences associated with poor practices in both the public and private sectors. A cross-sectional study was conducted in rural Tanzania among patients attending health facilities, and health workers. Patients, health workers and health facilities-related factors with the potential to influence drug prescription patterns were used to build a model of key predictors. Standard data mining methodology of classification tree analysis was used to define the importance of the different factors on prescription patterns. This analysis included 1,470 patients and 71 health workers practicing in 30 health facilities. Patients were mostly treated in dispensaries. Twenty two variables were used to construct two classification tree models: one for polypharmacy (prescription of ≥3 drugs) on a single clinic visit and one for co-prescription of artemether-lumefantrine (AL) with antibiotics. The most important predictor of polypharmacy was the diagnosis of several illnesses. Polypharmacy was also associated with little or no supervision of the health workers, administration of AL and private facilities. Co-prescription of AL with antibiotics was more frequent in children under five years of age and the other important predictors were transmission season, mode of diagnosis and the location of the health facility. Standard data mining methodology is an easy-to-implement analytical approach that can be useful for decision-making. Polypharmacy is mainly due to the diagnosis of multiple illnesses.

  13. A Systematic Approach to Subgroup Classification in Intellectual Disability

    ERIC Educational Resources Information Center

    Schalock, Robert L.; Luckasson, Ruth

    2015-01-01

    This article describes a systematic approach to subgroup classification based on a classification framework and sequential steps involved in the subgrouping process. The sequential steps are stating the purpose of the classification, identifying the classification elements, using relevant information, and using clearly stated and purposeful…

  14. A Systematic Approach to Subgroup Classification in Intellectual Disability

    ERIC Educational Resources Information Center

    Schalock, Robert L.; Luckasson, Ruth

    2015-01-01

    This article describes a systematic approach to subgroup classification based on a classification framework and sequential steps involved in the subgrouping process. The sequential steps are stating the purpose of the classification, identifying the classification elements, using relevant information, and using clearly stated and purposeful…

  15. Automatic Approach to Vhr Satellite Image Classification

    NASA Astrophysics Data System (ADS)

    Kupidura, P.; Osińska-Skotak, K.; Pluto-Kossakowska, J.

    2016-06-01

    In this paper, we present a proposition of a fully automatic classification of VHR satellite images. Unlike the most widespread approaches: supervised classification, which requires prior defining of class signatures, or unsupervised classification, which must be followed by an interpretation of its results, the proposed method requires no human intervention except for the setting of the initial parameters. The presented approach bases on both spectral and textural analysis of the image and consists of 3 steps. The first step, the analysis of spectral data, relies on NDVI values. Its purpose is to distinguish between basic classes, such as water, vegetation and non-vegetation, which all differ significantly spectrally, thus they can be easily extracted basing on spectral analysis. The second step relies on granulometric maps. These are the product of local granulometric analysis of an image and present information on the texture of each pixel neighbourhood, depending on the texture grain. The purpose of texture analysis is to distinguish between different classes, spectrally similar, but yet of different texture, e.g. bare soil from a built-up area, or low vegetation from a wooded area. Due to the use of granulometric analysis, based on mathematical morphology opening and closing, the results are resistant to the border effect (qualifying borders of objects in an image as spaces of high texture), which affect other methods of texture analysis like GLCM statistics or fractal analysis. Therefore, the effectiveness of the analysis is relatively high. Several indices based on values of different granulometric maps have been developed to simplify the extraction of classes of different texture. The third and final step of the process relies on a vegetation index, based on near infrared and blue bands. Its purpose is to correct partially misclassified pixels. All the indices used in the classification model developed relate to reflectance values, so the preliminary step

  16. Iqpc 2015 Track: Tree Separation and Classification in Mobile Mapping LIDAR Data

    NASA Astrophysics Data System (ADS)

    Gorte, B.; Oude Elberink, S.; Sirmacek, B.; Wang, J.

    2015-08-01

    The European FP7 project IQmulus yearly organizes several processing contests, where submissions are requested for novel algorithms for point cloud and other big geodata processing. This paper describes the set-up and execution of a contest having the purpose to evaluate state-of-the-art algorithms for Mobile Mapping System point clouds, in order to detect and identify (individual) trees. By the nature of MMS these are trees in the vicinity of the road network (rather than in forests). Therefore, part of the challenge is distinguishing between trees and other objects, such as buildings, street furniture, cars etc. Three submitted segmentation and classification algorithms are thus evaluated.

  17. Accuracy and efficiency of area classifications based on tree tally

    Treesearch

    Michael S. Williams; Hans T. Schreuder; Raymond L. Czaplewski

    2001-01-01

    Inventory data are often used to estimate the area of the land base that is classified as a specific condition class. Examples include areas classified as old-growth forest, private ownership, or suitable habitat for a given species. Many inventory programs rely on classification algorithms of varying complexity to determine condition class. These algorithms can be...

  18. Automated morphological analysis of bone marrow cells in microscopic images for diagnosis of leukemia: nucleus-plasma separation and cell classification using a hierarchical tree model of hematopoesis

    NASA Astrophysics Data System (ADS)

    Krappe, Sebastian; Wittenberg, Thomas; Haferlach, Torsten; Münzenmayer, Christian

    2016-03-01

    The morphological differentiation of bone marrow is fundamental for the diagnosis of leukemia. Currently, the counting and classification of the different types of bone marrow cells is done manually under the use of bright field microscopy. This is a time-consuming, subjective, tedious and error-prone process. Furthermore, repeated examinations of a slide may yield intra- and inter-observer variances. For that reason a computer assisted diagnosis system for bone marrow differentiation is pursued. In this work we focus (a) on a new method for the separation of nucleus and plasma parts and (b) on a knowledge-based hierarchical tree classifier for the differentiation of bone marrow cells in 16 different classes. Classification trees are easily interpretable and understandable and provide a classification together with an explanation. Using classification trees, expert knowledge (i.e. knowledge about similar classes and cell lines in the tree model of hematopoiesis) is integrated in the structure of the tree. The proposed segmentation method is evaluated with more than 10,000 manually segmented cells. For the evaluation of the proposed hierarchical classifier more than 140,000 automatically segmented bone marrow cells are used. Future automated solutions for the morphological analysis of bone marrow smears could potentially apply such an approach for the pre-classification of bone marrow cells and thereby shortening the examination time.

  19. Hierarchical description and extensive classification of protein structural changes by Motion Tree.

    PubMed

    Koike, Ryotaro; Ota, Motonori; Kidera, Akinori

    2014-02-06

    The structures of the same protein, determined under different conditions, provide clues toward understanding the role of structural changes in the protein's function. Structural changes are usually identified as rigid-body motions, which are defined using a particular threshold of rigidity, such as domain motions. However, each protein actually undergoes motions with various size and magnitude ranges. In this study, to describe protein structural changes more comprehensively, we propose a method based on hierarchical clustering. This method enables the illustration of a wide range of protein motions in a single tree diagram, named the "Motion Tree". We applied the method to 432 proteins exhibiting large structural changes and classified their Motion Trees in terms of the characteristic indices of the trees. This classification of the Motion Trees revealed clear relationships to their protein functions. Especially, complex structural changes are significantly correlated with multi-step protein functions.

  20. Development of prognostic indicators using Classification And Regression Trees (CART) for survival

    PubMed Central

    Nunn, Martha E.; Fan, Juanjuan; Su, Xiaogang; McGuire, Michael K.

    2014-01-01

    The development of an accurate prognosis is an integral component of treatment planning in the practice of periodontics. Prior work has evaluated the validity of using various clinical measured parameters for assigning periodontal prognosis as well as for predicting tooth survival and change in clinical conditions over time. We critically review the application of multivariate Classification And Regression Trees (CART) for survival in developing evidence-based periodontal prognostic indicators. We focus attention on two distinct methods of multivariate CART for survival: the marginal goodness-of-fit approach, and the multivariate exponential approach. A number of common clinical measures have been found to be significantly associated with tooth loss from periodontal disease, including furcation involvement, probing depth, mobility, crown-to-root ratio, and oral hygiene. However, the inter-relationships among these measures, as well as the relevance of other clinical measures to tooth loss from periodontal disease (such as bruxism, family history of periodontal disease, and overall bone loss), remain less clear. While inferences drawn from any single current study are necessarily limited, the application of new approaches in epidemiologic analyses to periodontal prognosis, such as CART for survival, should yield important insights into our understanding, and treatment, of periodontal diseases. PMID:22133372

  1. Mixed Neural Network Approach for Temporal Sleep Stage Classification.

    PubMed

    Dong, Hao; Supratak, Akara; Pan, Wei; Wu, Chao; Matthews, Paul M; Guo, Yike

    2017-07-28

    This paper proposes a practical approach to addressing limitations posed by using of single-channel electroencephalography (EEG) for sleep stage classification. EEG-based characterizations of sleep stage progression contribute the diagnosis and monitoring of the many pathologies of sleep. Several prior reports explored ways of automating the analysis of sleep EEG and of reducing the complexity of the data needed for reliable discrimination of sleep stages at lower cost in the home. However, these reports have involved recordings from electrodes placed on the cranial vertex or occiput, which are both uncomfortable and difficult to position. Previous studies of sleep stage scoring that used only frontal electrodes with a hierarchical decision tree motivated this paper, in which we have taken advantage of rectifier neural network for detecting hierarchical features and long short-term memory (LSTM) network for sequential data learning to optimize classification performance with single-channel recordings. After exploring alternative electrode placements, we found a comfortable configuration of a single-channel EEG on the forehead and have shown that it can be integrated with additional electrodes for simultaneous recording of the electrooculogram (EOG). Evaluation of data from 62 people (with 494 hours sleep) demonstrated better performance of our analytical algorithm than is available from existing approaches with vertex or occipital electrode placements. Use of this recording configuration with neural network deconvolution promises to make clinically indicated home sleep studies practical.

  2. A Quality Classification System for Young Hardwood Trees - The First Step in Predicting Future Products

    Treesearch

    David L. Sonderman; Robert L. Brisbin

    1978-01-01

    Forest managers have no objective way to determine the relative value of culturally treated forest stands in terms of product potential. This paper describes the first step in the development of a quality classification system based on the measurement of individual tree characteristics for young hardwood stands.

  3. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  4. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    Treesearch

    Thomas C. Edwards; D. Richard Cutler; Niklaus E. Zimmermann; Linda Geiser; Gretchen G. Moisen

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by...

  5. Using classification tree analysis to predict oak wilt distribution in Minnesota and Texas

    Treesearch

    Marla c. Downing; Vernon L. Thomas; Jennifer Juzwik; David N. Appel; Robin M. Reich; Kim Camilli

    2008-01-01

    We developed a methodology and compared results for predicting the potential distribution of Ceratocystis fagacearum (causal agent of oak wilt), in both Anoka County, MN, and Fort Hood, TX. The Potential Distribution of Oak Wilt (PDOW) utilizes a binary classification tree statistical technique that incorporates: geographical information systems (GIS...

  6. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  7. Quality-based Multimodal Classification Using Tree-Structured Sparsity

    DTIC Science & Technology

    2014-03-08

    Fuzzy Mathematical Techniques with Applica- tions. Addison-Wesley, 1986. 2 [11] S. Kim and E. Xing. Tree-guided group lasso for multi-task regression...arx- ivId:1205.6544, 2012 . 4 [14] B. Krishnapuram, L. Carin, M. A. Figueiredo, and A. J. Hartemink. Sparse multinomial logistic regression: Fast...TFS, 1(2):98–110, 1993. 2, 4 [16] A. Kumar and H. Daume III. Learning task grouping and overlap in multi-task learning. arXiv:1206.6417, 2012 . 2 [17

  8. A Classification of Recent Widespread Tree Mortality in the Western US

    NASA Astrophysics Data System (ADS)

    Hicke, J. A.; Anderegg, W.; Allen, C. D.; Stephenson, N.

    2015-12-01

    Widespread tree mortality has been documented across the western United States in recent decades. Climate change has been implicated in these events, in particular warming and associated effects on tree stress and biotic disturbance agents. Given projected future warming, the capability of accurately predicting future tree mortality is critical. However, sufficient ecological understanding is needed to do so. Here we describe differences in various mortality types associated with spatial characteristics and climate drivers. We loosely classify mortality types into four categories: 1) widespread but low severity background mortality that has been increasing mainly because of greater stress associated with rising climatic water deficit; 2) tree die-offs that are driven by severe, hotter drought in which biotic agents play minor roles, such as sudden aspen decline; 3) tree die-offs in which hotter droughts combined with outbreaks of biotic agents, often less aggressive bark beetles, to cause mortality, such as piñon pine mortality in the Southwest; and 4) tree die-offs that were initiated or facilitated by droughts but which were associated with aggressive biotic agents that can kill healthy trees at high populations, such as mountain pine beetle outbreaks. An important use of this classification is the different pathways by which climate change can cause tree mortality. For some classes (background and primarily drought-driven mortality), predictions may be sufficiently accurate based on climate (drought) metrics. For classes in which biotic agents play a role, the direct warming effect on insects may occur through mechanisms not related to drought, and therefore predictions may need to include mechanisms other than drought. We note that this is a simplistic classification designed to facilitate understanding of tree mortality, and that overlap occurs among categories.

  9. Classification and Progression Based on CFS-GA and C5.0 Boost Decision Tree of TCM Zheng in Chronic Hepatitis B.

    PubMed

    Chen, Xiao Yu; Ma, Li Zhuang; Chu, Na; Zhou, Min; Hu, Yiyang

    2013-01-01

    Chronic hepatitis B (CHB) is a serious public health problem, and Traditional Chinese Medicine (TCM) plays an important role in the control and treatment for CHB. In the treatment of TCM, zheng discrimination is the most important step. In this paper, an approach based on CFS-GA (Correlation based Feature Selection and Genetic Algorithm) and C5.0 boost decision tree is used for zheng classification and progression in the TCM treatment of CHB. The CFS-GA performs better than the typical method of CFS. By CFS-GA, the acquired attribute subset is classified by C5.0 boost decision tree for TCM zheng classification of CHB, and C5.0 decision tree outperforms two typical decision trees of NBTree and REPTree on CFS-GA, CFS, and nonselection in comparison. Based on the critical indicators from C5.0 decision tree, important lab indicators in zheng progression are obtained by the method of stepwise discriminant analysis for expressing TCM zhengs in CHB, and alterations of the important indicators are also analyzed in zheng progression. In conclusion, all the three decision trees perform better on CFS-GA than on CFS and nonselection, and C5.0 decision tree outperforms the two typical decision trees both on attribute selection and nonselection.

  10. Classification tree analysis of second neoplasms in survivors of childhood cancer

    PubMed Central

    Jazbec, Janez; Todorovski, Ljupčo; Jereb, Berta

    2007-01-01

    Background Reports on childhood cancer survivors estimated cumulative probability of developing secondary neoplasms vary from 3,3% to 25% at 25 years from diagnosis, and the risk of developing another cancer to several times greater than in the general population. Methods In our retrospective study, we have used the classification tree multivariate method on a group of 849 first cancer survivors, to identify childhood cancer patients with the greatest risk for development of secondary neoplasms. Results In observed group of patients, 34 develop secondary neoplasm after treatment of primary cancer. Analysis of parameters present at the treatment of first cancer, exposed two groups of patients at the special risk for secondary neoplasm. First are female patients treated for Hodgkin's disease at the age between 10 and 15 years, whose treatment included radiotherapy. Second group at special risk were male patients with acute lymphoblastic leukemia who were treated at the age between 4,6 and 6,6 years of age. Conclusion The risk groups identified in our study are similar to the results of studies that used more conventional approaches. Usefulness of our approach in study of occurrence of second neoplasms should be confirmed in larger sample study, but user friendly presentation of results makes it attractive for further studies. PMID:17270060

  11. Decision tree approach for soil liquefaction assessment.

    PubMed

    Gandomi, Amir H; Fridline, Mark M; Roke, David A

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view.

  12. Decision Tree Approach for Soil Liquefaction Assessment

    PubMed Central

    Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498

  13. Understanding tree growth responses after partial cuttings: A new approach

    PubMed Central

    Rossi, Sergio; Lussier, Jean-Martin; Walsh, Denis; Morin, Hubert

    2017-01-01

    Forest ecosystem management heads towards the use of partial cuttings. However, the wide variation in growth response of residual trees remains unexplained, preventing a suitable prediction of forest productivity. The aim of the study was to assess individual growth and identify the driving factors involved in the responses of residual trees. Six study blocks in even-aged black spruce [Picea mariana (Mill.) B.S.P.] stands of the eastern Canadian boreal forest were submitted to experimental shelterwood and seed-tree treatments. Individual-tree models were applied to 1039 trees to analyze their patterns of radial growth during the 10 years after partial cutting by using the nonlinear Schnute function on tree-ring series. The trees exhibited different growth patterns. A sigmoid growth was detected in 32% of trees, mainly in control plots of older stands. Forty-seven percent of trees located in the interior of residual strips showed an S-shape, which was influenced by stand mortality, harvested intensity and dominant height. Individuals showing an exponential pattern produced the greatest radial growth after cutting and were edge trees of younger stands with higher dominant height. A steady growth decline was observed in 4% of trees, represented by the individuals suppressed and insensitive to the treatment. The analyses demonstrated that individual nonlinear models are able to assess the variability in growth within the stand and the factors involved in the occurrence of the different growth patterns, thus improving understanding of the tree responses to partial cutting. This new approach can sustain forest management strategies by defining the best conditions to optimize the growth yield of residual trees. PMID:28222200

  14. Understanding tree growth responses after partial cuttings: A new approach.

    PubMed

    Montoro Girona, Miguel; Rossi, Sergio; Lussier, Jean-Martin; Walsh, Denis; Morin, Hubert

    2017-01-01

    Forest ecosystem management heads towards the use of partial cuttings. However, the wide variation in growth response of residual trees remains unexplained, preventing a suitable prediction of forest productivity. The aim of the study was to assess individual growth and identify the driving factors involved in the responses of residual trees. Six study blocks in even-aged black spruce [Picea mariana (Mill.) B.S.P.] stands of the eastern Canadian boreal forest were submitted to experimental shelterwood and seed-tree treatments. Individual-tree models were applied to 1039 trees to analyze their patterns of radial growth during the 10 years after partial cutting by using the nonlinear Schnute function on tree-ring series. The trees exhibited different growth patterns. A sigmoid growth was detected in 32% of trees, mainly in control plots of older stands. Forty-seven percent of trees located in the interior of residual strips showed an S-shape, which was influenced by stand mortality, harvested intensity and dominant height. Individuals showing an exponential pattern produced the greatest radial growth after cutting and were edge trees of younger stands with higher dominant height. A steady growth decline was observed in 4% of trees, represented by the individuals suppressed and insensitive to the treatment. The analyses demonstrated that individual nonlinear models are able to assess the variability in growth within the stand and the factors involved in the occurrence of the different growth patterns, thus improving understanding of the tree responses to partial cutting. This new approach can sustain forest management strategies by defining the best conditions to optimize the growth yield of residual trees.

  15. Multi-test decision tree and its application to microarray data classification.

    PubMed

    Czajkowski, Marcin; Grześ, Marek; Kretowski, Marek

    2014-05-01

    The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. A Nonparametric Approach to Estimate Classification Accuracy and Consistency

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2014-01-01

    When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

  17. A Nonparametric Approach to Estimate Classification Accuracy and Consistency

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2014-01-01

    When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

  18. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    USGS Publications Warehouse

    Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, G.G.

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.

  19. The minimum distance approach to classification

    NASA Technical Reports Server (NTRS)

    Wacker, A. G.; Landgrebe, D. A.

    1971-01-01

    The work to advance the state-of-the-art of miminum distance classification is reportd. This is accomplished through a combination of theoretical and comprehensive experimental investigations based on multispectral scanner data. A survey of the literature for suitable distance measures was conducted and the results of this survey are presented. It is shown that minimum distance classification, using density estimators and Kullback-Leibler numbers as the distance measure, is equivalent to a form of maximum likelihood sample classification. It is also shown that for the parametric case, minimum distance classification is equivalent to nearest neighbor classification in the parameter space.

  20. Stroke damage detection using classification trees on electrical bioimpedance cerebral spectroscopy measurements.

    PubMed

    Atefi, Seyed Reza; Seoane, Fernando; Thorlin, Thorleif; Lindecrantz, Kaj

    2013-08-07

    After cancer and cardio-vascular disease, stroke is the third greatest cause of death worldwide. Given the limitations of the current imaging technologies used for stroke diagnosis, the need for portable non-invasive and less expensive diagnostic tools is crucial. Previous studies have suggested that electrical bioimpedance (EBI) measurements from the head might contain useful clinical information related to changes produced in the cerebral tissue after the onset of stroke. In this study, we recorded 720 EBI Spectroscopy (EBIS) measurements from two different head regions of 18 hemispheres of nine subjects. Three of these subjects had suffered a unilateral haemorrhagic stroke. A number of features based on structural and intrinsic frequency-dependent properties of the cerebral tissue were extracted. These features were then fed into a classification tree. The results show that a full classification of damaged and undamaged cerebral tissue was achieved after three hierarchical classification steps. Lastly, the performance of the classification tree was assessed using Leave-One-Out Cross Validation (LOO-CV). Despite the fact that the results of this study are limited to a small database, and the observations obtained must be verified further with a larger cohort of patients, these findings confirm that EBI measurements contain useful information for   assessing on the health of brain tissue after stroke and supports the hypothesis that classification features based on Cole parameters, spectral information and the geometry of EBIS measurements are useful to differentiate between healthy and stroke damaged brain tissue.

  1. Computer-aided diagnosis of Alzheimer's disease using support vector machines and classification trees

    NASA Astrophysics Data System (ADS)

    Salas-Gonzalez, D.; Górriz, J. M.; Ramírez, J.; López, M.; Álvarez, I.; Segovia, F.; Chaves, R.; Puntonet, C. G.

    2010-05-01

    This paper presents a computer-aided diagnosis technique for improving the accuracy of early diagnosis of Alzheimer-type dementia. The proposed methodology is based on the selection of voxels which present Welch's t-test between both classes, normal and Alzheimer images, greater than a given threshold. The mean and standard deviation of intensity values are calculated for selected voxels. They are chosen as feature vectors for two different classifiers: support vector machines with linear kernel and classification trees. The proposed methodology reaches greater than 95% accuracy in the classification task.

  2. The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification.

    PubMed

    Afrasiabi, Cyrus; Samad, Bushra; Dineen, David; Meacham, Christopher; Sjölander, Kimmen

    2013-07-01

    The PhyloFacts 'Fast Approximate Tree Classification' (FAT-CAT) web server provides a novel approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs and robustness to promiscuous domains. We also present results documenting the precision of ortholog identification based on subtree hidden Markov model scoring. The FAT-CAT phylogenetic placement is used to derive a functional annotation for the query, including confidence scores and drill-down capabilities. PhyloFacts' broad taxonomic and functional coverage, with >7.3 M proteins from across the Tree of Life, enables FAT-CAT to predict orthologs and assign function for most sequence inputs. Four pipeline parameter presets are provided to handle different sequence types, including partial sequences and proteins containing promiscuous domains; users can also modify individual parameters. PhyloFacts trees matching the query can be viewed interactively online using the PhyloScope Javascript tree viewer and are hyperlinked to various external databases. The FAT-CAT web server is available at http://phylogenomics.berkeley.edu/phylofacts/fatcat/.

  3. Automatic lung nodule classification with radiomics approach

    NASA Astrophysics Data System (ADS)

    Ma, Jingchen; Wang, Qian; Ren, Yacheng; Hu, Haibo; Zhao, Jun

    2016-03-01

    Lung cancer is the first killer among the cancer deaths. Malignant lung nodules have extremely high mortality while some of the benign nodules don't need any treatment .Thus, the accuracy of diagnosis between benign or malignant nodules diagnosis is necessary. Notably, although currently additional invasive biopsy or second CT scan in 3 months later may help radiologists to make judgments, easier diagnosis approaches are imminently needed. In this paper, we propose a novel CAD method to distinguish the benign and malignant lung cancer from CT images directly, which can not only improve the efficiency of rumor diagnosis but also greatly decrease the pain and risk of patients in biopsy collecting process. Briefly, according to the state-of-the-art radiomics approach, 583 features were used at the first step for measurement of nodules' intensity, shape, heterogeneity and information in multi-frequencies. Further, with Random Forest method, we distinguish the benign nodules from malignant nodules by analyzing all these features. Notably, our proposed scheme was tested on all 79 CT scans with diagnosis data available in The Cancer Imaging Archive (TCIA) which contain 127 nodules and each nodule is annotated by at least one of four radiologists participating in the project. Satisfactorily, this method achieved 82.7% accuracy in classification of malignant primary lung nodules and benign nodules. We believe it would bring much value for routine lung cancer diagnosis in CT imaging and provide improvement in decision-support with much lower cost.

  4. Classification and regression trees for epidemiologic research: an air pollution example

    PubMed Central

    2014-01-01

    Background Identifying and characterizing how mixtures of exposures are associated with health endpoints is challenging. We demonstrate how classification and regression trees can be used to generate hypotheses regarding joint effects from exposure mixtures. Methods We illustrate the approach by investigating the joint effects of CO, NO2, O3, and PM2.5 on emergency department visits for pediatric asthma in Atlanta, Georgia. Pollutant concentrations were categorized as quartiles. Days when all pollutants were in the lowest quartile were held out as the referent group (n = 131) and the remaining 3,879 days were used to estimate the regression tree. Pollutants were parameterized as dichotomous variables representing each ordinal split of the quartiles (e.g. comparing CO quartile 1 vs. CO quartiles 2–4) and considered one at a time in a Poisson case-crossover model with control for confounding. The pollutant-split resulting in the smallest P-value was selected as the first split and the dataset was partitioned accordingly. This process repeated for each subset of the data until the P-values for the remaining splits were not below a given alpha, resulting in the formation of a “terminal node”. We used the case-crossover model to estimate the adjusted risk ratio for each terminal node compared to the referent group, as well as the likelihood ratio test for the inclusion of the terminal nodes in the final model. Results The largest risk ratio corresponded to days when PM2.5 was in the highest quartile and NO2 was in the lowest two quartiles (RR: 1.10, 95% CI: 1.05, 1.16). A simultaneous Wald test for the inclusion of all terminal nodes in the model was significant, with a chi-square statistic of 34.3 (p = 0.001, with 13 degrees of freedom). Conclusions Regression trees can be used to hypothesize about joint effects of exposure mixtures and may be particularly useful in the field of air pollution epidemiology for gaining a better understanding of complex

  5. Identification and classification of dynamic event tree scenarios via possibilistic clustering: application to a steam generator tube rupture event.

    PubMed

    Mercurio, D; Podofillini, L; Zio, E; Dang, V N

    2009-11-01

    This paper illustrates a method to identify and classify scenarios generated in a dynamic event tree (DET) analysis. Identification and classification are carried out by means of an evolutionary possibilistic fuzzy C-means clustering algorithm which takes into account not only the final system states but also the timing of the events and the process evolution. An application is considered with regards to the scenarios generated following a steam generator tube rupture in a nuclear power plant. The scenarios are generated by the accident dynamic simulator (ADS), coupled to a RELAP code that simulates the thermo-hydraulic behavior of the plant and to an operators' crew model, which simulates their cognitive and procedures-guided responses. A set of 60 scenarios has been generated by the ADS DET tool. The classification approach has grouped the 60 scenarios into 4 classes of dominant scenarios, one of which was not anticipated a priori but was "discovered" by the classifier. The proposed approach may be considered as a first effort towards the application of identification and classification approaches to scenarios post-processing for real-scale dynamic safety assessments.

  6. A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification.

    PubMed

    Singh, Anima; Guttag, John V

    2011-01-01

    Classification tree-based risk stratification models generate easily interpretable classification rules. This feature makes classification tree-based models appealing for use in a clinical setting, provided that they have comparable accuracy to other methods. In this paper, we present and evaluate the performance of a non-symmetric entropy-based classification tree algorithm. The algorithm is designed to accommodate class imbalance found in many medical datasets. We evaluate the performance of this algorithm, and compare it to that of SVM-based classifiers, when applied to 4219 non-ST elevation acute coronary syndrome patients. We generated SVM-based classifiers using three different strategies for handling class imbalance: cost-sensitive SVM learning, synthetic minority oversampling (SMOTE), and random majority undersampling. We used both linear and radial basis kernel-based SVMs. Our classification tree models outperformed SVM-based classifiers generated using each of the three techniques. On average, the classification tree models yielded a 14% improvement in G-score and a 21% improvement in F-score relative to the linear SVM classifiers with the best performance. Similarly, our classification tree models yielded a 12% improvement in G-score and a 21% improvement in the F-score over the best RBF kernel-based SVM classifiers.

  7. Classification tree analysis of the factors influencing injury-related disability caused by the Wenchuan earthquake.

    PubMed

    Liu, Xiang; Liu, Yuan-Yuan; Liu, Si-Huan; Zhang, Xiang-Rong; Du, Lei; Huang, Wen-Xia

    2014-04-01

    To identify the factors that influenced the risk of injury-related disability caused by the Wenchuan earthquake. A chi-squared automatic interaction detection (CHAID) classification tree analysis was used to retrospectively analyse clinical data from patients who underwent surgical treatment for earthquake-related injuries in the first 5 days after the earthquake. The CHAID classification tree explored the relationships between the development of disability and potential influencing factors including sex, age, time interval between injury and treatment, wound type, preoperative and postoperative haemoglobin levels, and operation time. A total of 334 patients underwent surgery; of these, 113 (33.8%) were discharged with varying degrees of permanent disability. The CHAID classification tree showed that children (≤ 17 years old), a long time interval between injury and treatment, an open wound and a low preoperative haemoglobin level were significant risk factors for disability. The results of this study can help to stratify patients according to their medical needs and to help allocate the available resources efficiently to ensure the best outcomes for injured patients during future earthquakes.

  8. Classification of the cycle of the seminiferous epithelium in the common tree shrew (Tupaia glis).

    PubMed

    Maeda, S; Endo, H; Kimura, J; Rerkamnuaychoke, W; Chungsamarnyart, N; Yamada, J; Kurohmarum; Hayashi, Y; Nishida, T

    1996-05-01

    The classification of the cycle of the seminiferous epithelium was carried out in the common tree shrew (Tupaia glis). The tree shrew captured in Thailand were fixed with Bouin's fixative, embedded in paraffin wax, and stained with PAS-hematoxylin. The cycle was classified into twelve stages on the basis of the acrosomal changes of spermatids. Relative frequencies of stages form I to XII were 11.9, 7.2, 8.9, 22.5, 12.9, 9.7, 8.0, 5.9, 4.0, 3.2, 2.9, and 3.6%, respectively. Different stages did no appear in a cross-sectioned tubule as did in primates. The head of matured spermatid was discoidal in shape and different from that of primates and rodents. Spermatogenesis of the common tree shrew is different from that of primates and rodents according to its morphological features.

  9. A rough sets approach of hyperspectral image classification

    NASA Astrophysics Data System (ADS)

    Wu, Zhaocong; Li, Deren

    2005-10-01

    Rough set theory has a powerful capability for attributes reduction and classification rules extraction, while artificial neural network (ANN) performances well in classification problems with a satisfactory accuracy. In this paper we focus our attention to investigate a way of integrating rough set theory and multi layer perceptron (MLP) in soft computing paradigm for classification and rule generation of hyperspectral remote sensing image classification. The novelty of this method lies in applying rough set theory for extracting classification rules and computing fuzzy membership values directly from decision table after attributes reduction on a real-valued attribute table consisting of classification features. The successful application of this approach in hyperspectral remote sensing images mineral classification illustrates the flexibility and practicality of this new approach.

  10. A systematic approach to the classification of diseases.

    PubMed

    Murthy, A R

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences.

  11. A SYSTEMATIC APPROACH TO THE CLASSIFICATION OF DISEASES

    PubMed Central

    Murthy, A.R.V.

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences. PMID:22556612

  12. Classification tree and minimum-volume ellipsoid analyses of the distribution of ponderosa pine in the western USA

    USGS Publications Warehouse

    Norris, Jodi R.; Jackson, Stephen T.; Betancourt, Julio L.

    2006-01-01

    Aim? Ponderosa pine (Pinus ponderosa Douglas ex Lawson & C. Lawson) is an economically and ecologically important conifer that has a wide geographic range in the western USA, but is mostly absent from the geographic centre of its distribution - the Great Basin and adjoining mountain ranges. Much of its modern range was achieved by migration of geographically distinct Sierra Nevada (P. ponderosa var. ponderosa) and Rocky Mountain (P. ponderosa var. scopulorum) varieties in the last 10,000 years. Previous research has confirmed genetic differences between the two varieties, and measurable genetic exchange occurs where their ranges now overlap in western Montana. A variety of approaches in bioclimatic modelling is required to explore the ecological differences between these varieties and their implications for historical biogeography and impending changes in western landscapes. Location? Western USA. Methods? We used a classification tree analysis and a minimum-volume ellipsoid as models to explain the broad patterns of distribution of ponderosa pine in modern environments using climatic and edaphic variables. Most biogeographical modelling assumes that the target group represents a single, ecologically uniform taxonomic population. Classification tree analysis does not require this assumption because it allows the creation of pathways that predict multiple positive and negative outcomes. Thus, classification tree analysis can be used to test the ecological uniformity of the species. In addition, a multidimensional ellipsoid was constructed to describe the niche of each variety of ponderosa pine, and distances from the niche were calculated and mapped on a 4-km grid for each ecological variable. Results? The resulting classification tree identified three dominant pathways predicting ponderosa pine presence. Two of these three pathways correspond roughly to the distribution of var. ponderosa, and the third pathway generally corresponds to the distribution of var

  13. Pesticides in urban multiunit dwellings: hazard identification using classification and regression tree (CART) analysis.

    PubMed

    Julien, Rhona; Levy, Jonathan I; Adamkiewicz, Gary; Hauser, Russ; Spengler, John D; Canales, Robert A; Hynes, H Patricia

    2008-10-01

    Many units in public housing or other low-income urban dwellings may have elevated pesticide residues, given recurring infestation, but it would be logistically and economically infeasible to sample a large number of units to identify highly exposed households to design interventions. Within this study, our aim was to devise a low-cost approach to identify homes in public housing with high levels of pesticide residues, using information that would allow the housing authority and residents to determine optimal strategies to reduce household exposures. As part of the Healthy Public Housing Initiative, we collected environmental samples from 42 public housing apartments in Boston, MA, in 2002 and 2003 and gathered housing characteristics; for example, household demographics and self-reported pesticide use information, considering information available with and without a home visit. Focusing on five organophosphate and pyrethroid pesticides, we used classification and regression tree analysis (CART) to disaggregate the pesticide concentration data into homogenous subsamples according to housing characteristics, which allowed us to identify households and associated networks impacted by the mismanagement of pesticides. The CART analysis demonstrated reasonable sensitivity and specificity given more extensive household information but generally poor performance using only information available without a home visit. Apartments with high concentrations of cyfluthrin, a pyrethroid of interest given that it is a restricted use pesticide, were more likely to be associated with Hispanic residents who resided in their current apartment for more than 5 yr, consistent with documented pesticide usage patterns. We conclude that using CART as an exploratory technique to better understand the home characteristics associated with elevated pesticide levels may be a viable approach for risk management in large multiunit housing developments.

  14. Classification of the PALMS single particle mass spectral data from Atlanta by regression tree analysis

    NASA Astrophysics Data System (ADS)

    Middlebrook, A. M.; Murphy, D. M.; Lee, S.; Lee, S.; Lee, S.; Thomson, D. S.; Thomson, D. S.

    2001-12-01

    During the Atlanta Supersites project in August 1999, the PALMS (Particle Analysis by Laser Mass Spectrometry) instrument collected over 500,000 individual particle spectra. The Atlanta data were originally analyzed by examining combinations of peaks and relative peak areas [Lee et al., 2001a,b], and a wide range of particle components such as sulfate, nitrate, mineral species, metals, organic species, and elemental carbon were detected. To further study the dataset, a classification program using regression tree analysis was developed and applied. Spectral data were compressed into a lower resolution spectrum (every 0.25 mass units) of the raw data and a list of peak areas (every mass unit). Each spectrum started as a normalized classification vector by itself. If the dot product of two classification vectors was within a certain threshold, they were combined into a new classification. The new classification vector was a normalized running average of the classifications being combined. In subsequent steps, the threshold for combining classifications was continuously lowered until a reasonable number of classifications remained. After the final iteration, each spectrum was compared individually with the entire set of classification vectors. Classifications were also combined manually. The classification results from the Atlanta data are generally consistent with those determined by peak identification. However, the classification program identified specific patterns in the mass spectra that were not found by peak identification and generated new particle types. Furthermore, rare particle types that may affect human health were studied in more detail. A description of the classification program as well as the results for the Atlanta data will be presented. Lee, S.-H., D. M. Murphy, D. S. Thomson, and A. M. Middlebrook, Chemical components of single particles measured with particle analysis by laser mass spectrometry (PALMS) during the Atlanta Supersites Project

  15. Classification of Tree Species in Overstorey Canopy of Subtropical Forest Using QuickBird Images

    PubMed Central

    Lin, Chinsu; Popescu, Sorin C.; Thomson, Gavin; Tsogt, Khongor; Chang, Chein-I

    2015-01-01

    This paper proposes a supervised classification scheme to identify 40 tree species (2 coniferous, 38 broadleaf) belonging to 22 families and 36 genera in high spatial resolution QuickBird multispectral images (HMS). Overall kappa coefficient (OKC) and species conditional kappa coefficients (SCKC) were used to evaluate classification performance in training samples and estimate accuracy and uncertainty in test samples. Baseline classification performance using HMS images and vegetation index (VI) images were evaluated with an OKC value of 0.58 and 0.48 respectively, but performance improved significantly (up to 0.99) when used in combination with an HMS spectral-spatial texture image (SpecTex). One of the 40 species had very high conditional kappa coefficient performance (SCKC ≥ 0.95) using 4-band HMS and 5-band VIs images, but, only five species had lower performance (0.68 ≤ SCKC ≤ 0.94) using the SpecTex images. When SpecTex images were combined with a Visible Atmospherically Resistant Index (VARI), there was a significant improvement in performance in the training samples. The same level of improvement could not be replicated in the test samples indicating that a high degree of uncertainty exists in species classification accuracy which may be due to individual tree crown density, leaf greenness (inter-canopy gaps), and noise in the background environment (intra-canopy gaps). These factors increase uncertainty in the spectral texture features and therefore represent potential problems when using pixel-based classification techniques for multi-species classification. PMID:25978466

  16. Parameter optimization of image classification techniques to delineate crowns of coppice trees on UltraCam-D aerial imagery in woodlands

    NASA Astrophysics Data System (ADS)

    Erfanifard, Yousef; Stereńczak, Krzysztof; Behnia, Negin

    2014-01-01

    Estimating the optimal parameters of some classification techniques becomes their negative aspect as it affects their performance for a given dataset and reduces classification accuracy. It was aimed to optimize the combination of effective parameters of support vector machine (SVM), artificial neural network (ANN), and object-based image analysis (OBIA) classification techniques by the Taguchi method. The optimized techniques were applied to delineate crowns of Persian oak coppice trees on UltraCam-D very high spatial resolution aerial imagery in Zagros semiarid woodlands, Iran. The imagery was classified and the maps were assessed by receiver operating characteristic curve and other performance metrics. The results showed that Taguchi is a robust approach to optimize the combination of effective parameters in these image classification techniques. The area under curve (AUC) showed that the optimized OBIA could well discriminate tree crowns on the imagery (AUC=0.897), while SVM and ANN yielded slightly less AUC performances of 0.819 and 0.850, respectively. The indices of accuracy (0.999) and precision (0.999) and performance metrics of specificity (0.999) and sensitivity (0.999) in the optimized OBIA were higher than with other techniques. The optimization of effective parameters of image classification techniques by the Taguchi method, thus, provided encouraging results to discriminate the crowns of Persian oak coppice trees on UltraCam-D aerial imagery in Zagros semiarid woodlands.

  17. A practicable approach for periodontal classification

    PubMed Central

    Mittal, Vishnu; Bhullar, Raman Preet K.; Bansal, Rachita; Singh, Karanprakash; Bhalodi, Anand; Khinda, Paramjit K.

    2013-01-01

    The Diagnosis and classification of periodontal diseases has remained a dilemma since long. Two distinct concepts have been used to define diseases: Essentialism and Nominalism. Essentialistic concept implies the real existence of disease whereas; nominalistic concept states that the names of diseases are the convenient way of stating concisely the endpoint of a diagnostic process. It generally advances from assessment of symptoms and signs toward knowledge of causation and gives a feasible option to name the disease for which etiology is either unknown or it is too complex to access in routine clinical practice. Various classifications have been proposed by the American Academy of Periodontology (AAP) in 1986, 1989 and 1999. The AAP 1999 classification is among the most widely used classification. But this classification also has demerits which provide impediment for its use in day to day practice. Hence a classification and diagnostic system is required which can help the clinician to access the patient's need and provide a suitable treatment which is in harmony with the diagnosis for that particular case. Here is an attempt to propose a practicable classification and diagnostic system of periodontal diseases for better treatment outcome. PMID:24379855

  18. Using classification trees to detect induced sow lameness with a transient model.

    PubMed

    Abell, C E; Johnson, A K; Karriker, L A; Rothschild, M F; Hoff, S J; Sun, G; Fitzgerald, R F; Stalder, K J

    2014-06-01

    Feet and legs issues are some of the main causes for sow removal in the US swine industry. More timely lameness detection among breeding herd females will allow better treatment decisions and outcomes. Producers will be able to treat lame females before the problem becomes too severe and cull females while they still have salvage value. The objective of this study was to compare the predictive abilities and accuracies of weight distribution and gait measures relative to each other and to a visual lameness detection method when detecting induced lameness among multiparous sows. Developing an objective lameness diagnosis algorithm will benefit animals, producers and scientists in timely and effective identification of lame individuals as well as aid producers in their efforts to decrease herd lameness by selecting animals that are less prone to become lame. In the early stages of lameness, weight distribution and gait are impacted. Lameness was chemically induced for a short time period in 24 multiparous sows and their weight distribution and walking gait were measured in the days following lameness induction. A linear mixed model was used to determine differences between measurements collected from day to day. Using a classification tree analysis, it was determined that the mean weight being placed on each leg was the most predictive measurement when determining whether the leg was sound or lame. The classification tree's predictive ability decreased as the number of days post-lameness induction increased. The weight distribution measurements had a greater predictive ability compared with the gait measurements. The error rates associated with the weight distribution trees were 29.2% and 31.3% at 6 days post-lameness induction for front and rear injected feet, respectively. For the gait classification trees, the error rates were 60.9% and 29.8% at 6 days post-lameness induction for front and rear injected feet, respectively. More timely lameness detection can improve

  19. Stratification of the severity of critically ill patients with classification trees

    PubMed Central

    2009-01-01

    Background Development of three classification trees (CT) based on the CART (Classification and Regression Trees), CHAID (Chi-Square Automatic Interaction Detection) and C4.5 methodologies for the calculation of probability of hospital mortality; the comparison of the results with the APACHE II, SAPS II and MPM II-24 scores, and with a model based on multiple logistic regression (LR). Methods Retrospective study of 2864 patients. Random partition (70:30) into a Development Set (DS) n = 1808 and Validation Set (VS) n = 808. Their properties of discrimination are compared with the ROC curve (AUC CI 95%), Percent of correct classification (PCC CI 95%); and the calibration with the Calibration Curve and the Standardized Mortality Ratio (SMR CI 95%). Results CTs are produced with a different selection of variables and decision rules: CART (5 variables and 8 decision rules), CHAID (7 variables and 15 rules) and C4.5 (6 variables and 10 rules). The common variables were: inotropic therapy, Glasgow, age, (A-a)O2 gradient and antecedent of chronic illness. In VS: all the models achieved acceptable discrimination with AUC above 0.7. CT: CART (0.75(0.71-0.81)), CHAID (0.76(0.72-0.79)) and C4.5 (0.76(0.73-0.80)). PCC: CART (72(69-75)), CHAID (72(69-75)) and C4.5 (76(73-79)). Calibration (SMR) better in the CT: CART (1.04(0.95-1.31)), CHAID (1.06(0.97-1.15) and C4.5 (1.08(0.98-1.16)). Conclusion With different methodologies of CTs, trees are generated with different selection of variables and decision rules. The CTs are easy to interpret, and they stratify the risk of hospital mortality. The CTs should be taken into account for the classification of the prognosis of critically ill patients. PMID:20003229

  20. a Two-Step Classification Approach to Distinguishing Similar Objects in Mobile LIDAR Point Clouds

    NASA Astrophysics Data System (ADS)

    He, H.; Khoshelham, K.; Fraser, C.

    2017-09-01

    Nowadays, lidar is widely used in cultural heritage documentation, urban modeling, and driverless car technology for its fast and accurate 3D scanning ability. However, full exploitation of the potential of point cloud data for efficient and automatic object recognition remains elusive. Recently, feature-based methods have become very popular in object recognition on account of their good performance in capturing object details. Compared with global features describing the whole shape of the object, local features recording the fractional details are more discriminative and are applicable for object classes with considerable similarity. In this paper, we propose a two-step classification approach based on point feature histograms and the bag-of-features method for automatic recognition of similar objects in mobile lidar point clouds. Lamp post, street light and traffic sign are grouped as one category in the first-step classification for their inter similarity compared with tree and vehicle. A finer classification of the lamp post, street light and traffic sign based on the result of the first-step classification is implemented in the second step. The proposed two-step classification approach is shown to yield a considerable improvement over the conventional one-step classification approach.

  1. Quantum Ensemble Classification: A Sampling-Based Learning Control Approach.

    PubMed

    Chen, Chunlin; Dong, Daoyi; Qi, Bo; Petersen, Ian R; Rabitz, Herschel

    2017-06-01

    Quantum ensemble classification (QEC) has significant applications in discrimination of atoms (or molecules), separation of isotopes, and quantum information extraction. However, quantum mechanics forbids deterministic discrimination among nonorthogonal states. The classification of inhomogeneous quantum ensembles is very challenging, since there exist variations in the parameters characterizing the members within different classes. In this paper, we recast QEC as a supervised quantum learning problem. A systematic classification methodology is presented by using a sampling-based learning control (SLC) approach for quantum discrimination. The classification task is accomplished via simultaneously steering members belonging to different classes to their corresponding target states (e.g., mutually orthogonal states). First, a new discrimination method is proposed for two similar quantum systems. Then, an SLC method is presented for QEC. Numerical results demonstrate the effectiveness of the proposed approach for the binary classification of two-level quantum ensembles and the multiclass classification of multilevel quantum ensembles.

  2. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  3. Mineral recognition mapping using measured spectra based on classification and regression tree

    NASA Astrophysics Data System (ADS)

    Zhan, Yunjun; Su, Yubin; Huang, Jiejun; Ye, Fawang; Zhang, Chuan

    2016-10-01

    The alteration of surrounding rock is an important prospecting indicator in mineral exploration, but some important minerals are unclassified or misclassified when using hyperspectral remote sensing mineral recognition. A method for mineral recognition mapping was proposed. In this method, a decision tree discrimination rule was established based on the classification and regression tree data-mining algorithm and the absorption characteristics of field-measured spectra. Compared with spectral angle mapping and mixture-tuned matched filtering (MTMF), this method is shown to be efficient for mineral recognition mapping using hyperspectral images; its accuracy is 85.06%, which is greater than that of the MTMF method (83.91%). The advantages of the proposed method comprise the reduction of errors caused by the setting of the artificial threshold for mineral mapping and the lesser degree of difficulty in its training. Furthermore, the hierarchy structure of the decision tree in this method reflects the recognition process clearly, and the rule nodes are closely related to the spectra of the minerals; therefore, the advantage of this method is the interpretability of the results and the process. This method could be used for mineral recognition and classification using hyperspectral images.

  4. Development of structure-taste relationships for monosubstituted phenylsulfamate sweeteners using classification and regression tree (CART) analysis.

    PubMed

    Kelly, Damien P; Spillane, William J; Newell, John

    2005-08-24

    Twenty monosubstituted phenylsulfamates (cyclamates) have been synthesized and have had their taste portfolios determined. These have been combined with 63 compounds already in the literature to give a database of 83 ortho, meta, and para compounds. A training set of 75 compounds was randomly selected leaving eight compounds as a test set. A series of nine predictors determined with Corey-Pauling-Koltun models, calculated from the PC SPARTAN PRO program and Hammett sigma values taken mainly from the literature, have been used to establish structure-taste relationships for these types of sweeteners. The taste panel data for all compounds were categorized into three classes, namely, sweet (S), nonsweet (N), and sweet/nonsweet (N/S), and a novel "sweetness value" or weighting was also calculated for each compound. Linear and quadratic discriminant analysis were first used with the S, N, and N/S data, but the results were somewhat disappointing. Classification and regression tree analysis using the sweetness values for all 75 compounds was more successful, and only 14 were misclassified and six of the eight test set compounds were correctly classified. For the 29 meta compounds, one subset using just two parameters classified 83% of these compounds. Finally, using various methods, predictions were made on the likely tastes of a number of meta compounds and a striking agreement was found between the tree prediction and those given by earlier models. This appears to offer a strong vindication of the tree approach.

  5. Automated Diagnosis of Heart Sounds Using Rule-Based Classification Tree.

    PubMed

    Karar, Mohamed Esmail; El-Khafif, Sahar H; El-Brawany, Mohamed A

    2017-04-01

    In order to assist the diagnosis procedure of heart sound signals, this paper presents a new automated method for classifying the heart status using a rule-based classification tree into normal and three abnormal cases; namely the aortic valve stenosis, aortic insufficient, and ventricular septum defect. The developed method includes three main steps as follows. First, one cycle of the heart sound signals is automatically detected and segmented based on time properties of the heart signals. Second, the segmented cycle is preprocessed with the discrete wavelet transform and then largest Lyapunov exponents are calculated to generate the dynamical features of heart sound time series. Finally, a rule-based classification tree is fed by these Lyapunov exponents to give the final decision of the heart health status. The developed method has been tested successfully on twenty-two datasets of normal heart sounds and murmurs with success rate of 95.5%. The resulting error can be easily corrected by modifying the classification rules; consequently, the accuracy of automated heart sounds diagnosis is further improved.

  6. A discrete element modelling approach for block impacts on trees

    NASA Astrophysics Data System (ADS)

    Toe, David; Bourrier, Franck; Olmedo, Ignatio; Berger, Frederic

    2015-04-01

    These past few year rockfall models explicitly accounting for block shape, especially those using the Discrete Element Method (DEM), have shown a good ability to predict rockfall trajectories. Integrating forest effects into those models still remain challenging. This study aims at using a DEM approach to model impacts of blocks on trees and identify the key parameters controlling the block kinematics after the impact on a tree. A DEM impact model of a block on a tree was developed and validated using laboratory experiments. Then, key parameters were assessed using a global sensitivity analyse. Modelling the impact of a block on a tree using DEM allows taking into account large displacements, material non-linearities and contacts between the block and the tree. Tree stems are represented by flexible cylinders model as plastic beams sustaining normal, shearing, bending, and twisting loading. Root soil interactions are modelled using a rotation stiffness acting on the bending moment at the bottom of the tree and a limit bending moment to account for tree overturning. The crown is taken into account using an additional mass distribute uniformly on the upper part of the tree. The block is represented by a sphere. The contact model between the block and the stem consists of an elastic frictional model. The DEM model was validated using laboratory impact tests carried out on 41 fresh beech (Fagus Sylvatica) stems. Each stem was 1,3 m long with a diameter between 3 to 7 cm. Wood stems were clamped on a rigid structure and impacted by a 149 kg charpy pendulum. Finally an intensive simulation campaign of blocks impacting trees was done to identify the input parameters controlling the block kinematics after the impact on a tree. 20 input parameters were considered in the DEM simulation model : 12 parameters were related to the tree and 8 parameters to the block. The results highlight that the impact velocity, the stem diameter, and the block volume are the three input

  7. The Reliability of Classification of Terminal Nodes in GUIDE Decision Tree to Predict the Nonalcoholic Fatty Liver Disease.

    PubMed

    Birjandi, Mehdi; Ayatollahi, Seyyed Mohammad Taghi; Pourahmad, Saeedeh

    2016-01-01

    Tree structured modeling is a data mining technique used to recursively partition a dataset into relatively homogeneous subgroups in order to make more accurate predictions on generated classes. One of the classification tree induction algorithms, GUIDE, is a nonparametric method with suitable accuracy and low bias selection, which is used for predicting binary classes based on many predictors. In this tree, evaluating the accuracy of predicted classes (terminal nodes) is clinically of special importance. For this purpose, we used GUIDE classification tree in two statuses of equal and unequal misclassification cost in order to predict nonalcoholic fatty liver disease (NAFLD), considering 30 predictors. Then, to evaluate the accuracy of predicted classes by using bootstrap method, first the classification reliability in which individuals are assigned to a unique class and next the prediction probability reliability as support for that are considered.

  8. The Reliability of Classification of Terminal Nodes in GUIDE Decision Tree to Predict the Nonalcoholic Fatty Liver Disease

    PubMed Central

    Pourahmad, Saeedeh

    2016-01-01

    Tree structured modeling is a data mining technique used to recursively partition a dataset into relatively homogeneous subgroups in order to make more accurate predictions on generated classes. One of the classification tree induction algorithms, GUIDE, is a nonparametric method with suitable accuracy and low bias selection, which is used for predicting binary classes based on many predictors. In this tree, evaluating the accuracy of predicted classes (terminal nodes) is clinically of special importance. For this purpose, we used GUIDE classification tree in two statuses of equal and unequal misclassification cost in order to predict nonalcoholic fatty liver disease (NAFLD), considering 30 predictors. Then, to evaluate the accuracy of predicted classes by using bootstrap method, first the classification reliability in which individuals are assigned to a unique class and next the prediction probability reliability as support for that are considered. PMID:28053651

  9. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  10. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  11. Prediction of severe acute pancreatitis using classification and regression tree analysis.

    PubMed

    Hong, Wandong; Dong, Lemei; Huang, Qingke; Wu, Wenzhi; Wu, Jiansheng; Wang, Yumin

    2011-12-01

    The available prognostic scoring systems for acute pancreatitis have limitations that restrict their clinical value. To develop a decision model based on classification and regression tree (CART) analysis for the prediction of severe acute pancreatitis (SAP). A total of 420 patients with acute pancreatitis were enrolled. Study participants were randomly assigned to the training sample and test sample in a 2:1 ratio. First, univariate analysis and logistic regression analysis were used to identify predictors associated with SAP in the training sample. Then, CART analysis was carried out to develop a simple tree model for the prediction of SAP. A receiver operating characteristic (ROC) curve was constructed in order to assess the performance of the model. The prediction model was then applied to the test sample. Four variables (systemic inflammatory response syndrome [SIRS], pleural effusion, serum calcium, and blood urea nitrogen [BUN]) were identified as important predictors of SAP by logistic regression analysis. A tree model (which consisted of pleural effusion, serum calcium, and BUN) that was developed by CART analysis was able to early identify among cohorts at high (79.03%) and low (7.80%) risk of developing SAP. The area under the ROC curve of the tree model was higher than that of the APACHE II score (0.84 vs. 0.68; P < 0.001). The predicted accuracy of the tree model was validated in the test sample with an area under the ROC curve of 0.86. A decision tree model that consists of pleural effusion, serum calcium, and BUN may be useful for the prediction of SAP.

  12. A phylogenomic approach to bacterial subspecies classification: proof of concept in Mycobacterium abscessus

    PubMed Central

    2013-01-01

    Background Mycobacterium abscessus is a rapidly growing mycobacterium that is often associated with human infections. The taxonomy of this species has undergone several revisions and is still being debated. In this study, we sequenced the genomes of 12 M. abscessus strains and used phylogenomic analysis to perform subspecies classification. Results A data mining approach was used to rank and select informative genes based on the relative entropy metric for the construction of a phylogenetic tree. The resulting tree topology was similar to that generated using the concatenation of five classical housekeeping genes: rpoB, hsp65, secA, recA and sodA. Additional support for the reliability of the subspecies classification came from the analysis of erm41 and ITS gene sequences, single nucleotide polymorphisms (SNPs)-based classification and strain clustering demonstrated by a variable number tandem repeat (VNTR) assay and a multilocus sequence analysis (MLSA). We subsequently found that the concatenation of a minimal set of three median-ranked genes: DNA polymerase III subunit alpha (polC), 4-hydroxy-2-ketovalerate aldolase (Hoa) and cell division protein FtsZ (ftsZ), is sufficient to recover the same tree topology. PCR assays designed specifically for these genes showed that all three genes could be amplified in the reference strain of M. abscessus ATCC 19977T. Conclusion This study provides proof of concept that whole-genome sequence-based data mining approach can provide confirmatory evidence of the phylogenetic informativeness of existing markers, as well as lead to the discovery of a more economical and informative set of markers that produces similar subspecies classification in M. abscessus. The systematic procedure used in this study to choose the informative minimal set of gene markers can potentially be applied to species or subspecies classification of other bacteria. PMID:24330254

  13. [Identification of the main risk factors for non infectious diseases: method of classification trees].

    PubMed

    Konstantinova, E D; Varaksin, A N; Zhovner, I V

    2013-01-01

    There is presented ideology of the application of one of the methods for assessment of the influence of multi-factor influence of risk factors on population health--the method of classification trees. The method of classification trees is a hierarchical procedure for constructing a decision rule that allows to divide the population into groups with higher and lower morbidity "in the coordinates of" risk factors. The main advantage of the method--the possibility of finding the complex of risk factors having the greatest impact on the health of the population (in contrast to common methods, analyzing only the single-factor effects). In the paper there are presented two possible variants of application of classification trees: 1) the finding of the complex of environmental risk factors (RF), which provides the maximum impact on the prevalence of non infectious diseases in preschool children) in Yekaterinburg (environmental risk factors--the pollution of air drinking water, in the presence of a gas stove in the child's flat, etc.). It is shown that, together with socio-economic risk factors environmental risk factors increase the prevalence of respiratory diseases in preschool children in Ekaterinburg in 2.5-4 times (depending on the list and the number of environmental RF), 2) finding the complex of non-environmental factors that most effectively compensating the negative effect of environmental pollution on human health. This posing of the problem is associated with the fact that pollution environmental factors are (usually) unmodified, while family, behavioral or social factors can be partially or completely eliminated Implementation of the recommendations presented in the paper can reduce the incidence of circulatory diseases in preschool children in Yekaterinburg more than 2 times.

  14. Internal Carbon Recycling in Trees - New Approach, Findings, and Implications

    NASA Astrophysics Data System (ADS)

    Angert, A.; Hilman, B.

    2012-12-01

    The CO2 emitted by respiration in a tree woody tissue (stem, branch, or root) is usually assumed to diffuse directly out to the atmosphere. Given that the internal concentrations of CO2 are one to two orders of magnitude higher than the atmospheric concentration, a reuse of this respired carbon can be beneficial to plants. We have developed a new method to track the fraction of respired CO2 not emitted from stems and branches, from the ratio of the CO2 efflux to the O2 influx. This ratio, which we defined as the apparent respiratory quotient (ARQ), is expected to equal 1.0 if carbohydrates are the substrate for respiration, and all respired CO2 is directly emitted. Using this approach we have recently showed that ~30% of the CO2 respired by Amazon forest tree stems was not directly emitted. In the current study we have applied this approach to 5 tree species living in Mediterranean climate, and have performed seasonal and diurnal ARQ measurements, at different heights along the stem and branches. We found different seasonal variations in the ARQ of riparian versus drought-resilient trees. In addition, the ARQ diurnal cycle, together with the measurements in different heights, indicate that a considerable fraction of the CO2 not emitted is recycled within the tree.

  15. A novel modulation classification approach using Gabor filter network.

    PubMed

    Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed

    2014-01-01

    A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel.

  16. A Novel Modulation Classification Approach Using Gabor Filter Network

    PubMed Central

    Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed

    2014-01-01

    A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel. PMID:25126603

  17. Flow cytometry data analysis: comparing large multivariate data sets using classification trees

    SciTech Connect

    Norman, J.

    1994-12-31

    This paper describes a method to compare flow cytometry data sets, which typically contain 50,000 six-parameter measurements each. By this method, the data points in two such data sets are divided into subpopulations using a binary classification tree generated from the data. The {chi}{sup 2} test is then used to establish the homogeneity of the two data sets based on how their data are distributed across these subpopulations. Preliminary results indicate that this comparison method is sufficiently sensitive to detect differences between flow cytometry data sets that are too subtle for human investigators to notice.

  18. The Learning Tree Montessori Child Care: An Approach to Diversity

    ERIC Educational Resources Information Center

    Wick, Laurie

    2006-01-01

    In this article the author describes how she and her partners started The Learning Tree Montessori Child Care, a Montessori program with a different approach in Seattle in 1979. The author also relates that the other area Montessori schools then offered half-day programs, and as a result the children who attended were, for the most part,…

  19. Eating Disorder Diagnoses: Empirical Approaches to Classification

    ERIC Educational Resources Information Center

    Wonderlich, Stephen A.; Joiner, Thomas E., Jr.; Keel, Pamela K.; Williamson, Donald A.; Crosby, Ross D.

    2007-01-01

    Decisions about the classification of eating disorders have significant scientific and clinical implications. The eating disorder diagnoses in the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) reflect the collective wisdom of experts in the field but are frequently not supported in…

  20. Eating Disorder Diagnoses: Empirical Approaches to Classification

    ERIC Educational Resources Information Center

    Wonderlich, Stephen A.; Joiner, Thomas E., Jr.; Keel, Pamela K.; Williamson, Donald A.; Crosby, Ross D.

    2007-01-01

    Decisions about the classification of eating disorders have significant scientific and clinical implications. The eating disorder diagnoses in the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) reflect the collective wisdom of experts in the field but are frequently not supported in…

  1. A cross-cultural investigation of college student alcohol consumption: a classification tree analysis.

    PubMed

    Kitsantas, Panagiota; Kitsantas, Anastasia; Anagnostopoulou, Tanya

    2008-01-01

    In this cross-cultural study, the authors attempted to identify high-risk subgroups for alcohol consumption among college students. American and Greek students (N = 132) answered questions about alcohol consumption, religious beliefs, attitudes toward drinking, advertisement influences, parental monitoring, and drinking consequences. Heavy drinkers in the American group were younger and less religious than were infrequent drinkers. In the Greek group, heavy drinkers tended to deny the negative results of drinking alcohol and use a permissive attitude to justify it, whereas infrequent drinkers were more likely to be monitored by their parents. These results suggest that parental monitoring and an emphasis on informing students about the negative effects of alcohol on their health and social and academic lives may be effective methods of reducing alcohol consumption. Classification tree analysis revealed that student attitudes toward drinking were important in the classification of American and Greek drinkers, indicating that this is a powerful predictor of alcohol consumption regardless of ethnic background.

  2. Prediction of protein phosphorylation sites using classification trees and SVM classifier

    NASA Astrophysics Data System (ADS)

    Betkier, Piotr; Szymański, Zbigniew

    2011-10-01

    The paper presents a method of solving the problem of protein phosphorylation sites recognition. Six classifiers were created for prediction whether specified amino acid sequences represented as a 9-character strings react with given types of the kinase-enzymes. The method consists of three steps. Positions in the amino acid sequences significant for classification are found with the use of classification trees in the first step. Afterwards, the symbols composing the sequences are mapped to the real numbers domain using the Gini index method. The last step consists of creating the SVM classifiers as the final prediction models. The paper contains evaluation of the obtained results and the description of the methods applied to evaluate the quality of the classifiers.

  3. Predicting Chemically Induced Duodenal Ulcer and Adrenal Necrosis with Classification Trees

    NASA Astrophysics Data System (ADS)

    Giampaolo, Casimiro; Gray, Andrew T.; Olshen, Richard A.; Szabo, Sandor

    1991-07-01

    Binary tree-structured statistical classification algorithms and properties of 56 model alkyl nucleophiles were brought to bear on two problems of experimental pharmacology and toxicology. Each rat of a learning sample of 745 was administered one compound and autopsied to determine the presence of duodenal ulcer or adrenal hemorrhagic necrosis. The cited statistical classification schemes were then applied to these outcomes and 67 features of the compounds to ascertain those characteristics that are associated with biologic activity. For predicting duodenal ulceration, dipole moment, melting point, and solubility in octanol are particularly important, while for predicting adrenal necrosis, important features include the number of sulfhydryl groups and double bonds. These methods may constitute inexpensive but powerful ways to screen untested compounds for possible organ-specific toxicity. Mechanisms for the etiology and pathogenesis of the duodenal and adrenal lesions are suggested, as are additional avenues for drug design.

  4. Phylogenetic approaches to microbial community classification.

    PubMed

    Ning, Jie; Beiko, Robert G

    2015-10-05

    The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and their roles and can highlight deviations from expected distributions of microbes. Good classification depends on choosing the right combination of classifier, feature representation, and learning model. Machine-learning procedures have been used in the past for supervised classification, but increased attention to feature representation and selection may produce better models and predictions. We focused our attention on the classification of nine oral sites and dental plaque in particular, using data collected from the Human Microbiome Project. A key focus of our representations was the use of phylogenetic information, both as the basis for custom kernels and as a way to represent sets of microbes to the classifier. We also used the PICRUSt software, which draws on phylogenetic relationships to predict molecular functions and to generate additional features for the classifier. Custom kernels based on the UniFrac measure of community dissimilarity did not improve performance. However, feature representation was vital to classification accuracy, with microbial clade and function representations providing useful information to the classifier; combining the two types of features did not yield increased prediction accuracy. Many of the best-performing clades and functions had clear associations with oral microflora. The classification of oral microbiota remains a challenging problem; our best accuracy on the plaque dataset was approximately 81 %. Perfect accuracy may be unattainable due to the close proximity of the sites and intra-individual variation. However, further exploration of the space of both classifiers and feature representations is likely to increase the accuracy of

  5. Non-Destructive Classification Approaches for Equilibrated Ordinary Chondrites

    NASA Astrophysics Data System (ADS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-09-01

    In order to compare a few non-destructive classification techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Mössbauer spectroscopy.

  6. A statistical approach to set classification by feature selection with applications to classification of histopathology images.

    PubMed

    Jung, Sungkyu; Qiao, Xingye

    2014-09-01

    Set classification problems arise when classification tasks are based on sets of observations as opposed to individual observations. In set classification, a classification rule is trained with N sets of observations, where each set is labeled with class information, and the prediction of a class label is performed also with a set of observations. Data sets for set classification appear, for example, in diagnostics of disease based on multiple cell nucleus images from a single tissue. Relevant statistical models for set classification are introduced, which motivate a set classification framework based on context-free feature extraction. By understanding a set of observations as an empirical distribution, we employ a data-driven method to choose those features which contain information on location and major variation. In particular, the method of principal component analysis is used to extract the features of major variation. Multidimensional scaling is used to represent features as vector-valued points on which conventional classifiers can be applied. The proposed set classification approaches achieve better classification results than competing methods in a number of simulated data examples. The benefits of our method are demonstrated in an analysis of histopathology images of cell nuclei related to liver cancer.

  7. Classification of savanna tree species, in the Greater Kruger National Park region, by integrating hyperspectral and LiDAR data in a Random Forest data mining environment

    NASA Astrophysics Data System (ADS)

    Naidoo, L.; Cho, M. A.; Mathieu, R.; Asner, G.

    2012-04-01

    The accurate classification and mapping of individual trees at species level in the savanna ecosystem can provide numerous benefits for the managerial authorities. Such benefits include the mapping of economically useful tree species, which are a key source of food production and fuel wood for the local communities, and of problematic alien invasive and bush encroaching species, which can threaten the integrity of the environment and livelihoods of the local communities. Species level mapping is particularly challenging in African savannas which are complex, heterogeneous, and open environments with high intra-species spectral variability due to differences in geology, topography, rainfall, herbivory and human impacts within relatively short distances. Savanna vegetation are also highly irregular in canopy and crown shape, height and other structural dimensions with a combination of open grassland patches and dense woody thicket - a stark contrast to the more homogeneous forest vegetation. This study classified eight common savanna tree species in the Greater Kruger National Park region, South Africa, using a combination of hyperspectral and Light Detection and Ranging (LiDAR)-derived structural parameters, in the form of seven predictor datasets, in an automated Random Forest modelling approach. The most important predictors, which were found to play an important role in the different classification models and contributed to the success of the hybrid dataset model when combined, were species tree height; NDVI; the chlorophyll b wavelength (466 nm) and a selection of raw, continuum removed and Spectral Angle Mapper (SAM) bands. It was also concluded that the hybrid predictor dataset Random Forest model yielded the highest classification accuracy and prediction success for the eight savanna tree species with an overall classification accuracy of 87.68% and KHAT value of 0.843.

  8. The Tree of Life and a New Classification of Bony Fishes

    PubMed Central

    Betancur-R., Ricardo; Broughton, Richard E.; Wiley, Edward O.; Carpenter, Kent; López, J. Andrés; Li, Chenhong; Holcroft, Nancy I.; Arcila, Dahiana; Sanciangco, Millicent; Cureton II, James C; Zhang, Feifei; Buser, Thaddaeus; Campbell, Matthew A.; Ballesteros, Jesus A; Roa-Varon, Adela; Willis, Stuart; Borden, W. Calvin; Rowley, Thaine; Reneau, Paulette C.; Hough, Daniel J.; Lu, Guoqing; Grande, Terry; Arratia, Gloria; Ortí, Guillermo

    2013-01-01

    The tree of life of fishes is in a state of flux because we still lack a comprehensive phylogeny that includes all major groups. The situation is most critical for a large clade of spiny-finned fishes, traditionally referred to as percomorphs, whose uncertain relationships have plagued ichthyologists for over a century. Most of what we know about the higher-level relationships among fish lineages has been based on morphology, but rapid influx of molecular studies is changing many established systematic concepts. We report a comprehensive molecular phylogeny for bony fishes that includes representatives of all major lineages. DNA sequence data for 21 molecular markers (one mitochondrial and 20 nuclear genes) were collected for 1410 bony fish taxa, plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). Bony fish diversity is represented by 1093 genera, 369 families, and all traditionally recognized orders. The maximum likelihood tree provides unprecedented resolution and high bootstrap support for most backbone nodes, defining for the first time a global phylogeny of fishes. The general structure of the tree is in agreement with expectations from previous morphological and molecular studies, but significant new clades arise. Most interestingly, the high degree of uncertainty among percomorphs is now resolved into nine well-supported supraordinal groups. The order Perciformes, considered by many a polyphyletic taxonomic waste basket, is defined for the first time as a monophyletic group in the global phylogeny. A new classification that reflects our phylogenetic hypothesis is proposed to facilitate communication about the newly found structure of the tree of life of fishes. Finally, the molecular phylogeny is calibrated using 60 fossil constraints to produce a comprehensive time tree. The new time-calibrated phylogeny will provide the basis for and stimulate new comparative studies to better understand the evolution of the amazing

  9. Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image

    NASA Astrophysics Data System (ADS)

    Adelabu, Samuel; Mutanga, Onisimo; Adam, Elhadi; Cho, Moses Azong

    2013-01-01

    Classification of different tree species in semiarid areas can be challenging as a result of the change in leaf structure and orientation due to soil moisture constraints. Tree species mapping is, however, a key parameter for forest management in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples.We performed classification using random forest (RF) and support vector machines (SVM) based on EnMap box. The overall accuracies for classifying the five tree species was 88.75 and 85% for both SVM and RF, respectively. We also demonstrated that the new red-edge band in the RapidEye sensor has the potential for classifying tree species in semiarid environments when integrated with other standard bands. Similarly, we observed that where there are limited training samples, SVM is preferred over RF. Finally, we demonstrated that the two accuracy measures of quantity and allocation disagreement are simpler and more helpful for the vast majority of remote sensing classification process than the kappa coefficient. Overall, high species classification can be achieved using strategically located RapidEye bands integrated with advanced processing algorithms.

  10. Transition to college: A classification and regression tree (CART) analysis of natural reduction of binge drinking.

    PubMed

    Vik, Peter W; Cellucci, Tony; Hedt, Jill; Jorgensen, Melinda

    2006-01-01

    Approximately one in five teens that drank heavily in high school reduces or discontinues consumption while in college. Multiple paths might lead to the common outcome of natural reduction in heavy drinking. Statistical modeling of this complex process of natural reduction is a challenge with standard linear statistics. The purpose of this paper is to use a new statistical procedure, Classification and Regression Tree (CART), to model the equifinality of reduction in drinking by college students who drank heavily as adolescents. An appealing aspect of CART is that the resulting tree model that can easily be interpreted and applied by those who work with adolescents during the important transition from high school to college. Of 201 college students who first binged on alcohol while in high school, 71 (35.3%) denied heavy or binge drinking within the previous three months (Natural Reducers). The final model accurately classified 84.6% of the students as either continued heavy drinkers or natural reducers. Sensitivity was modest (accurate identification of 67.6% of the reducers); however, specificity was strong (correct classification of 93.8% of the continued heavy drinkers). The model revealed four pathways to natural reduction in drinking. Predominant in each path was the influence of social factors that maintain continued drinking (e.g., social facilitation outcome expectancies, perception of friends' drinking) or facilitate natural reduction (e.g., regular church attendance). The results support the application of CART to model health behaviors across the transition from adolescence to young adulthood.

  11. Application of classification-tree methods to identify nitrate sources in ground water

    USGS Publications Warehouse

    Spruill, T.B.; Showers, W.J.; Howe, S.S.

    2002-01-01

    A study was conducted to determine if nitrate sources in ground water (fertilizer on crops, fertilizer on golf courses, irrigation spray from hog (Sus scrofa) wastes, and leachate from poultry litter and septic systems) could be classified with 80% or greater success. Two statistical classification-tree models were devised from 48 water samples containing nitrate from five source categories. Model I was constructed by evaluating 32 variables and selecting four primary predictor variables (??15N, nitrate to ammonia ratio, sodium to potassium ratio, and zinc) to identify nitrate sources. A ??15N value of nitrate plus potassium 18.2 indicated inorganic or soil organic N. A nitrate to ammonia ratio 575 indicated nitrate from golf courses. A sodium to potassium ratio 3.2 indicated spray or poultry wastes. A value for zinc 2.8 indicated poultry wastes. Model 2 was devised by using all variables except ??15N. This model also included four variables (sodium plus potassium, nitrate to ammonia ratio, calcium to magnesium ratio, and sodium to potassium ratio) to distinguish categories. Both models were able to distinguish all five source categories with better than 80% overall success and with 71 to 100% success in individual categories using the learning samples. Seventeen water samples that were not used in model development were tested using Model 2 for three categories, and all were correctly classified. Classification-tree models show great potential in identifying sources of contamination and variables important in the source-identification process.

  12. [Classification of dengue hemorrhagic fever using decision trees in the early phase of the disease].

    PubMed

    Vega Riverón, Beatriz; Sánchez Valdés, C Lizet; Cortiñas Abrahantes, C José; Castro Peraza, Osvaldo; González Rubio, C Daniel; Castro Peraza, Marta

    2012-01-01

    dengue is a viral disease with endemic behavior. At the beginning of the illness it is not possible to know which patients will have an unfavorable evolution and develop a severe form of dengue. However, some warning symptoms and signs may be present. to apply decision tree techniques to the exploration of signs of severity in the early phase of the illness. the study sample was made up of 230 patients admitted with dengue to "Pedro Kouri" Institute of Tropical Medicine in 2001. The variables considered for the classification were the signs, symptoms and laboratory exams on the third day of evolution of the illness. The algorithm of classification and regression trees using the Gini's index was applied. Different loss matrices to improve the sensitivity were considered. the algorithm CART, corresponding to the best loss, had a sensitivity of 98,68% and global error of 0,36. Without considering loss, it obtained its sensitivity reached 74% with an error of 0,25. In both cases, the most important variables were platelets and hemoglobin. the study submitted rules of decision with high sensitivity and negative predictive value of utility in the clinical practice. The laboratory variables resulted more important from the informational viewpoint than the clinical ones to discriminate clinical forms of dengue.

  13. Collaborative evaluation and management of students' health-related physical fitness: applications of cluster analysis and the classification tree.

    PubMed

    Chen, Jou-An; Shih, Chi-Chuan; Lin, Pay-Fan; Chen, Jin-Jong; Lin, Kuan-Chia

    2012-01-01

    Health-related physical fitness has decreased with age; this is od immense concern to adolescents. School-based health intervention programs can be classified as either population-wide or high-risk approach. Although the population-wide and risk-based approaches adopt different healthcare angles, they all need to focus resources on risk evaluation. In this paper, we describe an exploratory application of cluster analysis and the tree model to collaborative evaluation of students' health- related physical fitness from a high school sample in Taiwan (n=742). Cluster analysis show that physical fitness can be divided into relatively good, moderate and poor subgroups. There are significant differences in biochemical measurements among these three groups. For the tree model, we used 2004 school-year students as an experimental group and 2005 school-year students as a validation group. The results indicate that if sit-and-reach is shorter than 33 cm, BMI is >25.46 kg/m2, and 1600 m run/walk is >534 s, the predicted probability for the number of metabolic risk factors ≥2 is 100% and the population is 41, both results are the highest. From the risk-based healthcare viewpoint, the cluster analysis can sort out students' physical fitness data in a short time and then narrow down the scope to recognize the subgroups. A classification tree model specifically shows the discrimination paths between the measurements of physical fitness for metabolic risk and would be helpful for self-management or proper healthcare education targeting different groups. Applying both methods to specific adolescents' health issues could provide different angles in planning health promotion projects.

  14. Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification

    NASA Astrophysics Data System (ADS)

    Höhle, J.

    2014-09-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.

  15. Decision tree structure based classification of EEG signals recorded during two dimensional cursor movement imagery.

    PubMed

    Aydemir, Onder; Kayikcioglu, Temel

    2014-05-30

    Input signals of an EEG based brain computer interface (BCI) system are naturally non-stationary, have poor signal to noise ratio, depend on physical or mental tasks and are contaminated with various artifacts such as external electromagnetic waves, electromyogram and electrooculogram. All these disadvantages have motivated researchers to substantially improve speed and accuracy of all components of the communication system between brain and a BCI output device. In this study, a fast and accurate decision tree structure based classification method was proposed for classifying EEG data to up/down/right/left computer cursor movement imagery EEG data. The data sets were acquired from three healthy human subjects in age group of between 24 and 29 years old in two sessions on different days. The proposed decision tree structure based method was successfully applied to the present data sets and achieved 55.92%, 57.90% and 82.24% classification accuracy rate on the test data of three subjects. The results indicated that the proposed method provided 12.25% improvement over the best results of the most closely related studies although the EEG signals were collected on two different sessions with about 1 week interval. The proposed method required only a training set of the subject and automatically generated specific DTS for each new subject by determining the most appropriate feature set and classifier for each node. Additionally, with further developments of feature extraction and/or classification algorithms, any existing node can be easily replaced with new one without breaking the whole DTS. This attribute makes the proposed method flexible. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Classification of oxide glasses: A polarizability approach

    SciTech Connect

    Dimitrov, Vesselin; Komatsu, Takayuki . E-mail: komatsu@chem.nagaokaut.ac.jp

    2005-03-15

    A classification of binary oxide glasses has been proposed taking into account the values obtained on their refractive index-based oxide ion polarizability {alpha}{sub O2-}(n{sub 0}), optical basicity {lambda}(n{sub 0}), metallization criterion M(n{sub 0}), interaction parameter A(n{sub 0}), and ion's effective charges as well as O1s and metal binding energies determined by XPS. Four groups of oxide glasses have been established: glasses formed by two glass-forming acidic oxides; glasses formed by glass-forming acidic oxide and modifier's basic oxide; glasses formed by glass-forming acidic and conditional glass-forming basic oxide; glasses formed by two basic oxides. The role of electronic ion polarizability in chemical bonding of oxide glasses has been also estimated. Good agreement has been found with the previous results concerning classification of simple oxides. The results obtained probably provide good basis for prediction of type of bonding in oxide glasses on the basis of refractive index as well as for prediction of new nonlinear optical materials.

  17. A novel underwater dam crack detection and classification approach based on sonar images.

    PubMed

    Shi, Pengfei; Fan, Xinnan; Ni, Jianjun; Khan, Zubair; Li, Min

    2017-01-01

    Underwater dam crack detection and classification based on sonar images is a challenging task because underwater environments are complex and because cracks are quite random and diverse in nature. Furthermore, obtainable sonar images are of low resolution. To address these problems, a novel underwater dam crack detection and classification approach based on sonar imagery is proposed. First, the sonar images are divided into image blocks. Second, a clustering analysis of a 3-D feature space is used to obtain the crack fragments. Third, the crack fragments are connected using an improved tensor voting method. Fourth, a minimum spanning tree is used to obtain the crack curve. Finally, an improved evidence theory combined with fuzzy rule reasoning is proposed to classify the cracks. Experimental results show that the proposed approach is able to detect underwater dam cracks and classify them accurately and effectively under complex underwater environments.

  18. A novel underwater dam crack detection and classification approach based on sonar images

    PubMed Central

    Shi, Pengfei; Fan, Xinnan; Ni, Jianjun; Khan, Zubair; Li, Min

    2017-01-01

    Underwater dam crack detection and classification based on sonar images is a challenging task because underwater environments are complex and because cracks are quite random and diverse in nature. Furthermore, obtainable sonar images are of low resolution. To address these problems, a novel underwater dam crack detection and classification approach based on sonar imagery is proposed. First, the sonar images are divided into image blocks. Second, a clustering analysis of a 3-D feature space is used to obtain the crack fragments. Third, the crack fragments are connected using an improved tensor voting method. Fourth, a minimum spanning tree is used to obtain the crack curve. Finally, an improved evidence theory combined with fuzzy rule reasoning is proposed to classify the cracks. Experimental results show that the proposed approach is able to detect underwater dam cracks and classify them accurately and effectively under complex underwater environments. PMID:28640925

  19. The use of classification and regression trees to predict the likelihood of seasonal influenza.

    PubMed

    Afonso, Anna M; Ebell, Mark H; Gonzales, Ralph; Stein, John; Genton, Blaise; Senn, Nicolas

    2012-12-01

    Individual signs and symptoms are of limited value for the diagnosis of influenza. To develop a decision tree for the diagnosis of influenza based on a classification and regression tree (CART) analysis. Data from two previous similar cohort studies were assembled into a single dataset. The data were randomly divided into a development set (70%) and a validation set (30%). We used CART analysis to develop three models that maximize the number of patients who do not require diagnostic testing prior to treatment decisions. The validation set was used to evaluate overfitting of the model to the training set. Model 1 has seven terminal nodes based on temperature, the onset of symptoms and the presence of chills, cough and myalgia. Model 2 was a simpler tree with only two splits based on temperature and the presence of chills. Model 3 was developed with temperature as a dichotomous variable (≥38°C) and had only two splits based on the presence of fever and myalgia. The area under the receiver operating characteristic curves (AUROCC) for the development and validation sets, respectively, were 0.82 and 0.80 for Model 1, 0.75 and 0.76 for Model 2 and 0.76 and 0.77 for Model 3. Model 2 classified 67% of patients in the validation group into a high- or low-risk group compared with only 38% for Model 1 and 54% for Model 3. A simple decision tree (Model 2) classified two-thirds of patients as low or high risk and had an AUROCC of 0.76. After further validation in an independent population, this CART model could support clinical decision making regarding influenza, with low-risk patients requiring no further evaluation for influenza and high-risk patients being candidates for empiric symptomatic or drug therapy.

  20. Prediction of radiation levels in residences: A methodological comparison of CART (Classification and Regression Tree Analysis) and conventional regression

    SciTech Connect

    Janssen, I.; Stebbings, J.H.

    1990-01-01

    In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and {approximately}200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs.

  1. The creation of a digital soil map for Cyprus using decision-tree classification techniques

    NASA Astrophysics Data System (ADS)

    Camera, Corrado; Zomeni, Zomenia; Bruggeman, Adriana; Noller, Joy; Zissimos, Andreas

    2014-05-01

    Considering the increasing threats soil are experiencing especially in semi-arid, Mediterranean environments like Cyprus (erosion, contamination, sealing and salinisation), producing a high resolution, reliable soil map is essential for further soil conservation studies. This study aims to create a 1:50.000 soil map covering the area under the direct control of the Republic of Cyprus (5.760 km2). The study consists of two major steps. The first is the creation of a raster database of predictive variables selected according to the scorpan formula (McBratney et al., 2003). It is of particular interest the possibility of using, as soil properties, data coming from three older island-wide soil maps and the recently published geochemical atlas of Cyprus (Cohen et al., 2011). Ten highly characterizing elements were selected and used as predictors in the present study. For the other factors usual variables were used: temperature and aridity index for climate; total loss on ignition, vegetation and forestry types maps for organic matter; the DEM and related relief derivatives (slope, aspect, curvature, landscape units); bedrock, surficial geology and geomorphology (Noller, 2009) for parent material and age; and a sub-watershed map to better bound location related to parent material sources. In the second step, the digital soil map is created using the Random Forests package in R. Random Forests is a decision tree classification technique where many trees, instead of a single one, are developed and compared to increase the stability and the reliability of the prediction. The model is trained and verified on areas where a 1:25.000 published soil maps obtained from field work is available and then it is applied for predictive mapping to the other areas. Preliminary results obtained in a small area in the plain around the city of Lefkosia, where eight different soil classes are present, show very good capacities of the method. The Ramdom Forest approach leads to reproduce soil

  2. Morphological and molecular characteristics do not confirm popular classification of the Brazil nut tree in Acre, Brazil.

    PubMed

    Sujii, P S; Fernandes, E T M B; Azevedo, V C R; Ciampi, A Y; Martins, K; de O Wadt, L H

    2013-09-27

    In the State of Acre, the Brazil nut tree, Bertholletia excelsa (Lecythidaceae), is classified by the local population into two types according to morphological characteristics, including color and quality of wood, shape of the trunk and crown, and fruit production. We examined the reliability of this classification by comparing morphological and molecular data of four populations of Brazil nut trees from Vale do Rio Acre in the Brazilian Amazon. For the morphological analysis, we evaluated qualitative and quantitative information of the trees, fruits, and seeds. The molecular analysis was performed using RAPD and ISSR markers, with cluster analysis. Significant differences were found between the two types of Brazil nut trees for the characters diameter at breast height, fruit yield, fruit size, and number of seeds per fruit. Despite the significant correlation between the morphological characteristics and the popular classification, we observed all possible combinations of morphological characteristics in both types of Brazil nut trees. In some individuals, the classification did not correspond to any of the characteristics. The results obtained with molecular markers showed that the two locally classified types of Brazil nut trees did not differ genetically, indicating that there is no consistent separation between them.

  3. Tree-Level Hydrodynamic Approach for Improved Stomatal Conductance Parameterization

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, G.; Bohrer, G.; Matheny, A. M.; Ivanov, V. Y.

    2014-12-01

    The land-surface models do not mechanistically resolve hydrodynamic processes within the tree. The Finite-Elements Tree-Crown Hydrodynamics model version 2 (FETCH2) is based on the pervious FETCH model approach, but with finite difference numerics, and simplified single-beam conduit system. FETCH2 simulates water flow through the tree as a simplified system of porous media conduits. It explicitly resolves spatiotemporal hydraulic stresses throughout the tree's vertical extent that cannot be easily represented using other stomatal-conductance models. Empirical equations relate water potential at the stem to stomata conductance at leaves connected to the stem (through unresolved branches) at that height. While highly simplified, this approach bring some realism to the simulation of stomata conductance because the stomata can respond to stem water potential, rather than an assumed direct relationship with soil moisture, as is currently the case in almost all models. By enabling mechanistic simulation of hydrological traits, such as xylem conductivity, conductive area per DBH, vertical distribution of leaf area and maximal and minimal water content in the xylem, and their effect of the dynamics of water flow in the tree system, the FETCH2 modeling system enhanced our understanding of the role of hydraulic limitations on an experimental forest plot short-term water stresses that lead to tradeoffs between water and light availability for transpiring leaves in forest ecosystems. FETCH2 is particularly suitable to resolve the effects of structural differences between tree and species and size groups, and the consequences of differences in hydraulic strategies of different species. We leverage on a large dataset of sap flow from 60 trees of 4 species at our experimental plot at the University of Michigan Biological Station. Comparison of the sap flow and transpiration patterns in this site and an undisturbed control site shows significant difference in hydraulic strategies

  4. A novel dendrochronological approach reveals drivers of carbon sequestration in tree species of riparian forests across spatiotemporal scales.

    PubMed

    Rieger, Isaak; Kowarik, Ingo; Cherubini, Paolo; Cierjacks, Arne

    2017-01-01

    Aboveground carbon (C) sequestration in trees is important in global C dynamics, but reliable techniques for its modeling in highly productive and heterogeneous ecosystems are limited. We applied an extended dendrochronological approach to disentangle the functioning of drivers from the atmosphere (temperature, precipitation), the lithosphere (sedimentation rate), the hydrosphere (groundwater table, river water level fluctuation), the biosphere (tree characteristics), and the anthroposphere (dike construction). Carbon sequestration in aboveground biomass of riparian Quercus robur L. and Fraxinus excelsior L. was modeled (1) over time using boosted regression tree analysis (BRT) on cross-datable trees characterized by equal annual growth ring patterns and (2) across space using a subsequent classification and regression tree analysis (CART) on cross-datable and not cross-datable trees. While C sequestration of cross-datable Q. robur responded to precipitation and temperature, cross-datable F. excelsior also responded to a low Danube river water level. However, CART revealed that C sequestration over time is governed by tree height and parameters that vary over space (magnitude of fluctuation in the groundwater table, vertical distance to mean river water level, and longitudinal distance to upstream end of the study area). Thus, a uniform response to climatic drivers of aboveground C sequestration in Q. robur was only detectable in trees of an intermediate height class and in taller trees (>21.8m) on sites where the groundwater table fluctuated little (≤0.9m). The detection of climatic drivers and the river water level in F. excelsior depended on sites at lower altitudes above the mean river water level (≤2.7m) and along a less dynamic downstream section of the study area. Our approach indicates unexploited opportunities of understanding the interplay of different environmental drivers in aboveground C sequestration. Results may support species-specific and

  5. An Ensemble Rule Learning Approach for Automated Morphological Classification of Erythrocytes.

    PubMed

    Maity, Maitreya; Mungle, Tushar; Dhane, Dhiraj; Maiti, A K; Chakraborty, Chandan

    2017-04-01

    The analysis of pathophysiological change to erythrocytes is important for early diagnosis of anaemia. The manual assessment of pathology slides is time-consuming and complicated regarding various types of cell identification. This paper proposes an ensemble rule-based decision-making approach for morphological classification of erythrocytes. Firstly, the digital microscopic blood smear images are pre-processed for removal of spurious regions followed by colour normalisation and thresholding. The erythrocytes are segmented from background image using the watershed algorithm. The shape features are then extracted from the segmented image to detect shape abnormality present in microscopic blood smear images. The decision about the abnormality is taken using proposed multiple rule-based expert systems. The deciding factor is majority ensemble voting for abnormally shaped erythrocytes. Here, shape-based features are considered for nine different types of abnormal erythrocytes including normal erythrocytes. Further, the adaptive boosting algorithm is used to generate multiple decision tree models where each model tree generates an individual rule set. The supervised classification method is followed to generate rules using a C4.5 decision tree. The proposed ensemble approach is precise in detecting eight types of abnormal erythrocytes with an overall accuracy of 97.81% and weighted sensitivity of 97.33%, weighted specificity of 99.7%, and weighted precision of 98%. This approach shows the robustness of proposed strategy for erythrocytes classification into abnormal and normal class. The article also clarifies its latent quality to be incorporated in point of care technology solution targeting a rapid clinical assistance.

  6. A right whale pootree: classification trees of faecal hormones identify reproductive states in North Atlantic right whales (Eubalaena glacialis).

    PubMed

    Corkeron, Peter; Rolland, Rosalind M; Hunt, Kathleen E; Kraus, Scott D

    2017-01-01

    Immunoassay of hormone metabolites extracted from faecal samples of free-ranging large whales can provide biologically relevant information on reproductive state and stress responses. North Atlantic right whales (Eubalaena glacialis Müller 1776) are an ideal model for testing the conservation value of faecal metabolites. Almost all North Atlantic right whales are individually identified, most of the population is sighted each year, and systematic survey effort extends back to 1986. North Atlantic right whales number <500 individuals and are subject to anthropogenic mortality, morbidity and other stressors, and scientific data to inform conservation planning are recognized as important. Here, we describe the use of classification trees as an alternative method of analysing multiple-hormone data sets, building on univariate models that have previously been used to describe hormone profiles of individual North Atlantic right whales of known reproductive state. Our tree correctly classified the age class, sex and reproductive state of 83% of 112 faecal samples from known individual whales. Pregnant females, lactating females and both mature and immature males were classified reliably using our model. Non-reproductive [i.e. 'resting' (not pregnant and not lactating) and immature] females proved the most unreliable to distinguish. There were three individual males that, given their age, would traditionally be considered immature but that our tree classed as mature males, possibly calling for a re-evaluation of their reproductive status. Our analysis reiterates the importance of considering the reproductive state of whales when assessing the relationship between cortisol concentrations and stress. Overall, these results confirm findings from previous univariate statistical analyses, but with a more robust multivariate approach that may prove useful for the multiple-analyte data sets that are increasingly used by conservation physiologists.

  7. A neuro-fuzzy approach in the classification of students' academic performance.

    PubMed

    Do, Quang Hung; Chen, Jeng-Fung

    2013-01-01

    Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions.

  8. A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance

    PubMed Central

    2013-01-01

    Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928

  9. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification.

  10. Inguinal hernia recurrence: Classification and approach

    PubMed Central

    Campanelli, Giampiero; Pettinari, Diego; Cavalli, Marta; Avesani, Ettore Contessini

    2006-01-01

    The authors reviewed the records of 2,468 operations of groin hernia in 2,350 patients, including 277 recurrent hernias updated to January 2005. The data obtained - evaluating technique, results and complications - were used to propose a simple anatomo-clinical classification into three types which could be used to plan the surgical strategy: Type R1: first recurrence ‘high,’ oblique external, reducible hernia with small (<2 cm) defect in non-obese patients, after pure tissue or mesh repairType R2: first recurrence ‘low,’ direct, reducible hernia with small (<2 cm) defect in non-obese patients, after pure tissue or mesh repairType R3: all the other recurrences - including femoral recurrences; recurrent groin hernia with big defect (inguinal eventration); multirecurrent hernias; nonreducible, linked with a controlateral primitive or recurrent hernia; and situations compromised from aggravating factors (for example obesity) or anyway not easily included in R1 or R2, after pure tissue or mesh repair. PMID:21187986

  11. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    SciTech Connect

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.; Pounds, Joel G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the set of predictors to around 8 factors that can be validated using reputable medical and public health resources.

  12. Using Classification and Regression Trees (CART) and random forests to analyze attrition: Results from two simulations.

    PubMed

    Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J

    2015-12-01

    In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers.

  13. Induction of decision trees and Bayesian classification applied to diagnosis of sport injuries.

    PubMed

    Zelic, I; Kononenko, I; Lavrac, N; Vuga, V

    1997-12-01

    Machine learning techniques can be used to extract knowledge from data stored in medical databases. In our application, various machine learning algorithms were used to extract diagnostic knowledge which may be used to support the diagnosis of sport injuries. The applied methods include variants of the Assistant algorithm for top-down induction of decision trees, and variants of the Bayesian classifier. The available dataset was insufficient for reliable diagnosis of all sport injuries considered by the system. Consequently, expert-defined diagnostic rules were added and used as pre-classifiers or as generators of additional training instances for diagnoses for which only few training examples were available. Experimental results show that the classification accuracy and the explanation capability of the naive Bayesian classifier with the fuzzy discretization of numerical attributes were superior to other methods and estimated as the most appropriate for practical use.

  14. Identifying population groups with low palliative care program enrolment using classification and regression tree analysis.

    PubMed

    Gao, Jun; Johnston, Grace M; Lavergne, M Ruth; McIntyre, Paul

    2011-01-01

    Classification and regression tree (CART) analysis was used to identify subpopulations with lower palliative care program (PCP) enrolment rates. CART analysis uses recursive partitioning to group predictors. The PCP enrolment rate was 72 percent for the 6,892 adults who died of cancer from 2000 and 2005 in two counties in Nova Scotia, Canada. The lowest PCP enrolment rates were for nursing home residents over 82 years (27 percent), a group residing more than 43 kilometres from the PCP (31 percent), and another group living less than two weeks after their cancer diagnosis (37 percent). The highest rate (86 percent) was for the 2,118 persons who received palliative radiation. Findings from multiple logistic regression (MLR) were provided for comparison. CART findings identified low PCP enrolment subpopulations that were defined by interactions among demographic, social, medical, and health system predictors.

  15. Classification tree models for predicting distributions of michigan stream fish from landscape variables

    USGS Publications Warehouse

    Steen, P.J.; Zorn, T.G.; Seelbach, P.W.; Schaeffer, J.S.

    2008-01-01

    Traditionally, fish habitat requirements have been described from local-scale environmental variables. However, recent studies have shown that studying landscape-scale processes improves our understanding of what drives species assemblages and distribution patterns across the landscape. Our goal was to learn more about constraints on the distribution of Michigan stream fish by examining landscape-scale habitat variables. We used classification trees and landscape-scale habitat variables to create and validate presence-absence models and relative abundance models for Michigan stream fishes. We developed 93 presence-absence models that on average were 72% correct in making predictions for an independent data set, and we developed 46 relative abundance models that were 76% correct in making predictions for independent data. The models were used to create statewide predictive distribution and abundance maps that have the potential to be used for a variety of conservation and scientific purposes. ?? Copyright by the American Fisheries Society 2008.

  16. rpartOrdinal: An R Package for Deriving a Classification Tree for Predicting an Ordinal Response.

    PubMed

    Archer, Kellie J

    2010-04-01

    This paper describes an R package, rpartOrdinal, that implements alternative splitting functions for fitting a classification tree when interest lies in predicting an ordinal response. This includes the generalized Gini impurity function, which was introduced as a method for predicting an ordinal response by including costs of misclassification into the impurity function, as well as an alternative ordinal impurity function due to Piccarreta (2008) that does not require the assignment of misclassification costs. The ordered twoing splitting method, which is not defined as a decrease in node impurity, is also included in the package. Since, in the ordinal response setting, misclassifying observations to adjacent categories is a less egregious error than misclassifying observations to distant categories, this package also includes a function for estimating an ordinal measure of association, the gamma statistic.

  17. Using Classification and Regression Trees (CART) and Random Forests to Analyze Attrition: Results From Two Simulations

    PubMed Central

    Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J.

    2016-01-01

    In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers. PMID:26389526

  18. A hybrid clustering and classification approach for predicting crash injury severity on rural roads.

    PubMed

    Hasheminejad, Seyed Hessam-Allah; Zahedi, Mohsen; Hasheminejad, Seyed Mohammad Hossein

    2017-07-10

    As a threat for transportation system, traffic crashes have a wide range of social consequences for governments. Traffic crashes are increasing in developing countries and Iran as a developing country is not immune from this risk. There are several researches in the literature to predict traffic crash severity based on artificial neural networks (ANNs), support vector machines and decision trees. This paper attempts to investigate the crash injury severity of rural roads by using a hybrid clustering and classification approach to compare the performance of classification algorithms before and after applying the clustering. In this paper, a novel rule-based genetic algorithm (GA) is proposed to predict crash injury severity, which is evaluated by performance criteria in comparison with classification algorithms like ANN. The results obtained from analysis of 13,673 crashes (5600 property damage, 778 fatal crashes, 4690 slight injuries and 2605 severe injuries) on rural roads in Tehran Province of Iran during 2011-2013 revealed that the proposed GA method outperforms other classification algorithms based on classification metrics like precision (86%), recall (88%) and accuracy (87%). Moreover, the proposed GA method has the highest level of interpretation, is easy to understand and provides feedback to analysts.

  19. A hybrid sensing approach for pure and adulterated honey classification.

    PubMed

    Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar

    2012-10-17

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.

  20. A Hybrid Sensing Approach for Pure and Adulterated Honey Classification

    PubMed Central

    Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  1. New approaches to psychiatric diagnostic classification.

    PubMed

    Owen, Michael J

    2014-11-05

    Recent findings in psychiatric genetics have crystallized concerns that diagnostic categories used in the clinic map poorly onto the underlying biology. If we are to harness developments in genetics and neuroscience to understand disease mechanisms and develop new treatments, we need new approaches to patient stratification that recognize the complexity and continuous nature of psychiatric traits and that are not constrained by current categorical approaches. Recognizing this, the National Institute for Mental Health (NIMH) has developed a novel framework to encourage more research of this kind. The implications of these recent findings and funding policy developments for neuroscience research are considerable. Copyright © 2014 Elsevier Inc. All rights reserved.

  2. Classification of Simple Oxides: A Polarizability Approach

    NASA Astrophysics Data System (ADS)

    Dimitrov, Vesselin; Komatsu, Takayuki

    2002-01-01

    A simple oxide classification has been proposed on the basis of correlation between electronic polarizabilities of the ions and their binding energies determined by XPS. Three groups of oxides have been considered taking into account the values obtained on refractive-index- or energy-gap-based oxide ion polarizability, cation polarizability, optical basicity, O 1s binding energy, metal (or nonmetal) binding energy, and Yamashita-Kurosawa's interaction parameter of the oxides. The group of semicovalent predominantly acidic oxides includes BeO, B2O3, P2O5, SiO2, Al2O3, GeO2, and Ga2O3 with low oxide ion polarizability, high O 1s binding energy, low cation polarizability, high metal (or nonmetal) outermost binding energy, comparatively low optical basicity, and strong interionic interaction, leading to the formation of strong covalent bonds. Some main group oxides so-called ionic or basic such as CaO, In2O3, SnO2, and TeO2 and most transition metal oxides show relatively high oxide ion polarizability, O 1s binding energy in a very narrow medium range, high cation polarizability, and low metal (or nonmetal) binding energy. Their optical basicity varies in a narrow range and it is close to that of CaO. The group of very ionic or very basic oxides includes CdO, SrO, and BaO as well as PbO, Sb2O3, and Bi2O3, which possess very high oxide ion polarizability, low O 1s binding energy, very high cation polarizability, and very low metal (or nonmetal) binding energy. Their optical basicity is higher than that of CaO and the interionic interaction is very weak, giving rise to the formation of very ionic chemical bonds.

  3. Classification and regression tree model for predicting tracheostomy in patients with traumatic cervical spinal cord injury.

    PubMed

    Lee, Dae-Sang; Park, Chi-Min; Carriere, Keumhee Chough; Ahn, Joonghyun

    2017-04-26

    In patients with cervical spinal cord injury (CSCI), respiratory compromise and the need for tracheostomy are common. The purpose of this study was to identify common risk factors for tracheostomy following traumatic CSCI and develop a decision tree for tracheostomy in traumatic CSCI patients without pulmonary function test. Data of 105 trauma patients with CSCI admitted in our institution from April, 2008 to February, 2014 were retrospectively analyzed. Patients who underwent tracheostomy were compared to those who did not. Stepwise logistic regression analysis and classification and regression tree model were used to predict the risk factors for tracheostomy. Tracheostomy was performed in 20% of patients with traumatic CSCI on median hospital day 4. Patients who underwent tracheostomy tended to be more severely injured (higher Injury Severity Score, lower Glasgow Coma Score, and lower systolic blood pressure on admission) which required more frequent intubation in the emergency room (ER) with a higher rate of complete CSCI compared to those who did not. Upon multiple logistic analysis, Age ≥ 55 years (OR: 6.86, p = 0.037), Car accident (OR: 5.8, p = 0.049), injury above C5 (OR: 28.95, p = 0.009), ISS ≥ 16 (OR: 12.6, p = 0.004), intubation in the ER (OR: 23.87, p = 0.001), and complete CSCI (OR: 62.14, p < 0.001) were significant predictors for the need of tracheostomy after CSCI. These factors can predict whether a new patient needs future tracheostomy with 91.4% accuracy. Age ≥ 55 years, injury above C5, ISS ≥ 16, Car accident, intubation in the ER, and complete CSCI were independently associated with tracheostomy after CSCI. CART analysis may provide an intuitive decision tree for tracheostomy.

  4. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  5. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  6. Hong Kong CIE sky classification and prediction by accessible weather data and trees-based methods

    NASA Astrophysics Data System (ADS)

    Lou, S.; Li, D. H. W.; Lam, J. C.

    2016-08-01

    Solar irradiance and daylight illuminance are important for solar energy and daylighting designs. Recently, the International Commission of Illuminance (CIE) adopted a range of sky conditions to represent the possible sky distributions which are crucial to the estimation of solar irradiance and daylight illuminance on vertical building facades. The important issue would be whether the sky conditions are correctly identified by the accessible variables. Previously, a number of climatic parameters including sky luminance distributions, vertical solar irradiance and sky illuminance were proposed for the CIE sky classification. However, such data are not always available. This paper proposes an approach based on the readily accessible data that systematically recorded by the local meteorological station for many years. The performance was evaluated using measured vertical solar irradiance and illuminance. The results show that the proposed approach is reliable for sky classification.

  7. Ecological land classification: A survey approach

    NASA Astrophysics Data System (ADS)

    Rowe, J. Stan; Sheard, John W.

    1981-09-01

    A landscape approach to ecological land mapping, as illustrated in this article, proceeds by pattern recognition based on ecological theory. The unit areas delineated are hypotheses that arise from a knowledge of what is ecologically important in the land. Units formed by the mapper are likely to be inefficient or irrelevant for ecological purposes unless he possesses a sound rationale as to the interactions and controlling influences of the structural components of ecosystems. Here is the central problem with what have been called “objective” multivariate approaches to mapping based on grid units and the sometimes arbitrary attributes thereof; they tend to conceal the importance of ecological theory and the necessity for theory-based supervision of pattern recognition. Multivariate techniques are best used iteratively to verify and refine map units initially recognized and delineated by theoretical considerations. These ideas are illustrated by an example of a reconnaissance survey in the Northwest Territories of Canada.

  8. Tree species classification in the Southern Sierra Nevada Mountains based on MASTER and LIDAR imagery

    NASA Astrophysics Data System (ADS)

    Gibbons, S.; Grigsby, S.; Ustin, S.

    2013-12-01

    NASA recently collected MASTER (MODIS/ASTER) imagery over the Southern Sierra Nevada Mountains as part of the HyspIRI (Hyperspectral Infrared Imager) preparatory campaign, a location that was chosen for its distinct changes in vegetative species with elevation. Differentiation between functional types based on spectral data has been successful, however, classification between individual species is more difficult to accomplish with only the visible and near infrared portions of the spectrum. I used MASTER imagery in combination with Critical Zone Observatory LIDAR data to map species across both a low and high elevation site in the San Joaquin Experimental Range. While the visible and thermal bands of MASTER images provided an improved classification over shortwave bands, the physical characteristics from the LIDAR data showed the most contrast between the land covers, including tree species. The National Ecological Observation Network (NEON) plans to use LIDAR and spectral data to monitor 20 domains, including the San Joaquin Experimental Range, for the next thirty years. Understanding the current species distributions not only provides insight on the available resources of the area but will also act as a baseline to determine the effects of environmental changes on vegetation using future NEON data.

  9. Effect of training characteristics on object classification: An application using Boosted Decision Trees

    NASA Astrophysics Data System (ADS)

    Sevilla-Noarbe, I.; Etayo-Sotos, P.

    2015-06-01

    We present an application of a particular machine-learning method (Boosted Decision Trees, BDTs using AdaBoost) to separate stars and galaxies in photometric images using their catalog characteristics. BDTs are a well established machine learning technique used for classification purposes. They have been widely used specially in the field of particle and astroparticle physics, and we use them here in an optical astronomy application. This algorithm is able to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. The improvements are shown using the Sloan Digital Sky Survey Data Release 9, with respect to the type photometric classifier. We obtain an improvement in the impurity of the galaxy sample of a factor 2-4 for this particular dataset, adjusting for the same efficiency of the selection. Another main goal of this study is to verify the effects that different input vectors and training sets have on the classification performance, the results being of wider use to other machine learning techniques.

  10. Identifying tree crown delineation shapes and need for remediation on high resolution imagery using an evidence based approach

    NASA Astrophysics Data System (ADS)

    Leckie, Donald G.; Walsworth, Nicholas; Gougeon, François A.

    2016-04-01

    In order to fully realize the benefits of automated individual tree mapping for tree species, health, forest inventory attribution and forest management decision making, the tree delineations should be as good as possible. The concept of identifying poorly delineated tree crowns and suggesting likely types of remediation was investigated. Delineations (isolations or isols) were classified into shape types reflecting whether they were realistic tree shapes and the likely kind of remediation needed. Shape type was classified by an evidence based rules approach using primitives based on isol size, shape indices, morphology, the presence of local maxima, and matches with template models representing trees of different sizes. A test set containing 50,000 isols based on an automated tree delineation of 40 cm multispectral airborne imagery of a diverse temperate-boreal forest site was used. Isolations representing single trees or several trees were the focus, as opposed to cases where a tree is split into several isols. For eight shape classes from regular through to convolute, shape classification accuracy was in the order of 62%; simplifying to six classes accuracy was 83%. Shape type did give an indication of the type of remediation and there were 6% false alarms (i.e., isols classed as needing remediation but did not). Alternately, there were 5% omissions (i.e., isols of regular shape and not earmarked for remediation that did need remediation). The usefulness of the concept of identifying poor delineations in need of remediation was demonstrated and one suite of methods developed and shown to be effective.

  11. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification.

    PubMed

    Baali, Hamza; Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J E

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain-computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling's [Formula: see text] statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%.

  12. An approach for quantifying the efficacy of ecological classification schemes as management tools

    NASA Astrophysics Data System (ADS)

    Flanagan, A. M.; Cerrato, R. M.

    2015-10-01

    Rigorous assessments of ecological classification schemes being applied to submerged environments are needed to evaluate their utility as management tools. Verification that a scheme can quantitatively capture habitat and community variation would be of considerable value to individuals responsible for making difficult management decisions relevant to widespread environmental challenges including those in fisheries, preservation or restoration of critical habitats, and climate change. In this paper, an assessment approach that evaluates a scheme by treating it like a quantitative statistical model is presented. It couples two direct gradient, multivariate statistical techniques, multivariate regression trees (MRT) and redundancy analysis (RDA), with a modelling protocol involving model formulation, model selection, parameter estimation, and measurement of precision to produce a very flexible strategy for analyzing structure in ecological data. To illustrate the proposed approach, the assessment focused on benthic infauna and evaluating the Folk grain size classification scheme, along with some alternative grain size models. Analysis of data sets revealed that while it was fairly easy to uncover biotic-environmental relationships that were over-fitted, the community structure inherent in the data tended to be robustly discernible and preserved across all grain size models, but rigidly parameterized models (i.e., a one size fits all approach for grain size characterization with fixed boundaries) were generally ineffective. The proposed approach provided a clear, detailed, and rigorous assessment of Folk and several alternative models and can be used for the quantitative evaluation of existing ecological classification schemes and/or in the development of new schemes.

  13. Annual Crop Type Classification of the U.S. Great Plains for 2000 - 2011: An Application of Classification Tree Modeling using Remote Sensing and Ancillary Environmental Data (Invited)

    NASA Astrophysics Data System (ADS)

    Howard, D. M.; Wylie, B. K.

    2013-12-01

    The purpose of this study was to increase spatial and temporal availability of crop classification data using reliable source data that have the potential of being applied on local, regional, national, and global levels. This study implemented classification tree modeling to map annual crop types throughout the U.S. Great Plains from 2000 - 2011. Classification tree modeling has been shown in numerous studies to be an effective tool for developing classification models. In this study, nearly 18 million crop observation points, derived from annual U.S. Department of Agriculture (USDA) National Agriculture Statistics Service (NASS) Cropland Data Layers (CDLs), were used in the training, development, and validation of a classification tree crop type model (CTM). Each observation point was further defined by weekly Normalized Differential Vegetation Index (NDVI) readings, annual climatic conditions, soil conditions, and a number of other biogeophysical environmental characteristics. The CTM accounted for the most prevalent crop types in the area, including, corn, soybeans, winter wheat, spring wheat, cotton, sorghum, and alfalfa. Other crops that did not fit into any of these classes were identified and grouped into a miscellaneous class. An 87% success rate was achieved on the classification of 1.8 million observation points (10% of total observation points) that were withheld from training. The CTM was applied to create annual crop maps of the U.S. Great Plains for 2000 - 2011 at a spatial resolution of 250 meters. Product validation was performed by comparing county acreage derived from the modeled crop maps and county acreage data from the USDA NASS Survey Program for each crop type and each year. Greater than 15,000 county records from 2001 - 2010 were compared with a Pearson's correlation coefficient of r = 0.87.

  14. Simulating California Reservoir Operation Using the Classification and Regression Tree Algorithm Combined with a Shuffled Cross-Validation Scheme

    NASA Astrophysics Data System (ADS)

    Yang, T.; Gao, X.; Sorooshian, S.; Li, X.

    2015-12-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs, and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as the consideration of policy and regulation, environmental constraints, dry/wet conditions, etc. In this paper, a reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve model's predictive performance. An application study of 9 major reservoirs in California is carried out and the simulated results from different decision tree approaches are compared with observation, including original CART and Random Forest. The statistical measurements show that CART combined with the shuffled cross-validation scheme gives a better predictive performance over the other two methods, especially in simulating the peak flows. The results for simulated controlled outflow, storage changes and storage trajectories also show that the proposed model is able to consistently and reasonably predict the human's reservoir operation decisions. In addition, we found that the operation in the Trinity Lake, Oroville Lake and Shasta Lake are greatly influenced by policy and regulation, while low elevation reservoirs are more sensitive to inflow amount than others.

  15. A classification tree for the prediction of benign versus malignant disease in patients with small renal masses.

    PubMed

    Rendon, Ricardo A; Mason, Ross J; Kirkland, Susan; Lawen, Joseph G; Abdolell, Mohamed

    2014-08-01

    To develop a classification tree for the preoperative prediction of benign versus malignant disease in patients with small renal masses. This is a retrospective study including 395 consecutive patients who underwent surgical treatment for a renal mass < 5 cm in maximum diameter between July 1st 2001 and June 30th 2010. A classification tree to predict the risk of having a benign renal mass preoperatively was developed using recursive partitioning analysis for repeated measures outcomes. Age, sex, volume on preoperative imaging, tumor location (central/peripheral), degree of endophytic component (1%-100%), and tumor axis position were used as potential predictors to develop the model. Forty-five patients (11.4%) were found to have a benign mass postoperatively. A classification tree has been developed which can predict the risk of benign disease with an accuracy of 88.9% (95% CI: 85.3 to 91.8). The significant prognostic factors in the classification tree are tumor volume, degree of endophytic component and symptoms at diagnosis. As an example of its utilization, a renal mass with a volume of < 5.67 cm3 that is < 45% endophytic has a 52.6% chance of having benign pathology. Conversely, a renal mass with a volume ≥ 5.67 cm3 that is ≥ 35% endophytic has only a 5.3% possibility of being benign. A classification tree to predict the risk of benign disease in small renal masses has been developed to aid the clinician when deciding on treatment strategies for small renal masses.

  16. Transition from a botanical to a molecular classification in tree pollen allergy: implications for diagnosis and therapy.

    PubMed

    Mothes, Nadine; Horak, Friedrich; Valenta, Rudolf

    2004-12-01

    Tree pollens are among the most important allergen sources. Allergic cross-reactivity to pollens of trees from various plant orders has so far been classified according to botanical relationships. In this context, cross-reactivities to pollens of trees of the Fagales order (birch, alder, hazel, hornbeam, oak, chestnut), fruits and vegetables, between pollens of the Scrophulariales (olive, ash, plantain, privet, lilac) and pollens of the Coniferales (cedar, cypress, pine) are well established. The application of molecular biology methods for allergen characterization has revealed the molecular nature of many important tree pollen allergens. We review the spectrum of tree pollen allergens and propose a classification of tree pollen and related allergies based on major allergen molecules instead of botanical relationships among the allergenic sources. This molecular classification suggests the major birch pollen allergen, Bet v 1 as a marker for Fagales pollen and related plant food allergies, the major olive pollen allergen, Ole e 1, as a possible marker for Scrophulariales pollen allergy and the cedar allergens, Cry j 1 and Cry j 2, as potential markers for allergy to Coniferales pollens. We exemplify for Fagales pollen allergy and Bet v 1 that major marker allergens are diagnostic tools to determine the disease-eliciting allergen source. Information obtained by diagnostic testing with marker allergens will be important for the appropriate selection of patients for allergen-specific forms of therapy.

  17. Pattern classification approach to rocket engine diagnostics

    SciTech Connect

    Tulpule, S.

    1989-01-01

    This paper presents a systems level approach to integrate state-of-the-art rocket engine technology with advanced computational techniques to develop an integrated diagnostic system (IDS) for future rocket propulsion systems. The key feature of this IDS is the use of advanced diagnostic algorithms for failure detection as opposed to the current practice of redline-based failure detection methods. The paper presents a top-down analysis of rocket engine diagnostic requirements, rocket engine operation, applicable diagnostic algorithms, and algorithm design techniques, which serve as a basis for the IDS. The concepts of hierarchical, model-based information processing are described, together with the use uf signal processing, pattern recognition, and artificial intelligence techniques which are an integral part of this diagnostic system. 27 refs.

  18. Using Clustering and Classification Approaches in Interactive Retrieval.

    ERIC Educational Resources Information Center

    Wu, Mingfang; Fuller, Michael; Wilkinson, Ross

    2001-01-01

    Presents an ongoing series of experiments with the TREC (Text Retrieval Conference) Interactive Track that test the feasibility and effectiveness of using clustering and classification as an aid to retrieval and answer construction. Results indicate that the success of the approach depends on assessing the quality of the final answers generated by…

  19. Rational approaches to improving the isolation of endophytic actinobacteria from Australian native trees.

    PubMed

    Kaewkla, Onuma; Franco, Christopher M M

    2013-02-01

    In recent years, new actinobacterial species have been isolated as endophytes of plants and shrubs and are sought after both for their role as potential producers of new drug candidates for the pharmaceutical industry and as biocontrol inoculants for sustainable agriculture. Molecular-based approaches to the study of microbial ecology generally reveal a broader microbial diversity than can be obtained by cultivation methods. This study aimed to improve the success of isolating individual members of the actinobacterial population as pure cultures as well as improving the ability to characterise the large numbers obtained in pure culture. To achieve this objective, our study successfully employed rational and holistic approaches including the use of isolation media with low concentrations of nutrients normally available to the microorganism in the plant, plating larger quantities of plant sample, incubating isolation plates for up to 16 weeks, excising colonies when they are visible and choosing Australian endemic trees as the source of the actinobacteria. A hierarchy of polyphasic methods based on culture morphology, amplified 16S rRNA gene restriction analysis and limited sequencing was used to classify all 576 actinobacterial isolates from leaf, stem and root samples of two eucalypts: a Grey Box and Red Gum, a native apricot tree and a native pine tree. The classification revealed that, in addition to 413 Streptomyces spp., isolates belonged to 16 other actinobacterial genera: Actinomadura (two strains), Actinomycetospora (six), Actinopolymorpha (two), Amycolatopsis (six), Gordonia (one), Kribbella (25), Micromonospora (six), Nocardia (ten), Nocardioides (11), Nocardiopsis (one), Nonomuraea (one), Polymorphospora (two), Promicromonospora (51), Pseudonocardia (36), Williamsia (two) and a novel genus Flindersiella (one). In order to prove novelty, 12 strains were characterised fully to the species level based on polyphasic taxonomy. One strain represented a novel

  20. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    NASA Astrophysics Data System (ADS)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  1. An overview of the phase-modular fault tree approach to phased mission system analysis

    NASA Technical Reports Server (NTRS)

    Meshkat, L.; Xing, L.; Donohue, S. K.; Ou, Y.

    2003-01-01

    We look at how fault tree analysis (FTA), a primary means of performing reliability analysis of PMS, can meet this challenge in this paper by presenting an overview of the modular approach to solving fault trees that represent PMS.

  2. An overview of the phase-modular fault tree approach to phased mission system analysis

    NASA Technical Reports Server (NTRS)

    Meshkat, L.; Xing, L.; Donohue, S. K.; Ou, Y.

    2003-01-01

    We look at how fault tree analysis (FTA), a primary means of performing reliability analysis of PMS, can meet this challenge in this paper by presenting an overview of the modular approach to solving fault trees that represent PMS.

  3. An efficient tree classifier ensemble-based approach for pedestrian detection.

    PubMed

    Xu, Yanwu; Cao, Xianbin; Qiao, Hong

    2011-02-01

    Classification-based pedestrian detection systems (PDSs) are currently a hot research topic in the field of intelligent transportation. A PDS detects pedestrians in real time on moving vehicles. A practical PDS demands not only high detection accuracy but also high detection speed. However, most of the existing classification-based approaches mainly seek for high detection accuracy, while the detection speed is not purposely optimized for practical application. At the same time, the performance, particularly the speed, is primarily tuned based on experiments without theoretical foundations, leading to a long training procedure. This paper starts with measuring and optimizing detection speed, and then a practical classification-based pedestrian detection solution with high detection speed and training speed is described. First, an extended classification/detection speed metric, named feature-per-object (fpo), is proposed to measure the detection speed independently from execution. Then, an fpo minimization model with accuracy constraints is formulated based on a tree classifier ensemble, where the minimum fpo can guarantee the highest detection speed. Finally, the minimization problem is solved efficiently by using nonlinear fitting based on radial basis function neural networks. In addition, the optimal solution is directly used to instruct classifier training; thus, the training speed could be accelerated greatly. Therefore, a rapid and accurate classification-based detection technique is proposed for the PDS. Experimental results on urban traffic videos show that the proposed method has a high detection speed with an acceptable detection rate and a false-alarm rate for onboard detection; moreover, the training procedure is also very fast.

  4. Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

    NASA Astrophysics Data System (ADS)

    Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko

    2015-01-01

    Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.

  5. About decomposition approach for solving the classification problem

    NASA Astrophysics Data System (ADS)

    Andrianova, A. A.

    2016-11-01

    This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.

  6. Urban tree mortality: a primer on demographic approaches

    Treesearch

    Lara A. Roman; John J. Battles; Joe R. McBride

    2016-01-01

    Realizing the benefits of tree planting programs depends on tree survival. Projections of urban forest ecosystem services and cost-benefit analyses are sensitive to assumptions about tree mortality rates. Long-term mortality data are needed to improve the accuracy of these models and optimize the public investment in tree planting. With more accurate population...

  7. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2000-12-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  8. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  9. Surgical approach to impacted mandibular third molars--operative classification.

    PubMed

    Abu-El Naaj, Imad; Braun, Refael; Leiser, Yoav; Peled, Micha

    2010-03-01

    The aim of the present study is to suggest a convenient way to classify the position of the impacted third mandibular molar relative to the mandibular canal and to suggest indications for the use of each surgical approach for mandibular third molar extraction. The presented new typing system, Third Molar Classification (TMC), is a simple and easy-to-apply method for the surgical management of mandibular third molars and can be extended for any ectopic or impacted mandibular tooth. There are 3 major types of third molar positions. The second type is subdivided further into 2 subtypes. In the present study, 9 patients with high-risk mandibular third molars were treated according to the present classification and are presented and discussed. Patients typed as TMC IIb were treated with a sagittal split osteotomy approach and patients typed as TMC III were treated with an extraoral approach. The operative classification was successfully implemented in very rare cases of deeply impacted mandibular third molars. In 3 of 9 cases (33%) minor complications included some degree of hypoesthesia using the extraoral approach; these complications resolved spontaneously without the need for any intervention. The present study describes the use of a new surgical classification system for treatment planning in all types of mandibular third molar extractions. We believe that the present classification could help the oral and maxillofacial surgeon in decision-making and limit the possible risks that are present when attempting to extract impacted mandibular third molars. Copyright (c) 2010 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.

  10. Weighing risk factors associated with bee colony collapse disorder by classification and regression tree analysis.

    PubMed

    VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude

    2010-10-01

    Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.

  11. Aerial Images from AN Uav System: 3d Modeling and Tree Species Classification in a Park Area

    NASA Astrophysics Data System (ADS)

    Gini, R.; Passoni, D.; Pinto, L.; Sona, G.

    2012-07-01

    The use of aerial imagery acquired by Unmanned Aerial Vehicles (UAVs) is scheduled within the FoGLIE project (Fruition of Goods Landscape in Interactive Environment): it starts from the need to enhance the natural, artistic and cultural heritage, to produce a better usability of it by employing audiovisual movable systems of 3D reconstruction and to improve monitoring procedures, by using new media for integrating the fruition phase with the preservation ones. The pilot project focus on a test area, Parco Adda Nord, which encloses various goods' types (small buildings, agricultural fields and different tree species and bushes). Multispectral high resolution images were taken by two digital compact cameras: a Pentax Optio A40 for RGB photos and a Sigma DP1 modified to acquire the NIR band. Then, some tests were performed in order to analyze the UAV images' quality with both photogrammetric and photo-interpretation purposes, to validate the vector-sensor system, the image block geometry and to study the feasibility of tree species classification. Many pre-signalized Control Points were surveyed through GPS to allow accuracy analysis. Aerial Triangulations (ATs) were carried out with photogrammetric commercial software, Leica Photogrammetry Suite (LPS) and PhotoModeler, with manual or automatic selection of Tie Points, to pick out pros and cons of each package in managing non conventional aerial imagery as well as the differences in the modeling approach. Further analysis were done on the differences between the EO parameters and the corresponding data coming from the on board UAV navigation system.

  12. A conceptual approach to approximate tree root architecture in infinite slope models

    NASA Astrophysics Data System (ADS)

    Schmaltz, Elmar; Glade, Thomas

    2016-04-01

    paraboloids represent a cordate-root-system with radius r, height h and a constant, species-independent curvature. This procedure simplifies the classification of tree species into the three defined geometric solids. In this study we introduce a conceptual approach to estimate the 2- and 3-dimensional distribution of different tree root systems, and to implement it in a raster environment, as it is used in infinite slope models. Hereto we used the PCRaster extension in a python framework. The results show that root distribution and root growth are spatially reproducible in a simple raster framework. The outputs exhibit significant effects for a synthetically generated slope on local scale for equal time-steps. The preliminary results depict an initial step to develop a vegetation module that can be coupled with hydro-mechanical slope stability models. This approach is expected to yield a valuable contribution to the implementation of vegetation-related properties, in particular effects of root-reinforcement, into physically-based approaches using infinite slope models.

  13. Control of tree water networks: A geometric programming approach

    NASA Astrophysics Data System (ADS)

    Sela Perelman, L.; Amin, S.

    2015-10-01

    This paper presents a modeling and operation approach for tree water supply systems. The network control problem is approximated as a geometric programming (GP) problem. The original nonlinear nonconvex network control problem is transformed into a convex optimization problem. The optimization model can be efficiently solved to optimality using state-of-the-art solvers. Two control schemes are presented: (1) operation of network actuators (pumps and valves) and (2) controlled demand shedding allocation between network consumers with limited resources. The dual of the network control problem is formulated and is used to perform sensitivity analysis with respect to hydraulic constraints. The approach is demonstrated on a small branched-topology network and later extended to a medium-size irrigation network. The results demonstrate an intrinsic trade-off between energy costs and demand shedding policy, providing an efficient decision support tool for active management of water systems.

  14. An improved methodology for land-cover classification using artificial neural networks and a decision tree classifier

    NASA Astrophysics Data System (ADS)

    Arellano-Neri, Olimpia

    Mapping is essential for the analysis of the land and land-cover dynamics, which influence many environmental processes and properties. When creating land-cover maps it is important to minimize error, since error will propagate into later analyses based upon these land cover maps. The reliability of land cover maps derived from remotely sensed data depends upon an accurate classification. For decades, traditional statistical methods have been applied in land-cover classification with varying degrees of accuracy. One of the most significant developments in the field of land-cover classification using remotely sensed data has been the introduction of Artificial Neural Networks (ANN) procedures. In this research, Artificial Neural Networks were applied to remotely sensed data of the southwestern Ohio region for land-cover classification. Three variants on traditional ANN-based classifiers were explored here: (1) the use of a customized architecture of the neural network in terms of the input layer for each land-cover class, (2) the use of texture analysis to combine spectral information and spatial information which is essential for urban classes, and (3) the use of decision tree (DT) classification to refine the ANN classification and ultimately to achieve a more reliable land-cover thematic map. The objective of this research was to prove that a classification based on Artificial Neural Networks (ANN) and decision tree (DT) would outperform by far the National Land Cover Data (NLCD). The NLCD is a land-cover classification produced by a cooperative effort between the United States Geological Survey (USGS) and the United States Environmental Protection Agency (USEPA). In order to achieve this objective, an accuracy assessment was conducted for both NLCD classification and ANN/DT classification. Error matrices resulting from the accuracy assessments provided overall accuracy, accuracy of each class, omission errors, and commission errors for each classification. The

  15. Classification.

    PubMed

    Tuxhorn, Ingrid; Kotagal, Prakash

    2008-07-01

    In this article, we review the practical approach and diagnostic relevance of current seizure and epilepsy classification concepts and principles as a basic framework for good management of patients with epileptic seizures and epilepsy. Inaccurate generalizations about terminology, diagnosis, and treatment may be the single most important factor, next to an inadequately obtained history, that determines the misdiagnosis and mismanagement of patients with epilepsy. A stepwise signs and symptoms approach for diagnosis, evaluation, and management along the guidelines of the International League Against Epilepsy and definitions of epileptic seizures and epilepsy syndromes offers a state-of-the-art clinical approach to managing patients with epilepsy.

  16. Comparing ANNs, EAs, and Trees: a basic machine-learning approach to predictive environmental models.

    NASA Astrophysics Data System (ADS)

    Williams, J.; Poff, N.

    2005-05-01

    Machine learning techniques for ecological applications or "eco-informatics" are becoming increasingly useful and accessible for ecologists. We evaluated the predictive ability of three commercially available (i.e. user-friendly) software packages for artificial neural networks (ANNs), evolutionary algorithms (EAs), and classification/regression trees (Trees). We analyzed fish and habitat data for streams in the mid-Atlantic region of the U.S., which was collected by the U.S. Environmental Protection Agency (EPA). The data includes over 200 environmental descriptors summarizing watershed, stream, and water chemistry characteristics in addition to derived fish community metrics (i.e. richness, IBI scores, % exotics). In our analysis we predicted individual species presence/absence and fish community metrics as a function of these local and regional scale habitat variables. Predictive ability is evaluated with independent validation data. These approaches could prove especially useful for conservation or management applications where ecologists seek to utilize the most comprehensive data to make predictions at various scales. By employing "user-friendly" software we hope to show that ecologists, without extensive knowledge of computational science, can benefit from these techniques by extracting more information about complex ecosystems. Relative strengths and weaknesses of these three approaches are compared and recommendations for their use in conservation applications are presented.

  17. ADHD classification using bag of words approach on network features

    NASA Astrophysics Data System (ADS)

    Solmaz, Berkan; Dey, Soumyabrata; Rao, A. Ravishankar; Shah, Mubarak

    2012-02-01

    Attention Deficit Hyperactivity Disorder (ADHD) is receiving lots of attention nowadays mainly because it is one of the common brain disorders among children and not much information is known about the cause of this disorder. In this study, we propose to use a novel approach for automatic classification of ADHD conditioned subjects and control subjects using functional Magnetic Resonance Imaging (fMRI) data of resting state brains. For this purpose, we compute the correlation between every possible voxel pairs within a subject and over the time frame of the experimental protocol. A network of voxels is constructed by representing a high correlation value between any two voxels as an edge. A Bag-of-Words (BoW) approach is used to represent each subject as a histogram of network features; such as the number of degrees per voxel. The classification is done using a Support Vector Machine (SVM). We also investigate the use of raw intensity values in the time series for each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features. Experimental results verified that the classification accuracy improves when the combined histogram is used. We tested our approach on a highly challenging dataset released by NITRC for ADHD-200 competition and obtained promising results. The dataset not only has a large size but also includes subjects from different demography and edge groups. To the best of our knowledge, this is the first paper to propose BoW approach in any functional brain disorder classification and we believe that this approach will be useful in analysis of many brain related conditions.

  18. Improving Crop Classification Techniques Using Optical Remote Sensing Imagery, High-Resolution Agriculture Resource Inventory Shapefiles and Decision Trees

    NASA Astrophysics Data System (ADS)

    Melnychuk, A. L.; Berg, A. A.; Sweeney, S.

    2010-12-01

    Recognition of anthropogenic effects of land use management practices on bodies of water is important for remediating and preventing eutrophication. In the case of Lake Simcoe, Ontario the main surrounding landuse is agriculture. To better manage the nutrient flow into the lake, knowledge of the management of the agricultural land is important. For this basin, a comprehensive agricultural resource inventory is required for assessment of policy and for input into water quality management and assessment tools. Supervised decision tree classification schemes, used in many previous applications, have yielded reliable classifications in agricultural land-use systems. However, when using these classification techniques the user is confronted with numerous data sources. In this study we use a large inventory of optical satellite image products (Landsat, AWiFS, SPOT and MODIS) and ancillary data sources (temporal MODIS-NDVI product signatures, digital elevation models and soil maps) at various spatial and temporal resolutions in a decision tree classification scheme. The sensitivity of the classification accuracy to various products is assessed to identify optimal data sources for classifying crop systems.

  19. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  20. Twin SVM-Based Classification of Alzheimer's Disease Using Complex Dual-Tree Wavelet Principal Coefficients and LDA

    PubMed Central

    Alam, Saruar; Kim, Ji-In; Park, Chun-Su

    2017-01-01

    Alzheimer's disease (AD) is a leading cause of dementia, which causes serious health and socioeconomic problems. A progressive neurodegenerative disorder, Alzheimer's causes the structural change in the brain, thereby affecting behavior, cognition, emotions, and memory. Numerous multivariate analysis algorithms have been used for classifying AD, distinguishing it from healthy controls (HC). Efficient early classification of AD and mild cognitive impairment (MCI) from HC is imperative as early preventive care could help to mitigate risk factors. Magnetic resonance imaging (MRI), a noninvasive biomarker, displays morphometric differences and cerebral structural changes. A novel approach for distinguishing AD from HC using dual-tree complex wavelet transforms (DTCWT), principal coefficients from the transaxial slices of MRI images, linear discriminant analysis, and twin support vector machine is proposed here. The prediction accuracy of the proposed method yielded up to 92.65 ± 1.18 over the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, with a specificity of 92.19 ± 1.56 and sensitivity of 93.11 ± 1.29, and 96.68 ± 1.44 over the Open Access Series of Imaging Studies (OASIS) dataset, with a sensitivity of 97.72 ± 2.34 and specificity of 95.61 ± 1.67. The accuracy, sensitivity, and specificity achieved using the proposed method are comparable or superior to those obtained by various conventional AD prediction methods.

  1. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees

    PubMed Central

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥106 CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥105 CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  2. Study and Ranking of Determinants of Taenia solium Infections by Classification Tree Models

    PubMed Central

    Mwape, Kabemba E.; Phiri, Isaac K.; Praet, Nicolas; Dorny, Pierre; Muma, John B.; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  3. Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree

    PubMed Central

    Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid

    2016-01-01

    Introduction: Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. Objective: we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. Methods: we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. Results: The proposed model had an accuracy of 94.84% ( Standard Deviation: 24.42) in order to correct prediction of the ESD disease. Conclusions: Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD. PMID:28077889

  4. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  5. Predictors of sentinel lymph node status in cutaneous melanoma: a classification and regression tree analysis.

    PubMed

    Tejera-Vaquerizo, A; Martín-Cuevas, P; Gallego, E; Herrera-Acosta, E; Traves, V; Herrera-Ceballos, E; Nagore, E

    2015-04-01

    The main aim of this study was to identify predictors of sentinel lymph node (SN) metastasis in cutaneous melanoma. This was a retrospective cohort study of 818 patients in 2 tertiary-level hospitals. The primary outcome variable was SN involvement. Independent predictors were identified using multiple logistic regression and a classification and regression tree (CART) analysis. Ulceration, tumor thickness, and a high mitotic rate (≥6 mitoses/mm(2)) were independently associated with SN metastasis in the multiple regression analysis. The most important predictor in the CART analysis was Breslow thickness. Absence of an inflammatory infiltrate, patient age, and tumor location were predictive of SN metastasis in patients with tumors thicker than 2mm. In the case of thinner melanomas, the predictors were mitotic rate (>6 mitoses/mm(2)), presence of ulceration, and tumor thickness. Patient age, mitotic rate, and tumor thickness and location were predictive of survival. A high mitotic rate predicts a higher risk of SN involvement and worse survival. CART analysis improves the prediction of regional metastasis, resulting in better clinical management of melanoma patients. It may also help select suitable candidates for inclusion in clinical trials. Copyright © 2014 Elsevier España, S.L.U. and AEDV. All rights reserved.

  6. Classification tree analysis for the discrimination of pleural exudates and transudates.

    PubMed

    Esquerda, Aureli; Trujillano, Javier; López de Ullibarri, Ignacio; Bielsa, Silvia; Madroñero, Ana B; Porcel, José M

    2007-01-01

    Classification and regression tree (CART) analysis is a non-parametric technique suitable for the generation of clinical decision rules. We have studied the performance of CART analysis in the separation of pleural exudates and transudates. Basic demographic, radiologic and laboratory data were retrospectively evaluated in 1257 pleural effusions (204 transudates and 1053 exudates, according to standard clinical criteria) and submitted for CART analysis. The model's discriminative ability was compared with that of Light's criteria, in both the original formulation and an abbreviated version, i.e., deleting the pleural fluid (PF)/serum lactate dehydrogenase (LDH) ratio from the triad. A first CART model built starting from all available data identified PF/serum protein ratio and PF LDH ratios as the two best discriminatory parameters. This algorithm achieved a sensitivity of 96.8%, slightly lower than that of classical Light's criteria (98.5%) and comparable to that of the abbreviated Light's criteria (97.0%), and significantly better specificity (85.3%) compared to both classical (74.0%) and abbreviated (79.4%) Light's criteria. A second CART model developed after excluding serum measurements selected PF protein and PF LDH as the most discriminatory variables, and correctly classified 97.2% of exudates and 77.0% of transudates. CART-based algorithms can efficiently discriminate between pleural exudates and transudates.

  7. Study and ranking of determinants of Taenia solium infections by classification tree models.

    PubMed

    Mwape, Kabemba E; Phiri, Isaac K; Praet, Nicolas; Dorny, Pierre; Muma, John B; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals.

  8. Predicting smear negative pulmonary tuberculosis with classification trees and logistic regression: a cross-sectional study

    PubMed Central

    Mello, Fernanda Carvalho de Queiroz; Bastos, Luiz Gustavo do Valle; Soares, Sérgio Luiz Machado; Rezende, Valéria MC; Conde, Marcus Barreto; Chaisson, Richard E; Kritski, Afrânio Lineu; Ruffino-Netto, Antonio; Werneck, Guilherme Loureiro

    2006-01-01

    Background Smear negative pulmonary tuberculosis (SNPT) accounts for 30% of pulmonary tuberculosis cases reported yearly in Brazil. This study aimed to develop a prediction model for SNPT for outpatients in areas with scarce resources. Methods The study enrolled 551 patients with clinical-radiological suspicion of SNPT, in Rio de Janeiro, Brazil. The original data was divided into two equivalent samples for generation and validation of the prediction models. Symptoms, physical signs and chest X-rays were used for constructing logistic regression and classification and regression tree models. From the logistic regression, we generated a clinical and radiological prediction score. The area under the receiver operator characteristic curve, sensitivity, and specificity were used to evaluate the model's performance in both generation and validation samples. Results It was possible to generate predictive models for SNPT with sensitivity ranging from 64% to 71% and specificity ranging from 58% to 76%. Conclusion The results suggest that those models might be useful as screening tools for estimating the risk of SNPT, optimizing the utilization of more expensive tests, and avoiding costs of unnecessary anti-tuberculosis treatment. Those models might be cost-effective tools in a health care network with hierarchical distribution of scarce resources. PMID:16504086

  9. Subthreshold linear modeling of dendritic trees: a computational approach.

    PubMed

    Khodaei, Alireza; Pierobon, Massimiliano

    2016-08-01

    The design of communication systems based on the transmission of information through neurons is envisioned as a key technology for the pervasive interconnection of future wearable and implantable devices. While previous literature has mainly focused on modeling propagation of electrochemical spikes carrying natural information through the nervous system, in recent work the authors of this paper proposed the so-called subthreshold electrical stimulation as a viable technique to propagate artificial information through neurons. This technique promises to limit the interference with natural communication processes, and it can be successfully approximated with linear models. In this paper, a novel model is proposed to account for the subthreshold stimuli propagation from the dendritic tree to the soma of a neuron. A computational approach is detailed to obtain this model for a given realistic 3D dendritic tree with an arbitrary morphology. Numerical results from the model are obtained over a stimulation signal bandwidth of 1KHz, and compared with the results of a simulation through the NEURON software.

  10. Use of Binary Partition Tree and energy minimization for object-based classification of urban land cover

    NASA Astrophysics Data System (ADS)

    Li, Mengmeng; Bijker, Wietske; Stein, Alfred

    2015-04-01

    Two main challenges are faced when classifying urban land cover from very high resolution satellite images: obtaining an optimal image segmentation and distinguishing buildings from other man-made objects. For optimal segmentation, this work proposes a hierarchical representation of an image by means of a Binary Partition Tree (BPT) and an unsupervised evaluation of image segmentations by energy minimization. For building extraction, we apply fuzzy sets to create a fuzzy landscape of shadows which in turn involves a two-step procedure. The first step is a preliminarily image classification at a fine segmentation level to generate vegetation and shadow information. The second step models the directional relationship between building and shadow objects to extract building information at the optimal segmentation level. We conducted the experiments on two datasets of Pléiades images from Wuhan City, China. To demonstrate its performance, the proposed classification is compared at the optimal segmentation level with Maximum Likelihood Classification and Support Vector Machine classification. The results show that the proposed classification produced the highest overall accuracies and kappa coefficients, and the smallest over-classification and under-classification geometric errors. We conclude first that integrating BPT with energy minimization offers an effective means for image segmentation. Second, we conclude that the directional relationship between building and shadow objects represented by a fuzzy landscape is important for building extraction.

  11. Non-Destructive Classification Approaches for Equilbrated Ordinary Chondrites

    NASA Technical Reports Server (NTRS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-01-01

    Classification of meteorites is most effectively carried out by petrographic and mineralogic studies of thin sections, but a rapid and accurate classification technique for the many samples collected in dense collection areas (hot and cold deserts) is of great interest. Oil immersion techniques have been used to classify a large proportion of the US Antarctic meteorite collections since the mid-1980s [1]. This approach has allowed rapid characterization of thousands of samples over time, but nonetheless utilizes a piece of the sample that has been ground to grains or a powder. In order to compare a few non-destructive techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Moessbauer spectroscopy.

  12. A Three-Step Approach To Model Tree Mortality in the State of Georgia

    Treesearch

    Qingmin Meng; Chris J. Cieszewski; Roger C. Lowe; Michal Zasada

    2005-01-01

    Tree mortality is one of the most complex phenomena of forest growth and yield. Many types of factors affect tree mortality, which is considered difficult to predict. This study presents a new systematic approach to simulate tree mortality based on the integration of statistical models and geographical information systems. This method begins with variable preselection...

  13. A probability approach to sawtimber tree-value projections

    Treesearch

    Roger E. McCay; Paul S. DeBald; Paul S. DeBald

    1973-01-01

    The authors present a method for projecting hardwood sawtimber tree values, using tree-development probabilities based on continuous forest inventory (CFI) data and describe some ways to use the resulting value projections to assemble management-planning information.

  14. Genomic and physiological approaches to advancing forest tree improvement

    Treesearch

    C. Dana Nelson; Kurt H. Johnsen

    2008-01-01

    Summary The recent completion of a draft sequence of the poplar (Populus trichocarpa Torr. & Gray ex Brayshaw) genome has advanced forest tree genetics to an unprecedented level. A "parts list" for a forest tree has been produced, opening up new opportunities for dissecting the interworkings of tree growth and development. In the relatively near future we...

  15. A nearest neighbour approach by genetic distance to the assignment of individual trees to geographic origin.

    PubMed

    Degen, Bernd; Blanc-Jolivet, Céline; Stierand, Katrin; Gillet, Elizabeth

    2017-03-01

    During the past decade, the use of DNA for forensic applications has been extensively implemented for plant and animal species, as well as in humans. Tracing back the geographical origin of an individual usually requires genetic assignment analysis. These approaches are based on reference samples that are grouped into populations or other aggregates and intend to identify the most likely group of origin. Often this grouping does not have a biological but rather a historical or political justification, such as "country of origin". In this paper, we present a new nearest neighbour approach to individual assignment or classification within a given but potentially imperfect grouping of reference samples. This method, which is based on the genetic distance between individuals, functions better in many cases than commonly used methods. We demonstrate the operation of our assignment method using two data sets. One set is simulated for a large number of trees distributed in a 120km by 120km landscape with individual genotypes at 150 SNPs, and the other set comprises experimental data of 1221 individuals of the African tropical tree species Entandrophragma cylindricum (Sapelli) genotyped at 61 SNPs. Judging by the level of correct self-assignment, our approach outperformed the commonly used frequency and Bayesian approaches by 15% for the simulated data set and by 5-7% for the Sapelli data set. Our new approach is less sensitive to overlapping sources of genetic differentiation, such as genetic differences among closely-related species, phylogeographic lineages and isolation by distance, and thus operates better even for suboptimal grouping of individuals.

  16. A wrapper-based approach to image segmentation and classification.

    PubMed

    Farmer, Michael E; Jain, Anil K

    2005-12-01

    The traditional processing flow of segmentation followed by classification in computer vision assumes that the segmentation is able to successfully extract the object of interest from the background image. It is extremely difficult to obtain a reliable segmentation without any prior knowledge about the object that is being extracted from the scene. This is further complicated by the lack of any clearly defined metrics for evaluating the quality of segmentation or for comparing segmentation algorithms. We propose a method of segmentation that addresses both of these issues, by using the object classification subsystem as an integral part of the segmentation. This will provide contextual information regarding the objects to be segmented, as well as allow us to use the probability of correct classification as a metric to determine the quality of the segmentation. We view traditional segmentation as a filter operating on the image that is independent of the classifier, much like the filter methods for feature selection. We propose a new paradigm for segmentation and classification that follows the wrapper methods of feature selection. Our method wraps the segmentation and classification together, and uses the classification accuracy as the metric to determine the best segmentation. By using shape as the classification feature, we are able to develop a segmentation algorithm that relaxes the requirement that the object of interest to be segmented must be homogeneous in some low-level image parameter, such as texture, color, or grayscale. This represents an improvement over other segmentation methods that have used classification information only to modify the segmenter parameters, since these algorithms still require an underlying homogeneity in some parameter space. Rather than considering our method as, yet, another segmentation algorithm, we propose that our wrapper method can be considered as an image segmentation framework, within which existing image segmentation

  17. New Approach for Segmentation and Extraction of Single Tree from Point Clouds Data and Aerial Images

    NASA Astrophysics Data System (ADS)

    Homainejad, A. S.

    2016-06-01

    This paper addresses a new approach for reconstructing a 3D model from single trees via Airborne Laser Scanners (ALS) data and aerial images. The approach detects and extracts single tree from ALS data and aerial images. The existing approaches are able to provide bulk segmentation from a group of trees; however, some methods focused on detection and extraction of a particular tree from ALS and images. Segmentation of a single tree within a group of trees is mostly a mission impossible since the detection of boundary lines between the trees is a tedious job and basically it is not feasible. In this approach an experimental formula based on the height of the trees was developed and applied in order to define the boundary lines between the trees. As a result, each single tree was segmented and extracted and later a 3D model was created. Extracted trees from this approach have a unique identification and attribute. The output has application in various fields of science and engineering such as forestry, urban planning, and agriculture. For example in forestry, the result can be used for study in ecologically diverse, biodiversity and ecosystem.

  18. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  19. Nearest feature line embedding approach to hyperspectral image classification

    NASA Astrophysics Data System (ADS)

    Chang, Yang-Lang; Liu, Jin-Nan; Han, Chin-Chuan; Chen, Ying-Nong; Hsieh, Tung-Ju; Huang, Bormin

    2012-10-01

    In this paper, a nearest feature line (NFL) embedding transformation is proposed for dimension reduction of hyperspectral image (HSI). Eigenspace projection approaches are generally used for feature extraction of HSI in remote sensing image classification. In order to improve the classification accuracy, the feature vectors of high dimensions are reduced to the low dimensionalities by the effective projection transformation. Similarly, the proposed NFL measurement is embedded into the transformation during the discriminant analysis stage instead of the matching stage. The class separability, neighborhood structure preservation, and NFL measurement are also simultaneously considered to find the effective and discriminating transformation in eigenspaces for image classification. The nearest neighbor classifier is used to show the discriminative performance. The proposed NFL embedding transformation is compared with several conventional state-of-the-art algorithms. It was evaluated by the AVIRIS data sets of Northwest Tippecanoe County. Experimental results have demonstrated that NFL embedding method is an effective transformation for dimension reduction in land cover classification of earth remote sensing.

  20. A hybrid ensemble learning approach to star-galaxy classification

    NASA Astrophysics Data System (ADS)

    Kim, Edward J.; Brunner, Robert J.; Carrasco Kind, Matias

    2015-10-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template-fitting method. Using data from the CFHTLenS survey (Canada-France-Hawaii Telescope Lensing Survey), we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2 (Deep Extragalactic Evolutionary Probe Phase 2 ), SDSS (Sloan Digital Sky Survey), VIPERS (VIMOS Public Extragalactic Redshift Survey), and VVDS (VIMOS VLT Deep Survey), and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

  1. Improved wetland remote sensing in Yellowstone National Park using classification trees to combine TM imagery and ancillary environmental data

    USGS Publications Warehouse

    Wright, C.; Gallant, A.

    2007-01-01

    The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub–shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in

  2. Newer classification and regression tree techniques: Bagging and Random Forests for ecological prediction

    Treesearch

    Anantha M. Prasad; Louis R. Iverson; Andy Liaw; Andy Liaw

    2006-01-01

    We evaluated four statistical models - Regression Tree Analysis (RTA), Bagging Trees (BT), Random Forests (RF), and Multivariate Adaptive Regression Splines (MARS) - for predictive vegetation mapping under current and future climate scenarios according to the Canadian Climate Centre global circulation model.

  3. Classification of Parkinsonian Syndromes from FDG-PET Brain Data Using Decision Trees with SSM/PCA Features

    PubMed Central

    Mudali, D.; Teune, L. K.; Renken, R. J.; Leenders, K. L.; Roerdink, J. B. T. M.

    2015-01-01

    Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data. PMID:25918550

  4. Land cover and forest formation distributions for St. Kitts, Nevis, St. Eustatius, Grenada and Barbados from decision tree classification of cloud-cleared satellite imagery

    USGS Publications Warehouse

    Helmer, E.H.; Kennaway, T.A.; Pedreros, D.H.; Clark, M.L.; Marcano-Vega, H.; Tieszen, L.L.; Ruzycki, T.R.; Schill, S.R.; Carrington, C.M.S.

    2008-01-01

    Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius, testing a more detailed classification than earlier work in the latter three islands. Secondly, we estimate the extents of land cover and protected forest by formation for five islands and ask how land cover has changed over the second half of the 20th century. The image interpretation approach combines image mosaics and ancillary geographic data, classifying the resulting set of raster data with decision tree software. Cloud-free image mosaics for one or two seasons were created by applying regression tree normalization to scene dates that could fill cloudy areas in a base scene. Such mosaics are also known as cloud-filled, cloud-minimized or cloud-cleared imagery, mosaics, or composites. The approach accurately distinguished several classes that more standard methods would confuse; the seamless mosaics aided reference data collection; and the multiseason imagery allowed us to separate drought deciduous forests and woodlands from semi-deciduous ones. Cultivated land areas declined 60 to 100 percent from about 1945 to 2000 on several islands. Meanwhile, forest cover has increased 50 to 950%. This trend will likely continue where sugar cane cultivation has dominated. Like the island of Puerto Rico, most higher-elevation forest formations are protected in formal or informal reserves. Also similarly, lowland forests, which are drier forest types on these islands, are not well represented in reserves. Former cultivated lands in lowland areas could provide lands for new reserves of drier forest types. The land-use history of these islands may provide insight for planners in countries currently considering

  5. Emerald ash borer (Agrilus planipennis): Towards a classification of tree health and early detection

    Treesearch

    Matthew P. Peters; Louis R. Iverson; T. Davis Sydnor

    2009-01-01

    Forty-five green ash (Fraxinus pennsylvanica) street trees in Toledo, Ohio were photographed, measured, and visually rated for conditions related to emerald ash borer (Agrilus planipennis) (EAB) attacks. These trees were later removed, and sections were examined from each tree to determine the length of time that growth rates had...

  6. Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction

    PubMed Central

    2013-01-01

    Background Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. Results This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Conclusions Our newly developed method for HMC takes into account network information in the learning phase: When

  7. A lazy data mining approach for protein classification.

    PubMed

    Merschmann, Luiz; Plastino, Alexandre

    2007-03-01

    In this work, we propose a new computational technique to solve the protein classification problem. The goal is to predict the functional family of novel protein sequences based on their motif composition. In order to improve the results obtained with other known approaches, we propose a new data mining technique for protein classification based on Bayes' theorem, called highest subset probability (HiSP). To evaluate our proposal, datasets extracted from Prosite, a curated protein family database, are used as experimental datasets. The computational results have shown that the proposed method outperforms other known methods for all tested datasets and looks very promising for problems with characteristics similar to the problem addressed here. In addition, our experiments suggest that HiSP performs well on highly imbalanced datasets.

  8. A Distributed Artificial Intelligence Approach To Object Identification And Classification

    NASA Astrophysics Data System (ADS)

    Sikka, Digvijay I.; Varshney, Pramod K.; Vannicola, Vincent C.

    1989-09-01

    This paper presents an application of Distributed Artificial Intelligence (DAI) tools to the data fusion and classification problem. Our approach is to use a blackboard for information management and hypothe-ses formulation. The blackboard is used by the knowledge sources (KSs) for sharing information and posting their hypotheses on, just as experts sitting around a round table would do. The present simulation performs classification of an Aircraft(AC), after identifying it by its features, into disjoint sets (object classes) comprising of the five commercial ACs; Boeing 747, Boeing 707, DC10, Concord and Boeing 727. A situation data base is characterized by experimental data available from the three levels of expert reasoning. Ohio State University ElectroScience Laboratory provided this experimental data. To validate the architecture presented, we employ two KSs for modeling the sensors, aspect angle polarization feature and the ellipticity data. The system has been implemented on Symbolics 3645, under Genera 7.1, in Common LISP.

  9. A methodological approach to the classification of dermoscopy images

    PubMed Central

    Celebi, M. Emre; Kingravi, Hassan A.; Uddin, Bakhtiyar; Iyatomi, Hitoshi; Aslandogan, Y. Alp; Stoecker, William V.; Moss, Randy H.

    2011-01-01

    In this paper a methodological approach to the classification of pigmented skin lesions in dermoscopy images is presented. First, automatic border detection is performed to separate the lesion from the background skin. Shape features are then extracted from this border. For the extraction of color and texture related features, the image is divided into various clinically significant regions using the Euclidean distance transform. This feature data is fed into an optimization framework, which ranks the features using various feature selection algorithms and determines the optimal feature subset size according to the area under the ROC curve measure obtained from support vector machine classification. The issue of class imbalance is addressed using various sampling strategies, and the classifier generalization error is estimated using Monte Carlo cross validation. Experiments on a set of 564 images yielded a specificity of 92.34% and a sensitivity of 93.33%. PMID:17387001

  10. Investigating the Utility of Oblique Tree-Based Ensembles for the Classification of Hyperspectral Data

    PubMed Central

    Poona, Nitesh; van Niekerk, Adriaan; Ismail, Riyad

    2016-01-01

    Ensemble classifiers are being widely used for the classification of spectroscopic data. In this regard, the random forest (RF) ensemble has been successfully applied in an array of applications, and has proven to be robust in handling high dimensional data. More recently, several variants of the traditional RF algorithm including rotation forest (rotF) and oblique random forest (oRF) have been applied to classifying high dimensional data. In this study we compare the traditional RF, rotF, and oRF (using three different splitting rules, i.e., ridge regression, partial least squares, and support vector machine) for the classification of healthy and infected Pinus radiata seedlings using high dimensional spectroscopic data. We further test the robustness of these five ensemble classifiers to reduced spectral resolution by spectral resampling (binning) of the original spectral bands. The results showed that the three oblique random forest ensembles outperformed both the traditional RF and rotF ensembles. Additionally, the rotF ensemble proved to be the least robust of the five ensembles tested. Spectral resampling of the original bands provided mixed results. Nevertheless, the results demonstrate that using spectral resampled bands is a promising approach to classifying asymptomatic stress in Pinus radiata seedlings. PMID:27854290

  11. Schistosoma mansoni reinfection: Analysis of risk factors by classification and regression tree (CART) modeling

    PubMed Central

    Oliveira-Prado, Roberta; Matoso, Leonardo Ferreira; Veloso, Bráulio M.; Andrade, Gisele; Kloos, Helmut; Bethony, Jeffrey M.; Assunção, Renato M.; Correa-Oliveira, Rodrigo

    2017-01-01

    Praziquantel (PZQ) is an effective chemotherapy for schistosomiasis mansoni and a mainstay for its control and potential elimination. However, it does not prevent against reinfection, which can occur rapidly in areas with active transmission. A guide to ranking the risk factors for Schistosoma mansoni reinfection would greatly contribute to prioritizing resources and focusing prevention and control measures to prevent rapid reinfection. The objective of the current study was to explore the relationship among the socioeconomic, demographic, and epidemiological factors that can influence reinfection by S. mansoni one year after successful treatment with PZQ in school-aged children in Northeastern Minas Gerais state Brazil. Parasitological, socioeconomic, demographic, and water contact information were surveyed in 506 S. mansoni-infected individuals, aged 6 to 15 years, resident in these endemic areas. Eligible individuals were treated with PZQ until they were determined to be negative by the absence of S. mansoni eggs in the feces on two consecutive days of Kato-Katz fecal thick smear. These individuals were surveyed again 12 months from the date of successful treatment with PZQ. A classification and regression tree modeling (CART) was then used to explore the relationship between socioeconomic, demographic, and epidemiological variables and their reinfection status. The most important risk factor identified for S. mansoni reinfection was their “heavy” infection at baseline. Additional analyses, excluding heavy infection status, showed that lower socioeconomic status and a lower level of education of the household head were also most important risk factors for S. mansoni reinfection. Our results provide an important contribution toward the control and possible elimination of schistosomiasis by identifying three major risk factors that can be used for targeted treatment and monitoring of reinfection. We suggest that control measures that target heavily infected

  12. Risk assessment of dental caries by using Classification and Regression Trees.

    PubMed

    Ito, Ataru; Hayashi, Mikako; Hamasaki, Toshimitsu; Ebisu, Shigeyuki

    2011-06-01

    Being able to predict an individual's risks of dental caries would offer a potentially huge natural step forward toward better oral heath. As things stand, preventive treatment against caries is mostly carried out without risk assessment because there is no proven way to analyse an individual's risk factors. The purpose of this study was to try to identify those patients with high and low risk of caries by using Classification and Regression Trees (CART). In this historical cohort study, data from 442 patients in a general practice who met the inclusion criteria were analysed. CART was applied to the data to seek a model for predicting caries by using the following parameters according to each patient: age, number of carious teeth, numbers of cariogenic bacteria, the secretion rate and buffer capacity of saliva, and compliance with a prevention programme. The risks of caries were presented by odds ratios. Multiple logistic regression analysis was performed to confirm the results obtained by CART. CART identified high and low risk patients for primary caries with relative odds ratios of 0.41 (95%CI: 0.22-0.77, p = 0.0055) and 2.88 (95%CI: 1.49-5.59, p = 0.0018) according the numbers of cariogenic bacteria. High and low risk patients for secondary caries were also identified with the odds ratios of 0.07 (95%CI: 0.01-0.55, p = 0.00109) and 7.00 (95%CI: 3.50-13.98, p < 0.0001) according the numbers of bacteria and existing caries. Cariogenic bacteria play a leading role in the incidence of caries. CART proved effective in identifying an individual patient's risk of caries. Copyright © 2011 Elsevier Ltd. All rights reserved.

  13. Risk stratification for 1-year mortality in acute heart failure: classification and regression tree analysis.

    PubMed

    Arenja, Nisha; Breidthardt, Tobias; Socrates, Thenral; Schindler, Christian; Heinisch, Corinna; Tschung, Christopher; Potocki, Mihael; Gualandro, Danielle; Mueller, Christian

    2011-10-09

    Simple tools for risk stratification of patients with acute heart failure (AHF) are an unmet clinical need, particularly regarding long-term mortality. We prospectively enrolled 610 consecutive patients presenting to the emergency department with AHF. The diagnosis of AHF was adjudicated by two independent cardiologists. The classification and regression tree (CART) analysis was used to develop a simple risk algorithm. This was internally validated by cross-validation. One-year follow-up was complete in all patients (100%). A total of 201 patients (33%) died within 360 days. The CART analysis identified blood urea nitrogen (BUN) and age as the best single predictors of 1-year mortality and patients were categorised to three risk groups: high risk group (BUN >27.5 mg/dl and age >86 years), intermediate risk group (BUN >27.5 mg/dl and age ≤ 86 years) and low risk group (BUN ≤ 27.5 mg/dl). The Kaplan-Meier curves showed a significant increase in mortality in the high risk group compared with the lower risk groups (log-rank test p <0.001). The hazard ratio regarding 1-year mortality between patients identified as low and high risk was 2.0 (95% confidence interval, 1.7-2.4), with statistically significant differences between all risk groups (p <0.001). The likelihood-based 95%-confidence set for the age- and the urea-threshold is contained in the rectangular set defined by 25 mg/dl ≤ urea threshold ≤30.6 mg/dl and 76 years ≤ age threshold ≤96 years. These results suggest that AHF patients at low, intermediate and high risk for death within 360 days can be easily identified using patient's demographics and laboratory data obtained at presentation. Application of this simple risk stratification algorithm may help to improve the management of these patients.

  14. Classification tree analysis of postal questionnaire data to identify risk of excessive gestational weight gain.

    PubMed

    Fuller-Tyszkiewicz, Matthew; Skouteris, Helen; Hill, Briony; Teede, Helena; McPhie, Skye

    2016-01-01

    overweight/obese weight status during pregnancy increases risk of a range of adverse health outcomes for mother and child. Whereas identification of those who are overweight/obese pre-pregnancy and in early pregnancy is straightforward, prediction of who will experience excessive gestational weight gain (EGWG), and thus be at greater risk of becoming overweight or obese during pregnancy is more challenging. The present study sought to better identify those at risk of EGWG by exploring pre-pregnancy BMI as well as a range of psychosocial risk factors identified as risk factors in prior research. 225 pregnant women completed self-reported via postal survey measures of height, weight, and psychosocial variables at 16-18 weeks gestation, and reported their weight again at 32-34 weeks to calculate GWG. Classification and regression tree analysis (CART) was used to find subgroups in the data with increased risk of EGWG based on their pre-pregnancy BMI and psychosocial risk factor scores at Time 1. CART confirmed that self-reported BMI status was a strong predictor of EGWG risk for women who were overweight/obese pre-pregnancy. Normal weight women with low motivation to maintain a healthy diet and who reported lower levels of partner support were also at considerable risk of EGWG. present findings offer support for inclusion of psychosocial measures (in addition to BMI) in early antenatal visits to detect risk of EGWG. However, these findings also underscore the need for further consideration of effect modifiers that place women at increased or decreased risk of EGWG. Proposed additional constructs are discussed to direct further theory-driven research. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.

  15. Availability and Capacity of Substance Abuse Programs in Correctional Settings: A Classification and Regression Tree Analysis

    PubMed Central

    Kitsantas, Panagiota

    2009-01-01

    Objective to be addressed The purpose of this study was to investigate the structural and organizational factors that contribute to the availability and increased capacity for substance abuse treatment programs in correctional settings. We used Classification and Regression Tree statistical procedures to identify how multi-level data can explain the variability in availability and capacity of substance abuse treatment programs in jails and probation/parole offices. Methods The data for this study combined the National Criminal Justice Treatment Practices survey (NCJTP) and the 2000 Census. The NCJTP survey was a nationally representative sample of correctional administrators for jails and probation/parole agencies. The sample size included 295 substance abuse treatment programs that were classified according to the intensity of their services: high, medium, and low. The independent variables included jurisdictional-level structural variables, attributes of the correctional administrators, and program and service delivery characteristics of the correctional agency. Results The two most important variables in predicting the availability of all three types of services were stronger working relationships with other organizations and the adoption of a standardized substance abuse screening tool by correctional agencies. For high and medium intensive programs, the capacity increased when an organizational learning strategy was used by administrators and the organization used a substance abuse screening tool. Implications on advancing treatment practices in correctional settings are discussed, including further work to test theories on how to better understand access to intensive treatment services. This study presents the first phase of understanding capacity-related issues regarding treatment programs offered in correctional settings. PMID:19395204

  16. Using Classification and Regression Trees (CART) to Identify Prescribing Thresholds for Cardiovascular Disease.

    PubMed

    Schilling, Chris; Mortimer, Duncan; Dalziel, Kim; Heeley, Emma; Chalmers, John; Clarke, Philip

    2016-02-01

    Many guidelines for clinical decisions are hierarchical and nonlinear. Evaluating if these guidelines are used in practice requires methods that can identify such structures and thresholds. Classification and regression trees (CART) were used to analyse prescribing patterns of Australian general practitioners (GPs) for the primary prevention of cardiovascular disease (CVD). Our aim was to identify if GPs use absolute risk (AR) guidelines in favour of individual risk factors to inform their prescribing decisions of lipid-lowering medications. We employed administrative prescribing information that is linked to patient-level data from a clinical assessment and patient survey (the AusHeart Study), and assessed prescribing of lipid-lowering medications over a 12-month period for patients (n = 1903) who were not using such medications prior to recruitment. CART models were developed to explain prescribing practice. Out-of-sample performance was evaluated using receiver operating characteristic (ROC) curves, and optimised via pruning. We found that individual risk factors (low-density lipoprotein, diabetes, triglycerides and a history of CVD), GP-estimated rather than Framingham AR, and sociodemographic factors (household income, education) were the predominant drivers of GP prescribing. However, sociodemographic factors and some individual risk factors (triglycerides and CVD history) only become relevant for patients with a particular profile of other risk factors. The ROC area under the curve was 0.63 (95% confidence interval [CI] 0.60-0.64). There is little evidence that AR guidelines recommended by the National Heart Foundation and National Vascular Disease Prevention Alliance, or conditional individual risk eligibility guidelines from the Pharmaceutical Benefits Scheme, are adopted in prescribing practice. The hierarchy of conditional relationships between risk factors and socioeconomic factors identified by CART provides new insights into prescribing decisions

  17. "Trees and Things That Live in Trees": Three Children with Special Needs Experience the Project Approach

    ERIC Educational Resources Information Center

    Griebling, Susan; Elgas, Peg; Konerman, Rachel

    2015-01-01

    The authors report on research conducted during a project investigation undertaken with preschool children, ages 3-5. The report focuses on three children with special needs and the positive outcomes for each child as they engaged in the project Trees and Things That Live in Trees. Two of the children were diagnosed with developmental delays, and…

  18. Cluster Stability Estimation Based on a Minimal Spanning Trees Approach

    NASA Astrophysics Data System (ADS)

    Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora

    2009-08-01

    Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.

  19. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification

    PubMed Central

    Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898

  20. Use of classification trees to apportion single echo detections to species: Application to the pelagic fish community of Lake Superior

    USGS Publications Warehouse

    Yule, Daniel L.; Adams, Jean V.; Hrabik, Thomas R.; Vinson, Mark R.; Woiak, Zebadiah; Ahrenstroff, Tyler D.

    2013-01-01

    Acoustic methods are used to estimate the density of pelagic fish in large lakes with results of midwater trawling used to assign species composition. Apportionment in lakes having mixed species can be challenging because only a small fraction of the water sampled acoustically is sampled with trawl gear. Here we describe a new method where single echo detections (SEDs) are assigned to species based on classification tree models developed from catch data that separate species based on fish size and the spatial habitats they occupy. During the summer of 2011, we conducted a spatially-balanced lake-wide acoustic and midwater trawl survey of Lake Superior. A total of 51 sites in four bathymetric depth strata (0–30 m, 30–100 m, 100–200 m, and >200 m) were sampled. We developed classification tree models for each stratum and found fish length was the most important variable for separating species. To apply these trees to the acoustic data, we needed to identify a target strength to length (TS-to-L) relationship appropriate for all abundant Lake Superior pelagic species. We tested performance of 7 general (i.e., multi-species) relationships derived from three published studies. The best-performing relationship was identified by comparing predicted and observed catch compositions using a second independent Lake Superior data set. Once identified, the relationship was used to predict lengths of SEDs from the lake-wide survey, and the classification tree models were used to assign each SED to a species. Exotic rainbow smelt (Osmerus mordax) were the most common species at bathymetric depths 100 m (384 million; 6.0 kt). Cisco (Coregonus artedi) were widely distributed over all strata with their population estimated at 182 million (44 kt). The apportionment method we describe should be transferable to other large lakes provided fish are not tightly aggregated, and an appropriate TS-to-L relationship for abundant pelagic fish species can be determined.

  1. Quad-polarized synthetic aperture radar and multispectral data classification using classification and regression tree and support vector machine-based data fusion system

    NASA Astrophysics Data System (ADS)

    Bigdeli, Behnaz; Pahlavani, Parham

    2017-01-01

    Interpretation of synthetic aperture radar (SAR) data processing is difficult because the geometry and spectral range of SAR are different from optical imagery. Consequently, SAR imaging can be a complementary data to multispectral (MS) optical remote sensing techniques because it does not depend on solar illumination and weather conditions. This study presents a multisensor fusion of SAR and MS data based on the use of classification and regression tree (CART) and support vector machine (SVM) through a decision fusion system. First, different feature extraction strategies were applied on SAR and MS data to produce more spectral and textural information. To overcome the redundancy and correlation between features, an intrinsic dimension estimation method based on noise-whitened Harsanyi, Farrand, and Chang determines the proper dimension of the features. Then, principal component analysis and independent component analysis were utilized on stacked feature space of two data. Afterward, SVM and CART classified each reduced feature space. Finally, a fusion strategy was utilized to fuse the classification results. To show the effectiveness of the proposed methodology, single classification on each data was compared to the obtained results. A coregistered Radarsat-2 and WorldView-2 data set from San Francisco, USA, was available to examine the effectiveness of the proposed method. The results show that combinations of SAR data with optical sensor based on the proposed methodology improve the classification results for most of the classes. The proposed fusion method provided approximately 93.24% and 95.44% for two different areas of the data.

  2. A scalable approach for tree segmentation within small-footprint airborne LiDAR data

    NASA Astrophysics Data System (ADS)

    Hamraz, Hamid; Contreras, Marco A.; Zhang, Jun

    2017-05-01

    This paper presents a distributed approach that scales up to segment tree crowns within a LiDAR point cloud representing an arbitrarily large forested area. The approach uses a single-processor tree segmentation algorithm as a building block in order to process the data delivered in the shape of tiles in parallel. The distributed processing is performed in a master-slave manner, in which the master maintains the global map of the tiles and coordinates the slaves that segment tree crowns within and across the boundaries of the tiles. A minimal bias was introduced to the number of detected trees because of trees lying across the tile boundaries, which was quantified and adjusted for. Theoretical and experimental analyses of the runtime of the approach revealed a near linear speedup. The estimated number of trees categorized by crown class and the associated error margins as well as the height distribution of the detected trees aligned well with field estimations, verifying that the distributed approach works correctly. The approach enables providing information of individual tree locations and point cloud segments for a forest-level area in a timely manner, which can be used to create detailed remotely sensed forest inventories. Although the approach was presented for tree segmentation within LiDAR point clouds, the idea can also be generalized to scale up processing other big spatial datasets.

  3. Differences in forest area classification based on tree tally from variable- and fixed-radius plots

    Treesearch

    David Azuma; Vicente J. Monleon

    2011-01-01

    In forest inventory, it is not enough to formulate a definition; it is also necessary to define the "measurement procedure." In the classification of forestland by dominant cover type, the measurement design (the plot) can affect the outcome of the classification. We present results of a simulation study comparing classification of the dominant cover type...

  4. A new classification approach for detecting severe weather patterns

    NASA Astrophysics Data System (ADS)

    Teixeira de Lima, Glauston R.; Stephany, Stephan

    2013-08-01

    Early detection of possible occurrences of severe convective events would be useful in order to avoid, or at least mitigate, the environmental and socio-economic damages caused by such events. However, the enormous volume of meteorological data currently available makes difficult, if not impossible, its analysis by meteorologists. In addition, severe convective events may occur in very different spatial and temporal scales, precluding their early and accurate prediction. In this work, we propose an innovative approach for the classification of meteorological data based on the frequency of occurrence of the values of different variables provided by a weather forecast model. It is possible to identify patterns that may be associated to severe convective activity. In the considered classification problem, the information attributes are variables outputted by the weather forecast model Eta, while the decision attribute is given by the density of occurrence of cloud-to-ground atmospheric electrical discharges, assumed as correlated to the level of convective activity. Results show good classification performance for some selected mini-regions of Brazil during the summer of 2007. We expect that the screening of the outputs of the meteorological model Eta by the proposed classifier could serve as a support tool for meteorologists in order to identify in advance patterns associated to severe convective events.

  5. Colorectal Cancer Classification and Cell Heterogeneity: A Systems Oncology Approach.

    PubMed

    Blanco-Calvo, Moisés; Concha, Ángel; Figueroa, Angélica; Garrido, Federico; Valladares-Ayerbes, Manuel

    2015-06-15

    Colorectal cancer is a heterogeneous disease that manifests through diverse clinical scenarios. During many years, our knowledge about the variability of colorectal tumors was limited to the histopathological analysis from which generic classifications associated with different clinical expectations are derived. However, currently we are beginning to understand that under the intense pathological and clinical variability of these tumors there underlies strong genetic and biological heterogeneity. Thus, with the increasing available information of inter-tumor and intra-tumor heterogeneity, the classical pathological approach is being displaced in favor of novel molecular classifications. In the present article, we summarize the most relevant proposals of molecular classifications obtained from the analysis of colorectal tumors using powerful high throughput techniques and devices. We also discuss the role that cancer systems biology may play in the integration and interpretation of the high amount of data generated and the challenges to be addressed in the future development of precision oncology. In addition, we review the current state of implementation of these novel tools in the pathological laboratory and in clinical practice.

  6. Colorectal Cancer Classification and Cell Heterogeneity: A Systems Oncology Approach

    PubMed Central

    Blanco-Calvo, Moisés; Concha, Ángel; Figueroa, Angélica; Garrido, Federico; Valladares-Ayerbes, Manuel

    2015-01-01

    Colorectal cancer is a heterogeneous disease that manifests through diverse clinical scenarios. During many years, our knowledge about the variability of colorectal tumors was limited to the histopathological analysis from which generic classifications associated with different clinical expectations are derived. However, currently we are beginning to understand that under the intense pathological and clinical variability of these tumors there underlies strong genetic and biological heterogeneity. Thus, with the increasing available information of inter-tumor and intra-tumor heterogeneity, the classical pathological approach is being displaced in favor of novel molecular classifications. In the present article, we summarize the most relevant proposals of molecular classifications obtained from the analysis of colorectal tumors using powerful high throughput techniques and devices. We also discuss the role that cancer systems biology may play in the integration and interpretation of the high amount of data generated and the challenges to be addressed in the future development of precision oncology. In addition, we review the current state of implementation of these novel tools in the pathological laboratory and in clinical practice. PMID:26084042

  7. AutoClass: A Bayesian Approach to Classification

    NASA Technical Reports Server (NTRS)

    Stutz, John; Cheeseman, Peter; Hanson, Robin; Taylor, Will; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    We describe a Bayesian approach to the untutored discovery of classes in a set of cases, sometimes called finite mixture separation or clustering. The main difference between clustering and our approach is that we search for the "best" set of class descriptions rather than grouping the cases themselves. We describe our classes in terms of a probability distribution or density function, and the locally maximal posterior probability valued function parameters. We rate our classifications with an approximate joint probability of the data and functional form, marginalizing over the parameters. Approximation is necessitated by the computational complexity of the joint probability. Thus, we marginalize w.r.t. local maxima in the parameter space. We discuss the rationale behind our approach to classification. We give the mathematical development for the basic mixture model and describe the approximations needed for computational tractability. We instantiate the basic model with the discrete Dirichlet distribution and multivariant Gaussian density likelihoods. Then we show some results for both constructed and actual data.

  8. Prognostic transcriptional association networks: a new supervised approach based on regression trees

    PubMed Central

    Nepomuceno-Chamorro, Isabel; Azuaje, Francisco; Devaux, Yvan; Nazarov, Petr V.; Muller, Arnaud; Aguilar-Ruiz, Jesús S.; Wagner, Daniel R.

    2011-01-01

    Motivation: The application of information encoded in molecular networks for prognostic purposes is a crucial objective of systems biomedicine. This approach has not been widely investigated in the cardiovascular research area. Within this area, the prediction of clinical outcomes after suffering a heart attack would represent a significant step forward. We developed a new quantitative prediction-based method for this prognostic problem based on the discovery of clinically relevant transcriptional association networks. This method integrates regression trees and clinical class-specific networks, and can be applied to other clinical domains. Results: Before analyzing our cardiovascular disease dataset, we tested the usefulness of our approach on a benchmark dataset with control and disease patients. We also compared it to several algorithms to infer transcriptional association networks and classification models. Comparative results provided evidence of the prediction power of our approach. Next, we discovered new models for predicting good and bad outcomes after myocardial infarction. Using blood-derived gene expression data, our models reported areas under the receiver operating characteristic curve above 0.70. Our model could also outperform different techniques based on co-expressed gene modules. We also predicted processes that may represent novel therapeutic targets for heart disease, such as the synthesis of leucine and isoleucine. Availability: The SATuRNo software is freely available at http://www.lsi.us.es/isanepo/toolsSaturno/. Contact: inepomuceno@us.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21098433

  9. Identification of pests and diseases of Dalbergia hainanensis based on EVI time series and classification of decision tree

    NASA Astrophysics Data System (ADS)

    Luo, Qiu; Xin, Wu; Qiming, Xiong

    2017-06-01

    In the process of vegetation remote sensing information extraction, the problem of phenological features and low performance of remote sensing analysis algorithm is not considered. To solve this problem, the method of remote sensing vegetation information based on EVI time-series and the classification of decision-tree of multi-source branch similarity is promoted. Firstly, to improve the time-series stability of recognition accuracy, the seasonal feature of vegetation is extracted based on the fitting span range of time-series. Secondly, the decision-tree similarity is distinguished by adaptive selection path or probability parameter of component prediction. As an index, it is to evaluate the degree of task association, decide whether to perform migration of multi-source decision tree, and ensure the speed of migration. Finally, the accuracy of classification and recognition of pests and diseases can reach 87%--98% of commercial forest in Dalbergia hainanensis, which is significantly better than that of MODIS coverage accuracy of 80%--96% in this area. Therefore, the validity of the proposed method can be verified.

  10. Prediction of Severe Acute Pancreatitis Using a Decision Tree Model Based on the Revised Atlanta Classification of Acute Pancreatitis

    PubMed Central

    Zhang, Yushun; Yang, Chong; Gou, Shanmiao; Li, Yongfeng; Xiong, Jiongxin; Wu, Heshui; Wang, Chunyou

    2015-01-01

    Objective To develop a model for the early prediction of severe acute pancreatitis based on the revised Atlanta classification of acute pancreatitis. Methods Clinical data of 1308 patients with acute pancreatitis (AP) were included in the retrospective study. A total of 603 patients who were admitted to the hospital within 36 hours of the onset of the disease were included at last according to the inclusion criteria. The clinical data were collected within 12 hours after admission. All the patients were classified as having mild acute pancreatitis (MAP), moderately severe acute pancreatitis (MSAP) and severe acute pancreatitis (SAP) based on the revised Atlanta classification of acute pancreatitis. All the 603 patients were randomly divided into training group (402 cases) and test group (201 cases). Univariate and multiple regression analyses were used to identify the independent risk factors for the development of SAP in the training group. Then the prediction model was constructed using the decision tree method, and this model was applied to the test group to evaluate its validity. Results The decision tree model was developed using creatinine, lactate dehydrogenase, and oxygenation index to predict SAP. The diagnostic sensitivity and specificity of SAP in the training group were 80.9% and 90.0%, respectively, and the sensitivity and specificity in the test group were 88.6% and 90.4%, respectively. Conclusions The decision tree model based on creatinine, lactate dehydrogenase, and oxygenation index is more likely to predict the occurrence of SAP. PMID:26580397

  11. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    PubMed

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  12. Prediction of Severe Acute Pancreatitis Using a Decision Tree Model Based on the Revised Atlanta Classification of Acute Pancreatitis.

    PubMed

    Yang, Zhiyong; Dong, Liming; Zhang, Yushun; Yang, Chong; Gou, Shanmiao; Li, Yongfeng; Xiong, Jiongxin; Wu, Heshui; Wang, Chunyou

    2015-01-01

    To develop a model for the early prediction of severe acute pancreatitis based on the revised Atlanta classification of acute pancreatitis. Clinical data of 1308 patients with acute pancreatitis (AP) were included in the retrospective study. A total of 603 patients who were admitted to the hospital within 36 hours of the onset of the disease were included at last according to the inclusion criteria. The clinical data were collected within 12 hours after admission. All the patients were classified as having mild acute pancreatitis (MAP), moderately severe acute pancreatitis (MSAP) and severe acute pancreatitis (SAP) based on the revised Atlanta classification of acute pancreatitis. All the 603 patients were randomly divided into training group (402 cases) and test group (201 cases). Univariate and multiple regression analyses were used to identify the independent risk factors for the development of SAP in the training group. Then the prediction model was constructed using the decision tree method, and this model was applied to the test group to evaluate its validity. The decision tree model was developed using creatinine, lactate dehydrogenase, and oxygenation index to predict SAP. The diagnostic sensitivity and specificity of SAP in the training group were 80.9% and 90.0%, respectively, and the sensitivity and specificity in the test group were 88.6% and 90.4%, respectively. The decision tree model based on creatinine, lactate dehydrogenase, and oxygenation index is more likely to predict the occurrence of SAP.

  13. Full Hierarchic Versus Non-Hierarchic Classification Approaches for Mapping Sealed Surfaces at the Rural-Urban Fringe Using High-Resolution Satellite Data

    PubMed Central

    De Roeck, Tim; Van de Voorde, Tim; Canters, Frank

    2009-01-01

    Since 2008 more than half of the world population is living in cities and urban sprawl is continuing. Because of these developments, the mapping and monitoring of urban environments and their surroundings is becoming increasingly important. In this study two object-oriented approaches for high-resolution mapping of sealed surfaces are compared: a standard non-hierarchic approach and a full hierarchic approach using both multi-layer perceptrons and decision trees as learning algorithms. Both methods outperform the standard nearest neighbour classifier, which is used as a benchmark scenario. For the multi-layer perceptron approach, applying a hierarchic classification strategy substantially increases the accuracy of the classification. For the decision tree approach a one-against-all hierarchic classification strategy does not lead to an improvement of classification accuracy compared to the standard all-against-all approach. Best results are obtained with the hierarchic multi-layer perceptron classification strategy, producing a kappa value of 0.77. A simple shadow reclassification procedure based on characteristics of neighbouring objects further increases the kappa value to 0.84. PMID:22389586

  14. Full hierarchic versus non-hierarchic classification approaches for mapping sealed surfaces at the rural-urban fringe using high-resolution satellite data.

    PubMed

    De Roeck, Tim; Van de Voorde, Tim; Canters, Frank

    2009-01-01

    Since 2008 more than half of the world population is living in cities and urban sprawl is continuing. Because of these developments, the mapping and monitoring of urban environments and their surroundings is becoming increasingly important. In this study two object-oriented approaches for high-resolution mapping of sealed surfaces are compared: a standard non-hierarchic approach and a full hierarchic approach using both multi-layer perceptrons and decision trees as learning algorithms. Both methods outperform the standard nearest neighbour classifier, which is used as a benchmark scenario. For the multi-layer perceptron approach, applying a hierarchic classification strategy substantially increases the accuracy of the classification. For the decision tree approach a one-against-all hierarchic classification strategy does not lead to an improvement of classification accuracy compared to the standard all-against-all approach. Best results are obtained with the hierarchic multi-layer perceptron classification strategy, producing a kappa value of 0.77. A simple shadow reclassification procedure based on characteristics of neighbouring objects further increases the kappa value to 0.84.

  15. A three-way approach for protein function classification

    PubMed Central

    2017-01-01

    The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy. PMID:28234929

  16. Multifaceted approach to the diagnosis and classification of acute leukemias.

    PubMed

    McKenna, R W

    2000-08-01

    Until recently, the diagnosis and classification of acute myeloid (AML) and acute lymphoblastic (ALL) leukemias was based almost exclusively on well-defined morphologic criteria and cytochemical stains. Although most cases can be diagnosed by these methods, there is only modest correlation between morphologic categories and treatment responsiveness and prognosis. The expansion of therapeutic options and improvement in remission induction and disease-free survival for both AML and ALL have stimulated emphasis on defining good and poor treatment response groups. This is most effectively accomplished by a multifaceted approach to diagnosis and classification using immunophenotyping, cytogenetics, and molecular analysis in addition to the traditional methods. Immunophenotyping is important in characterizing morphologically poorly differentiated acute leukemias and in defining prognostic categories of ALL. Cytogenetic and molecular studies provide important prognostic information and are becoming vitally important in determining the appropriate treatment protocol. With optimal application of these techniques in the diagnosis of acute leukemias, treatment strategies can be more specifically directed and new therapeutic approaches can be evaluated more effectively.

  17. A comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification

    NASA Astrophysics Data System (ADS)

    Lin, Yi; Hyyppä, Juha

    2016-04-01

    Tree species information is crucial for digital forestry, and efficient techniques for classifying tree species are extensively demanded. To this end, airborne light detection and ranging (LiDAR) has been introduced. However, the literature review suggests that most of the previous airborne LiDAR-based studies were only based on limited kinds of tree signatures. To address this gap, this study proposed developing a novel modular framework for LiDAR-based tree species classification, by deriving feature parameters in a systematic way. Specifically, feature parameters of point-distribution (PD), laser pulse intensity (IN), crown-internal (CI) and tree-external (TE) structures were proposed and derived. With a support-vector-machine (SVM) classifier used, the classifications were conducted in a leave-one-out-for-cross-validation (LOOCV) mode. Based on the samples of four typical boreal tree species, i.e., Picea abies, Pinus sylvestris, Populus tremula and Quercus robur, tests showed that the accuracies of the classifications based on the acquired PD-, IN-, CI- and TE-categorized feature parameters as well as the integration of their individual optimal parameters are 65.00%, 80.00%, 82.50%, 85.00% and 92.50%, respectively. These results indicate that the procedures proposed in this study can be used as a comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification.

  18. Incremental Transductive Learning Approaches to Schistosomiasis Vector Classification

    NASA Astrophysics Data System (ADS)

    Fusco, Terence; Bi, Yaxin; Wang, Haiying; Browne, Fiona

    2016-08-01

    The key issues pertaining to collection of epidemic disease data for our analysis purposes are that it is a labour intensive, time consuming and expensive process resulting in availability of sparse sample data which we use to develop prediction models. To address this sparse data issue, we present the novel Incremental Transductive methods to circumvent the data collection process by applying previously acquired data to provide consistent, confidence-based labelling alternatives to field survey research. We investigated various reasoning approaches for semi-supervised machine learning including Bayesian models for labelling data. The results show that using the proposed methods, we can label instances of data with a class of vector density at a high level of confidence. By applying the Liberal and Strict Training Approaches, we provide a labelling and classification alternative to standalone algorithms. The methods in this paper are components in the process of reducing the proliferation of the Schistosomiasis disease and its effects.

  19. A tree classification for the selection forests of the Sierra Nevada

    Treesearch

    Duncan Dunning

    1928-01-01

    Individuality in man is accepted without question. In domestic animals, also, good and bad individuals are generally recognized. Even in some cultivated plants —orange trees and rubber trees— the poor producers are searched out and eliminated. Indeed, individual variability is a normal condition in all groups of organisms. Yet forest trees are...

  20. Relating FIA data to habitat classifications via tree-based models of canopy cover

    Treesearch

    Mark D. Nelson; Brian G. Tavernia; Chris Toney; Brian F. Walters

    2012-01-01

    Wildlife species-habitat matrices are used to relate lists of species with abundance of their habitats. The Forest Inventory and Analysis Program provides data on forest composition and structure, but these attributes may not correspond directly with definitions of wildlife habitats. We used FIA tree data and tree crown diameter models to estimate canopy cover, from...

  1. A new approach to modeling tree rainfall interception

    NASA Astrophysics Data System (ADS)

    Xiao, Qingfu; McPherson, E. Gregory; Ustin, Susan L.; Grismer, Mark E.

    2000-12-01

    A three-dimensional physically based stochastic model was developed to describe canopy rainfall interception processes at desired spatial and temporal resolutions. Such model development is important to understand these processes because forest canopy interception may exceed 59% of annual precipitation in old growth trees. The model describes the interception process from a single leaf, to a branch segment, and then up to the individual tree level. It takes into account rainfall, meteorology, and canopy architecture factors as explicit variables. Leaf and stem surface roughness, architecture, and geometric shape control both leaf drip and stemflow. Model predictions were evaluated using actual interception data collected for two mature open grown trees, a 9-year-old broadleaf deciduous pear tree (Pyrus calleryana "Bradford" or Callery pear) and an 8-year-old broadleaf evergreen oak tree (Quercus suber or cork oak). When simulating 18 rainfall events for the oak tree and 16 rainfall events for the pear tree, the model over estimated interception loss by 4.5% and 3.0%, respectively, while stemflow was under estimated by 0.8% and 3.3%, and throughfall was under estimated by 3.7% for the oak tree and over estimated by 0.3% for the pear tree. A model sensitivity analysis indicates that canopy surface storage capacity had the greatest influence on interception, and interception losses were sensitive to leaf and stem surface area indices. Among rainfall factors, interception losses relative to gross precipitation were most sensitive to rainfall amount. Rainfall incident angle had a significant effect on total precipitation intercepting the projected surface area. Stemflow was sensitive to stem segment and leaf zenith angle distributions. Enhanced understanding of interception loss dynamics should lead to improved urban forest ecosystem management.

  2. A maximum pseudo-likelihood approach for estimating species trees under the coalescent model

    PubMed Central

    2010-01-01

    Background Several phylogenetic approaches have been developed to estimate species trees from collections of gene trees. However, maximum likelihood approaches for estimating species trees under the coalescent model are limited. Although the likelihood of a species tree under the multispecies coalescent model has already been derived by Rannala and Yang, it can be shown that the maximum likelihood estimate (MLE) of the species tree (topology, branch lengths, and population sizes) from gene trees under this formula does not exist. In this paper, we develop a pseudo-likelihood function of the species tree to obtain maximum pseudo-likelihood estimates (MPE) of species trees, with branch lengths of the species tree in coalescent units. Results We show that the MPE of the species tree is statistically consistent as the number M of genes goes to infinity. In addition, the probability that the MPE of the species tree matches the true species tree converges to 1 at rate O(M -1). The simulation results confirm that the maximum pseudo-likelihood approach is statistically consistent even when the species tree is in the anomaly zone. We applied our method, Maximum Pseudo-likelihood for Estimating Species Trees (MP-EST) to a mammal dataset. The four major clades found in the MP-EST tree are consistent with those in the Bayesian concatenation tree. The bootstrap supports for the species tree estimated by the MP-EST method are more reasonable than the posterior probability supports given by the Bayesian concatenation method in reflecting the level of uncertainty in gene trees and controversies over the relationship of four major groups of placental mammals. Conclusions MP-EST can consistently estimate the topology and branch lengths (in coalescent units) of the species tree. Although the pseudo-likelihood is derived from coalescent theory, and assumes no gene flow or horizontal gene transfer (HGT), the MP-EST method is robust to a small amount of HGT in the dataset. In addition

  3. Human and tree classification based on a model using 3D ladar in a GPS-denied environment

    NASA Astrophysics Data System (ADS)

    Cho, Kuk; Baeg, Seung-Ho; Park, Sangdeok

    2013-05-01

    This study explained a method to classify humans and trees by extraction their geometric and statistical features in data obtained from 3D LADAR. In a wooded GPS-denied environment, it is difficult to identify the location of unmanned ground vehicles and it is also difficult to properly recognize the environment in which these vehicles move. In this study, using the point cloud data obtained via 3D LADAR, a method to extract the features of humans, trees, and other objects within an environment was implemented and verified through the processes of segmentation, feature extraction, and classification. First, for the segmentation, the radially bounded nearest neighbor method was applied. Second, for the feature extraction, each segmented object was divided into three parts, and then their geometrical and statistical features were extracted. A human was divided into three parts: the head, trunk and legs. A tree was also divided into three parts: the top, middle, and bottom. The geometric features were the variance of the x-y data for the center of each part in an object, using the distance between the two central points for each part, using K-mean clustering. The statistical features were the variance of each of the parts. In this study, three, six and six features of data were extracted, respectively, resulting in a total of 15 features. Finally, after training the extracted data via an artificial network, new data were classified. This study showed the results of an experiment that applied an algorithm proposed with a vehicle equipped with 3D LADAR in a thickly forested area, which is a GPS-denied environment. A total of 5,158 segments were obtained and the classification rates for human and trees were 82.9% and 87.4%, respectively.

  4. A practical approach to assessing structure, function, and value of street tree populations in small communities

    Treesearch

    S.E. Maco; E.G. McPherson

    2003-01-01

    This study demonstrates an approach to quantify the structure, benefits, and costs of street tree populations in resource-limited communities without tree inventories. Using the city of Davis, California, U.S., as a model, existing data on the benefits and costs of municipal trees were applied to the results of a sample inventory of the city’s public and private street...

  5. Active optical sensors for tree stem detection and classification in nurseries.

    PubMed

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J; Hanson, Bradley D; Slaughter, David C

    2014-06-19

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops.

  6. Active Optical Sensors for Tree Stem Detection and Classification in Nurseries

    PubMed Central

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J.; Hanson, Bradley D.; Slaughter, David C.

    2014-01-01

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops. PMID:24949638

  7. Target-classification approach applied to active UXO sites

    NASA Astrophysics Data System (ADS)

    Shubitidze, F.; Fernández, J. P.; Shamatava, Irma; Barrowes, B. E.; O'Neill, K.

    2013-06-01

    This study is designed to illustrate the discrimination performance at two UXO active sites (Oklahoma's Fort Sill and the Massachusetts Military Reservation) of a set of advanced electromagnetic induction (EMI) inversion/discrimination models which include the orthonormalized volume magnetic source (ONVMS), joint diagonalization (JD), and differential evolution (DE) approaches and whose power and flexibility greatly exceed those of the simple dipole model. The Fort Sill site is highly contaminated by a mix of the following types of munitions: 37-mm target practice tracers, 60-mm illumination mortars, 75-mm and 4.5'' projectiles, 3.5'', 2.36'', and LAAW rockets, antitank mine fuzes with and without hex nuts, practice MK2 and M67 grenades, 2.5'' ballistic windshields, M2A1-mines with/without bases, M19-14 time fuzes, and 40-mm practice grenades with/without cartridges. The site at the MMR site contains targets of yet different sizes. In this work we apply our models to EMI data collected using the MetalMapper (MM) and 2 × 2 TEMTADS sensors. The data for each anomaly are inverted to extract estimates of the extrinsic and intrinsic parameters associated with each buried target. (The latter include the total volume magnetic source or NVMS, which relates to size, shape, and material properties; the former includes location, depth, and orientation). The estimated intrinsic parameters are then used for classification performed via library matching and the use of statistical classification algorithms; this process yielded prioritized dig-lists that were submitted to the Institute for Defense Analyses (IDA) for independent scoring. The models' classification performance is illustrated and assessed based on these independent evaluations.

  8. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    PubMed

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  9. The use of decision trees in the classification of beach forms/patterns on IKONOS-2 data

    NASA Astrophysics Data System (ADS)

    Teodoro, A. C.; Ferreira, D.; Gonçalves, H.

    2013-10-01

    Evaluation of beach hydromorphological behaviour and its classification is highly complex. The available beach morphologic and classification models are mainly based on wave, tidal and sediment parameters. Since these parameters are usually unavailable for some regions - such as in the Portuguese coastal zone - a morphologic analysis using remotely sensed data seems to be a valid alternative. Data mining for spatial pattern recognition is the process of discovering useful information, such as patterns/forms, changes and significant structures from large amounts of data. This study focuses on the application of data mining techniques, particularly Decision Trees (DT), to an IKONOS-2 image in order to classify beach features/patterns, in a stretch of the northwest coast of Portugal. Based on the knowledge of the coastal features, five classes were defined: Sea, Suspended-Sediments, Breaking-Zone, Beachface and Beach. The dataset was randomly divided into training and validation subsets. Based on the analysis of several DT algorithms, the CART algorithm was found to be the most adequate and was thus applied. The performance of the DT algorithm was evaluated by the confusion matrix, overall accuracy, and Kappa coefficient. In the classification of beach features/patterns, the algorithm presented an overall accuracy of 98.2% and a kappa coefficient of 0.97. The DTs were compared with a neural network algorithm, and the results were in agreement. The methodology presented in this paper provides promising results and should be considered in further applications of beach forms/patterns classification.

  10. Reflectance properties of West African savanna trees from ground radiometer measurements. II - Classification of components

    NASA Technical Reports Server (NTRS)

    Hanan, N. P.; Prince, S. D.; Franklin, J.

    1993-01-01

    A pole-mounted radiometer was used to measure the reflectance properties in the red and near-IR of three Sahelian tree species. These properties are classified depending on their location over the canopy. A geometrical description of the patterns of shadow and sunlight on and beneath a model tree when viewed from above is given, and six components are defined. Tree canopies are found to be dark in the red waveband with respect to the soil, but have little or no effect on the near-IR.

  11. Trees

    NASA Astrophysics Data System (ADS)

    Epstein, Henri

    2016-11-01

    An algebraic formalism, developed with V. Glaser and R. Stora for the study of the generalized retarded functions of quantum field theory, is used to prove a factorization theorem which provides a complete description of the generalized retarded functions associated with any tree graph. Integrating over the variables associated to internal vertices to obtain the perturbative generalized retarded functions for interacting fields arising from such graphs is shown to be possible for a large category of space-times.

  12. Comparison of four approaches to a rock facies classification problem

    NASA Astrophysics Data System (ADS)

    Dubois, Martin K.; Bohling, Geoffrey C.; Chakrabarti, Swapan

    2007-05-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified.

  13. Comparison of four approaches to a rock facies classification problem

    USGS Publications Warehouse

    Dubois, M.K.; Bohling, G.C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  14. A new multi criteria classification approach in a multi agent system applied to SEEG analysis.

    PubMed

    Kinié, A; Ndiaye, M; Montois, J J; Jacquelet, Y

    2007-01-01

    This work is focused on the study of the organization of the SEEG signals during epileptic seizures with multi-agent system approach. This approach is based on cooperative mechanisms of auto-organization at the micro level and of emergence of a global function at the macro level. In order to evaluate this approach we propose a distributed collaborative approach for the classification of the interesting signals. This new multi-criteria classification method is able to provide a relevant brain area structures organisation and to bring out epileptogenic networks elements. The method is compared to another classification approach a fuzzy classification and gives better results when applied to SEEG signals.

  15. An improved classification tree analysis of high cost modules based upon an axiomatic definition of complexity

    NASA Technical Reports Server (NTRS)

    Tian, Jianhui; Porter, Adam; Zelkowitz, Marvin V.

    1992-01-01

    Identification of high cost modules has been viewed as one mechanism to improve overall system reliability, since such modules tend to produce more than their share of problems. A decision tree model was used to identify such modules. In this current paper, a previously developed axiomatic model of program complexity is merged with the previously developed decision tree process for an improvement in the ability to identify such modules. This improvement was tested using data from the NASA Software Engineering Laboratory.

  16. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  17. Industrial and occupational ergonomics in the petrochemical process industry: a regression trees approach.

    PubMed

    Bevilacqua, M; Ciarapica, F E; Giacchetta, G

    2008-07-01

    This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.

  18. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2012-02-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  19. An Approach for Automatic Classification of Radiology Reports in Spanish.

    PubMed

    Cotik, Viviana; Filippo, Darío; Castaño, José

    2015-01-01

    Automatic detection of relevant terms in medical reports is useful for educational purposes and for clinical research. Natural language processing (NLP) techniques can be applied in order to identify them. In this work we present an approach to classify radiology reports written in Spanish into two sets: the ones that indicate pathological findings and the ones that do not. In addition, the entities corresponding to pathological findings are identified in the reports. We use RadLex, a lexicon of English radiology terms, and NLP techniques to identify the occurrence of pathological findings. Reports are classified using a simple algorithm based on the presence of pathological findings, negation and hedge terms. The implemented algorithms were tested with a test set of 248 reports annotated by an expert, obtaining a best result of 0.72 F1 measure. The output of the classification task can be used to look for specific occurrences of pathological findings.

  20. A hierarchical approach for speech-instrumental-song classification.

    PubMed

    Ghosal, Arijit; Chakraborty, Rudrasis; Dhara, Bibhas Chandra; Saha, Sanjoy Kumar

    2013-01-01

    Audio classification acts as the fundamental step for lots of applications like content based audio retrieval and audio indexing. In this work, we have presented a novel scheme for classifying audio signal into three categories namely, speech, music without voice (instrumental) and music with voice (song). A hierarchical approach has been adopted to classify the signals. At the first stage, signals are categorized as speech and music using audio texture derived from simple features like ZCR and STE. Proposed audio texture captures contextual information and summarizes the frame level features. At the second stage, music is further classified as instrumental/song based on Mel frequency cepstral co-efficient (MFCC). A classifier based on Random Sample and Consensus (RANSAC), capable of handling wide variety of data has been utilized. Experimental result indicates the effectiveness of the proposed scheme.

  1. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2013-01-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  2. A comparison of feature selection methods for multitemporal tree species classification

    NASA Astrophysics Data System (ADS)

    Pipkins, Kyle; Förster, Michael; Clasen, Anne; Schmidt, Tobias; Kleinschmit, Birgit

    2014-10-01

    The problem of feature selection is a significant one in classification problems, where the addition of too many features to the classification fails to lead to significant increases in classification accuracy. This problem is especially significant within the context of multitemporal remote sensing classifications, where the costs and efforts associated with the acquisition of additional imagery can be extensive. It would thus be beneficial to identify the most important seasons for acquiring imagery for specific land cover types. This study uses a phenologically-adjusted 21 date RapidEye time-series in order to evaluate two methods of feature selection. The two methods compared in this study are a genetic algorithm (GA) and a semi-exhaustive method (EXH), both of which compare permutations of sequential date and band combinations. These methods are employed using a seven class support vector machine classification on a Normalized Difference Vegetation Index (NDVI)-transformed dataset. Overall accuracy (OAA) is used as the performance metric, and OAA significance is assessed using the McNemar test. The results from the feature selection methods are compared on the basis of phenological seasons selected across all iterations and the ideal number of combinations, based on the ratio of better performing classifications to all other classifications. The results suggest that the GA has a moderate but insignificant correlation when compared with the EXH for identifying ideal phenological seasons (overall Spearman's ρ= 0.60, p = 0.13), but is comparable when considering the number of seasons and image combinations.

  3. The importance of chemosensory clues in Aguaruna tree classification and identification

    PubMed Central

    Jernigan, Kevin A

    2008-01-01

    Background The ethnobotanical literature still contains few detailed descriptions of the sensory criteria people use for judging membership in taxonomic categories. Olfactory criteria in particular have been explored very little. This paper will describe the importance of odor for woody plant taxonomy and identification among the Aguaruna Jívaro of the northern Peruvian Amazon, focusing on the Aguaruna category númi (trees excluding palms). Aguaruna informants almost always place trees that they consider to have a similar odor together as kumpají – 'companions,' a metaphor they use to describe trees that they consider to be related. Methods The research took place in several Aguaruna communities in the upper Marañón region of the Peruvian Amazon. Structured interview data focus on informant criteria for membership in various folk taxa of trees. Informants were also asked to explain what members of each group of related companions had in common. This paper focuses on odor and taste criteria that came to light during these structured interviews. Botanical voucher specimens were collected, wherever possible. Results Of the 182 tree folk genera recorded in this study, 51 (28%) were widely considered to possess a distinctive odor. Thirty nine of those (76%) were said to have odors similar to some other tree, while the other 24% had unique odors. Aguaruna informants very rarely described tree odors in non-botanical terms. Taste was used mostly to describe trees with edible fruits. Trees judged to be related were nearly always in the same botanical family. Conclusion The results of this study illustrate that odor of bark, sap, flowers, fruit and leaves are important clues that help the Aguaruna to judge the relatedness of trees found in their local environment. In contrast, taste appears to play a more limited role. The results suggest a more general ethnobotanical hypothesis that could be tested in other cultural settings: people tend to consider plants with

  4. Coronary vessel trees from 3D imagery: A topological approach

    PubMed Central

    Szymczak, Andrzej; Stillman, Arthur; Tannenbaum, Allen; Mischaikow, Konstantin

    2013-01-01

    We propose a simple method for reconstructing vascular trees from 3D images. Our algorithm extracts persistent maxima of the intensity on all axis-aligned 2D slices of the input image. The maxima concentrate along 1D intensity ridges, in particular along blood vessels. We build a forest connecting the persistent maxima with short edges. The forest tends to approximate the blood vessels present in the image, but also contains numerous spurious features and often fails to connect segments belonging to one vessel in low contrast areas. We improve the forest by applying simple geometric filters that trim short branches, fill gaps in blood vessels and remove spurious branches from the vascular tree to be extracted. Experiments show that our technique can be applied to extract coronary trees from heart CT scans. PMID:16798058

  5. A comparison of ARA and DNA data for microbial source tracking based on source-classification models developed using classification trees.

    PubMed

    Price, Bertram; Venso, Elichia; Frana, Mark; Greenberg, Joshua; Ware, Adam

    2007-08-01

    The literature on microbial source tracking (MST) suggests that DNA analysis of fecal samples leads to more reliable determinations of bacterial sources of surface water contamination than antibiotic resistance analysis (ARA). Our goal is to determine whether the increased reliability, if any, in library-based MST developed with DNA data is sufficient to justify its higher cost, where the bacteria source predictions are used in TMDL surface water management programs. We describe an application of classification trees for MST applied to ARA and DNA data from samples collected in the Potomac River Watershed in Maryland. Conclusions concerning the comparison of ARA and DNA data, although preliminary at the current time, suggest that the added cost of obtaining DNA data in comparison to the cost of ARA data may not be justified, where MST is applied in TMDL surface water management programs.

  6. TrExML: a maximum-likelihood approach for extensive tree-space exploration.

    PubMed

    Wolf, M J; Easteal, S; Kahn, M; McKay, B D; Jermiin, L S

    2000-04-01

    Maximum-likelihood analysis of nucleotide and amino acid sequences is a powerful approach for inferring phylogenetic relationships and for comparing evolutionary hypotheses. Because it is a computationally demanding and time-consuming process, most algorithms explore only a minute portion of tree-space, with the emphasis on finding the most likely tree while ignoring the less likely, but not significantly worse, trees. However, when such trees exist, it is equally important to identify them to give due consideration to the phylogenetic uncertainty. Consequently, it is necessary to change the focus of these algorithms such that near optimal trees are also identified. This paper presents the Advanced Stepwise Addition Algorithm for exploring tree-space and two algorithms for generating all binary trees on a set of sequences. The Advanced Stepwise Addition Algorithm has been implemented in TrExML, a phylogenetic program for maximum-likelihood analysis of nucleotide sequences. TrExML is shown to be more effective at finding near optimal trees than a similar program, fastDNAml, implying that TrExML offers a better approach to account for phylogenetic uncertainty than has previously been possible. A program, TreeGen, is also described; it generates binary trees on a set of sequences allowing for extensive exploration of tree-space using other programs. TreeGen, TrExML, and the sequence data used to test the programs are available from the following two WWW sites: http://whitetail.bemidji.msus. edu/trexml/and http://jcsmr.anu.edu.au/dmm/humgen.+ ++html.

  7. Bacillary dysentery and meteorological factors in northeastern China: a historical review based on classification and regression trees.

    PubMed

    Guan, Peng; Huang, Desheng; Guo, Junqiao; Wang, Ping; Zhou, Baosen

    2008-09-01

    The relationship between the incidence of bacillary dysentery and meteorological factors was investigated. Data on bacillary dysentery incidence in Shenyang from 1990 to 1996 were obtained from Liaoning Provincial Center for Disease Control and Prevention, and meteorological data such as atmospheric pressure, air temperature, precipitation, evaporation, wind speed, and the amount of solar radiation were obtained from Shenyang Meteorological Bureau. Kendall and Spearman correlations were used to analyze the relationship between bacillary dysentery and meteorological factors. The incidence of bacillary dysentery was treated as a response variable, and meteorological factors were treated as predictable variables. Software R 2.3.1 was used to execute the classification and regression trees (CART). The model improved the accuracy of the fitting results. The residual sum square error of the regression tree model was 53.9, while the residual sum square error of the multivariate linear regression model was 107.2. Among all the meteorological indexes, relative humidity, minimum temperature, and pressure one month prior were statistically influential factors in the multivariate regression tree model. CART may be a useful tool for dealing with heterogeneous data, as it can serve as a decision support tool and is notable for its simplicity and ease.

  8. Hybrid Classification of Pulmonary Nodules

    NASA Astrophysics Data System (ADS)

    Lee, S. L. A.; Kouzani, A. Z.; Hu, E. J.

    Automated classification of lung nodules is challenging because of the variation in shape and size of lung nodules, as well as their associated differences in their images. Ensemble based learners have demonstrated the potentialof good performance. Random forests are employed for pulmonary nodule classification where each tree in the forest produces a classification decision, and an integrated output is calculated. A classification aided by clustering approach is proposed to improve the lung nodule classification performance. Three experiments are performed using the LIDC lung image database of 32 cases. The classification performance and execution times are presented and discussed.

  9. Deep water X-mas tree standardization -- Interchangeability approach

    SciTech Connect

    Paula, M.T.R.; Paulo, C.A.S.; Moreira, C.C.

    1995-12-31

    Aiming the rationalization of subsea operations to turn the production of oil and gas more economical and reliable, standardization of subsea equipment interfaces is a tool that can play a very important role. Continuing the program initiated some years ago, Petrobras is now harvesting the results from the first efforts. Diverless guidelineless subsea Christmas trees from four different suppliers have already been manufactured in accordance to the standardized specification. Tests performed this year in Macae (Campos Basin onshore base), in Brazil, confirmed the interchangeability among subsea Christmas trees, tubing hangers, adapter bases and flowline hubs of different manufacturers. This interchangeability, associated with the use of proven techniques, results in operational flexibility, savings in rig time and reduction in production losses during workovers. By now, 33 complete sets of subsea Christmas trees have already been delivered and successfully tested. Other 28 sets are still being manufactured by the four local suppliers. For the next five years, more than a hundred of these trees will be required for the exploration of the new discoveries. This paper describes the standardized equipment, the role of the operator in an integrated way of working with the manufacturers on the standardization activities, the importance of a frank information flow through the involved companies and how a simple manufacturing philosophy, with the use of construction jigs, has proved to work satisfactorily.

  10. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach

    ERIC Educational Resources Information Center

    Klauer, Karl Christoph

    2010-01-01

    Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…

  11. Oak Wilt: People and Trees, A Community Approach to Management

    Treesearch

    J. Juzwik; S. Cook; L. Haugen; J. Elwell

    2004-01-01

    Version 1.3. This self-paced short course on CD-ROM was designed as a learning tool for urban and community foresters, city administrators, tree inspectors, parks and recreation staff, and others involved in oak wilt management.Click the "View or print this publication" link below to request your Oak Wilt: People and...

  12. An Economic Approach to Planting Trees for Carbon Storage

    Treesearch

    Peter J. Parks; David O. Hall; Bengt Kristrom; Omar R. Masera; Robert J. Multon; Andrew J. Plantinga; Joel N. Swisher; Jack K. Winjum

    1997-01-01

    Abstract: Methods are described for evaluating economic and carbon storage aspects of tree planting projects (e.g., plantations for restoration, roundwood, bioenergy, and nonwood products). Total carbon (C) stock is dynamic and comprises C in vegetation, decomposing matter, soil, products, and fuel substituted. An alternative (reference) case is...

  13. A Fault Tree Approach to Analysis of Organizational Communication Systems.

    ERIC Educational Resources Information Center

    Witkin, Belle Ruth; Stephens, Kent G.

    Fault Tree Analysis (FTA) is a method of examing communication in an organization by focusing on: (1) the complex interrelationships in human systems, particularly in communication systems; (2) interactions across subsystems and system boundaries; and (3) the need to select and "prioritize" channels which will eliminate noise in the…

  14. A basic approach to fire injury of tree stems

    Treesearch

    R. E. Martin

    1963-01-01

    Fire has come to be widely used as a tool in wildland management, particularly in the South. Its usefulness in fire hazard reduction, removal of undesirable trees, and changing of cover types has been demonstrated. We are continually trying to improve fire use, however, by learning more of the specific effects of fire on different species of plants.

  15. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, Jacopo; Tuia, Devis; Berne, Alexis

    2015-04-01

    Hydrometeor classification is the process that aims at identifying the dominant type of hydrometeor (e.g. rain, hail, snow aggregates, hail, graupel, ice crystals) in a domain covered by a polarimetric weather radar during precipitation. The techniques documented in the literature are mostly based on numerical simulations and fuzzy logic. This involves the arbitrary selection of a set of hydrometeor classes and the numerical simulation of theoretical radar observations associated to each class. The information derived from the simulation is then applied to actual radar measurements by means of fuzzy logic input-output association. This approach has some limitations: the number and type of the hydrometeor categories undergoing identification is selected arbitrarily and the scattering simulations are based on constraining assumptions, especially in case of solid hydrometeors. Furthermore, in presence of noise and uncertainties, it is not guaranteed that the selected hydrometeor classes can be effectively identified in actual observations. In the present work we propose a different starting point for the classification task, which is based on observations instead of numerical simulations. We provide criteria for the selection of the number of hydrometeor classes that can be identified, by looking at how polarimetric observations collected over different precipitation events form clusters in the multi-dimensional space of the polarimetric variables. Two datasets, collected by an X-band weather radar, are employed in the study. The first dataset covers mountainous weather conditions (Swiss Alps), while the second includes Mediterranean orographic precipitation events collected during the special observation period (SOP) 2012 of the HyMeX campaign. We employ an unsupervised hierarchical clustering method to group the observations into clusters and we introduce a spatial smoothness constraint for the groups, assuming that the hydrometeor type changes smoothly in space

  16. Remote sensing of aquatic vegetation distribution in Taihu Lake using an improved classification tree with modified thresholds.

    PubMed

    Zhao, Dehua; Jiang, Hao; Yang, Tangwu; Cai, Ying; Xu, Delin; An, Shuqing

    2012-03-01

    Classification trees (CT) have been used successfully in the past to classify aquatic vegetation from spectral indices (SI) obtained from remotely-sensed images. However, applying CT models developed for certain image dates to other time periods within the same year or among different years can reduce the classification accuracy. In this study, we developed CT models with modified thresholds using extreme SI values (CT(m)) to improve the stability of the models when applying them to different time periods. A total of 903 ground-truth samples were obtained in September of 2009 and 2010 and classified as emergent, floating-leaf, or submerged vegetation or other cover types. Classification trees were developed for 2009 (Model-09) and 2010 (Model-10) using field samples and a combination of two images from winter and summer. Overall accuracies of these models were 92.8% and 94.9%, respectively, which confirmed the ability of CT analysis to map aquatic vegetation in Taihu Lake. However, Model-10 had only 58.9-71.6% classification accuracy and 31.1-58.3% agreement (i.e., pixels classified the same in the two maps) for aquatic vegetation when it was applied to image pairs from both a different time period in 2010 and a similar time period in 2009. We developed a method to estimate the effects of extrinsic (EF) and intrinsic (IF) factors on model uncertainty using Modis images. Results indicated that 71.1% of the instability in classification between time periods was due to EF, which might include changes in atmospheric conditions, sun-view angle and water quality. The remainder was due to IF, such as phenological and growth status differences between time periods. The modified version of Model-10 (i.e. CT(m)) performed better than traditional CT with different image dates. When applied to 2009 images, the CT(m) version of Model-10 had very similar thresholds and performance as Model-09, with overall accuracies of 92.8% and 90.5% for Model-09 and the CT(m) version of Model

  17. Nonlinear feature extraction for MMW image classification: a supervised approach

    NASA Astrophysics Data System (ADS)

    Maskall, Guy T.; Webb, Andrew R.

    2002-07-01

    The specular nature of Radar imagery causes problems for ATR as small changes to the configuration of targets can result in significant changes to the resulting target signature. This adds to the challenge of constructing a classifier that is both robust to changes in target configuration and capable of generalizing to previously unseen targets. Here, we describe the application of a nonlinear Radial Basis Function (RBF) transformation to perform feature extraction on millimeter-wave (MMW) imagery of target vehicles. The features extracted were used as inputs to a nearest-neighbor classifier to obtain measures of classification performance. The training of the feature extraction stage was by way of a loss function that quantified the amount of data structure preserved in the transformation to feature space. In this paper we describe a supervised extension to the loss function and explore the value of using the supervised training process over the unsupervised approach and compare with results obtained using a supervised linear technique (Linear Discriminant Analysis --- LDA). The data used were Inverse Synthetic Aperture Radar (ISAR) images of armored vehicles gathered at 94GHz and were categorized as Armored Personnel Carrier, Main Battle Tank or Air Defense Unit. We find that the form of supervision used in this work is an advantage when the number of features used for classification is low, with the conclusion that the supervision allows information useful for discrimination between classes to be distilled into fewer features. When only one example of each class is used for training purposes, the LDA results are comparable to the RBF results. However, when an additional example is added per class, the RBF results are significantly better than those from LDA. Thus, the RBF technique seems better able to make use of the extra knowledge available to the system about variability between different examples of the same class.

  18. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification

    PubMed Central

    Botella, Juan; Huang, Huiling; Suero, Manuel

    2013-01-01

    Studies that evaluate the accuracy of binary classification tools are needed. Such studies provide 2 × 2 cross-classifications of test outcomes and the categories according to an unquestionable reference (or gold standard). However, sometimes a suboptimal reliability reference is employed. Several methods have been proposed to deal with studies where the observations are cross-classified with an imperfect reference. These methods require that the status of the reference, as a gold standard or as an imperfect reference, is known. In this paper a procedure for determining whether it is appropriate to maintain the assumption that the reference is a gold standard or an imperfect reference, is proposed. This procedure fits two nested multinomial tree models, and assesses and compares their absolute and incremental fit. Its implementation requires the availability of the results of several independent studies. These should be carried out using similar designs to provide frequencies of cross-classification between a test and the reference under investigation. The procedure is applied in two examples with real data. PMID:24106484

  19. A Novel Approach on Designing Augmented Fuzzy Cognitive Maps Using Fuzzified Decision Trees

    NASA Astrophysics Data System (ADS)

    Papageorgiou, Elpiniki I.

    This paper proposes a new methodology for designing Fuzzy Cognitive Maps using crisp decision trees that have been fuzzified. Fuzzy cognitive map is a knowledge-based technique that works as an artificial cognitive network inheriting the main aspects of cognitive maps and artificial neural networks. Decision trees, in the other hand, are well known intelligent techniques that extract rules from both symbolic and numeric data. Fuzzy theoretical techniques are used to fuzzify crisp decision trees in order to soften decision boundaries at decision nodes inherent in this type of trees. Comparisons between crisp decision trees and the fuzzified decision trees suggest that the later fuzzy tree is significantly more robust and produces a more balanced decision making. The approach proposed in this paper could incorporate any type of fuzzy decision trees. Through this methodology, new linguistic weights were determined in FCM model, thus producing augmented FCM tool. The framework is consisted of a new fuzzy algorithm to generate linguistic weights that describe the cause-effect relationships among the concepts of the FCM model, from induced fuzzy decision trees.

  20. Increased tree establishment in Lithuanian peat bogs--insights from field and remotely sensed approaches.

    PubMed

    Edvardsson, Johannes; Šimanauskienė, Rasa; Taminskas, Julius; Baužienė, Ieva; Stoffel, Markus

    2015-02-01

    Over the past century an ongoing establishment of Scots pine (Pinus sylvestris L.), sometimes at accelerating rates, is noted at three studied Lithuanian peat bogs, namely Kerėplis, Rėkyva and Aukštumala, all representing different degrees of tree coverage and geographic settings. Present establishment rates seem to depend on tree density on the bog surface and are most significant at sparsely covered sites where about three-fourth of the trees have established since the mid-1990s, whereas the initial establishment in general was during the early to mid-19th century. Three methods were used to detect, compare and describe tree establishment: (1) tree counts in small plots, (2) dendrochronological dating of bog pine trees, and (3) interpretation of aerial photographs and historical maps of the study areas. In combination, the different approaches provide complimentary information but also weigh up each other's drawbacks. Tree counts in plots provided a reasonable overview of age class distributions and enabled capturing of the most recently established trees with ages less than 50 years. The dendrochronological analysis yielded accurate tree ages and a good temporal resolution of long-term changes. Tree establishment and spread interpreted from aerial photographs and historical maps provided a good overview of tree spread and total affected area. It also helped to verify the results obtained with the other methods and an upscaling of findings to the entire peat bogs. The ongoing spread of trees in predominantly undisturbed peat bogs is related to warmer and/or drier climatic conditions, and to a minor degree to land-use changes. Our results therefore provide valuable insights into vegetation changes in peat bogs, also with respect to bog response to ongoing and future climatic changes.

  1. Self-organizing tree-growing network for the classification of protein sequences.

    PubMed Central

    Wang, H. C.; Dopazo, J.; de la Fraga, L. G.; Zhu, Y. P.; Carazo, J. M.

    1998-01-01

    The self-organizing tree algorithm (SOTA) was recently introduced to construct phylogenetic trees from biological sequences, based on the principles of Kohonen's self-organizing maps and on Fritzke's growing cell structures. SOTA is designed in such a way that the generation of new nodes can be stopped when the sequences assigned to a node are already above a certain similarity threshold. In this way a phylogenetic tree resolved at a high taxonomic level can be obtained. This capability is especially useful to classify sets of diversified sequences. SOTA was originally designed to analyze pre-aligned sequences. It is now adapted to be able to analyze patterns associated to the frequency of residues along a sequence, such as protein dipeptide composition and other n-gram compositions. In this work we show that the algorithm applied to these data is able to not only successfully construct phylogenetic trees of protein families, such as cytochrome c, triosephophate isomerase, and hemoglobin alpha chains, but also classify very diversified sequence data sets, such as a mixture of interleukins and their receptors. PMID:9865956

  2. Spectral difference analysis and airborne imaging classification for citrus greening infected trees

    USDA-ARS?s Scientific Manuscript database

    Citrus greening, also called Huanglongbing (HLB), became a devastating disease spread through citrus groves in Florida, since it was first found in 2005. Multispectral (MS) and hyperspectral (HS) airborne images of citrus groves in Florida were acquired to detect citrus greening infected trees in 20...

  3. Chemical classification of cattle. 2. Phylogenetic tree and specific status of the Zebu.

    PubMed

    Manwell, C; Baker, C M

    1980-01-01

    Phylogenetic trees for the ten major breed groups of cattle were constructed by Farris's (1972) maximum parsimony method, or Fitch & Margoliash's (1967) method, which averages ou the deviation over the entire assemblage. Both techniques yield essentially identical trees. The phylogenetic tree for the ten major cattle breed groups can be superimposed on a map of Europe and western Asia, the root of the tree being close to the 'fertile crescent' in Asia Minor, believed to be a primary centre of bovine domestication. For some but not all protein variants there is a cline of gene frequencies as one proceeds from the British Isles and northwest Europe towards southeast Europe and Asia Minor, with the most extreme gene frequencies in the Zebu breeds of India. It is not clear to what extent the observed clines are primary or secondary, i.e., consequent to the initial migrations of cattle towards the end of the Pleistocene or consequent to the many migrations of man with his domesticated cattle. Such clines as exist are not in themselves sufficient to prove either selection versus genetic drift or to establish taxonomic ranking. Contrary to some suggestions in the literature, the biochemical evidence supports Linnaeus's original conclusions: Bos taurus and Bos indicus are distinct species.

  4. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii)

    USDA-ARS?s Scientific Manuscript database

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii),...

  5. Developing a methodology to predict oak wilt distribution using classification tree analysis

    Treesearch

    Marla C. Downing; Vernon L. Thomas; Robin M. Reich

    2006-01-01

    Oak wilt (Ceratocystis fagacearum), a fungal disease that causes some species of oak trees to wilt and die rapidly, is a threat to oak forested resources in 22 states in the United States. We developed a methodology for predicting the Potential Distribution of Oak Wilt (PDOW) using Anoka County, Minnesota as our study area. The PDOW utilizes GIS; the...

  6. Frequency domain approach for activity classification using accelerometer.

    PubMed

    Chung, Wan-Young; Purwar, Amit; Sharma, Annapurna

    2008-01-01

    Activity classification was performed using MEMS accelerometer and wireless sensor node for wireless sensor network environment. Three axes MEMS accelerometer measures body's acceleration and transmits measured data with the help of sensor node to base station attached to PC. On the PC, real time accelerometer data is processed for movement classifications. In this paper, Rest, walking and running are the classified activities of the person. Both time and frequency analysis was performed to classify running and walking. The classification of rest and movement is done using Signal magnitude area (SMA). The classification accuracy for rest and movement is 100%. For the classification of walk and Run two parameters i.e. SMA and Median frequency were used. The classification accuracy for walk and running was detected as 81.25% in the experiments performed by the test persons.

  7. Identification, Classification and Differential Expression of Oleosin Genes in Tung Tree (Vernicia fordii)

    PubMed Central

    Cao, Heping; Zhang, Lin; Tan, Xiaofeng; Long, Hongxu; Shockey, Jay M.

    2014-01-01

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii), whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the “proline knot” motif (PX5SPX3P) shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach), Populus trichocarpa (poplar), Ricinus communis (castor bean), Theobroma cacao (cacao) and Vitis vinifera (grapevine). Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1) All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2) OLE mRNA levels were much higher in seeds than leaves or flowers; 3) OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4) OLE mRNA levels rapidly increased during seed development; and 5) OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1–3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants. PMID:24516650

  8. A Bayesian Approach for Fast and Accurate Gene Tree Reconstruction

    PubMed Central

    Rasmussen, Matthew D.; Kellis, Manolis

    2011-01-01

    Recent sequencing and computing advances have enabled phylogenetic analyses to expand to both entire genomes and large clades, thus requiring more efficient and accurate methods designed specifically for the phylogenomic context. Here, we present SPIMAP, an efficient Bayesian method for reconstructing gene trees in the presence of a known species tree. We observe many improvements in reconstruction accuracy, achieved by modeling multiple aspects of evolution, including gene duplication and loss (DL) rates, speciation times, and correlated substitution rate variation across both species and loci. We have implemented and applied this method on two clades of fully sequenced species, 12 Drosophila and 16 fungal genomes as well as simulated phylogenies and find dramatic improvements in reconstruction accuracy as compared with the most popular existing methods, including those that take the species tree into account. We find that reconstruction inaccuracies of traditional phylogenetic methods overestimate the number of DL events by as much as 2–3-fold, whereas our method achieves significantly higher accuracy. We feel that the results and methods presented here will have many important implications for future investigations of gene evolution. PMID:20660489

  9. Impacts of age-dependent tree sensitivity and dating approaches on dendrogeomorphic time series of landslides

    NASA Astrophysics Data System (ADS)

    Šilhán, Karel; Stoffel, Markus

    2015-05-01

    Different approaches and thresholds have been utilized in the past to date landslides with growth ring series of disturbed trees. Past work was mostly based on conifer species because of their well-defined ring boundaries and the easy identification of compression wood after stem tilting. More recently, work has been expanded to include broad-leaved trees, which are thought to produce less and less evident reactions after landsliding. This contribution reviews recent progress made in dendrogeomorphic landslide analysis and introduces a new approach in which landslides are dated via ring eccentricity formed after tilting. We compare results of this new and the more conventional approaches. In addition, the paper also addresses tree sensitivity to landslide disturbance as a function of tree age and trunk diameter using 119 common beech (Fagus sylvatica L.) and 39 Crimean pine (Pinus nigra ssp. pallasiana) trees growing on two landslide bodies. The landslide events reconstructed with the classical approach (reaction wood) also appear as events in the eccentricity analysis, but the inclusion of eccentricity clearly allowed for more (162%) landslides to be detected in the tree-ring series. With respect to tree sensitivity, conifers and broad-leaved trees show the strongest reactions to landslides at ages comprised between 40 and 60 years, with a second phase of increased sensitivity in P. nigra at ages of ca. 120-130 years. These phases of highest sensitivities correspond with trunk diameters at breast height of 6-8 and 18-22 cm, respectively (P. nigra). This study thus calls for the inclusion of eccentricity analyses in future landslide reconstructions as well as for the selection of trees belonging to different age and diameter classes to allow for a well-balanced and more complete reconstruction of past events.

  10. Corpus Callosum MR Image Classification

    NASA Astrophysics Data System (ADS)

    Elsayed, A.; Coenen, F.; Jiang, C.; García-Fiñana, M.; Sluming, V.

    An approach to classifying Magnetic Resonance (MR) image data is described. The specific application is the classification of MRI scan data according to the nature of the corpus callosum, however the approach has more general applicability. A variation of the “spectral segmentation with multi-scale graph decomposition” mechanism is introduced. The result of the segmentation is stored in a quad-tree data structure to which a weighted variation (also developed by the authors) of the gSpan algorithm is applied to identify frequent sub-trees. As a result the images are expressed as a set frequent sub-trees. There may be a great many of these and thus a decision tree based feature reduction technique is applied before classification takes place. The results show that the proposed approach performs both efficiently and effectively, obtaining a classification accuracy of over 95% in the case of the given application.

  11. Pattern Recognition Approaches for Breast Cancer DCE-MRI Classification: A Systematic Review.

    PubMed

    Fusco, Roberta; Sansone, Mario; Filice, Salvatore; Carone, Guglielmo; Amato, Daniela Maria; Sansone, Carlo; Petrillo, Antonella

    2016-01-01

    We performed a systematic review of several pattern analysis approaches for classifying breast lesions using dynamic, morphological, and textural features in dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). Several machine learning approaches, namely artificial neural networks (ANN), support vector machines (SVM), linear discriminant analysis (LDA), tree-based classifiers (TC), and Bayesian classifiers (BC), and features used for classification are described. The findings of a systematic review of 26 studies are presented. The sensitivity and specificity are respectively 91 and 83 % for ANN, 85 and 82 % for SVM, 96 and 85 % for LDA, 92 and 87 % for TC, and 82 and 85 % for BC. The sensitivity and specificity are respectively 82 and 74 % for dynamic features, 93 and 60 % for morphological features, 88 and 81 % for textural features, 95 and 86 % for a combination of dynamic and morphological features, and 88 and 84 % for a combination of dynamic, morphological, and other features. LDA and TC have the best performance. A combination of dynamic and morphological features gives the best performance.

  12. A Cladistic Approach for the Classification of Oligotrichid Ciliates (Ciliophora: Spirotricha)

    PubMed Central

    AGATHA, Sabine

    2010-01-01

    Summary Currently, gene sequence genealogies of the Oligotrichea Bütschli, 1889 comprise only few species. Therefore, a cladistic approach, especially to the Oligotrichida, was made, applying Hennig's method and computer programs. Twenty-three characters were selected and discussed, i.e., the morphology of the oral apparatus (five characters), the somatic ciliature (eight characters), special organelles (four characters), and ontogenetic particulars (six characters). Nine of these characters developed convergently twice. Although several new features were included into the analyses, the cladograms match other morphological trees in the monophyly of the Oligotrichea, Halteriia, Oligotrichia, Oligotrichida, and Choreotrichida. The main synapomorphies of the Oligotrichea are the enantiotropic division mode and the de novo-origin of the undulating membranes. Although the sister group relationship of the Halteriia and the Oligotrichia contradicts results obtained by gene sequence analyses, no morphologic, ontogenetic or ultrastructural features were found, which support a branching of Halteria grandinella within the Stichotrichida. The cladistic approaches suggest paraphyly of the family Strombidiidae probably due to the scarce knowledge. A revised classification of the Oligotrichea is suggested, including all sufficiently known families and genera. PMID:20396404

  13. Narrowing historical uncertainty: probabilistic classification of ambiguously identified tree species in historical forest survey data

    USGS Publications Warehouse

    Mladenoff, D.J.; Dahir, S.E.; Nordheim, E.V.; Schulte, L.A.; Guntenspergen, G.R.

    2002-01-01

    Historical data have increasingly become appreciated for insight into the past conditions of ecosystems. Uses of such data include assessing the extent of ecosystem change; deriving ecological baselines for management, restoration, and modeling; and assessing the importance of past conditions on the composition and function of current systems. One historical data set of this type is the Public Land Survey (PLS) of the United States General Land Office, which contains data on multiple tree species, sizes, and distances recorded at each survey point, located at half-mile (0.8 km) intervals on a 1-mi (1.6 km) grid. This survey method was begun in the 1790s on US federal lands extending westward from Ohio. Thus, the data have the potential of providing a view of much of the US landscape from the mid-1800s, and they have been used extensively for this purpose. However, historical data sources, such as those describing the species composition of forests, can often be limited in the detail recorded and the reliability of the data, since the information was often not originally recorded for ecological purposes. Forest trees are sometimes recorded ambiguously, using generic or obscure common names. For the PLS data of northern Wisconsin, USA, we developed a method to classify ambiguously identified tree species using logistic regression analysis, using data on trees that were clearly identified to species and a set of independent predictor variables to build the models. The models were first created on partial data sets for each species and then tested for fit against the remaining data. Validations were conducted using repeated, random subsets of the data. Model prediction accuracy ranged from 81% to 96% in differentiating congeneric species among oak, pine, ash, maple, birch, and elm. Major predictor variables were tree size, associated species, landscape classes indicative of soil type, and spatial location within the study region. Results help to clarify ambiguities

  14. Tree mortality based fire severity classification for forest inventories: A Pacific Northwest national forests example

    Treesearch

    Thomas R. Whittier; Andrew N. Gray

    2016-01-01

    Determining how the frequency, severity, and extent of forest fires are changing in response to changes in management and climate is a key concern in many regions where fire is an important natural disturbance. In the USA the only national-scale fire severity classification uses satellite image changedetection to produce maps for large (>400 ha) fires, and is...

  15. A discrimlnant function approach to ecological site classification in northern New England

    Treesearch

    James M. Fincher; Marie-Louise Smith

    1994-01-01

    Describes one approach to ecologically based classification of upland forest community types of the White and Green Mountain physiographic regions. The classification approach is based on an intensive statistical analysis of the relationship between the communities and soil-site factors. Discriminant functions useful in distinguishing between types based on soil-site...

  16. Potential of Full Waveform Airborne Laser Scanning Data for Urban Area Classification - Transfer of Classification Approaches Between Missions

    NASA Astrophysics Data System (ADS)

    Tran, G.; Nguyen, D.; Milenkovic, M.; Pfeifer, N.

    2015-04-01

    Full-waveform (FWF) LiDAR (Light Detection and Ranging) systems have their advantage in recording the entire backscattered signal of each emitted laser pulse compared to conventional airborne discrete-return laser scanner systems. The FWF systems can provide point clouds which contain extra attributes like amplitude and echo width, etc. In this study, a FWF data collected in 2010 for Eisenstadt, a city in the eastern part of Austria was used to classify four main classes: buildings, trees, waterbody and ground by employing a decision tree. Point density, echo ratio, echo width, normalised digital surface model and point cloud roughness are the main inputs for classification. The accuracy of the final results, correctness and completeness measures, were assessed by comparison of the classified output to a knowledge-based labelling of the points. Completeness and correctness between 90% and 97% was reached, depending on the class. While such results and methods were presented before, we are investigating additionally the transferability of the classification method (features, thresholds ...) to another urban FWF lidar point cloud. Our conclusions are that from the features used, only echo width requires new thresholds. A data-driven adaptation of thresholds is suggested.

  17. Single-cell approaches for molecular classification of endocrine tumors

    PubMed Central

    Koh, James; Allbritton, Nancy L.; Sosa, Julie A.

    2015-01-01

    Purpose of review In this review, we summarize recent developments in single-cell technologies that can be employed for the functional and molecular classification of endocrine cells in normal and neoplastic tissue. Recent findings The emergence of new platforms for the isolation, analysis, and dynamic assessment of individual cell identity and reactive behavior enables experimental deconstruction of intratumoral heterogeneity and other contexts, where variability in cell signaling and biochemical responsiveness inform biological function and clinical presentation. These tools are particularly appropriate for examining and classifying endocrine neoplasias, as the clinical sequelae of these tumors are often driven by disrupted hormonal responsiveness secondary to compromised cell signaling. Single-cell methods allow for multidimensional experimental designs incorporating both spatial and temporal parameters with the capacity to probe dynamic cell signaling behaviors and kinetic response patterns dependent upon sequential agonist challenge. Summary Intratumoral heterogeneity in the provenance, composition, and biological activity of different forms of endocrine neoplasia presents a significant challenge for prognostic assessment. Single-cell technologies provide an array of powerful new approaches uniquely well suited for dissecting complex endocrine tumors. Studies examining the relationship between clinical behavior and tumor compositional variations in cellular activity are now possible, providing new opportunities to deconstruct the underlying mechanisms of endocrine neoplasia. PMID:26632769

  18. One or Two Dimensions in Spontaneous Classification: A Simplicity Approach

    ERIC Educational Resources Information Center

    Pothos, Emmanuel M.; Close, James

    2008-01-01

    When participants are asked to spontaneously categorize a set of items, they typically produce unidimensional classifications, i.e., categorize the items on the basis of only one of their dimensions of variation. We examine whether it is possible to predict unidimensional vs. two-dimensional classification on the basis of the abstract stimulus…

  19. One or Two Dimensions in Spontaneous Classification: A Simplicity Approach

    ERIC Educational Resources Information Center

    Pothos, Emmanuel M.; Close, James

    2008-01-01

    When participants are asked to spontaneously categorize a set of items, they typically produce unidimensional classifications, i.e., categorize the items on the basis of only one of their dimensions of variation. We examine whether it is possible to predict unidimensional vs. two-dimensional classification on the basis of the abstract stimulus…

  20. Evaluation of current approaches to stream classification and a heuristic guide to developing classifications of integrated aquatic networks.

    PubMed

    Melles, S J; Jones, N E; Schmidt, B J

    2014-03-01

    Conservation and management of fresh flowing waters involves evaluating and managing effects of cumulative impacts on the aquatic environment from disturbances such as: land use change, point and nonpoint source pollution, the creation of dams and reservoirs, mining, and fishing. To assess effects of these changes on associated biotic communities it is necessary to monitor and report on the status of lotic ecosystems. A variety of stream classification methods are available to assist with these tasks, and such methods attempt to provide a systematic approach to modeling and understanding complex aquatic systems at various spatial and temporal scales. Of the vast number of approaches that exist, it is useful to group them into three main types. The first involves modeling longitudinal species turnover patterns within large drainage basins and relating these patterns to environmental predictors collected at reach and upstream catchment scales; the second uses regionalized hierarchical classification to create multi-scale, spatially homogenous aquatic ecoregions by grouping adjacent catchments together based on environmental similarities; and the third approach groups sites together on the basis of similarities in their environmental conditions both within and between catchments, independent of their geographic location. We review the literature with a focus on more recent classifications to examine the strengths and weaknesses of the different approaches. We identify gaps or problems with the current approaches, and we propose an eight-step heuristic process that may assist with development of more flexible and integrated aquatic classifications based on the current understanding, network thinking, and theoretical underpinnings.

  1. Evaluation of Current Approaches to Stream Classification and a Heuristic Guide to Developing Classifications of Integrated Aquatic Networks

    NASA Astrophysics Data System (ADS)

    Melles, S. J.; Jones, N. E.; Schmidt, B. J.

    2014-03-01

    Conservation and management of fresh flowing waters involves evaluating and managing effects of cumulative impacts on the aquatic environment from disturbances such as: land use change, point and nonpoint source pollution, the creation of dams and reservoirs, mining, and fishing. To assess effects of these changes on associated biotic communities it is necessary to monitor and report on the status of lotic ecosystems. A variety of stream classification methods are available to assist with these tasks, and such methods attempt to provide a systematic approach to modeling and understanding complex aquatic systems at various spatial and temporal scales. Of the vast number of approaches that exist, it is useful to group them into three main types. The first involves modeling longitudinal species turnover patterns within large drainage basins and relating these patterns to environmental predictors collected at reach and upstream catchment scales; the second uses regionalized hierarchical classification to create multi-scale, spatially homogenous aquatic ecoregions by grouping adjacent catchments together based on environmental similarities; and the third approach groups sites together on the basis of similarities in their environmental conditions both within and between catchments, independent of their geographic location. We review the literature with a focus on more recent classifications to examine the strengths and weaknesses of the different approaches. We identify gaps or problems with the current approaches, and we propose an eight-step heuristic process that may assist with development of more flexible and integrated aquatic classifications based on the current understanding, network thinking, and theoretical underpinnings.

  2. Terrain Classification and Identification of Tree Stems Using Ground-Based Lidar

    DTIC Science & Technology

    2012-12-01

    enhance UGV localization (Vandapel et al., 2006; Carle et al., 2010). In the forestry domain, tree stem modeling can be used for estimating biomass ...Surface lidar remote sensing of basal area and biomass in deciduous forests of eastern Maryland, USA. Remote Sensing of Environment, 67, 83-98...Microsoft Research. Sharma, M., Parton, J. (2009). Modeling Stand Density Effects on Taper for Jack Pine and Black Spruce Plantations Using

  3. Automated detection and classification of lunar craters using multiple approaches

    NASA Astrophysics Data System (ADS)

    Sawabe, Y.; Matsunaga, T.; Rokugawa, S.

    Many missions such as Clementine and SELENE (SELenological and Engineering Explorer) take lunar images for examination. A large volume of imagery data has already been archived and much more is on the way. Extracting the necessary information from the already large and ever growing volume of data is the crucial problem that needs to be overcome. Craters are studied extensively since they provide us with the relative age of the surface unit and more information on the lunar surface geology. Manually extracting craters from lunar images is a difficult task because it requires a great deal of man power as well as specific knowledge and skills of extraction. Several automated craters detection algorithms have been developed but none is yet practical or sufficiently tested to be reliable. Our previous algorithm (Sawabe, Y., Matsunaga, T., Rokugawa, S. Automatic crater detection algorithm for the lunar surface using multiple approaches. J. Remote Sens. Soc. Jpn. 25 (2), 157 168, 2005.) was improved to enhance detection of craters in lunar images and automate crater classification. This algorithm was tested using various images for wide range of applicability. Four approaches were used with the crater detecting algorithm to find (1) “shady and sunny” patters in images with low sun angle, (2) circular features in edge images, (3) curves and circles in thinned and connected edge lines, and (4) discrete or broken circular edge lines using fuzzy Hough transform. The algorithm was applied to mare and highland images of the moon captured by Clementine and Apollo under different solar angles and spatial resolution. The new algorithm was able to detect 80% more without parameter tuning. In addition, the detected craters were classified by spectral characteristics derived from Clementine UV Vis multi-spectral images. Finally, the lunar surface GIS was formulated which has the geological and spectral attributes automatically generated by our algorithm. It could be helpful

  4. The Iqmulus Urban Showcase: Automatic Tree Classification and Identification in Huge Mobile Mapping Point Clouds

    NASA Astrophysics Data System (ADS)

    Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.

    2016-06-01

    Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.

  5. Risk Factors Predicting Infectious Lactational Mastitis: Decision Tree Approach versus Logistic Regression Analysis.

    PubMed

    Fernández, Leónides; Mediano, Pilar; García, Ricardo; Rodríguez, Juan M; Marín, María

    2016-09-01

    Objectives Lactational mastitis frequently leads to a premature abandonment of breastfeeding; its development has been associated with several risk factors. This study aims to use a decision tree (DT) approach to establish the main risk factors involved in mastitis and to compare its performance for predicting this condition with a stepwise logistic regression (LR) model. Methods Data from 368 cases (breastfeeding women with mastitis) and 148 controls were collected by a questionnaire about risk factors related to medical history of mother and infant, pregnancy, delivery, postpartum, and breastfeeding practices. The performance of the DT and LR analyses was compared using the area under the receiver operating characteristic (ROC) curve. Sensitivity, specificity and accuracy of both models were calculated. Results Cracked nipples, antibiotics and antifungal drugs during breastfeeding, infant age, breast pumps, familial history of mastitis and throat infection were significant risk factors associated with mastitis in both analyses. Bottle-feeding and milk supply were related to mastitis for certain subgroups in the DT model. The areas under the ROC curves were similar for LR and DT models (0.870 and 0.835, respectively). The LR model had better classification accuracy and sensitivity than the DT model, but the last one presented better specificity at the optimal threshold of each curve. Conclusions The DT and LR models constitute useful and complementary analytical tools to assess the risk of lactational infectious mastitis. The DT approach identifies high-risk subpopulations that need specific mastitis prevention programs and, therefore, it could be used to make the most of public health resources.

  6. Simple, novel approaches to investigating biophysical characteristics of individual mid-latitude deciduous trees

    NASA Astrophysics Data System (ADS)

    Kalibo, Humphrey Wafula

    Forests play a critical role in the functioning of the biosphere and support the livelihoods of millions of people. With increasing anthropogenic influences and looming effects associated with climatic variability, it is crucial that the research community and policy makers take advantage of the capabilities afforded by remote sensing technologies to generate reliable and timely data to support management decisions. Set in the species-rich woodland of Prairie Pines in Lincoln, Nebraska, this research addresses three distinct objectives that could contribute towards forest research and management. First, three supervised classification algorithms were applied to two hyperspectral AISA-Eagle images to evaluate their capability for spectrally identifying selected tree species. The findings show that each algorithm had low to moderate overall classification accuracies (46%-62%), probably due to mixed pixels resulting from pronounced heterogeneity in tree diversity; however, the algorithms could be a rapid means to assess species composition. The second objective is an investigation into how twelve individual morphologically different deciduous trees transmit incoming photosynthetically active radiation (PAR) over the course of the growing season. It was found that more diffuse light was transmitted than direct light, dictated by seasonality, vegetation fraction (VF), and leaf size. In the final objective, VF derived from upward-looking hemispherical photographs of twelve deciduous tree canopies and eight spectral vegetation indices (VIs) calculated from in situ single leaf-level reflectance data were used to investigate whether the VIs could mimic and estimate the temporal patterns of measured VF of each tree over the growing season. The findings show that all the indices accurately depicted the temporal patterns of the photo-derived VF. NDVI and SAVI had the highest correlations (R 2 > 0.7; RMSE 0.7; E > 0.8) and closely mirrored the temporal patterns of VF for nine

  7. [Analysis of dietary pattern and diabetes mellitus influencing factors identified by classification tree model in adults of Fujian].

    PubMed

    Yu, F L; Ye, Y; Yan, Y S

    2017-05-10

    Objective: To find out the dietary patterns and explore the relationship between environmental factors (especially dietary patterns) and diabetes mellitus in the adults of Fujian. Methods: Multi-stage sampling method were used to survey residents aged ≥18 years by questionnaire, physical examination and laboratory detection in 10 disease surveillance points in Fujian. Factor analysis was used to identify the dietary patterns, while logistic regression model was applied to analyze relationship between dietary patterns and diabetes mellitus, and classification tree model was adopted to identify the influencing factors for diabetes mellitus. Results: There were four dietary patterns in the population, including meat, plant, high-quality protein, and fried food and beverages patterns. The result of logistic analysis showed that plant pattern, which has higher factor loading of fresh fruit-vegetables and cereal-tubers, was a protective factor for non-diabetes mellitus. The risk of diabetes mellitus in the population at T2 and T3 levels of factor score were 0.727 (95%CI:0.561-0.943) times and 0.736 (95%CI: 0.573-0.944) times higher, respectively, than those whose factor score was in lowest quartile. Thirteen influencing factors and eleven group at high-risk for diabetes mellitus were identified by classification tree model. The influencing factors were dyslipidemia, age, family history of diabetes, hypertension, physical activity, career, sex, sedentary time, abdominal adiposity, BMI, marital status, sleep time and high-quality protein pattern. Conclusion: There is a close association between dietary patterns and diabetes mellitus. It is necessary to promote healthy and reasonable diet, strengthen the monitoring and control of blood lipids, blood pressure and body weight, and have good lifestyle for the prevention and control of diabetes mellitus.

  8. An efficient approach to 3D single tree-crown delineation in LiDAR data

    NASA Astrophysics Data System (ADS)

    Mongus, Domen; Žalik, Borut

    2015-10-01

    This paper proposes a new method for 3D delineation of single tree-crowns in LiDAR data by exploiting the complementaries of treetop and tree trunk detections. A unified mathematical framework is provided based on the graph theory, allowing for all the segmentations to be achieved using marker-controlled watersheds. Treetops are defined by detecting concave neighbourhoods within the canopy height model using locally fitted surfaces. These serve as markers for watershed segmentation of the canopy layer where possible oversegmentation is reduced by merging the regions based on their heights, areas, and shapes. Additional tree crowns are delineated from mid- and under-storey layers based on tree trunk detection. A new approach for estimating the verticalities of the points' distributions is proposed for this purpose. The watershed segmentation is then applied on a density function within the voxel space, while boundaries of delineated trees from the canopy layer are used to prevent the overspreading of regions. The experiments show an approximately 6% increase in the efficiency of the proposed treetop definition based on locally fitted surfaces in comparison with the traditionally used local maxima of the smoothed canopy height model. In addition, 4% increase in the efficiency is achieved by the proposed tree trunk detection. Although the tree trunk detection alone is dependent on the data density, supplementing it with the treetop detection the proposed approach is efficient even when dealing with low density point-clouds.

  9. Subgrouping patients with low back pain: evolution of a classification approach to physical therapy.

    PubMed

    Fritz, Julie M; Cleland, Joshua A; Childs, John D

    2007-06-01

    The development of valid classification methods to assist the physical therapy management of patients with low back pain has been recognized as a research priority. There is also growing evidence that the use of a classification approach to physical therapy results in better clinical outcomes than the use of alternative management approaches. In 1995 Delitto and colleagues proposed a classification system intended to inform and direct the physical therapy management of patients with low back pain. The system described 4 classifications of patients with low back pain (manipulation, stabilization, specific exercise, and traction). Each classification could be identified by a unique set of examination criteria, and was associated with an intervention strategy believed to result in the best outcomes for the patient. The system was based on expert opinion and research evidence available at the time. A substantial amount of research has emerged in the years since the introduction of this classification system, including the development of clinical prediction rules, providing new evidence for the examination criteria used to place a patient into a classification and for the optimal intervention strategies for each classification. New evidence should continually be incorporated into existing classification systems. The purpose of this clinical commentary is to review this classification system, its evolution and current status, and to discuss its implications for the classification of patients with low back pain.

  10. Bridging process-based and empirical approaches to modeling tree growth

    Treesearch

    Harry T. Valentine; Annikki Makela; Annikki Makela

    2005-01-01

    The gulf between process-based and empirical approaches to modeling tree growth may be bridged, in part, by the use of a common model. To this end, we have formulated a process-based model of tree growth that can be fitted and applied in an empirical mode. The growth model is grounded in pipe model theory and an optimal control model of crown development. Together, the...

  11. Tree-Level Hydrodynamic Approach for Modeling Aboveground Water Storage and Stomatal Conductance Highlights the Effects of Tree Hydraulic Strategy

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, G.; Bohrer, G.; Matheny, A. M.; Fatichi, S.; Frasson, R. P. M.; Schafer, K. V.

    2016-12-01

    The Finite-difference Ecosystem-scale Tree-Crown Hydrodynamics model version 2 (FETCH2) is a novel tree-scale hydrodynamic model of transpiration. The FETCH2 model employs a finite difference numerical methodology and a simplified single-beam conduit system and simulates water flow through the tree as a continuum of porous media conduits. It explicitly resolves xylem water potential throughout the tree's vertical extent. Empirical equations relate water potential within the stem to stomatal conductance of the leaves at each height throughout the crown. While highly simplified, this approach brings additional realism to the simulation of transpiration by linking stomatal responses to stem water potential rather than directly to soil moisture, as is currently the case in the majority of land-surface models. FETCH2 accounts for plant hydraulic traits, such as the degree of anisohydric/isohydric response of stomata, maximal xylem conductivity, vertical distribution of leaf area, and maximal and minimal stemwater content. We used FETCH2 along with sap flow and eddy covariance data sets collected from a mixed plot of two genera (oak/pine) in Silas Little Experimental Forest, NJ, USA, to conduct an analysis of the inter-genera variation of hydraulic strategies and their effects on diurnal and seasonal transpiration dynamics. We define these strategies through the parameters that describe the genus-level transpiration and xylem conductivity responses to changes in stem water potential. A virtual experiment showed that the model was able to capture the effect of hydraulic strategies such as isohydric/anisohydric behavior on stomatal conductance under different soil-water availability conditions. Our evaluation revealed that FETCH2 considerably improved the simulation of ecosystem transpiration and latent heat flux than more conventional models.

  12. Assessing College Student Interest in Math and/or Computer Science in a Cross-National Sample Using Classification and Regression Trees

    ERIC Educational Resources Information Center

    Kitsantas, Anastasia; Kitsantas, Panagiota; Kitsantas, Thomas

    2012-01-01

    The purpose of this exploratory study was to assess the relative importance of a number of variables in predicting students' interest in math and/or computer science. Classification and regression trees (CART) were employed in the analysis of survey data collected from 276 college students enrolled in two U.S. and Greek universities. The results…

  13. Assessment on the classification of landslide risk level using Genetic Algorithm of Operation Tree in central Taiwan

    NASA Astrophysics Data System (ADS)

    Wei, Chiang; Yeh, Hui-Chung; Chen, Yen-Chang

    2015-04-01

    This study assessed the classification of landslide areas by Genetic Algorithm of Operation Tree (GAOT) of Chen-Yu-Lan River upstream watershed of National Taiwan University Experimental Forest (NTUEF) after the Typhoon Morakot in 2009 using remotely and geological data. Landslides of 624.5 ha which accounting for 1.9% of total area were delineated with the threshold of slope (22°) and area size (1 hectare), 48 landslide sites were located in the upstream Chen-Yu-Lan watershed using FORMOSAT-II satellite imagery, the aerial photo and GIS related coverage. The five risk levels of these landslide areas was classified by the area, elevation, slope order, aspect, erosion order and geological factor order using the Simplicity Method suggested in the Technical Regulations for Soil and Water Conservation of Taiwan. If all the landslide sites were considered, the accuracy of classification using GAOT is 97.9%, superior than the K-means, Ward method, Shared Nearest Neighbor method, Maximum Likelihood Classifier and Bayesian Classifier; if 36 sites were used as training samples and the rest 12 sites were tested, the accuracy still can reach 81.3%. More geological data, anthropogenic influence and hydrological factors may be necessary for clarifying the landside area and the results benefit the assessment for future correction and management of the authorities.

  14. Multi-temporal remote sensing image classification - a multi-view approach

    SciTech Connect

    Chandola, Varun; Vatsavai, Raju

    2010-01-01

    Multispectral remote sensing images have been widely used for automated land use and land cover classification tasks. Often thematic classification is done using single date image, however in many instances a single date image is not informative enough to distinguish between different land cover types. In this paper we show how one can use multiple images, collected at different times of year (for example, during crop growing season), to learn a better classifier. We propose two approaches, an ensemble of classifiers approach and a co-training based approach, and show how both of these methods outperform a straightforward stacked vector approach often used in multi-temporal image classification. Additionally, the co-training based method addresses the challenge of limited labeled training data in supervised classification, as this classification scheme utilizes a large number of unlabeled samples (which comes for free) in conjunction with a small set of labeled training data.

  15. Fusion of LiDAR and aerial imagery for the estimation of downed tree volume using Support Vector Machines classification and region based object fitting

    NASA Astrophysics Data System (ADS)

    Selvarajan, Sowmya

    The study classifies 3D small footprint full waveform digitized LiDAR fused with aerial imagery to downed trees using Support Vector Machines (SVM) algorithm. Using small footprint waveform LiDAR, airborne LiDAR systems can provide better canopy penetration and very high spatial resolution. The small footprint waveform scanner system Riegl LMS-Q680 is addition with an UltraCamX aerial camera are used to measure and map downed trees in a forest. The various data preprocessing steps helped in the identification of ground points from the dense LiDAR dataset and segment the LiDAR data to help reduce the complexity of the algorithm. The haze filtering process helped to differentiate the spectral signatures of the various classes within the aerial image. Such processes, helped to better select the features from both sensor data. The six features: LiDAR height, LiDAR intensity, LiDAR echo, and three image intensities are utilized. To do so, LiDAR derived, aerial image derived and fused LiDAR-aerial image derived features are used to organize the data for the SVM hypothesis formulation. Several variations of the SVM algorithm with different kernels and soft margin parameter C are experimented. The algorithm is implemented to classify downed trees over a pine trees zone. The LiDAR derived features provided an overall accuracy of 98% of downed trees but with no classification error of 86%. The image derived features provided an overall accuracy of 65% and fusion derived features resulted in an overall accuracy of 88%. The results are observed to be stable and robust. The SVM accuracies were accompanied by high false alarm rates, with the LiDAR classification producing 58.45%, image classification producing 95.74% and finally the fused classification producing 93% false alarm rates The Canny edge correction filter helped control the LiDAR false alarm to 35.99%, image false alarm to 48.56% and fused false alarm to 37.69% The implemented classifiers provided a powerful tool for

  16. Integrated Analysis of Tropical Trees Growth: A Multivariate Approach

    PubMed Central

    YÁÑEZ-ESPINOSA, LAURA; TERRAZAS, TERESA; LÓPEZ-MATA, LAURO

    2006-01-01

    • Background and Aims One of the problems analysing cause–effect relationships of growth and environmental factors is that a single factor could be correlated with other ones directly influencing growth. One attempt to understand tropical trees' growth cause–effect relationships is integrating research about anatomical, physiological and environmental factors that influence growth in order to develop mathematical models. The relevance is to understand the nature of the process of growth and to model this as a function of the environment. • Methods The relationships of Aphananthe monoica, Pleuranthodendron lindenii and Psychotria costivenia radial growth and phenology with environmental factors (local climate, vertical strata microclimate and physical and chemical soil variables) were evaluated from April 2000 to September 2001. The association among these groups of variables was determined by generalized canonical correlation analysis (GCCA), which considers the probable associations of three or more data groups and the selection of the most important variables for each data group. • Key Results The GCCA allowed determination of a general model of relationships among tree phenology and radial growth with climate, microclimate and soil factors. A strong influence of climate in phenology and radial growth existed. Leaf initiation and cambial activity periods were associated with maximum temperature and day length, and vascular tissue differentiation with soil moisture and rainfall. The analyses of individual species detected different relationships for the three species. • Conclusions The analyses of the individual species suggest that each one takes advantage in a different way of the environment in which they are growing, allowing them to coexist. PMID:16822807

  17. RAVEN: Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Andrea Alfonsi; Cristian Rabiti; Diego Mandelli; Joshua Cogliati; Robert Kinoshita

    2013-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  18. RAVEN. Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Alfonsi, Andrea; Rabiti, Cristian; Mandelli, Diego; Cogliati, Joshua; Kinoshita, Robert

    2014-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  19. New Approaches to Object Classification in Synoptic Sky Surveys

    SciTech Connect

    Donalek, C.; Mahabal, A.; Djorgovski, S. G.; Marney, S.; Drake, A.; Glikman, E.; Graham, M. J.; Williams, R.

    2008-12-05

    Digital synoptic sky surveys pose several new object classification challenges. In surveys where real-time detection and classification of transient events is a science driver, there is a need for an effective elimination of instrument-related artifacts which can masquerade as transient sources in the detection pipeline, e.g., unremoved large cosmic rays, saturation trails, reflections, crosstalk artifacts, etc. We have implemented such an Artifact Filter, using a supervised neural network, for the real-time processing pipeline in the Palomar-Quest (PQ) survey. After the training phase, for each object it takes as input a set of measured morphological parameters and returns the probability of it being a real object. Despite the relatively low number of training cases for many kinds of artifacts, the overall artifact classification rate is around 90%, with no genuine transients misclassified during our real-time scans. Another question is how to assign an optimal star-galaxy classification in a multi-pass survey, where seeing and other conditions change between different epochs, potentially producing inconsistent classifications for the same object. We have implemented a star/galaxy multipass classifier that makes use of external and a priori knowledge to find the optimal classification from the individually derived ones. Both these techniques can be applied to other, similar surveys and data sets.

  20. Idiopathic interstitial pneumonias and emphysema: detection and classification using a texture-discriminative approach

    NASA Astrophysics Data System (ADS)

    Fetita, C.; Chang-Chien, K. C.; Brillet, P. Y.; Pr"teux, F.; Chang, R. F.

    2012-03-01

    Our study aims at developing a computer-aided diagnosis (CAD) system for fully automatic detection and classification of pathological lung parenchyma patterns in idiopathic interstitial pneumonias (IIP) and emphysema using multi-detector computed tomography (MDCT). The proposed CAD system is based on three-dimensional (3-D) mathematical morphology, texture and fuzzy logic analysis, and can be divided into four stages: (1) a multi-resolution decomposition scheme based on a 3-D morphological filter was exploited to discriminate the lung region patterns at different analysis scales. (2) An additional spatial lung partitioning based on the lung tissue texture was introduced to reinforce the spatial separation between patterns extracted at the same resolution level in the decomposition pyramid. Then, (3) a hierarchic tree structure was exploited to describe the relationship between patterns at different resolution levels, and for each pattern, six fuzzy membership functions were established for assigning a probability of association with a normal tissue or a pathological target. Finally, (4) a decision step exploiting the fuzzy-logic assignments selects the target class of each lung pattern among the following categories: normal (N), emphysema (EM), fibrosis/honeycombing (FHC), and ground glass (GDG). According to a preliminary evaluation on an extended database, the proposed method can overcome the drawbacks of a previously developed approach and achieve higher sensitivity and specificity.

  1. Bayesian decision tree for the classification of the mode of motion in single-molecule trajectories.

    PubMed

    Türkcan, Silvan; Masson, Jean-Baptiste

    2013-01-01

    Membrane proteins move in heterogeneous environments with spatially (sometimes temporally) varying friction and with biochemical interactions with various partners. It is important to reliably distinguish different modes of motion to improve our knowledge of the membrane architecture and to understand the nature of interactions between membrane proteins and their environments. Here, we present an analysis technique for single molecule tracking (SMT) trajectories that can determine the preferred model of motion that best matches observed trajectories. The method is based on Bayesian inference to calculate the posteriori probability of an observed trajectory according to a certain model. Information theory criteria, such as the Bayesian information criterion (BIC), the Akaike information criterion (AIC), and modified AIC (AICc), are used to select the preferred model. The considered group of models includes free Brownian motion, and confined motion in 2nd or 4th order potentials. We determine the best information criteria for classifying trajectories. We tested its limits through simulations matching large sets of experimental conditions and we built a decision tree. This decision tree first uses the BIC to distinguish between free Brownian motion and confined motion. In a second step, it classifies the confining potential further using the AIC. We apply the method to experimental Clostridium Perfingens [Formula: see text]-toxin (CP[Formula: see text]T) receptor trajectories to show that these receptors are confined by a spring-like potential. An adaptation of this technique was applied on a sliding window in the temporal dimension along the trajectory. We applied this adaptation to experimental CP[Formula: see text]T trajectories that lose confinement due to disaggregation of confining domains. This new technique adds another dimension to the discussion of SMT data. The mode of motion of a receptor might hold more biologically relevant information than the diffusion

  2. Bayesian Decision Tree for the Classification of the Mode of Motion in Single-Molecule Trajectories

    PubMed Central

    Türkcan, Silvan; Masson, Jean-Baptiste

    2013-01-01

    Membrane proteins move in heterogeneous environments with spatially (sometimes temporally) varying friction and with biochemical interactions with various partners. It is important to reliably distinguish different modes of motion to improve our knowledge of the membrane architecture and to understand the nature of interactions between membrane proteins and their environments. Here, we present an analysis technique for single molecule tracking (SMT) trajectories that can determine the preferred model of motion that best matches observed trajectories. The method is based on Bayesian inference to calculate the posteriori probability of an observed trajectory according to a certain model. Information theory criteria, such as the Bayesian information criterion (BIC), the Akaike information criterion (AIC), and modified AIC (AICc), are used to select the preferred model. The considered group of models includes free Brownian motion, and confined motion in 2nd or 4th order potentials. We determine the best information criteria for classifying trajectories. We tested its limits through simulations matching large sets of experimental conditions and we built a decision tree. This decision tree first uses the BIC to distinguish between free Brownian motion and confined motion. In a second step, it classifies the confining potential further using the AIC. We apply the method to experimental Clostridium Perfingens -toxin (CPT) receptor trajectories to show that these receptors are confined by a spring-like potential. An adaptation of this technique was applied on a sliding window in the temporal dimension along the trajectory. We applied this adaptation to experimental CPT trajectories that lose confinement due to disaggregation of confining domains. This new technique adds another dimension to the discussion of SMT data. The mode of motion of a receptor might hold more biologically relevant information than the diffusion coefficient or domain size and may be a better tool to

  3. A machine learning approach for viral genome classification.

    PubMed

    Remita, Mohamed Amine; Halioui, Ahmed; Malick Diouara, Abou Abdallah; Daigle, Bruno; Kiani, Golrokh; Diallo, Abdoulaye Baniré

    2017-04-11

    Advances in cloning and sequencing technology are yielding a massive number of viral genomes. The classification and annotation of these genomes constitute important assets in the discovery of genomic variability, taxonomic characteristics and disease mechanisms. Existing classification methods are often designed for specific well-studied family of viruses. Thus, the viral comparative genomic studies could benefit from more generic, fast and accurate tools for classifying and typing newly sequenced strains of diverse virus families. Here, we introduce a virus classification platform, CASTOR, based on machine learning methods. CASTOR is inspired by a well-known technique in molecular biology: restriction fragment length polymorphism (RFLP). It simulates, in silico, the restriction digestion of genomic material by different enzymes into fragments. It uses two metrics to construct feature vectors for machine learning algorithms in the classification step. We benchmark CASTOR for the classification of distinct datasets of human papillomaviruses (HPV), hepatitis B viruses (HBV) and human immunodeficiency viruses type 1 (HIV-1). Results reveal true positive rates of 99%, 99% and 98% for HPV Alpha species, HBV genotyping and HIV-1 M subtyping, respectively. Furthermore, CASTOR shows a competitive performance compared to well-known HIV-1 specific classifiers (REGA and COMET) on whole genomes and pol fragments. The performance of CASTOR, its genericity and robustness could permit to perform novel and accurate large scale virus studies. The CASTOR web platform provides an open access, collaborative and reproducible machine learning classifiers. CASTOR can be accessed at http://castor.bioinfo.uqam.ca .

  4. Staged Approach for Rehabilitation Classification: Shoulder Disorders (STAR-Shoulder).

    PubMed

    McClure, Philip W; Michener, Lori A

    2015-05-01

    Shoulder disorders are a common musculoskeletal problem causing pain and functional loss. Traditionally, diagnostic categories are based on a pathoanatomic medical model aimed at identifying the pathologic tissues. However, the pathoanatomic model may not provide diagnostic categories that effectively guide treatment decision making in rehabilitation. An expanded classification system is proposed that includes the pathoanatomic diagnosis and a rehabilitation classification based on tissue irritability and identified impairments. For the rehabilitation classification, 3 levels of irritability are proposed and defined, with corresponding strategies guiding intensity of treatment based on the physical stress theory. Common impairments are identified and are used to guide specific intervention tactics with varying levels of intensity. The proposed system is conceptual and needs to be tested for reliability and validity. This classification system may be useful clinically for guiding rehabilitation intervention and provides a potential method of identifying relevant subgroups in future research studies. Although the system was developed for and applied to shoulder disorders, it may be applicable to classification and rehabilitation of musculoskeletal disorders in other body regions. © 2015 American Physical Therapy Association.

  5. Metabarcoding of marine nematodes - evaluation of reference datasets used in tree-based taxonomy assignment approach.

    PubMed

    Holovachov, Oleksandr

    2016-01-01

    Metabarcoding is becoming a common tool used to assess and compare diversity of organisms in environmental samples. Identification of OTUs is one of the critical steps in the process and several taxonomy assignment methods were proposed to accomplish this task. This publication evaluates the quality of reference datasets, alongside with several alignment and phylogeny inference methods used in one of the taxonomy assignment methods, called tree-based approach. This approach assigns anonymous OTUs to taxonomic categories based on relative placements of OTUs and reference sequences on the cladogram and support that these placements receive. In tree-based taxonomy assignment approach, reliable identification of anonymous OTUs is based on their placement in monophyletic and highly supported clades together with identified reference taxa. Therefore, it requires high quality reference dataset to be used. Resolution of phylogenetic trees is strongly affected by the presence of erroneous sequences as well as alignment and phylogeny inference methods used in the process. Two preparation steps are essential for the successful application of tree-based taxonomy assignment approach. Curated collections of genetic information do include erroneous sequences. These sequences have detrimental effect on the resolution of cladograms used in tree-based approach. They must be identified and excluded from the reference dataset beforehand.Various combinations of multiple sequence alignment and phylogeny inference methods provide cladograms with different topology and bootstrap support. These combinations of methods need to be tested in order to determine the one that gives highest resolution for the particular reference dataset.Completing the above mentioned preparation steps is expected to decrease the number of unassigned OTUs and thus improve the results of the tree-based taxonomy assignment approach.

  6. A new multi criteria classification approach in a multi agent system applied to SEEG analysis

    PubMed Central

    Kinie, Abel; Ndiaye, Mamadou Lamine L.; Montois, Jean-Jacques; Jacquelet, Yann

    2007-01-01

    This work is focused on the study of the organization of the SEEG signals during epileptic seizures with multi-agent system approach. This approach is based on cooperative mechanisms of auto-organization at the micro level and of emergence of a global function at the macro level. In order to evaluate this approach we propose a distributed collaborative approach for the classification of the interesting signals. This new multi-criteria classification method is able to provide a relevant brain area structures organisation and to bring out epileptogenic networks elements. The method is compared to another classification approach a fuzzy classification and gives better results when applied to SEEG signals. PMID:18002381

  7. Text Categorization Based on K-Nearest Neighbor Approach for Web Site Classification.

    ERIC Educational Resources Information Center

    Kwon, Oh-Woog; Lee, Jong-Hyeok

    2003-01-01

    Discusses text categorization and Web site classification and proposes a three-step classification system that includes the use of Web pages linked with the home page. Highlights include the k-nearest neighbor (k-NN) approach; improving performance with a feature selection method and a term weighting scheme using HTML tags; and similarity…

  8. Text Categorization Based on K-Nearest Neighbor Approach for Web Site Classification.

    ERIC Educational Resources Information Center

    Kwon, Oh-Woog; Lee, Jong-Hyeok

    2003-01-01

    Discusses text categorization and Web site classification and proposes a three-step classification system that includes the use of Web pages linked with the home page. Highlights include the k-nearest neighbor (k-NN) approach; improving performance with a feature selection method and a term weighting scheme using HTML tags; and similarity…

  9. The Comprehensive AOCMF Classification System: Radiological Issues and Systematic Approach

    PubMed Central

    Buitrago-Téllez, Carlos H.; Cornelius, Carl-Peter; Prein, Joachim; Kunz, Christoph; Ieva, Antonio di; Audigé, Laurent

    2014-01-01

    The AOCMF Classification Group developed a hierarchical three-level craniomaxillofacial (CMF) classification system with increasing level of complexity and details. The basic level 1 system differentiates fracture location in the mandible (code 91), midface (code 92), skull base (code 93), and cranial vault (code 94); the levels 2 and 3 focus on defining fracture location and morphology within more detailed regions and subregions. Correct imaging acquisition, systematic analysis, and interpretation according to the anatomic and surgical relevant structures in the CMF regions are essential for an accurate, reproducible, and comprehensive diagnosis of CMF fractures using that system. Basic principles for radiographic diagnosis are based on conventional plain films, multidetector computed tomography, and magnetic resonance imaging. In this tutorial, the radiological issues according to each level of the classification are described. PMID:25489396

  10. Neural network approach to classification of infrasound signals

    NASA Astrophysics Data System (ADS)

    Lee, Dong-Chang

    As part of the International Monitoring Systems of the Preparatory Commissions for the Comprehensive Nuclear Test-Ban Treaty Organization, the Infrasound Group at the University of Alaska Fairbanks maintains and operates two infrasound stations to monitor global nuclear activity. In addition, the group specializes in detecting and classifying the man-made and naturally produced signals recorded at both stations by computing various characterization parameters (e.g. mean of the cross correlation maxima, trace velocity, direction of arrival, and planarity values) using the in-house developed weighted least-squares algorithm. Classifying commonly observed low-frequency (0.015--0.1 Hz) signals at out stations, namely mountain associated waves and high trace-velocity signals, using traditional approach (e.g. analysis of power spectral density) presents a problem. Such signals can be separated statistically by setting a window to the trace-velocity estimate for each signal types, and the feasibility of such technique is demonstrated by displaying and comparing various summary plots (e.g. universal, seasonal and azimuthal variations) produced by analyzing infrasound data (2004--2007) from the Fairbanks and Antarctic arrays. Such plots with the availability of magnetic activity information (from the College International Geophysical Observatory located at Fairbanks, Alaska) leads to possible physical sources of the two signal types. Throughout this thesis a newly developed robust algorithm (sum of squares of variance ratios) with improved detection quality (under low signal to noise ratios) over two well-known detection algorithms (mean of the cross correlation maxima and Fisher Statistics) are investigated for its efficacy as a new detector. A neural network is examined for its ability to automatically classify the two signals described above against clutter (spurious signals with common characteristics). Four identical perceptron networks are trained and validated (with

  11. A Method for Application of Classification Tree Models to Map Aquatic Vegetation Using Remotely Sensed Images from Different Sensors and Dates

    PubMed Central

    Jiang, Hao; Zhao, Dehua; Cai, Ying; An, Shuqing

    2012-01-01

    In previous attempts to identify aquatic vegetation from remotely-sensed images using classification trees (CT), the images used to apply CT models to different times or locations necessarily originated from the same satellite sensor as that from which the original images used in model development came, greatly limiting the application of CT. We have developed an effective normalization method to improve the robustness of CT models when applied to images originating from different sensors and dates. A total of 965 ground-truth samples of aquatic vegetation types were obtained in 2009 and 2010 in Taihu Lake, China. Using relevant spectral indices (SI) as classifiers, we manually developed a stable CT model structure and then applied a standard CT algorithm to obtain quantitative (optimal) thresholds from 2009 ground-truth data and images from Landsat7-ETM+, HJ-1B-CCD, Landsat5-TM and ALOS-AVNIR-2 sensors. Optimal CT thresholds produced average classification accuracies of 78.1%, 84.7% and 74.0% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. However, the optimal CT thresholds for different sensor images differed from each other, with an average relative variation (RV) of 6.40%. We developed and evaluated three new approaches to normalizing the images. The best-performing method (Method of 0.1% index scaling) normalized the SI images using tailored percentages of extreme pixel values. Using the images normalized by Method of 0.1% index scaling, CT models for a particular sensor in which thresholds were replaced by those from the models developed for images originating from other sensors provided average classification accuracies of 76.0%, 82.8% and 68.9% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. Applying the CT models developed for normalized 2009 images to 2010 images resulted in high classification (78.0%–93.3%) and overall (92.0%–93.1%) accuracies. Our results suggest

  12. Machine Learning Approaches for High-resolution Urban Land Cover Classification: A Comparative Study

    SciTech Connect

    Vatsavai, Raju; Chandola, Varun; Cheriyadat, Anil M; Bright, Eddie A; Bhaduri, Budhendra L; Graesser, Jordan B

    2011-01-01

    The proliferation of several machine learning approaches makes it difficult to identify a suitable classification technique for analyzing high-resolution remote sensing images. In this study, ten classification techniques were compared from five broad machine learning categories. Surprisingly, the performance of simple statistical classification schemes like maximum likelihood and Logistic regression over complex and recent techniques is very close. Given that these two classifiers require little input from the user, they should still be considered for most classification tasks. Multiple classifier systems is a good choice if the resources permit.

  13. Tree level hydrodynamic approach for resolving aboveground water storage and stomatal conductance and modeling the effects of tree hydraulic strategy

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, Golnazalsadat; Bohrer, Gil; Matheny, Ashley M.; Fatichi, Simone; Moraes Frasson, Renato Prata; Schäfer, Karina V. R.

    2016-07-01

    The finite difference ecosystem-scale tree crown hydrodynamics model version 2 (FETCH2) is a tree-scale hydrodynamic model of transpiration. The FETCH2 model employs a finite difference numerical methodology and a simplified single-beam conduit system to explicitly resolve xylem water potentials throughout the vertical extent of a tree. Empirical equations relate water potential within the stem to stomatal conductance of the leaves at each height throughout the crown. While highly simplified, this approach brings additional realism to the simulation of transpiration by linking stomatal responses to stem water potential rather than directly to soil moisture, as is currently the case in the majority of land surface models. FETCH2 accounts for plant hydraulic traits, such as the degree of anisohydric/isohydric response of stomata, maximal xylem conductivity, vertical distribution of leaf area, and maximal and minimal xylem water content. We used FETCH2 along with sap flow and eddy covariance data sets collected from a mixed plot of two genera (oak/pine) in Silas Little Experimental Forest, NJ, USA, to conduct an analysis of the intergeneric variation of hydraulic strategies and their effects on diurnal and seasonal transpiration dynamics. We define these strategies through the parameters that describe the genus level transpiration and xylem conductivity responses to changes in stem water potential. Our evaluation revealed that FETCH2 considerably improved the simulation of ecosystem transpiration and latent heat flux in comparison to more conventional models. A virtual experiment showed that the model was able to capture the effect of hydraulic strategies such as isohydric/anisohydric behavior on stomatal conductance under different soil-water availability conditions.

  14. PoMo: An Allele Frequency-Based Approach for Species Tree Estimation

    PubMed Central

    De Maio, Nicola; Schrempf, Dominik; Kosiol, Carolin

    2015-01-01

    Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches. PMID:26209413

  15. Neural network approaches versus statistical methods in classification of multisource remote sensing data

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.

    1990-01-01

    Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.

  16. A heuristic multi-criteria classification approach incorporating data quality information for choropleth mapping

    PubMed Central

    Sun, Min; Wong, David; Kronenfeld, Barry

    2016-01-01

    Despite conceptual and technology advancements in cartography over the decades, choropleth map design and classification fail to address a fundamental issue: estimates that are statistically indifferent may be assigned to different classes on maps or vice versa. Recently, the class separability concept was introduced as a map classification criterion to evaluate the likelihood that estimates in two classes are statistical different. Unfortunately, choropleth maps created according to the separability criterion usually have highly unbalanced classes. To produce reasonably separable but more balanced classes, we propose a heuristic classification approach to consider not just the class separability criterion but also other classification criteria such as evenness and intra-class variability. A geovisual-analytic package was developed to support the heuristic mapping process to evaluate the trade-off between relevant criteria and to select the most preferable classification. Class break values can be adjusted to improve the performance of a classification. PMID:28286426

  17. Gregarine site-heterogeneous 18S rDNA trees, revision of gregarine higher classification, and the evolutionary diversification of Sporozoa.

    PubMed

    Cavalier-Smith, Thomas

    2014-10-01

    Gregarine 18S ribosomal DNA trees are hard to resolve because they exhibit the most disparate rates of rDNA evolution of any eukaryote group. As site-heterogeneous tree-reconstruction algorithms can give more accurate trees, especially for technically unusually challenging groups, I present the first site-heterogeneous rDNA trees for 122 gregarines and an extensive set of 452 appropriate outgroups. While some features remain poorly resolved, these trees fit morphological diversity better than most previous, evolutionarily less realistic, maximum likelihood trees. Gregarines are probably polyphyletic, with some 'eugregarines' and all 'neogregarines' (both abandoned as taxa) being more closely related to Cryptosporidium and Rhytidocystidae than to archigregarines. I establish a new subclass Orthogregarinia (new orders Vermigregarida, Arthrogregarida) for gregarines most closely related to Cryptosporidium and group Orthogregarinia, Cryptosporidiidae, and Rhytidocystidae as revised class Gregarinomorphea. Archigregarines are excluded from Gregarinomorphea and grouped with new orders Velocida (Urosporoidea superfam. n. and Veloxidium) and Stenophorida as a new sporozoan class Paragregarea. Platyproteum and Filipodium never group with Orthogregarinia or Paragregarea and are sufficiently different morphologically to merit a new order Squirmida. I revise gregarine higher-level classification generally in the light of site-heterogeneous-model trees, discuss their evolution, and also sporozoan cell structure and life-history evolution, correcting widespread misinterpretations.

  18. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    PubMed

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1

  19. Oregon Hydrologic Landscapes: An Approach for Broadscale Hydrologic Classification

    EPA Science Inventory

    Gaged streams represent only a small percentage of watershed hydrologic conditions throughout the Unites States and globe, but there is a growing need for hydrologic classification systems that can serve as the foundation for broad-scale assessments of the hydrologic functions of...

  20. Oregon Hydrologic Landscapes: An Approach for Broadscale Hydrologic Classification

    EPA Science Inventory

    Gaged streams represent only a small percentage of watershed hydrologic conditions throughout the Unites States and globe, but there is a growing need for hydrologic classification systems that can serve as the foundation for broad-scale assessments of the hydrologic functions of...

  1. Ecological type classification for California: the Forest Service approach

    Treesearch

    Barbara H. Allen

    1987-01-01

    National legislation has mandated the development and use of an ecological data base to improve resource decision making, while State and Federal agencies have agreed to cooperate in standardizing resource classification and inventory data. In the Pacific Southwest Region, which includes nearly 20 million acres (8.3 million ha) in California, the Forest Service, U.S....

  2. Histopathological image analysis for centroblasts classification through dimensionality reduction approaches.

    PubMed

    Kornaropoulos, Evgenios N; Niazi, M Khalid Khan; Lozanski, Gerard; Gurcan, Metin N

    2014-03-01

    We present two novel automated image analysis methods to differentiate centroblast (CB) cells from noncentroblast (non-CB) cells in digital images of H&E-stained tissues of follicular lymphoma. CB cells are often confused by similar looking cells within the tissue, therefore a system to help their classification is necessary. Our methods extract the discriminatory features of cells by approximating the intrinsic dimensionality from the subspace spanned by CB and non-CB cells. In the first method, discriminatory features are approximated with the help of singular value decomposition (SVD), whereas in the second method they are extracted using Laplacian Eigenmaps. Five hundred high-power field images were extracted from 17 slides, which are then used to compose a database of 213 CB and 234 non-CB region of interest images. The recall, precision, and overall accuracy rates of the developed methods were measured and compared with existing classification methods. Moreover, the reproducibility of both classification methods was also examined. The average values of the overall accuracy were 99.22% ± 0.75% and 99.07% ± 1.53% for COB and CLEM, respectively. The experimental results demonstrate that both proposed methods provide better classification accuracy of CB/non-CB in comparison with the state of the art methods.

  3. Image classification approach for automatic identification of grassland weeds

    NASA Astrophysics Data System (ADS)

    Gebhardt, Steffen; Kühbauch, Walter

    2006-08-01

    The potential of digital image processing for weed mapping in arable crops has widely been investigated in the last decades. In grassland farming these techniques are rarely applied so far. The project presented here focuses on the automatic identification of one of the most invasive and persistent grassland weed species, the broad-leaved dock (Rumex obtusifolius L.) in complex mixtures of grass and herbs. A total of 108 RGB-images were acquired in near range from a field experiment under constant illumination conditions using a commercial digital camera. The objects of interest were separated from the background by transforming the 24 bit RGB-images into 8 bit intensities and then calculating the local homogeneity images. These images were binarised by applying a dynamic grey value threshold. Finally, morphological opening was applied to the binary images. The remaining contiguous regions were considered to be objects. In order to classify these objects into 3 different weed species, a soil and a residue class, a total of 17 object-features related to shape, color and texture of the weeds were extracted. Using MANOVA, 12 of them were identified which contribute to classification. Maximum-likelihood classification was conducted to discriminate the weed species. The total classification rate across all classes ranged from 76 % to 83 %. The classification of Rumex obtusifolius achieved detection rates between 85 % and 93 % by misclassifications below 10 %. Further, Rumex obtusifolius distribution and the density maps were generated based on classification results and transformation of image coordinates into Gauss-Krueger system. These promising results show the high potential of image analysis for weed mapping in grassland and the implementation of site-specific herbicide spraying.

  4. Using hydrogeomorphic criteria to classify wetlands on Mt. Desert Island, Maine - approach, classification system, and examples

    USGS Publications Warehouse

    Nielsen, Martha G.; Guntenspergen, Glenn R.; Neckles, Hilary A.

    2005-01-01

    A wetland classification system was designed for Mt. Desert Island, Maine, to help categorize the large number of wetlands (over 1,200 mapped units) as an aid to understanding their hydrologic functions. The classification system, developed by the U.S. Geological Survey (USGS), in cooperation with the National Park Service, uses a modified hydrogeomorphic (HGM) approach, and assigns categories based on position in the landscape, soils and surficial geologic setting, and source of water. A dichotomous key was developed to determine a preliminary HGM classification of wetlands on the island. This key is designed for use with USGS topographic maps and 1:24,000 geographic information system (GIS) coverages as an aid to the classification, but may also be used with field data. Hydrologic data collected from a wetland monitoring study were used to determine whether the preliminary classification of individual wetlands using the HGM approach yielded classes that were consistent with actual hydroperiod data. Preliminary HGM classifications of the 20 wetlands in the monitoring study were consistent with the field hydroperiod data. The modified HGM classification approach appears robust, although the method apparently works somewhat better with undisturbed wetlands than with disturbed wetlands. This wetland classification system could be applied to other hydrogeologically similar areas of northern New England.

  5. Comparison of Sub-pixel Classification Approaches for Crop-specific Mapping

    EPA Science Inventory

    The Moderate Resolution Imaging Spectroradiometer (MODIS) data has been increasingly used for crop mapping and other agricultural applications. Phenology-based classification approaches using the NDVI (Normalized Difference Vegetation Index) 16-day composite (250 m) data product...

  6. Data-Driven Multimodal Sleep Apnea Events Detection : Synchrosquezing Transform Processing and Riemannian Geometry Classification Approaches.

    PubMed

    Rutkowski, Tomasz M

    2016-07-01

    A novel multimodal and bio-inspired approach to biomedical signal processing and classification is presented in the paper. This approach allows for an automatic semantic labeling (interpretation) of sleep apnea events based the proposed data-driven biomedical signal processing and classification. The presented signal processing and classification methods have been already successfully applied to real-time unimodal brainwaves (EEG only) decoding in brain-computer interfaces developed by the author. In the current project the very encouraging results are obtained using multimodal biomedical (brainwaves and peripheral physiological) signals in a unified processing approach allowing for the automatic semantic data description. The results thus support a hypothesis of the data-driven and bio-inspired signal processing approach validity for medical data semantic interpretation based on the sleep apnea events machine-learning-related classification.

  7. Comparison of Sub-pixel Classification Approaches for Crop-specific Mapping

    EPA Science Inventory

    The Moderate Resolution Imaging Spectroradiometer (MODIS) data has been increasingly used for crop mapping and other agricultural applications. Phenology-based classification approaches using the NDVI (Normalized Difference Vegetation Index) 16-day composite (250 m) data product...

  8. Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis

    PubMed Central

    Rastegari, Azam; Haghdoost, Ali Akbar; Baneshi, Mohammad Reza

    2013-01-01

    Background Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Methods Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. Findings The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. Conclusion In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables. PMID:24494152

  9. Multistage classification of multispectral Earth observational data: The design approach

    NASA Technical Reports Server (NTRS)

    Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.

    1981-01-01

    An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.

  10. Neuropsychological test selection for cognitive impairment classification: A machine learning approach.

    PubMed

    Weakley, Alyssa; Williams, Jennifer A; Schmitter-Edgecombe, Maureen; Cook, Diane J

    2015-01-01

    Reducing the amount of testing required to accurately detect cognitive impairment is clinically relevant. The aim of this research was to determine the fewest number of clinical measures required to accurately classify participants as healthy older adult, mild cognitive impairment (MCI), or dementia using a suite of classification techniques. Two variable selection machine learning models (i.e., naive Bayes, decision tree), a logistic regression, and two participant datasets (i.e., clinical diagnosis; Clinical Dementia Rating, CDR) were explored. Participants classified using clinical diagnosis criteria included 52 individuals with dementia, 97 with MCI, and 161 cognitively healthy older adults. Participants classified using CDR included 154 individuals with CDR = 0, 93 individuals with CDR = 0.5, and 25 individuals with CDR = 1.0+. A total of 27 demographic, psychological, and neuropsychological variables were available for variable selection. No significant difference was observed between naive Bayes, decision tree, and logistic regression models for classification of both clinical diagnosis and CDR datasets. Participant classification (70.0-99.1%), geometric mean (60.9-98.1%), sensitivity (44.2-100%), and specificity (52.7-100%) were generally satisfactory. Unsurprisingly, the MCI/CDR = 0.5 participant group was the most challenging to classify. Through variable selection only 2-9 variables were required for classification and varied between datasets in a clinically meaningful way. The current study results reveal that machine learning techniques can accurately classify cognitive impairment and reduce the number of measures required for diagnosis.

  11. Alpine Holocene Tree Ring Isotope Records - A Synthesis of a Multi-Proxy Approach in Dendroclimatology

    NASA Astrophysics Data System (ADS)

    Ziehmer, Malin Michelle; Nicolussi, Kurt; Schlüchter, Christian; Leuenberger, Markus

    2017-04-01

    High-resolution climate reconstructions based on tree-ring proxies are often limited by the individual segment length of living trees selected at the defined sampling sites, which mostly results in relatively short multi-centennial proxy series. A potential extension of living wood records comprise the addition of subfossil and archeological wood remains resulting in chronologies and associated climate reconstructions which are able to cover a few millennia in central Europe (e.g. Büntgen et al., 2011). However, existing multi-millennial tree-ring width chronologies in central Europe rank among the longest continuous chronologies world-wide and span the entire Holocene (Becker et al., 1993; Nicolussi et al. 2009). So far, these chronologies have mainly been used for dating subfossil wood samples, floating chronologies and archeological artifacts, but only in parts for reconstructing climate. Finds of Holocene wood remains in glacier forefields, peat bogs and small lakes allow us not only to establish such long-term tree-ring width records; further they offer the possibility to establish multi-millennial proxy records for the entire Holocene by using a multi-proxy approach which includes both tree-ring width and triple stable isotope ratios. As temperature limits tree growth at the Alpine upper tree line, the existing tree-ring width records are currently limited to reconstruct a single environmental variable. In the framework of the project Alpine Holocene Tree Ring Isotope Records, we combine tree-ring width, cellulose content as well as carbon, oxygen and hydrogen isotope series in a multi-proxy approach which allows the reconstruction of past environments by combining both Holocene wood remains and recent tree samples from two Alpine tree-line species. For this purpose, α-cellulose is prepared from 5-year tree ring blocks following the procedure after Boettger et al. (2007) and subsequently crushed by ultrasonic homogenization (Laumer et al., 2009). The

  12. Automatic tree stem detection - a geometric feature based approach for MLS point clouds

    NASA Astrophysics Data System (ADS)

    Hetti Arachchige, N.

    2013-10-01

    Recognition of tree stem is a fundamental task for obtaining various geometric attributes of trees such as diameter, height, stem position and so on for diverse of urban application. We propose a novel tree stem segmentation approach using geometric features corresponding to trees for high density MLS point data covering in urban environments. The principal direction and shape of point subsets are used as geometric features. Point orientation exhibits the most variance (shape of point set) of a point neighbourhood, assists to measure similarity, while shape provides the dimensional information of a group of points. Points residing on a stem can be isolated by defining various rules based on these geometric features. The shape characterization step is accomplished by estimating the structure tensor with principal component analysis. These features are assigned to different steps of our segmentation algorithm. Wrong segmentations mainly occur in the area where our rules have failed, such as vertical type objects, road poles and light post. To overcome these problems, global shape is further checked. The experiment is performed to evaluate the method; it shows that more than 90% of tree stems are detected. The overall accuracy of the proposed method is 90.6%. The results show that principal direction and shape analysis are sufficient for the tree stem recognition from MLS point cloud in a relatively complex urban area.

  13. Carbon footprint of forest and tree utilization technologies in life cycle approach

    NASA Astrophysics Data System (ADS)

    Polgár, András; Pécsinger, Judit

    2017-04-01

    In our research project a suitable method has been developed related the technological aspect of the environmental assessment of land use changes caused by climate change. We have prepared an eco-balance (environmental inventory) to the environmental effects classification in life-cycle approach in connection with the typical agricultural / forest and tree utilization technologies. The use of balances and environmental classification makes possible to compare land-use technologies and their environmental effects per common functional unit. In order to test our environmental analysis model, we carried out surveys in sample of forest stands. We set up an eco-balance of the working systems of intermediate cutting and final harvest in the stands of beech, oak, spruce, acacia, poplar and short rotation energy plantations (willow, poplar). We set up the life-cycle plan of the surveyed working systems by using the GaBi 6.0 Professional software and carried out midpoint and endpoint impact assessment. Out of the results, we applied the values of CML 2001 - Global Warming Potential (GWP 100 years) [kg CO2-Equiv.] and Eco-Indicator 99 - Human health, Climate Change [DALY]. On the basis of the values we set up a ranking of technology. By this, we received the environmental impact classification of the technologies based on carbon footprint. The working systems had the greatest impact on global warming (GWP 100 years) throughout their whole life cycle. This is explained by the amount of carbon dioxide releasing to the atmosphere resulting from the fuel of the technologies. Abiotic depletion (ADP foss) and marine aquatic ecotoxicity (MAETP) emerged also as significant impact categories. These impact categories can be explained by the share of input of fuel and lube. On the basis of the most significant environmental impact category (carbon footprint), we perform the relative life cycle contribution and ranking of each technologies. The technological life cycle stages examined

  14. Genetic and genomic approaches to assess adaptive genetic variation in plants: forest trees as a model.

    PubMed

    Gailing, Oliver; Vornam, Barbara; Leinemann, Ludger; Finkeldey, Reiner

    2009-12-01

    With the increasing availability of sequence information at putatively important genes or regulatory regions, the characterization of adaptive genetic diversity and their association with phenotypic trait variation becomes feasible for many non-model organisms such as forest trees. Especially in predominantly outcrossing forest tree populations with large effective size, a high genetic variation in relevant genes is maintained, that is the raw material for the adaptation to changing and variable environments, and likewise for plant breeding. Oaks (Quercus spp.) are excellent model species to study the adaptation of forest trees to changing environments. They show a wide geographic distribution in Europe as dominant tree species in many forests and grow under a wide range of climatic and edaphic conditions. With the availability of a growing amount of functional and expressional candidate genes, we are now able to test the functional importance of single nucleotide polymorphisms (SNPs) by associating nucleotide variation in these genes with phenotypic variation in adaptive traits in segregating or natural populations. Here, we report on quantitative trait locus (QTL), candidate gene and association mapping approaches that are applicable to characterize gene markers and SNPs associated with variation in adaptive traits, such as bud burst, drought resistance and other traits showing selective responses to environmental change and stress. Because genome-wide association mapping studies are not feasible because of the enormous amount of SNP markers required in outcrossing trees with high recombination rates, the success of such an approach depends largely on the reasonable selection of candidate genes.

  15. Evaluating Two Approaches to Helping College Students Understand Evolutionary Trees through Diagramming Tasks

    ERIC Educational Resources Information Center

    Perry, Judy; Meir, Eli; Herron, Jon C.; Maruca, Susan; Stal, Derek

    2008-01-01

    To understand evolutionary theory, students must be able to understand and use evolutionary trees and their underlying concepts. Active, hands-on curricula relevant to macroevolution can be challenging to implement across large college-level classes where textbook learning is the norm. We evaluated two approaches to helping students learn…

  16. A decision tree approach using silvics to guide planning for forest restoration

    Treesearch

    Sharon M. Hermann; John S. Kush; John C. Gilbert

    2013-01-01

    We created a decision tree based on silvics of longleaf pine (Pinus palustris) and historical descriptions to develop approaches for restoration management at Horseshoe Bend National Military Park located in central Alabama. A National Park Service goal is to promote structure and composition of a forest that likely surrounded the 1814 battlefield....

  17. An efficient semi-supervised classification approach for hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Tan, Kun; Li, Erzhu; Du, Qian; Du, Peijun

    2014-11-01

    In this paper, an efficient semi-supervised support vector machine (SVM) with segmentation-based ensemble (S2SVMSE) algorithm is proposed for hyperspectral image classification. The algorithm utilizes spatial information extracted by a segmentation algorithm for unlabeled sample selection. The unlabeled samples that are the most similar to the labeled ones are found and the candidate set of unlabeled samples to be chosen is enlarged to the corresponding image segments. To ensure the finally selected unlabeled samples be spatially widely distributed and less correlated, random selection is conducted with the flexibility of the number of unlabeled samples actually participating in semi-supervised learning. Classification is also refined through a spectral-spatial feature ensemble technique. The proposed method with very limited labeled training samples is evaluated via experiments with two real hyperspectral images, where it outperforms the fully supervised SVM and the semi-supervised version without spectral-spatial ensemble.

  18. Toward noncooperative iris recognition: a classification approach using multiple signatures.

    PubMed

    Proença, Hugo; Alexandre, Luís A

    2007-04-01

    This paper focuses on noncooperative iris recognition, i.e., the capture of iris images at large distances, under less controlled lighting conditions, and without active participation of the subjects. This increases the probability of capturing very heterogeneous images (regarding focus, contrast, or brightness) and with several noise factors (iris obstructions and reflections). Current iris recognition systems are unable to deal with noisy data and substantially increase their error rates, especially the false rejections, in these conditions. We propose an iris classification method that divides the segmented and normalized iris image into six regions, makes an independent feature extraction and comparison for each region, and combines each of the dissimilarity values through a classification rule. Experiments show a substantial decrease, higher than 40 percent, of the false rejection rates in the recognition of noisy iris images.

  19. Combined application of information theory on laboratory results with classification and regression tree analysis: analysis of unnecessary biopsy for prostate cancer.

    PubMed

    Hwang, Sang-Hyun; Pyo, Tina; Oh, Heung-Bum; Park, Hyun Jun; Lee, Kwan-Jeh

    2013-01-16

    The probability of a prostate cancer-positive biopsy result varies with PSA concentration. Thus, we applied information theory on classification and regression tree (CART) analysis for decision making predicting the probability of a biopsy result at various PSA concentrations. From 2007 to 2009, prostate biopsies were performed in 664 referred patients in a tertiary hospital. We created 2 CART models based on the information theory: one for moderate uncertainty (PSA concentration: 2.5-10 ng/ml) and the other for high uncertainty (PSA concentration: 10-25 ng/ml). The CART model for moderate uncertainty (n=321) had 3 splits based on PSA density (PSAD), hypoechoic nodules, and age and the other CART for high uncertainty (n=160) had 2 splits based on prostate volume and percent-free PSA. In this validation set, the patients (14.3% and 14.0% for moderate and high uncertainty groups, respectively) could avoid unnecessary biopsies without false-negative results. Using these CART models based on uncertainty information of PSA, the overall reduction in unnecessary prostate biopsies was 14.0-14.3% and CART models were simplified. Using uncertainty of laboratory results from information theoretic approach can provide additional information for decision analysis such as CART. Copyright © 2012 Elsevier B.V. All rights reserved.

  20. Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2013-01-01

    Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…

  1. Response-Time Approach to Contrasting Models of Perceptual Classification

    DTIC Science & Technology

    2013-02-01

    For example, in Experiment 1 of Nosofsky et al. (2011), the stimuli were a set of 27 Munsell colors varying along dimensions of hue , saturation, and...develop and test models that explain the time course of classification and recognition decision making. The first specific goal involved the...Several empirical studies demonstrated successful applications of the new theory in this domain. The second goal involved the development and testing

  2. Rapid Erosion Modeling in a Western Kenya Watershed using Visible Near Infrared Reflectance, Classification Tree Analysis and 137Cesium

    PubMed Central

    deGraffenried, Jeff B.; Shepherd, Keith D.

    2010-01-01

    Human induced soil erosion has severe economic and environmental impacts throughout the world. It is more severe in the tropics than elsewhere and results in diminished food production and security. Kenya has limited arable land and 30 percent of the country experiences severe to very severe human induced soil degradation. The purpose of this research was to test visible near infrared diffuse reflectance spectroscopy (VNIR) as a tool for rapid assessment and benchmarking of soil condition and erosion severity class. The study was conducted in the Saiwa River watershed in the northern Rift Valley Province of western Kenya, a tropical highland area. Soil 137Cs concentration was measured to validate spectrally derived erosion classes and establish the background levels for difference land use types. Results indicate VNIR could be used to accurately evaluate a large and diverse soil data set and predict soil erosion characteristics. Soil condition was spectrally assessed and modeled. Analysis of mean raw spectra indicated significant reflectance differences between soil erosion classes. The largest differences occurred between 1,350 and 1,950 nm with the largest separation occurring at 1,920 nm. Classification and Regression Tree (CART) analysis indicated that the spectral model had practical predictive success (72%) with Receiver Operating Characteristic (ROC) of 0.74. The change in 137Cs concentrations supported the premise that VNIR is an effective tool for rapid screening of soil erosion condition. PMID:27397933

  3. Biodiversity among Lactobacillus helveticus Strains Isolated from Different Natural Whey Starter Cultures as Revealed by Classification Trees

    PubMed Central

    Gatti, Monica; Trivisano, Carlo; Fabrizi, Enrico; Neviani, Erasmo; Gardini, Fausto

    2004-01-01

    Lactobacillus helveticus is a homofermentative thermophilic lactic acid bacterium used extensively for manufacturing Swiss type and aged Italian cheese. In this study, the phenotypic and genotypic diversity of strains isolated from different natural dairy starter cultures used for Grana Padano, Parmigiano Reggiano, and Provolone cheeses was investigated by a classification tree technique. A data set was used that consists of 119 L. helveticus strains, each of which was studied for its physiological characters, as well as surface protein profiles and hybridization with a species-specific DNA probe. The methodology employed in this work allowed the strains to be grouped into terminal nodes without difficult and subjective interpretation. In particular, good discrimination was obtained between L. helveticus strains isolated, respectively, from Grana Padano and from Provolone natural whey starter cultures. The method used in this work allowed identification of the main characteristics that permit discrimination of biotypes. In order to understand what kind of genes could code for phenotypes of technological relevance, evidence that specific DNA sequences are present only in particular biotypes may be of great interest. PMID:14711641

  4. Feature selection using Decision Tree and classification through Proximal Support Vector Machine for fault diagnostics of roller bearing

    NASA Astrophysics Data System (ADS)

    Sugumaran, V.; Muralidharan, V.; Ramachandran, K. I.

    2007-02-01

    Roller bearing is one of the most widely used rotary elements in a rotary machine. The roller bearing's nature of vibration reveals its condition and the features that show the nature, are to be extracted through some indirect means. Statistical parameters like kurtosis, standard deviation, maximum value, etc. form a set of features, which are widely used in fault diagnostics. Often the problem is, finding out good features that discriminate the different fault conditions of the bearing. Selection of good features is an important phase in pattern recognition and requires detailed domain knowledge. This paper illustrates the use of a Decision Tree that identifies the best features from a given set of samples for the purpose of classification. It uses Proximal Support Vector Machine (PSVM), which has the capability to efficiently classify the faults using statistical features. The vibration signal from a piezoelectric transducer is captured for the following conditions: good bearing, bearing with inner race fault, bearing with outer race fault, and inner and outer race fault. The statistical features are extracted therefrom and classified successfully using PSVM and SVM. The results of PSVM and SVM are compared.

  5. Unimodal transform of variables selected by interval segmentation purity for classification tree modeling of high-dimensional microarray data.

    PubMed

    Du, Wen; Gu, Ting; Tang, Li-Juan; Jiang, Jian-Hui; Wu, Hai-Long; Shen, Guo-Li; Yu, Ru-Qin

    2011-09-15

    As a greedy search algorithm, classification and regression tree (CART) is easily relapsing into overfitting while modeling microarray gene expression data. A straightforward solution is to filter irrelevant genes via identifying significant ones. Considering some significant genes with multi-modal expression patterns exhibiting systematic difference in within-class samples are difficult to be identified by existing methods, a strategy that unimodal transform of variables selected by interval segmentation purity (UTISP) for CART modeling is proposed. First, significant genes exhibiting varied expression patterns can be properly identified by a variable selection method based on interval segmentation purity. Then, unimodal transform is implemented to offer unimodal featured variables for CART modeling via feature extraction. Because significant genes with complex expression patterns can be properly identified and unimodal feature extracted in advance, this developed strategy potentially improves the performance of CART in combating overfitting or underfitting while modeling microarray data. The developed strategy is demonstrated using two microarray data sets. The results reveal that UTISP-based CART provides superior performance to k-nearest neighbors or CARTs coupled with other gene identifying strategies, indicating UTISP-based CART holds great promise for microarray data analysis.

  6. A novel decision tree approach based on transcranial Doppler sonography to screen for blunt cervical vascular injuries.

    PubMed

    Purvis, Dianna; Aldaghlas, Tayseer; Trickey, Amber W; Rizzo, Anne; Sikdar, Siddhartha

    2013-06-01

    Early detection and treatment of blunt cervical vascular injuries prevent adverse neurologic sequelae. Current screening criteria can miss up to 22% of these injuries. The study objective was to investigate bedside transcranial Doppler sonography for detecting blunt cervical vascular injuries in trauma patients using a novel decision tree approach. This prospective pilot study was conducted at a level I trauma center. Patients undergoing computed tomographic angiography for suspected blunt cervical vascular injuries were studied with transcranial Doppler sonography. Extracranial and intracranial vasculatures were examined with a portable power M-mode transcranial Doppler unit. The middle cerebral artery mean flow velocity, pulsatility index, and their asymmetries were used to quantify flow patterns and develop an injury decision tree screening protocol. Student t tests validated associations between injuries and transcranial Doppler predictive measures. We evaluated 27 trauma patients with 13 injuries. Single vertebral artery injuries were most common (38.5%), followed by single internal carotid artery injuries (30%). Compared to patients without injuries, mean flow velocity asymmetry was higher for single internal carotid artery (P = .003) and single vertebral artery (P = .004) injuries. Similarly, pulsatility index asymmetry was higher in single internal carotid artery (P = .015) and single vertebral artery (P = .042) injuries, whereas the lowest pulsatility index was elevated for bilateral vertebral artery injuries (P = .006). The decision tree yielded 92% specificity, 93% sensitivity, and 93% correct classifications. In this pilot feasibility study, transcranial Doppler measures were significantly associated with the blunt cervical vascular injury status, suggesting that transcranial Doppler sonography might be a viable bedside screening tool for trauma. Patient-specific hemodynamic information from transcranial Doppler assessment has the potential to alter

  7. A science based approach to topical drug classification system (TCS).

    PubMed

    Shah, Vinod P; Yacobi, Avraham; Rădulescu, Flavian Ştefan; Miron, Dalia Simona; Lane, Majella E

    2015-08-01

    The Biopharmaceutics Classification System (BCS) for oral immediate release solid drug products has been very successful; its implementation in drug industry and regulatory approval has shown significant progress. This has been the case primarily because BCS was developed using sound scientific judgment. Following the success of BCS, we have considered the topical drug products for similar classification system based on sound scientific principles. In USA, most of the generic topical drug products have qualitatively (Q1) and quantitatively (Q2) same excipients as the reference listed drug (RLD). The applications of in vitro release (IVR) and in vitro characterization are considered for a range of dosage forms (suspensions, creams, ointments and gels) of differing strengths. We advance a Topical Drug Classification System (TCS) based on a consideration of Q1, Q2 as well as the arrangement of matter and microstructure of topical formulations (Q3). Four distinct classes are presented for the various scenarios that may arise and depending on whether biowaiver can be granted or not.

  8. A Novel Anti-classification Approach for Knowledge Protection.

    PubMed

    Lin, Chen-Yi; Chen, Tung-Shou; Tsai, Hui-Fang; Lee, Wei-Bin; Hsu, Tien-Yu; Kao, Yuan-Hung

    2015-10-01

    Classification is the problem of identifying a set of categories where new data belong, on the basis of a set of training data whose category membership is known. Its application is wide-spread, such as the medical science domain. The issue of the classification knowledge protection has been paid attention increasingly in recent years because of the popularity of cloud environments. In the paper, we propose a Shaking Sorted-Sampling (triple-S) algorithm for protecting the classification knowledge of a dataset. The triple-S algorithm sorts the data of an original dataset according to the projection results of the principal components analysis so that the features of the adjacent data are similar. Then, we generate noise data with incorrect classes and add those data to the original dataset. In addition, we develop an effective positioning strategy, determining the added positions of noise data in the original dataset, to ensure the restoration of the original dataset after removing those noise data. The experimental results show that the disturbance effect of the triple-S algorithm on the CLC, MySVM, and LibSVM classifiers increases when the noise data ratio increases. In addition, compared with existing methods, the disturbance effect of the triple-S algorithm is more significant on MySVM and LibSVM when a certain amount of the noise data added to the original dataset is reached.

  9. Classification as clustering: a Pareto cooperative-competitive GP approach.

    PubMed

    McIntyre, Andrew R; Heywood, Malcolm I

    2011-01-01

    Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases.

  10. Classification of Cardiopulmonary Resuscitation Chest Compression Patterns: Manual Versus Automated Approaches

    PubMed Central

    Wang, Henry E.; Schmicker, Robert H.; Herren, Heather; Brown, Siobhan; Donnelly, John P.; Gray, Randal; Ragsdale, Sally; Gleeson, Andrew; Byers, Adam; Jasti, Jamie; Aguirre, Christina; Owens, Pam; Condle, Joe; Leroux, Brian

    2015-01-01

    observations support the consistency of manual CPR pattern classification as well as the use of automated approaches to chest compression pattern analysis. PMID:25639554

  11. Dynamic species classification of microorganisms across time, abiotic and biotic environments—A sliding window approach

    PubMed Central

    Griffiths, Jason I.; Fronhofer, Emanuel A.; Garnier, Aurélie; Seymour, Mathew; Altermatt, Florian; Petchey, Owen L.

    2017-01-01

    The development of video-based monitoring methods allows for rapid, dynamic and accurate monitoring of individuals or communities, compared to slower traditional methods, with far reaching ecological and evolutionary applications. Large amounts of data are generated using video-based methods, which can be effectively processed using machine learning (ML) algorithms into meaningful ecological information. ML uses user defined classes (e.g. species), derived from a subset (i.e. training data) of video-observed quantitative features (e.g. phenotypic variation), to infer classes in subsequent observations. However, phenotypic variation often changes due to environmental conditions, which may lead to poor classification, if environmentally induced variation in phenotypes is not accounted for. Here we describe a framework for classifying species under changing environmental conditions based on the random forest classification. A sliding window approach was developed that restricts temporal and environmentally conditions to improve the classification. We tested our approach by applying the classification framework to experimental data. The experiment used a set of six ciliate species to monitor changes in community structure and behavior over hundreds of generations, in dozens of species combinations and across a temperature gradient. Differences in biotic and abiotic conditions caused simplistic classification approaches to be unsuccessful. In contrast, the sliding window approach allowed classification to be highly successful, as phenotypic differences driven by environmental change, could be captured by the classifier. Importantly, classification using the random forest algorithm showed comparable success when validated against traditional, slower, manual identification. Our framework allows for reliable classification in dynamic environments, and may help to improve strategies for long-term monitoring of species in changing environments. Our classification pipeline

  12. Derivation of Tree Canopy Cover by Multiscale Remote Sensing Approach

    NASA Astrophysics Data System (ADS)

    Wu, W.

    2011-08-01

    In forestry, treecanopy cover (CC) is an important biophysical indicator for characterizing terrestrial ecosystemsand modeling global biogeochemical cycles, e.g., woody biomass estimation, carbon balance analysis (sink/emission). However, currently available CC product cannot fully meet what we need while conducting woody biomass estimation in tropical savannas.It is thus necessary to develop an approach to estimate more reliable CC. Based on the acquisition of multisensor and multiresolution dataset, this study introduces an innovative multiscalemethod for this purpose taking the multiple savannas country Sudan as an example. The procedure includes: (1)Measurement of CC using Google Earth Pro in which very high resolution images such as QuickBirdand GeoEye images are available, and then the measured CC was coupled with atmospherically corrected and reflectance-based 16 frames of Landsat ETM+ vegetation indices (EVI, SARVI and NDVI)dated Nov 1999-2002 to establish the CC-VIs models; it was noted that among these indices NDVI indicates the best correlation with CC (CC = 153.09NDVI- 10.12, R2 = 0.91);(2) The NDVI of Landsat ETM+ was calibrated against MODIS NDVI of the same time period (Nov 2002)to make sure that model developed from Landsat ETM+ data can be applied to MODIS data for upscalingto regional scale study; (3)Time-series MODIS NDVI data of the period Jan 2002-Dec 2009 (MODIS13Q1, 250m, 186 acquisitions) were acquired and used to decompose the woody component(NDVI) from seasonal changeand herbaceous component by time-series analysis;(4) The equation obtained in step 1 was applied to the decomposed MODIS woody NDVI images to derive country scale CC data. The produced CC was checked against the 287 ground measured CC obtained in step 1 and a good agreement (R2 = 0.53-0.71) was found.It is hence concluded that the proposed multiscale approach is effective, operational and can be applied for reliable estimation of regional and even continental scales CC data.

  13. A statistical approach to material classification using image patch exemplars.

    PubMed

    Varma, Manik; Zisserman, Andrew

    2009-11-01

    In this paper, we investigate material classification from single images obtained under unknown viewpoint and illumination. It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighborhoods (starting from as small as 3 \\times 3 pixels square) and that this can outperform classification using filter banks with large support. It is also shown that the performance of filter banks is inferior to that of image patches with equivalent neighborhoods. We develop novel texton-based representations which are suited to modeling this joint neighborhood distribution for Markov random fields. The representations are learned from training images and then used to classify novel images (with unknown viewpoint and lighting) into texture classes. Three such representations are proposed and their performance is assessed and compared to that of filter banks. The power of the method is demonstrated by classifying 2,806 images of all 61 materials present in the Columbia-Utrecht database. The classification performance surpasses that of recent state-of-the-art filter bank-based classifiers such as Leung and Malik (IJCV 01), Cula and Dana (IJCV 04), and Varma and Zisserman (IJCV 05). We also benchmark performance by classifying all of the textures present in the UIUC, Microsoft Textile, and San Francisco outdoor data sets. We conclude with discussions on why features based on compact neighborhoods can correctly discriminate between textures with large global structure and why the performance of filter banks is not superior to that of the source image patches from which they were derived.

  14. A Phenotypic Approach for IUIS PID Classification and Diagnosis: Guidelines for Clinicians at the Bedside

    PubMed Central

    Jeddane, Leïla; Ailal, Fatima; Al Herz, Waleed; Conley, Mary Ellen; Cunningham-Rundles, Charlotte; Etzioni, Amos; Fischer, Alain; Franco, Jose Luis; Geha, Raif S.; Hammarström, Lennart; Nonoyama, Shigeaki; Ochs, Hans D.; Roifman, Chaim M.; Seger, Reinhard; Tang, Mimi L. K.; Puck, Jennifer M.; Chapel, Helen; Notarangelo, Luigi D.; Casanova, Jean-Laurent

    2014-01-01

    The number of genetically defined Primary Immunodeficiency Diseases (PID) has increased exponentially, especially in the past decade. The biennial classification published by the IUIS PID expert committee is therefore quickly expanding, providing valuable information regarding the disease-causing genotypes, the immunological anomalies, and the associated clinical features of PIDs. These are grouped in eight, somewhat overlapping, categories of immune dysfunction. However, based on this immunological classification, the diagnosis of a specific PID from the clinician’s observation of an individual clinical and/or immunological phenotype remains difficult, especially for non-PID specialists. The purpose of this work is to suggest a phenotypic classification that forms the basis for diagnostic trees, leading the physician to particular groups of PIDs, starting from clinical features and combining routine immunological investigations along the way.We present 8 colored diagnostic figures that correspond to the 8 PID groups in the IUIS Classification, including all the PIDs cited in the 2011 update of the IUIS classification and most of those reported since. PMID:23657403

  15. Bayesian Evidence Framework for Decision Tree Learning

    NASA Astrophysics Data System (ADS)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  16. Classification

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  17. A bayesian approach to classification criteria for spectacled eiders

    USGS Publications Warehouse

    Taylor, B.L.; Wade, P.R.; Stehn, R.A.; Cochrane, J.F.

    1996-01-01

    To facilitate decisions to classify species according to risk of extinction, we used Bayesian methods to analyze trend data for the Spectacled Eider, an arctic sea duck. Trend data from three independent surveys of the Yukon-Kuskokwim Delta were analyzed individually and in combination to yield posterior distributions for population growth rates. We used classification criteria developed by the recovery team for Spectacled Eiders that seek to equalize errors of under- or overprotecting the species. We conducted both a Bayesian decision analysis and a frequentist (classical statistical inference) decision analysis. Bayesian decision analyses are computationally easier, yield basically the same results, and yield results that are easier to explain to nonscientists. With the exception of the aerial survey analysis of the 10 most recent years, both Bayesian and frequentist methods indicated that an endangered classification is warranted. The discrepancy between surveys warrants further research. Although the trend data are abundance indices, we used a preliminary estimate of absolute abundance to demonstrate how to calculate extinction distributions using the joint probability distributions for population growth rate and variance in growth rate generated by the Bayesian analysis. Recent apparent increases in abundance highlight the need for models that apply to declining and then recovering species.

  18. A Hybrid Approach to Sentiment Sentence Classification in Suicide Notes

    PubMed Central

    Sohn, Sunghwan; Torii, Manabu; Li, Dingcheng; Wagholikar, Kavishwar; Wu, Stephen; Liu, Hongfang

    2012-01-01

    This paper describes the sentiment classification system developed by the Mayo Clinic team for the 2011 I2B2/VA/Cincinnati Natural Language Processing (NLP) Challenge. The sentiment classification task is to assign any pertinent emotion to each sentence in suicide notes. We have implemented three systems that have been trained on suicide notes provided by the I2B2 challenge organizer—a machine learning system, a rule-based system, and a system consisting of a combination of both. Our machine learning system was trained on re-annotated data in which apparently inconsistent emotion assignment was adjusted. Then, the machine learning methods by RIPPER and multinomial Naïve Bayes classifiers, manual pattern matching rules, and the combination of the two systems were tested to determine the emotions within sentences. The combination of the machine learning and rule-based system performed best and produced a micro-average F-score of 0.5640. PMID:22879759

  19. Comparative Study on the Different Testing Techniques in Tree Classification for Detecting the Learning Motivation

    NASA Astrophysics Data System (ADS)

    Juliane, C.; Arman, A. A.; Sastramihardja, H. S.; Supriana, I.

    2017-03-01

    Having motivation to learn is a successful requirement in a learning process, and needs to be maintained properly. This study aims to measure learning motivation, especially in the process of electronic learning (e-learning). Here, data mining approach was chosen as a research method. For the testing process, the accuracy comparative study on the different testing techniques was conducted, involving Cross Validation and Percentage Split. The best accuracy was generated by J48 algorithm with a percentage split technique reaching at 92.19 %. This study provided an overview on how to detect the presence of learning motivation in the context of e-learning. It is expected to be good contribution for education, and to warn the teachers for whom they have to provide motivation.

  20. Mode of Action (MOA) Assignment Classifications for Ecotoxicology: An Evaluation of Approaches.

    PubMed

    Kienzler, A; Barron, M G; Belanger, S E; Beasley, A; Embry, M R

    2017-09-05

    The mode of toxic action (MOA) is recognized as a key determinant of chemical toxicity and as an alternative to chemical class-based predictive toxicity modeling. However, MOA classification has never been standardized in ecotoxicology, and a comprehensive comparison of classification tools and approaches has never been reported. Here we critically evaluate three MOA classification methodologies using an aquatic toxicity data set of 3448 chemicals, compare the approaches, and assess utility and limitations in screening and early tier assessments. The comparisons focused on three commonly used tools: Verhaar prediction of toxicity MOA, the U.S. Environmental Protection Agency (EPA) ASsessment Tool for Evaluating Risk (ASTER) QSAR (quantitative structure activity relationship) application, and the EPA Mode of Action and Toxicity (MOAtox) database. Of the 3448 MOAs predicted using the Verhaar scheme, 1165 were classified by ASTER, and 802 were available in MOAtox. Of the subset of 432 chemicals with MOA assignments for each of the three schemes, 42% had complete concordance in MOA classification, and there was no agreement for 7% of the chemicals. The research shows the potential for large differences in MOA classification between the five broad groups of the Verhaar scheme and the more mechanism-based assignments of ASTER and MOAtox. Harmonization of classification schemes is needed to use MOA classification in chemical hazard and risk assessment more broadly.

  1. Automatic Pulmonary Artery-Vein Separation and Classification in Computed Tomography Using Tree Partitioning and Peripheral Vessel Matching.

    PubMed

    Charbonnier, Jean-Paul; Brink, Monique; Ciompi, Francesco; Scholten, Ernst T; Schaefer-Prokop, Cornelia M; van Rikxoort, Eva M

    2016-03-01

    We present a method for automatic separation and classification of pulmonary arteries and veins in computed tomography. Our method takes advantage of local information to separate segmented vessels, and global information to perform the artery-vein classification. Given a vessel segmentation, a geometric graph is constructed that represents both the topology and the spatial distribution of the vessels. All nodes in the geometric graph where arteries and veins are potentially merged are identified based on graph pruning and individual branching patterns. At the identified nodes, the graph is split into subgraphs that each contain only arteries or veins. Based on the anatomical information that arteries and veins approach a common alveolar sag, an arterial subgraph is expected to be intertwined with a venous subgraph in the periphery of the lung. This relationship is quantified using periphery matching and is used to group subgraphs of the same artery-vein class. Artery-vein classification is performed on these grouped subgraphs based on the volumetric difference between arteries and veins. A quantitative evaluation was performed on 55 publicly available non-contrast CT scans. In all scans, two observers manually annotated randomly selected vessels as artery or vein. Our method was able to separate and classify arteries and veins with a median accuracy of 89%, closely approximating the inter-observer agreement. All CT scans used in this study, including all results of our system and all manual annotations, are publicly available at "http://www.w3.org/1999/xlink">http://arteryvein.grand-challenge.org".

  2. Hierarchical Object-based Image Analysis approach for classification of sub-meter multispectral imagery in Tanzania

    NASA Astrophysics Data System (ADS)

    Chung, C.; Nagol, J. R.; Tao, X.; Anand, A.; Dempewolf, J.

    2015-12-01

    Increasing agricultural production while at the same time preserving the environment has become a challenging task. There is a need for new approaches for use of multi-scale and multi-source remote sensing data as well as ground based measurements for mapping and monitoring crop and ecosystem state to support decision making by governmental and non-governmental organizations for sustainable agricultural development. High resolution sub-meter imagery plays an important role in such an integrative framework of landscape monitoring. It helps link the ground based data to more easily available coarser resolution data, facilitating calibration and validation of derived remote sensing products. Here we present a hierarchical Object Based Image Analysis (OBIA) approach to classify sub-meter imagery. The primary reason for choosing OBIA is to accommodate pixel sizes smaller than the object or class of interest. Especially in non-homogeneous savannah regions of Tanzania, this is an important concern and the traditional pixel based spectral signature approach often fails. Ortho-rectified, calibrated, pan sharpened 0.5 meter resolution data acquired from DigitalGlobe's WorldView-2 satellite sensor was used for this purpose. Multi-scale hierarchical segmentation was performed using multi-resolution segmentation approach to facilitate the use of texture, neighborhood context, and the relationship between super and sub objects for training and classification. eCognition, a commonly used OBIA software program, was used for this purpose. Both decision tree and random forest approaches for classification were tested. The Kappa index agreement for both algorithms surpassed the 85%. The results demonstrate that using hierarchical OBIA can effectively and accurately discriminate classes at even LCCS-3 legend.

  3. A novel approach to malignant-benign classification of pulmonary nodules by using ensemble learning classifiers.

    PubMed

    Tartar, A; Akan, A; Kilic, N

    2014-01-01

    Computer-aided detection systems can help radiologists to detect pulmonary nodules at an early stage. In this paper, a novel Computer-Aided Diagnosis system (CAD) is proposed for the classification of pulmonary nodules as malignant and benign. The proposed CAD system using ensemble learning classifiers, provides an important support to radiologists at the diagnosis process of the disease, achieves high classification performance. The proposed approach with bagging classifier results in 94.7 %, 90.0 % and 77.8 % classification sensitivities for benign, malignant and undetermined classes (89.5 % accuracy), respectively.

  4. Gene selection approach based on improved swarm intelligent optimisation algorithm for tumour classification.

    PubMed

    Jin, Cong; Jin, Shu-Wei

    2016-06-01

    A number of different gene selection approaches based on gene expression profiles (GEP) have been developed for tumour classification. A gene selection approach selects the most informative genes from the whole gene space, which is an important process for tumour classification using GEP. This study presents an improved swarm intelligent optimisation algorithm to select genes for maintaining the diversity of the population. The most essential characteristic of the proposed approach is that it can automatically determine the number of the selected genes. On the basis of the gene selection, the authors construct a variety of the tumour classifiers, including the ensemble classifiers. Four gene datasets are used to evaluate the performance of the proposed approach. The experimental results confirm that the proposed classifiers for tumour classification are indeed effective.

  5. A support vector machine using the lazy learning approach for multi-class classification.

    PubMed

    Comak, E; Arslan, A

    2006-01-01

    Support vector machines can be used in a new machine learning technique based on statistical learning. In this paper, we develop least squares support vector machines (LS-SVMs) using the lazy learning approach to classify data in unclassifiable regions in the case of multi-class classification. LS-SVMs use a set of linear equations while SVMs use a quadratic programming problem. The lazy learning approach is a local and memory-based technique. Therefore, it is an alternative technique to fuzzy inference systems. Our studies show that LS-SVMs with the lazy learning approach can give comparable results to fuzzy LS-SVMs for multi-class classification.

  6. A data driven approach for condition monitoring of wind turbine blade using vibration signals through best-first tree algorithm and functional trees algorithm: A comparative study.

    PubMed

    Joshuva, A; Sugumaran, V

    2017-03-01

    Wind energy is one of the important renewable energy resources available in nature. It is one of the major resources for production of energy because of its dependability due to the development of the technology and relatively low cost. Wind energy is converted into electrical energy using rotating blades. Due to environmental conditions and large structure, the blades are subjected to various vibration forces that may cause damage to the blades. This leads to a liability in energy production and turbine shutdown. The downtime can be reduced when the blades are diagnosed continuously using structural health condition monitoring. These are considered as a pattern recognition problem which consists of three phases namely, feature extraction, feature selection, and feature classification. In this study, statistical features were extracted from vibration signals, feature selection was carried out using a J48 decision tree algorithm and feature classification was performed using best-first tree algorithm and functional trees algorithm. The better algorithm is suggested for fault diagnosis of wind turbine blade.

  7. Characterizing Vocal Repertoires—Hard vs. Soft Classification Approaches

    PubMed Central

    Wadewitz, Philip; Hammerschmidt, Kurt; Battaglia, Demian; Witt, Annette; Wolf, Fred; Fischer, Julia

    2015-01-01

    To understand the proximate and ultimate causes that shape acoustic communication in animals, objective characterizations of the vocal repertoire of a given species are critical, as they provide the foundation for comparative analyses among individuals, populations and taxa. Progress in this field has been hampered by a lack of standard in methodology, however. One problem is that researchers may settle on different variables to characterize the calls, which may impact on the classification of calls. More important, there is no agreement how to best characterize the overall structure of the repertoire in terms of the amount of gradation within and between call types. Here, we address these challenges by examining 912 calls recorded from wild chacma baboons (Papio ursinus). We extracted 118 acoustic variables from spectrograms, from which we constructed different sets of acoustic features, containing 9, 38, and 118 variables; as well 19 factors derived from principal component analysis. We compared and validated the resulting classifications of k-means and hierarchical clustering. Datasets with a higher number of acoustic features lead to better clustering results than datasets with only a few features. The use of factors in the cluster analysis resulted in an extremely poor resolution of emerging call types. Another important finding is that none of the applied clustering methods gave strong support to a specific cluster solution. Instead, the cluster analysis revealed that within distinct call types, subtypes may exist. Because hard clustering methods are not well suited to capture such gradation within call types, we applied a fuzzy clustering algorithm. We found that this algorithm provides a detailed and quantitative description of the gradation within and between chacma baboon call types. In conclusion, we suggest that fuzzy clustering should be used in future studies to analyze the graded structure of vocal repertoires. Moreover, the use of factor analyses to

  8. Developmental Structuralist Approach to the Classification of Adaptive and Pathologic Personality Organizations: Infancy and Early Childhood.

    ERIC Educational Resources Information Center

    Greenspan, Stanley I.; Lourie, Reginald S.

    This paper applies a developmental structuralist approach to the classification of adaptive and pathologic personality organizations and behavior in infancy and early childhood, and it discusses implications of this approach for preventive intervention. In general, as development proceeds, the structural capacity of the developing infant and child…

  9. A New Approach in Teaching the Features and Classifications of Invertebrate Animals in Biology Courses

    ERIC Educational Resources Information Center

    Sezek, Fatih

    2013-01-01

    This study examined the effectiveness of a new learning approach in teaching classification of invertebrate animals in biology courses. In this approach, we used an impersonal style: the subject jigsaw, which differs from the other jigsaws in that both course topics and student groups are divided. Students in Jigsaw group were divided into five…

  10. The usefulness of a classification and regression tree algorithm for detecting perioperative transfusion-related pulmonary complications.

    PubMed

    Kim, Kyu Nam; Kim, Dong Won; Jeong, Mi Ae

    2015-11-01

    Transfusion-related acute lung injury (TRALI) and transfusion-associated circulatory overload (TACO) are leading causes of transfusion-related mortality. An electronic medical record-based screening classification and regression tree (CART) algorithm was previously developed for predicting transfusion-related pulmonary complications. In the Republic of Korea, TRALI is not sufficiently recognized and an accurate TRALI incidence has not been reported. Therefore, we carried out this study to assess the incidence of TRALI and to determine whether the CART algorithm can be applied to our hospital data. A retrospective analysis of all patients who received any type of transfusion during anesthesia was performed. After the patients were diagnosed by the relevant diagnostic criteria, they were reclassified by the CART algorithm. The validity of the algorithm was evaluated with sensitivity, specificity, likelihood ratios, and misclassification rate. Among 1948 patients who had received 11,269 units of transfusion, 26 TRALI and 20 TACO cases were identified. The incidence of TRALI among the transfused patients was 1.33% and per unit of transfused blood product was 0.23%. The sensitivity and specificity of the TRALI algorithm were estimated to be 73.1% (95% confidence interval [CI], 53.9%-86.3%) and 57.0% (95% CI, 52.5%-61.4%). For TACO, the sensitivity and specificity were 90.0% (95% CI, 69.9%-97.2%) and 56.0% (95% CI, 51.6%-60.4%), respectively. Low specificity of the CART algorithm adopted previously indicated its limited diagnostic value in the Republic of Korea. A new algorithm is needed to facilitate the detection of transfusion-related complications. © 2015 AABB.

  11. A multilayered approach for the analysis of perinatal mortality using different classification systems.

    PubMed

    Gordijn, Sanne J; Korteweg, Fleurisca J; Erwich, Jan Jaap H M; Holm, Jozien P; van Diem, Mariet Th; Bergman, Klasien A; Timmer, Albertus

    2009-06-01

    Many classification systems for perinatal mortality are available, all with their own strengths and weaknesses: none of them has been universally accepted. We present a systematic multilayered approach for the analysis of perinatal mortality based on information related to the moment of death, the conditions associated with death and the underlying cause of death, using a combination of representatives of existing classification systems. We compared the existing classification systems regarding their definition of the perinatal period, level of complexity, inclusion of maternal, foetal and/or placental factors and whether they focus at a clinical or pathological viewpoint. Furthermore, we allocated the classification systems to one of three categories: 'when', 'what' or 'why', dependent on whether the allocation of the individual cases of perinatal mortality is based on the moment of death ('when'), the clinical conditions associated with death ('what'), or the underlying cause of death ('why'). A multilayered approach for the analysis and classification of perinatal mortality is possible by using combinations of existing systems; for example the Wigglesworth or Nordic Baltic ('when'), ReCoDe ('what') and Tulip ('why') classification systems. This approach is useful not only for in depth analysis of perinatal mortality in the developed world but also for analysis of perinatal mortality in the developing countries, where resources to investigate death are often limited.

  12. The development of a classification schema for arts-based approaches to knowledge translation.

    PubMed

    Archibald, Mandy M; Caine, Vera; Scott, Shannon D

    2014-10-01

    Arts-based approaches to knowledge translation are emerging as powerful interprofessional strategies with potential to facilitate evidence uptake, communication, knowledge, attitude, and behavior change across healthcare provider and consumer groups. These strategies are in the early stages of development. To date, no classification system for arts-based knowledge translation exists, which limits development and understandings of effectiveness in evidence syntheses. We developed a classification schema of arts-based knowledge translation strategies based on two mechanisms by which these approaches function: (a) the degree of precision in key message delivery, and (b) the degree of end-user participation. We demonstrate how this classification is necessary to explore how context, time, and location shape arts-based knowledge translation strategies. Classifying arts-based knowledge translation strategies according to their core attributes extends understandings of the appropriateness of these approaches for various healthcare settings and provider groups. The classification schema developed may enhance understanding of how, where, and for whom arts-based knowledge translation approaches are effective, and enable theorizing of essential knowledge translation constructs, such as the influence of context, time, and location on utilization strategies. The classification schema developed may encourage systematic inquiry into the effectiveness of these approaches in diverse interprofessional contexts. © 2014 Sigma Theta Tau International.

  13. Frugivores bias seed-adult tree associations through nonrandom seed dispersal: a phylogenetic approach.

    PubMed

    Razafindratsima, Onja H; Dunham, Amy E

    2016-08-01

    Frugivores are the main seed dispersers in many ecosystems, such that behaviorally driven, nonrandom patterns of seed dispersal are a common process; but patterns are poorly understood. Characterizing these patterns may be essential for understanding spatial organization of fruiting trees and drivers of seed-dispersal limitation in biodiverse forests. To address this, we studied resulting spatial associations between dispersed seeds and adult tree neighbors in a diverse rainforest in Madagascar, using a temporal and phylogenetic approach. Data show that by using fruiting trees as seed-dispersal foci, frugivores bias seed dispersal under conspecific adults and under heterospecific trees that share dispersers and fruiting time with the dispersed species. Frugivore-mediated seed dispersal also resulted in nonrandom phylogenetic associations of dispersed seeds with their nearest adult neighbors, in nine out of the 16 months of our study. However, these nonrandom phylogenetic associations fluctuated unpredictably over time, ranging from clustered to overdispersed. The spatial and phylogenetic template of seed dispersal did not translate to similar patterns of association in adult tree neighborhoods, suggesting the importance of post-dispersal processes in structuring plant communities. Results suggest that frugivore-mediated seed dispersal is important for structuring early stages of plant-plant associations, setting the template for post-dispersal processes that influence ultimate patterns of plant recruitment. Importantly, if biased patterns of dispersal are common in other systems, frugivores may promote tree coexistence in biodiverse forests by limiting the frequency and diversity of heterospecific interactions of seeds they disperse. © 2016 by the Ecological Society of America.

  14. TRELLIS+: an effective approach for indexing genome-scale sequences using suffix trees.

    PubMed

    Phoophakdee, Benjarath; Zaki, Mohammed J

    2008-01-01

    With advances in high-throughput sequencing methods, and the corresponding exponential growth in sequence data, it has become critical to develop scalable data management techniques for sequence storage, retrieval and analysis. In this paper we present a novel disk-based suffix tree approach, called TRELLIS+, that effectively scales to massive amount of sequence data using only a limited amount of main-memory, based on a novel string buffering strategy. We show experimentally that TRELLIS+ outperforms existing suffix tree approaches; it is able to index genome-scale sequences (e.g., the entire Human genome), and it also allows rapid query processing over the disk-based index. TRELLIS+ source code is available online at http://www.cs.rpi.edu/-zaki/software/trellis

  15. A Voronoi interior adjacency-based approach for generating a contour tree

    NASA Astrophysics Data System (ADS)

    Chen, Jun; Qiao, Chaofei; Zhao, Renliang

    2004-05-01

    A contour tree is a good graphical tool for representing the spatial relations of contour lines and has found many applications in map generalization, map annotation, terrain analysis, etc. A new approach for generating contour trees by introducing a Voronoi-based interior adjacency set concept is proposed in this paper. The immediate interior adjacency set is employed to identify all of the children contours of each contour without contour elevations. It has advantages over existing methods such as the point-in-polygon method and the region growing-based method. This new approach can be used for spatial data mining and knowledge discovering, such as the automatic extraction of terrain features and construction of multi-resolution digital elevation model.

  16. An improved spanning tree approach for the reliability analysis of supply chain collaborative network

    NASA Astrophysics Data System (ADS)

    Lam, C. Y.; Ip, W. H.

    2012-11-01

    A higher degree of reliability in the collaborative network can increase the competitiveness and performance of an entire supply chain. As supply chain networks grow more complex, the consequences of unreliable behaviour become increasingly severe in terms of cost, effort and time. Moreover, it is computationally difficult to calculate the network reliability of a Non-deterministic Polynomial-time hard (NP-hard) all-terminal network using state enumeration, as this may require a huge number of iterations for topology optimisation. Therefore, this paper proposes an alternative approach of an improved spanning tree for reliability analysis to help effectively evaluate and analyse the reliability of collaborative networks in supply chains and reduce the comparative computational complexity of algorithms. Set theory is employed to evaluate and model the all-terminal reliability of the improved spanning tree algorithm and present a case study of a supply chain used in lamp production to illustrate the application of the proposed approach.

  17. Neuropsychological Test Selection for Cognitive Impairment Classification: A Machine Learning Approach

    PubMed Central

    Williams, Jennifer A.; Schmitter-Edgecombe, Maureen; Cook, Diane J.

    2016-01-01

    Introduction Reducing the amount of testing required to accurately detect cognitive impairment is clinically relevant. The aim of this research was to determine the fewest number of clinical measures required to accurately classify participants as healthy older adult, mild cognitive impairment (MCI) or dementia using a suite of classification techniques. Methods Two variable selection machine learning models (i.e., naive Bayes, decision tree), a logistic regression, and two participant datasets (i.e., clinical diagnosis, clinical dementia rating; CDR) were explored. Participants classified using clinical diagnosis criteria included 52 individuals with dementia, 97 with MCI, and 161 cognitively healthy older adults. Participants classified using CDR included 154 individuals CDR = 0, 93 individuals with CDR = 0.5, and 25 individuals with CDR = 1.0+. Twenty-seven demographic, psychological, and neuropsychological variables were available for variable selection. Results No significant difference was observed between naive Bayes, decision tree, and logistic regression models for classification of both clinical diagnosis and CDR datasets. Participant classification (70.0 – 99.1%), geometric mean (60.9 – 98.1%), sensitivity (44.2 – 100%), and specificity (52.7 – 100%) were generally satisfactory. Unsurprisingly, the MCI/CDR = 0.5 participant group was the most challenging to classify. Through variable selection only 2 – 9 variables were required for classification and varied between datasets in a clinically meaningful way. Conclusions The current study results reveal that machine learning techniques can accurately classifying cognitive impairment and reduce the number of measures required for diagnosis. PMID:26332171

  18. Tropical forest structure characterization using airborne lidar data: an individual tree level approach

    NASA Astrophysics Data System (ADS)

    Ferraz, A.; Saatchi, S. S.

    2015-12-01

    Fine scale tropical forest structure characterization has been performed by means of field measurements techniques that record both the specie and the diameter at the breast height (dbh) for every tree within a given area. Due to dense and complex vegetation, additional important ecological variables (e.g. the tree height and crown size) are usually not measured because they are hardly recognized from the ground. The poor knowledge on the 3D tropical forest structure has been a major limitation for the understanding of different ecological issues such as the spatial distribution of carbon stocks, regeneration and competition dynamics and light penetration gradient assessments. Airborne laser scanning (ALS) is an active remote sensing technique that provides georeferenced distance measurements between the aircraft and the surface. It provides an unstructured 3D point cloud that is a high-resolution model of the forest. This study presents the first approach for tropical forest characterization at a fine scale using remote sensing data. The multi-modal lidar point cloud is decomposed into 3D clusters that correspond to single trees by means of a technique called Adaptive Mean Shift Segmentation (AMS3D). The ability of the corresponding individual tree metrics (tree height, crown area and crown volume) for the estimation of above ground biomass (agb) over the 50 ha CTFS plot in Barro Colorado Island is here assessed. We conclude that our approach is able to map the agb spatial distribution with an error of nearly 12% (RMSE=28 Mg ha-1) compared with field-based estimates over 1ha plots.

  19. A distributed coding approach for stereo sequences in the tree structured Haar transform domain

    NASA Astrophysics Data System (ADS)

    Cancellaro, M.; Carli, M.; Neri, A.

    2009-02-01

    In this contribution, a novel method for distributed video coding for stereo sequences is proposed. The system encodes independently the left and right frames of the stereoscopic sequence. The decoder exploits the side information to achieve the best reconstruction of the correlated video streams. In particular, a syndrome coder approach based on a lifted Tree Structured Haar wavelet scheme has been adopted. The experimental results show the effectiveness of the proposed scheme.

  20. Classification of prosthetic heart valve sounds. A parametric approach

    SciTech Connect

    Candy, J.V.; Jones, H.E. |

    1995-06-01

    People with heart problems have had their lives extended considerably with the development of the prosthetic heart valve. Great strides have been made in the development of the valves through the use of improved materials as well as efficient mechanical designs. However, since the valves operate continuously over a long period, structural failures can occur-even though they are relatively uncommon. Here the development of techniques to classify the valve either as having intact struts or as having a separated strut, commonly called single leg separation, is discussed. In this paper the signal processing techniques employed to extract the required signals/parameters are briefly reviewed and then it is shown how they can be used to simulate a synthetic heart valve database for eventual Monte Carlo testing. Next, the optimal classifier is developed under assumed conditions and its pe