Science.gov

Sample records for classification tree approach

  1. The decision tree approach to classification

    NASA Technical Reports Server (NTRS)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  2. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  3. Decision tree approach for classification of remotely sensed satellite data using open source support

    NASA Astrophysics Data System (ADS)

    Sharma, Richa; Ghosh, Aniruddha; Joshi, P. K.

    2013-10-01

    In this study, an attempt has been made to develop a decision tree classification (DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source data mining software. The classified image is compared with the image classified using classical ISODATA clustering and Maximum Likelihood Classifier (MLC) algorithms. Classification result based on DTC method provided better visual depiction than results produced by ISODATA clustering or by MLC algorithms. The overall accuracy was found to be 90% (kappa = 0.88) using the DTC, 76.67% (kappa = 0.72) using the Maximum Likelihood and 57.5% (kappa = 0.49) using ISODATA clustering method. Based on the overall accuracy and kappa statistics, DTC was found to be more preferred classification approach than others.

  4. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  5. Event-based prediction of stream turbidity using a combined cluster analysis and classification tree approach

    NASA Astrophysics Data System (ADS)

    Mather, Amanda L.; Johnson, Richard L.

    2015-11-01

    Stream turbidity typically increases during streamflow events; however, similar event hydrographs can produce markedly different event turbidity behaviors because many factors influence turbidity in addition to streamflow, including antecedent moisture conditions, season, and supply of turbidity-causing materials. Modeling of sub-hourly turbidity as a function of streamflow shows that event model parameters vary on an event-by-event basis. Here we examine the extent to which stream turbidity can be predicted through the prediction of event model parameters. Using three mid-sized streams from the Mid-Atlantic region of the U.S., we show the model parameter set for each event can be predicted based on the event characteristics (e.g., hydrologic, meteorologic and antecedent moisture conditions) using a combined cluster analysis and classification tree approach. The results suggest that the ratio of beginning event discharge to peak event discharge (an estimate of the event baseflow index), as well as catchment antecedent moisture, are important factors in the prediction of event turbidity. Indicators of antecedent moisture, particularly those derived from antecedent discharge, account for the majority of the splitting nodes in the classification trees for all three streams. For this study, prediction of turbidity during streamflow events is based upon observed data (e.g., measured streamflow, precipitation and air temperature). However, the results also suggest that the methods presented here can, in future work, be used in conjunction with forecasts of streamflow, precipitation and air temperature to forecast stream turbidity.

  6. Predictive Classification Trees

    NASA Astrophysics Data System (ADS)

    Dlugosz, Stephan; Müller-Funk, Ulrich

    CART (Breiman et al., Classification and Regression Trees, Chapman and Hall, New York, 1984) and (exhaustive) CHAID (Kass, Appl Stat 29:119-127, 1980) figure prominently among the procedures actually used in data based management, etc. CART is a well-established procedure that produces binary trees. CHAID, in contrast, admits multiple splittings, a feature that allows to exploit the splitting variable more extensively. On the other hand, that procedure depends on premises that are questionable in practical applications. This can be put down to the fact that CHAID relies on simultaneous Chi-Square- resp. F-tests. The null-distribution of the second test statistic, for instance, relies on the normality assumption that is not plausible in a data mining context. Moreover, none of these procedures - as implemented in SPSS, for instance - take ordinal dependent variables into account. In the paper we suggest an alternative tree-algorithm that: Requires explanatory categorical variables

  7. Snow event classification with a 2D video disdrometer - A decision tree approach

    NASA Astrophysics Data System (ADS)

    Bernauer, F.; Hürkamp, K.; Rühm, W.; Tschiersch, J.

    2016-05-01

    Snowfall classification according to crystal type or degree of riming of the snowflakes is import for many atmospheric processes, e.g. wet deposition of aerosol particles. 2D video disdrometers (2DVD) have recently proved their capability to measure microphysical parameters of snowfall. The present work has the aim of classifying snowfall according to microphysical properties of single hydrometeors (e.g. shape and fall velocity) measured by means of a 2DVD. The constraints for the shape and velocity parameters which are used in a decision tree for classification of the 2DVD measurements, are derived from detailed on-site observations, combining automatic 2DVD classification with visual inspection. The developed decision tree algorithm subdivides the detected events into three classes of dominating crystal type (single crystals, complex crystals and pellets) and three classes of dominating degree of riming (weak, moderate and strong). The classification results for the crystal type were validated with an independent data set proving the unambiguousness of the classification. In addition, for three long-term events, good agreement of the classification results with independently measured maximum dimension of snowflakes, snowflake bulk density and surrounding temperature was found. The developed classification algorithm is applicable for wind speeds below 5.0 m s -1 and has the advantage of being easily implemented by other users.

  8. Chronic subdural hematoma: Surgical management and outcome in 986 cases: A classification and regression tree approach

    PubMed Central

    Rovlias, Aristedis; Theodoropoulos, Spyridon; Papoutsakis, Dimitrios

    2015-01-01

    Background: Chronic subdural hematoma (CSDH) is one of the most common clinical entities in daily neurosurgical practice which carries a most favorable prognosis. However, because of the advanced age and medical problems of patients, surgical therapy is frequently associated with various complications. This study evaluated the clinical features, radiological findings, and neurological outcome in a large series of patients with CSDH. Methods: A classification and regression tree (CART) technique was employed in the analysis of data from 986 patients who were operated at Asclepeion General Hospital of Athens from January 1986 to December 2011. Burr holes evacuation with closed system drainage has been the operative technique of first choice at our institution for 29 consecutive years. A total of 27 prognostic factors were examined to predict the outcome at 3-month postoperatively. Results: Our results indicated that neurological status on admission was the best predictor of outcome. With regard to the other data, age, brain atrophy, thickness and density of hematoma, subdural accumulation of air, and antiplatelet and anticoagulant therapy were found to correlate significantly with prognosis. The overall cross-validated predictive accuracy of CART model was 85.34%, with a cross-validated relative error of 0.326. Conclusions: Methodologically, CART technique is quite different from the more commonly used methods, with the primary benefit of illustrating the important prognostic variables as related to outcome. Since, the ideal therapy for the treatment of CSDH is still under debate, this technique may prove useful in developing new therapeutic strategies and approaches for patients with CSDH. PMID:26257985

  9. Quantification of chemical peptide reactivity for screening contact allergens: a classification tree model approach.

    PubMed

    Gerberick, G Frank; Vassallo, Jeffrey D; Foertsch, Leslie M; Price, Brad B; Chaney, Joel G; Lepoittevin, Jean-Pierre

    2007-06-01

    In the interest of reducing animal use, in vitro alternatives for skin sensitization testing are under development. One unifying characteristic of chemical allergens is the requirement that they react with proteins for the effective induction of skin sensitization. The majority of chemical allergens are electrophilic and react with nucleophilic amino acids. To determine whether and to what extent reactivity correlates with skin sensitization potential, 82 chemicals comprising allergens of different potencies and nonallergenic chemicals were evaluated for their ability to react with reduced glutathione (GSH) or with two synthetic peptides containing either a single cysteine or lysine. Following a 15-min reaction time with GSH, or a 24-h reaction time with the two synthetic peptides, the samples were analyzed by high-performance liquid chromatography. UV detection was used to monitor the depletion of GSH or the peptides. The peptide reactivity data were compared with existing local lymph node assay data using recursive partitioning methodology to build a classification tree that allowed a ranking of reactivity as minimal, low, moderate, and high. Generally, nonallergens and weak allergens demonstrated minimal to low peptide reactivity, whereas moderate to extremely potent allergens displayed moderate to high peptide reactivity. Classifying minimal reactivity as nonsensitizers and low, moderate, and high reactivity as sensitizers, it was determined that a model based on cysteine and lysine gave a prediction accuracy of 89%. The results of these investigations reveal that measurement of peptide reactivity has considerable potential utility as a screening approach for skin sensitization testing, and thereby for reducing reliance on animal-based test methods. PMID:17400584

  10. Applying an Ensemble Classification Tree Approach to the Prediction of Completion of a 12-Step Facilitation Intervention with Stimulant Abusers

    PubMed Central

    Doyle, Suzanne R.; Donovan, Dennis M.

    2014-01-01

    Aims The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed and the subagging procedure of random subsampling is applied. Methods Among 234 individuals with stimulant use disorders randomized to a 12-Step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. Findings The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower ASI Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. Conclusions The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038

  11. Applying an ensemble classification tree approach to the prediction of completion of a 12-step facilitation intervention with stimulant abusers.

    PubMed

    Doyle, Suzanne R; Donovan, Dennis M

    2014-12-01

    The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed, and the subagging procedure of random subsampling is applied. Among 234 individuals with stimulant use disorders randomized to a 12-step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower Addiction Severity Index (ASI) Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038

  12. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  13. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object-based classification approaches.

    PubMed

    Meneguzzo, Dacia M; Liknes, Greg C; Nelson, Mark D

    2013-08-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics. Despite the significance of ToF, forest and other natural resource inventory programs and geospatial land cover datasets that are available at a national scale do not include comprehensive information regarding ToF in the United States. Additional ground-based data collection and acquisition of specialized imagery to inventory these resources are expensive alternatives. As a potential solution, we identified two remote sensing-based approaches that use free high-resolution aerial imagery from the National Agriculture Imagery Program (NAIP) to map all tree cover in an agriculturally dominant landscape. We compared the results obtained using an unsupervised per-pixel classifier (independent component analysis-[ICA]) and an object-based image analysis (OBIA) procedure in Steele County, Minnesota, USA. Three types of accuracy assessments were used to evaluate how each method performed in terms of: (1) producing a county-level estimate of total tree-covered area, (2) correctly locating tree cover on the ground, and (3) how tree cover patch metrics computed from the classified outputs compared to those delineated by a human photo interpreter. Both approaches were found to be viable for mapping tree cover over a broad spatial extent and could serve to supplement ground-based inventory data. The ICA approach produced an estimate of total tree cover more similar to the photo-interpreted result, but the output from the OBIA method was more realistic in terms of describing the actual observed spatial pattern of tree cover. PMID:23255169

  14. Phylogenetic classification and the universal tree.

    PubMed

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished. PMID:10381871

  15. Predicting 'very poor' beach water quality gradings using classification tree.

    PubMed

    Thoe, Wai; Choi, King Wah; Lee, Joseph Hun-wei

    2016-02-01

    A beach water quality prediction system has been developed in Hong Kong using multiple linear regression (MLR) models. However, linear models are found to be weak at capturing the infrequent 'very poor' water quality occasions when Escherichia coli (E. coli) concentration exceeds 610 counts/100 mL. This study uses a classification tree to increase the accuracy in predicting the 'very poor' water quality events at three Hong Kong beaches affected either by non-point source or point source pollution. Binary-output classification trees (to predict whether E. coli concentration exceeds 610 counts/100 mL) are developed over the periods before and after the implementation of the Harbour Area Treatment Scheme, when systematic changes in water quality were observed. Results show that classification trees can capture more 'very poor' events in both periods when compared to the corresponding linear models, with an increase in correct positives by an average of 20%. Classification trees are also developed at two beaches to predict the four-category Beach Water Quality Indices. They perform worse than the binary tree and give excessive false alarms of 'very poor' events. Finally, a combined modelling approach using both MLR model and classification tree is proposed to enhance the beach water quality prediction system for Hong Kong. PMID:26837834

  16. Classification based on full decision trees

    NASA Astrophysics Data System (ADS)

    Genrikhov, I. E.; Djukova, E. V.

    2012-04-01

    The ideas underlying a series of the authors' studies dealing with the design of classification algorithms based on full decision trees are further developed. It is shown that the decision tree construction under consideration takes into account all the features satisfying a branching criterion. Full decision trees with an entropy branching criterion are studied as applied to precedent-based pattern recognition problems with real-valued data. Recognition procedures are constructed for solving problems with incomplete data (gaps in the feature descriptions of the objects) in the case when the learning objects are nonuniformly distributed over the classes. The authors' basic results previously obtained in this area are overviewed.

  17. Fast Image Texture Classification Using Decision Trees

    NASA Technical Reports Server (NTRS)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  18. Seasonal Effect on Tree Species Classification in an Urban Environment Using Hyperspectral Data, LiDAR, and an Object-Oriented Approach

    PubMed Central

    Voss, Matthew; Sugumaran, Ramanathan

    2008-01-01

    The objective of the current study was to analyze the seasonal effect on differentiating tree species in an urban environment using multi-temporal hyperspectral data, Light Detection And Ranging (LiDAR) data, and a tree species database collected from the field. Two Airborne Imaging Spectrometer for Applications (AISA) hyperspectral images were collected, covering the Summer and Fall seasons. In order to make both datasets spatially and spectrally compatible, several preprocessing steps, including band reduction and a spatial degradation, were performed. An object-oriented classification was performed on both images using training data collected randomly from the tree species database. The seven dominant tree species (Gleditsia triacanthos, Acer saccharum, Tilia Americana, Quercus palustris, Pinus strobus and Picea glauca) were used in the classification. The results from this analysis did not show any major difference in overall accuracy between the two seasons. Overall accuracy was approximately 57% for the Summer dataset and 56% for the Fall dataset. However, the Fall dataset provided more consistent results for all tree species while the Summer dataset had a few higher individual class accuracies. Further, adding LiDAR into the classification improved the results by 19% for both fall and summer. This is mainly due to the removal of shadow effect and the addition of elevation data to separate low and high vegetation.

  19. Voxel classification based airway tree segmentation

    NASA Astrophysics Data System (ADS)

    Lo, Pechin; de Bruijne, Marleen

    2008-03-01

    This paper presents a voxel classification based method for segmenting the human airway tree in volumetric computed tomography (CT) images. In contrast to standard methods that use only voxel intensities, our method uses a more complex appearance model based on a set of local image appearance features and Kth nearest neighbor (KNN) classification. The optimal set of features for classification is selected automatically from a large set of features describing the local image structure at several scales. The use of multiple features enables the appearance model to differentiate between airway tree voxels and other voxels of similar intensities in the lung, thus making the segmentation robust to pathologies such as emphysema. The classifier is trained on imperfect segmentations that can easily be obtained using region growing with a manual threshold selection. Experiments show that the proposed method results in a more robust segmentation that can grow into the smaller airway branches without leaking into emphysematous areas, and is able to segment many branches that are not present in the training set.

  20. Semi-supervised SVM for individual tree crown species classification

    NASA Astrophysics Data System (ADS)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  1. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    NASA Astrophysics Data System (ADS)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  2. Consensus of classification trees for skin sensitisation hazard prediction.

    PubMed

    Asturiol, D; Casati, S; Worth, A

    2016-10-01

    Since March 2013, it is no longer possible to market in the European Union (EU) cosmetics containing new ingredients tested on animals. Although several in silico alternatives are available and achievements have been made in the development and regulatory adoption of skin sensitisation non-animal tests, there is not yet a generally accepted approach for skin sensitisation assessment that would fully substitute the need for animal testing. The aim of this work was to build a defined approach (i.e. a predictive model based on readouts from various information sources that uses a fixed procedure for generating a prediction) for skin sensitisation hazard prediction (sensitiser/non-sensitiser) using Local Lymph Node Assay (LLNA) results as reference classifications. To derive the model, we built a dataset with high quality data from in chemico (DPRA) and in vitro (KeratinoSens™ and h-CLAT) methods, and it was complemented with predictions from several software packages. The modelling exercise showed that skin sensitisation hazard was better predicted by classification trees based on in silico predictions. The defined approach consists of a consensus of two classification trees that are based on descriptors that account for protein reactivity and structural features. The model showed an accuracy of 0.93, sensitivity of 0.98, and specificity of 0.85 for 269 chemicals. In addition, the defined approach provides a measure of confidence associated to the prediction. PMID:27458072

  3. Evaluating multimedia chemical persistence: Classification and regression tree analysis

    SciTech Connect

    Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.

    2000-04-01

    For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Widely distributed persistent chemicals are impossible to remove from the environment and remediation by natural processes may take decades, which is problematic if adverse health or ecological effects are discovered after prolonged release into the environment. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as persistent or nonpersistent based on the chemical properties. In this approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robust for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and can be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.

  4. Tree Classification with Fused Mobile Laser Scanning and Hyperspectral Data

    PubMed Central

    Puttonen, Eetu; Jaakkola, Anttoni; Litkey, Paula; Hyyppä, Juha

    2011-01-01

    Mobile Laser Scanning data were collected simultaneously with hyperspectral data using the Finnish Geodetic Institute Sensei system. The data were tested for tree species classification. The test area was an urban garden in the City of Espoo, Finland. Point clouds representing 168 individual tree specimens of 23 tree species were determined manually. The classification of the trees was done using first only the spatial data from point clouds, then with only the spectral data obtained with a spectrometer, and finally with the combined spatial and hyperspectral data from both sensors. Two classification tests were performed: the separation of coniferous and deciduous trees, and the identification of individual tree species. All determined tree specimens were used in distinguishing coniferous and deciduous trees. A subset of 133 trees and 10 tree species was used in the tree species classification. The best classification results for the fused data were 95.8% for the separation of the coniferous and deciduous classes. The best overall tree species classification succeeded with 83.5% accuracy for the best tested fused data feature combination. The respective results for paired structural features derived from the laser point cloud were 90.5% for the separation of the coniferous and deciduous classes and 65.4% for the species classification. Classification accuracies with paired hyperspectral reflectance value data were 90.5% for the separation of coniferous and deciduous classes and 62.4% for different species. The results are among the first of their kind and they show that mobile collected fused data outperformed single-sensor data in both classification tests and by a significant margin. PMID:22163894

  5. Sensitivity of missing values in classification tree for large sample

    NASA Astrophysics Data System (ADS)

    Hasan, Norsida; Adam, Mohd Bakri; Mustapha, Norwati; Abu Bakar, Mohd Rizam

    2012-05-01

    Missing values either in predictor or in response variables are a very common problem in statistics and data mining. Cases with missing values are often ignored which results in loss of information and possible bias. The objectives of our research were to investigate the sensitivity of missing data in classification tree model for large sample. Data were obtained from one of the high level educational institutions in Malaysia. Students' background data were randomly eliminated and classification tree was used to predict students degree classification. The results showed that for large sample, the structure of the classification tree was sensitive to missing values especially for sample contains more than ten percent missing values.

  6. A Mixtures-of-Trees Framework for Multi-Label Classification

    PubMed Central

    Hong, Charmgil; Batal, Iyad; Hauskrecht, Milos

    2015-01-01

    We propose a new probabilistic approach for multi-label classification that aims to represent the class posterior distribution P(Y|X). Our approach uses a mixture of tree-structured Bayesian networks, which can leverage the computational advantages of conditional tree-structured models and the abilities of mixtures to compensate for tree-structured restrictions. We develop algorithms for learning the model from data and for performing multi-label predictions using the learned model. Experiments on multiple datasets demonstrate that our approach outperforms several state-of-the-art multi-label classification methods. PMID:25927011

  7. [Automatic classification method of star spectrum data based on classification pattern tree].

    PubMed

    Zhao, Xu-Jun; Cai, Jiang-Hui; Zhang, Ji-Fu; Yang, Hai-Feng; Ma, Yang

    2013-10-01

    Frequent pattern, frequently appearing in the data set, plays an important role in data mining. For the stellar spectrum classification tasks, a classification rule mining method based on classification pattern tree is presented on the basis of frequent pattern. The procedures can be shown as follows. Firstly, a new tree structure, i. e., classification pattern tree, is introduced based on the different frequencies of stellar spectral attributes in data base and its different importance used for classification. The related concepts and the construction method of classification pattern tree are also described in this paper. Then, the characteristics of the stellar spectrum are mapped to the classification pattern tree. Two modes of top-to-down and bottom-to-up are used to traverse the classification pattern tree and extract the classification rules. Meanwhile, the concept of pattern capability is introduced to adjust the number of classification rules and improve the construction efficiency of the classification pattern tree. Finally, the SDSS (the Sloan Digital Sky Survey) stellar spectral data provided by the National Astronomical Observatory are used to verify the accuracy of the method. The results show that a higher classification accuracy has been got. PMID:24409754

  8. Watershed Merge Tree Classification for Electron Microscopy Image Segmentation

    SciTech Connect

    Liu, TIng; Jurrus, Elizabeth R.; Seyedhosseini, Mojtaba; Ellisman, Mark; Tasdizen, Tolga

    2012-11-11

    Automated segmentation of electron microscopy (EM) images is a challenging problem. In this paper, we present a novel method that utilizes a hierarchical structure and boundary classification for 2D neuron segmentation. With a membrane detection probability map, a watershed merge tree is built for the representation of hierarchical region merging from the watershed algorithm. A boundary classifier is learned with non-local image features to predict each potential merge in the tree, upon which merge decisions are made with consistency constraints in the sense of optimization to acquire the final segmentation. Independent of classifiers and decision strategies, our approach proposes a general framework for efficient hierarchical segmentation with statistical learning. We demonstrate that our method leads to a substantial improvement in segmentation accuracy.

  9. Classification of Liss IV Imagery Using Decision Tree Methods

    NASA Astrophysics Data System (ADS)

    Verma, Amit Kumar; Garg, P. K.; Prasad, K. S. Hari; Dadhwal, V. K.

    2016-06-01

    Image classification is a compulsory step in any remote sensing research. Classification uses the spectral information represented by the digital numbers in one or more spectral bands and attempts to classify each individual pixel based on this spectral information. Crop classification is the main concern of remote sensing applications for developing sustainable agriculture system. Vegetation indices computed from satellite images gives a good indication of the presence of vegetation. It is an indicator that describes the greenness, density and health of vegetation. Texture is also an important characteristics which is used to identifying objects or region of interest is an image. This paper illustrate the use of decision tree method to classify the land in to crop land and non-crop land and to classify different crops. In this paper we evaluate the possibility of crop classification using an integrated approach methods based on texture property with different vegetation indices for single date LISS IV sensor 5.8 meter high spatial resolution data. Eleven vegetation indices (NDVI, DVI, GEMI, GNDVI, MSAVI2, NDWI, NG, NR, NNIR, OSAVI and VI green) has been generated using green, red and NIR band and then image is classified using decision tree method. The other approach is used integration of texture feature (mean, variance, kurtosis and skewness) with these vegetation indices. A comparison has been done between these two methods. The results indicate that inclusion of textural feature with vegetation indices can be effectively implemented to produce classifiedmaps with 8.33% higher accuracy for Indian satellite IRS-P6, LISS IV sensor images.

  10. Classification Based on Tree-Structured Allocation Rules

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qui

    2008-01-01

    The authors consider the problem of classifying an unknown observation into 1 of several populations by using tree-structured allocation rules. Although many parametric classification procedures are robust to certain assumption violations, there is need for classification procedures that can be used regardless of the group-conditional…

  11. Urban Tree Classification Using Full-Waveform Airborne Laser Scanning

    NASA Astrophysics Data System (ADS)

    Koma, Zs.; Koenig, K.; Höfle, B.

    2016-06-01

    Vegetation mapping in urban environments plays an important role in biological research and urban management. Airborne laser scanning provides detailed 3D geodata, which allows to classify single trees into different taxa. Until now, research dealing with tree classification focused on forest environments. This study investigates the object-based classification of urban trees at taxonomic family level, using full-waveform airborne laser scanning data captured in the city centre of Vienna (Austria). The data set is characterised by a variety of taxa, including deciduous trees (beeches, mallows, plane trees and soapberries) and the coniferous pine species. A workflow for tree object classification is presented using geometric and radiometric features. The derived features are related to point density, crown shape and radiometric characteristics. For the derivation of crown features, a prior detection of the crown base is performed. The effects of interfering objects (e.g. fences and cars which are typical in urban areas) on the feature characteristics and the subsequent classification accuracy are investigated. The applicability of the features is evaluated by Random Forest classification and exploratory analysis. The most reliable classification is achieved by using the combination of geometric and radiometric features, resulting in 87.5% overall accuracy. By using radiometric features only, a reliable classification with accuracy of 86.3% can be achieved. The influence of interfering objects on feature characteristics is identified, in particular for the radiometric features. The results indicate the potential of using radiometric features in urban tree classification and show its limitations due to anthropogenic influences at the same time.

  12. Decision tree methods: applications for classification and prediction.

    PubMed

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure. PMID:26120265

  13. Automatic Classification of Trees from Laser Scanning Point Clouds

    NASA Astrophysics Data System (ADS)

    Sirmacek, B.; Lindenbergh, R.

    2015-08-01

    Development of laser scanning technologies has promoted tree monitoring studies to a new level, as the laser scanning point clouds enable accurate 3D measurements in a fast and environmental friendly manner. In this paper, we introduce a probability matrix computation based algorithm for automatically classifying laser scanning point clouds into 'tree' and 'non-tree' classes. Our method uses the 3D coordinates of the laser scanning points as input and generates a new point cloud which holds a label for each point indicating if it belongs to the 'tree' or 'non-tree' class. To do so, a grid surface is assigned to the lowest height level of the point cloud. The grids are filled with probability values which are calculated by checking the point density above the grid. Since the tree trunk locations appear with very high values in the probability matrix, selecting the local maxima of the grid surface help to detect the tree trunks. Further points are assigned to tree trunks if they appear in the close proximity of trunks. Since heavy mathematical computations (such as point cloud organization, detailed shape 3D detection methods, graph network generation) are not required, the proposed algorithm works very fast compared to the existing methods. The tree classification results are found reliable even on point clouds of cities containing many different objects. As the most significant weakness, false detection of light poles, traffic signs and other objects close to trees cannot be prevented. Nevertheless, the experimental results on mobile and airborne laser scanning point clouds indicate the possible usage of the algorithm as an important step for tree growth observation, tree counting and similar applications. While the laser scanning point cloud is giving opportunity to classify even very small trees, accuracy of the results is reduced in the low point density areas further away than the scanning location. These advantages and disadvantages of two laser scanning point

  14. Ultraviolet stellar spectral classification using a multilevel tree neural network

    NASA Astrophysics Data System (ADS)

    Gulati, R. K.; Gupta, R.; Gothoskar, P.; Khobragade, S.

    Here we present a pattern classification technique based on an Artificial Neural Network (ANN) in a multi-level tree configuration to classify ultraviolet stellar spectra from the IUE Low-Dispersion Spectra Reference Atlas. Preliminary results of this technique show that 94% of the spectra have been classified correctly with an accuracy of one sub-class. A conventional overlineχ2 minimization scheme has also been applied to the data to compare the classification obtained from these schemes with that of the IUE catalog classification.

  15. Using Classification Trees to Predict Alumni Giving for Higher Education

    ERIC Educational Resources Information Center

    Weerts, David J.; Ronca, Justin M.

    2009-01-01

    As the relative level of public support for higher education declines, colleges and universities aim to maximize alumni-giving to keep their programs competitive. Anchored in a utility maximization framework, this study employs the classification and regression tree methodology to examine characteristics of alumni donors and non-donors at a…

  16. Growth in Mathematics Achievement: Analysis with Classification and Regression Trees

    ERIC Educational Resources Information Center

    Ma, Xin

    2005-01-01

    A recently developed statistical technique, often referred to as classification and regression trees (CART), holds great potential for researchers to discover how student-level (and school-level) characteristics interactively affect growth in mathematics achievement. CART is a host of advanced statistical methods that statistically cluster…

  17. A Section-based Method For Tree Species Classification Using Airborne LiDAR Discrete Points In Urban Areas

    NASA Astrophysics Data System (ADS)

    Chunjing, Y. C.; Hui, T.; Zhongjie, R.; Guikai, B.

    2015-12-01

    As a new approach to forest inventory utilizing, LiDAR remote sensing has become an important research issue in the past. Lidar researches initially concentrate on the investigation for mapping forests at the tree level and identifying important structural parameters, such as tree height, crown size, crown base height, individual tree species, and stem volume etc. But for the virtual city visualization and mapping, the traditional methods of tree classification can't satisfy the more complex conditions. Recently, the advanced LiDAR technology has generated new full waveform scanners that provide a higher point density and additional information about the reflecting characteristics of trees. Subsequently, it was demonstrated that it is feasible to detect individual overstorey trees in forests and classify species. But the important issues like the calibration and the decomposition of full waveform data with a series of Gaussian functions usually take a lot of works. What's more, the detection and classification of vegetation results relay much on the prior outcomes. From all above, the section-based method for tree species classification using small footprint and high sampling density lidar data is proposed in this paper, which can overcome the tree species classification issues in urban areas. More specific objectives are to: (1)use local maximum height decision and four direction sections certification methods to get the precise locations of the trees;(2) develop new lidar-derived features processing techniques for characterizing the section structure of individual tree crowns;(3) investigate several techniques for filtering and analyzing vertical profiles of individual trees to classify the trees, and using the expert decision skills based on percentile analysis;(4) assess the accuracy of estimating tree species for each tree, and (5) investigate which type of lidar data, point frequency or intensity, provides the most accurate estimate of tree species

  18. Combining QuickBird, LiDAR, and GIS topography indices to identify a single native tree species in a complex landscape using an object-based classification approach

    NASA Astrophysics Data System (ADS)

    Pham, Lien T. H.; Brabyn, Lars; Ashraf, Salman

    2016-08-01

    There are now a wide range of techniques that can be combined for image analysis. These include the use of object-based classifications rather than pixel-based classifiers, the use of LiDAR to determine vegetation height and vertical structure, as well terrain variables such as topographic wetness index and slope that can be calculated using GIS. This research investigates the benefits of combining these techniques to identify individual tree species. A QuickBird image and low point density LiDAR data for a coastal region in New Zealand was used to examine the possibility of mapping Pohutukawa trees which are regarded as an iconic tree in New Zealand. The study area included a mix of buildings and vegetation types. After image and LiDAR preparation, single tree objects were identified using a range of techniques including: a threshold of above ground height to eliminate ground based objects; Normalised Difference Vegetation Index and elevation difference between the first and last return of LiDAR data to distinguish vegetation from buildings; geometric information to separate clusters of trees from single trees, and treetop identification and region growing techniques to separate tree clusters into single tree crowns. Important feature variables were identified using Random Forest, and the Support Vector Machine provided the classification. The combined techniques using LiDAR and spectral data produced an overall accuracy of 85.4% (Kappa 80.6%). Classification using just the spectral data produced an overall accuracy of 75.8% (Kappa 67.8%). The research findings demonstrate how the combining of LiDAR and spectral data improves classification for Pohutukawa trees.

  19. Multiple Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2010-01-01

    A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.

  20. A novel transferable individual tree crown delineation model based on Fishing Net Dragging and boundary classification

    NASA Astrophysics Data System (ADS)

    Liu, Tao; Im, Jungho; Quackenbush, Lindi J.

    2015-12-01

    This study provides a novel approach to individual tree crown delineation (ITCD) using airborne Light Detection and Ranging (LiDAR) data in dense natural forests using two main steps: crown boundary refinement based on a proposed Fishing Net Dragging (FiND) method, and segment merging based on boundary classification. FiND starts with approximate tree crown boundaries derived using a traditional watershed method with Gaussian filtering and refines these boundaries using an algorithm that mimics how a fisherman drags a fishing net. Random forest machine learning is then used to classify boundary segments into two classes: boundaries between trees and boundaries between branches that belong to a single tree. Three groups of LiDAR-derived features-two from the pseudo waveform generated along with crown boundaries and one from a canopy height model (CHM)-were used in the classification. The proposed ITCD approach was tested using LiDAR data collected over a mountainous region in the Adirondack Park, NY, USA. Overall accuracy of boundary classification was 82.4%. Features derived from the CHM were generally more important in the classification than the features extracted from the pseudo waveform. A comprehensive accuracy assessment scheme for ITCD was also introduced by considering both area of crown overlap and crown centroids. Accuracy assessment using this new scheme shows the proposed ITCD achieved 74% and 78% as overall accuracy, respectively, for deciduous and mixed forest.

  1. An automated approach to the design of decision tree classifiers

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Chin, P.; Beaudet, P.

    1980-01-01

    The classification of large dimensional data sets arising from the merging of remote sensing data with more traditional forms of ancillary data is considered. Decision tree classification, a popular approach to the problem, is characterized by the property that samples are subjected to a sequence of decision rules before they are assigned to a unique class. An automated technique for effective decision tree design which relies only on apriori statistics is presented. This procedure utilizes a set of two dimensional canonical transforms and Bayes table look-up decision rules. An optimal design at each node is derived based on the associated decision table. A procedure for computing the global probability of correct classfication is also provided. An example is given in which class statistics obtained from an actual LANDSAT scene are used as input to the program. The resulting decision tree design has an associated probability of correct classification of .76 compared to the theoretically optimum .79 probability of correct classification associated with a full dimensional Bayes classifier. Recommendations for future research are included.

  2. A Dynamic Classification Approach for Nursing

    PubMed Central

    Hardiker, Nicholas R.; Kim, Tae Youn; Coenen, Amy M.; Jansen, Kay R.

    2011-01-01

    Nursing has a long tradition of classification, stretching back at least 150 years. The introduction of computers into health care towards the end of the 20th Century helped to focus efforts, culminating in the development of a range of standardized classifications. Many of these classifications are still in use today and, while content is periodically updated, the underlying classification structures remain relatively static. In this paper an approach to classification that is relatively new to nursing is presented; an approach that uses formal Web Ontology Language definitions for classes, and computer-based reasoning on those classes, to determine automatically classification structures that more flexibly meet the needs of users. A new proposed classification structure for the International Classification for Nursing Practice is derived under the new approach to provide a new view on the next release of the classification and to contribute to broader quality improvement processes. PMID:22195109

  3. Flood-type classification in mountainous catchments using crisp and fuzzy decision trees

    NASA Astrophysics Data System (ADS)

    Sikorska, Anna E.; Viviroli, Daniel; Seibert, Jan

    2015-10-01

    Floods are governed by largely varying processes and thus exhibit various behaviors. Classification of flood events into flood types and the determination of their respective frequency is therefore important for a better understanding and prediction of floods. This study presents a flood classification for identifying flood patterns at a catchment scale by means of a fuzzy decision tree. Hence, events are represented as a spectrum of six main possible flood types that are attributed with their degree of acceptance. Considered types are flash, short rainfall, long rainfall, snow-melt, rainfall on snow and, in high alpine catchments, glacier-melt floods. The fuzzy decision tree also makes it possible to acknowledge the uncertainty present in the identification of flood processes and thus allows for more reliable flood class estimates than using a crisp decision tree, which identifies one flood type per event. Based on the data set in nine Swiss mountainous catchments, it was demonstrated that this approach is less sensitive to uncertainties in the classification attributes than the classical crisp approach. These results show that the fuzzy approach bears additional potential for analyses of flood patterns at a catchment scale and thereby it provides more realistic representation of flood processes.

  4. Data mining in psychological treatment research: a primer on classification and regression trees.

    PubMed

    King, Matthew W; Resick, Patricia A

    2014-10-01

    Data mining of treatment study results can reveal unforeseen but critical insights, such as who receives the most benefit from treatment and under what circumstances. The usefulness and legitimacy of exploratory data analysis have received relatively little recognition, however, and analytic methods well suited to the task are not widely known in psychology. With roots in computer science and statistics, statistical learning approaches offer a credible option: These methods take a more inductive approach to building a model than is done in traditional regression, allowing the data greater role in suggesting the correct relationships between variables rather than imposing them a priori. Classification and regression trees are presented as a powerful, flexible exemplar of statistical learning methods. Trees allow researchers to efficiently identify useful predictors of an outcome and discover interactions between predictors without the need to anticipate and specify these in advance, making them ideal for revealing patterns that inform hypotheses about treatment effects. Trees can also provide a predictive model for forecasting outcomes as an aid to clinical decision making. This primer describes how tree models are constructed, how the results are interpreted and evaluated, and how trees overcome some of the complexities of traditional regression. Examples are drawn from randomized clinical trial data and highlight some interpretations of particular interest to treatment researchers. The limitations of tree models are discussed, and suggestions for further reading and choices in software are offered. PMID:24588404

  5. Classification of dopamine, serotonin, and dual antagonists by decision trees.

    PubMed

    Kim, Hye-Jung; Choo, Hyunah; Cho, Yong Seo; Koh, Hun Yeong; No, Kyoung Tai; Pae, Ae Nim

    2006-04-15

    Dopamine antagonists (DA), serotonin antagonists (SA), and serotonin-dopamine dual antagonists (Dual) are being used as antipsychotics. A lot of dopamine and serotonin antagonists reveal non-selective binding affinity against these two receptors because the antagonists share structurally common features originated from conserved residues of binding site of the aminergic receptor family. Therefore, classification of dopamine and serotonin antagonists into their own receptors can be useful in the designing of selective antagonist for individual therapy of antipsychotic disorders. Data set containing 1135 dopamine antagonists (D2, D3, and D4), 1251 serotonin antagonists (5-HT1A, 5-HT2A, and 5-HT2C), and 386 serotonin-dopamine dual antagonists was collected from the MDDR database. Cerius2 descriptors were employed to develop a classification model for the 2772 compounds with antipsychotic activity. LDA (linear discriminant analysis), SIMCA (soft independent modeling of class analogy), RP (recursive partitioning), and ANN (artificial neural network) algorithms successfully classified the active class of each compound at the average 73.6% and predicted at the average 69.8%. The decision trees from RP, the best model, were generated to identify and interpret those descriptors that discriminate the active classes more easily. These classification models could be used as a virtual screening tool to predict the active class of new candidates. PMID:16387502

  6. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System

    PubMed Central

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3–6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable ‘if-then’ rules. PMID:26461492

  7. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System.

    PubMed

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3-6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable 'if-then' rules. PMID:26461492

  8. Decision Tree Classifier for Classification of Plant and Animal Micro RNA's

    NASA Astrophysics Data System (ADS)

    Pant, Bhasker; Pant, Kumud; Pardasani, K. R.

    Gene expression is regulated by miRNAs or micro RNAs which can be 21-23 nucleotide in length. They are non coding RNAs which control gene expression either by translation repression or mRNA degradation. Plants and animals both contain miRNAs which have been classified by wet lab techniques. These techniques are highly expensive, labour intensive and time consuming. Hence faster and economical computational approaches are needed. In view of above a machine learning model has been developed for classification of plant and animal miRNAs using decision tree classifier. The model has been tested on available data and it gives results with 91% accuracy.

  9. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    NASA Astrophysics Data System (ADS)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  10. Graduates employment classification using data mining approach

    NASA Astrophysics Data System (ADS)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  11. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    NASA Astrophysics Data System (ADS)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  12. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    PubMed

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement. PMID:25420271

  13. Integration of Classification Tree Analyses and Spatial Metrics to Assess Changes in Supraglacial Lakes in the Karakoram Himalaya

    NASA Astrophysics Data System (ADS)

    Bulley, H. N.; Bishop, M. P.; Shroder, J. F.; Haritashya, U. K.

    2007-12-01

    Alpine glacier responses to climate chnage reveal increases in retreat with corresponding increases in production of glacier melt water and development of supraglacial lakes. The rate of occurrence and spatial extent of lakes in the Himalaya are difficult to determine because current spectral-based image analysis of glacier surfaces are limited through anisotropic reflectance and lack of high quality digital elevation models. Additionally, the limitations of multivariate classification algorithms to adequately segregate glacier features in satellite imagery have led to an increased interest in non-parametric methods, such as classification and regression trees. Our objectives are to demonstrate the utility of a semi-automated approach that integrates classification- tree-based image segmentation and object-oriented analysis to differentiate supraglacial lakes from glacier debris, ice cliffs, lateral and medial moraines. The classification-tree process involves a binary, recursive, partitioning non-parametric method that can account for non-linear relationships. We used 2002 and 2004 ASTER VNIR and SWIR imagery to assess the Baltoro Glacier in the Karakoram Himalaya. Other input variables include the normalized difference water index (NDWI), ratio images, Moran's I image, and fractal dimension. The classification tree was used to generate initial image segments and it was particularly effective in differentiating glacier features. The object-oriented analysis included the use of shape and spatial metrics to refine the classification-tree output. Classification-tree results show that NDWI is the most important single variable for characterizing the glacier-surface features, followed by NIR/IR ratio, IR band, and IR/Red ratio variables. Lake features extracted from both images show there were 142 lakes in 2002 as compared to 188 lakes in 2004. In general, there was a significant increase in planimetric area from 2002 to 2004, and we documented the formation of 46 new

  14. Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui

    2016-07-01

    The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.

  15. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis

    PubMed Central

    Lo, Benjamin W. Y.; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R. Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A. H.

    2016-01-01

    Background: Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). Methods: The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Results: Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56–2.45, P < 0.01). Conclusions: A clinically useful classification tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH. PMID:27512607

  16. Biosensor Approach to Psychopathology Classification

    PubMed Central

    Koshelev, Misha; Lohrenz, Terry; Vannucci, Marina; Montague, P. Read

    2010-01-01

    We used a multi-round, two-party exchange game in which a healthy subject played a subject diagnosed with a DSM-IV (Diagnostic and Statistics Manual-IV) disorder, and applied a Bayesian clustering approach to the behavior exhibited by the healthy subject. The goal was to characterize quantitatively the style of play elicited in the healthy subject (the proposer) by their DSM-diagnosed partner (the responder). The approach exploits the dynamics of the behavior elicited in the healthy proposer as a biosensor for cognitive features that characterize the psychopathology group at the other side of the interaction. Using a large cohort of subjects (n = 574), we found statistically significant clustering of proposers' behavior overlapping with a range of DSM-IV disorders including autism spectrum disorder, borderline personality disorder, attention deficit hyperactivity disorder, and major depressive disorder. To further validate these results, we developed a computer agent to replace the human subject in the proposer role (the biosensor) and show that it can also detect these same four DSM-defined disorders. These results suggest that the highly developed social sensitivities that humans bring to a two-party social exchange can be exploited and automated to detect important psychopathologies, using an interpersonal behavioral probe not directly related to the defining diagnostic criteria. PMID:20975934

  17. An automated approach to the design of decision tree classifiers

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Chin, R.; Beaudet, P.

    1982-01-01

    An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.

  18. Information theoretic approach for accounting classification

    NASA Astrophysics Data System (ADS)

    Ribeiro, E. M. S.; Prataviera, G. A.

    2014-12-01

    In this paper we consider an information theoretic approach for the accounting classification process. We propose a matrix formalism and an algorithm for calculations of information theoretic measures associated to accounting classification. The formalism may be useful for further generalizations and computer-based implementation. Information theoretic measures, mutual information and symmetric uncertainty, were evaluated for daily transactions recorded in the chart of accounts of a small company during two years. Variation in the information measures due the aggregation of data in the process of accounting classification is observed. In particular, the symmetric uncertainty seems to be a useful parameter for comparing companies over time or in different sectors or different accounting choices and standards.

  19. A statistical approach to root system classification

    PubMed Central

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  20. A statistical approach to root system classification.

    PubMed

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  1. An evaluation of popular hyperspectral images classification approaches

    NASA Astrophysics Data System (ADS)

    Kuznetsov, Andrey; Myasnikov, Vladislav

    2015-12-01

    This work is devoted to the problem of the best hyperspectral images classification algorithm selection. The following algorithms are used for comparison: decision tree using full cross-validation; decision tree C 4.5; Bayesian classifier; maximum-likelihood method; MSE minimization classifier, including a special case - classification by conjugation; spectral angle classifier (for empirical mean and nearest neighbor), spectral mismatch classifier and support vector machine (SVM). There are used AVIRIS and SpecTIR hyperspectral images to conduct experiments.

  2. A neural network approach to cloud classification

    NASA Technical Reports Server (NTRS)

    Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.

    1990-01-01

    It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.

  3. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    PubMed

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  4. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    PubMed Central

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  5. Tree Species Classification By Multiseasonal High Resolution Satellite Data

    NASA Astrophysics Data System (ADS)

    Elatawneh, Alata; Wallner, Adelheid; Straub, Christoph; Schneider, Thomas; Knoke, Thomas

    2013-12-01

    Accurate forest tree species mapping is a fundamental issue for sustainable forest management and planning. Forest tree species mapping with the means of remote sensing data is still a topic to be investigated. The Bavaria state institute of forestry is investigating the potential of using digital aerial images for forest management purposes. However, using aerial images is still cost- and time-consuming, in addition to their acquisition restrictions. The new space-born sensor generations such as, RapidEye, with a very high temporal resolution, offering multiseasonal data have the potential to improve the forest tree species mapping. In this study, we investigated the potential of multiseasonal RapidEye data for mapping tree species in a Mid European forest in Southern Germany. The RapidEye data of level A3 were collected on ten different dates in the years 2009, 2010 and 2011. For data analysis, a model was developed, which combines the Spectral Angle Mapper technique with a 10-fold- cross-validation. The analysis succeeded to differentiate four tree species; Norway spruce (Picea abies L.), Silver Fir (Abies alba Mill.), European beech (Fagus sylvatica) and Maple (Acer pseudoplatanus). The model success was evaluated using digital aerial images acquired in the year 2009 and inventory point records from 2008/09 inventory. Model results of the multiseasonal RapidEye data analysis achieved an overall accuracy of 76%. However, the success of the model was evaluated only for all the identified species and not for the individual.

  6. Iqpc 2015 Track: Tree Separation and Classification in Mobile Mapping LIDAR Data

    NASA Astrophysics Data System (ADS)

    Gorte, B.; Oude Elberink, S.; Sirmacek, B.; Wang, J.

    2015-08-01

    The European FP7 project IQmulus yearly organizes several processing contests, where submissions are requested for novel algorithms for point cloud and other big geodata processing. This paper describes the set-up and execution of a contest having the purpose to evaluate state-of-the-art algorithms for Mobile Mapping System point clouds, in order to detect and identify (individual) trees. By the nature of MMS these are trees in the vicinity of the road network (rather than in forests). Therefore, part of the challenge is distinguishing between trees and other objects, such as buildings, street furniture, cars etc. Three submitted segmentation and classification algorithms are thus evaluated.

  7. A Systematic Approach to Subgroup Classification in Intellectual Disability

    ERIC Educational Resources Information Center

    Schalock, Robert L.; Luckasson, Ruth

    2015-01-01

    This article describes a systematic approach to subgroup classification based on a classification framework and sequential steps involved in the subgrouping process. The sequential steps are stating the purpose of the classification, identifying the classification elements, using relevant information, and using clearly stated and purposeful…

  8. Flow Analysis: A Novel Approach For Classification.

    PubMed

    Vakh, Christina; Falkova, Marina; Timofeeva, Irina; Moskvin, Alexey; Moskvin, Leonid; Bulatov, Andrey

    2016-09-01

    We suggest a novel approach for classification of flow analysis methods according to the conditions under which the mass transfer processes and chemical reactions take place in the flow mode: dispersion-convection flow methods and forced-convection flow methods. The first group includes continuous flow analysis, flow injection analysis, all injection analysis, sequential injection analysis, sequential injection chromatography, cross injection analysis, multi-commutated flow analysis, multi-syringe flow injection analysis, multi-pumping flow systems, loop flow analysis, and simultaneous injection effective mixing flow analysis. The second group includes segmented flow analysis, zone fluidics, flow batch analysis, sequential injection analysis with a mixing chamber, stepwise injection analysis, and multi-commutated stepwise injection analysis. The offered classification allows systematizing a large number of flow analysis methods. Recent developments and applications of dispersion-convection flow methods and forced-convection flow methods are presented. PMID:26364745

  9. Classification and concentration estimation of explosive precursors using nanowires sensor array and decision tree learning

    NASA Astrophysics Data System (ADS)

    Cho, Junghwan; Li, Xiaopeng; Gu, Zhiyong; Kurup, Pradeep

    2011-09-01

    This paper aims to classify and estimate concentrations of explosive precursors using a nanowire sensor array and decision tree learning algorithm. The nanowire sensor array consists of tin oxide sensors with four different additives, platinum (Pt), copper (Cu), indium (In), and nickel (Ni). The nanowire sensor array was tested using the vapors from four explosives precursors, acetone, nitrobenzene, nitrotoluene, and octane with 10 different concentration levels each. A pattern recognition technique based on decision tree learning was applied to classify the explosive precursors and estimate their concentration. Classification and regression tree (CART) analysis was used for classification. The CART was also utilized for the purpose of structure identification in Sugeno fuzzy inference system (FIS) for estimating the concentration of the precursors. Two CARTs were trained and their testing results were investigated.

  10. Automatic Approach to Vhr Satellite Image Classification

    NASA Astrophysics Data System (ADS)

    Kupidura, P.; Osińska-Skotak, K.; Pluto-Kossakowska, J.

    2016-06-01

    In this paper, we present a proposition of a fully automatic classification of VHR satellite images. Unlike the most widespread approaches: supervised classification, which requires prior defining of class signatures, or unsupervised classification, which must be followed by an interpretation of its results, the proposed method requires no human intervention except for the setting of the initial parameters. The presented approach bases on both spectral and textural analysis of the image and consists of 3 steps. The first step, the analysis of spectral data, relies on NDVI values. Its purpose is to distinguish between basic classes, such as water, vegetation and non-vegetation, which all differ significantly spectrally, thus they can be easily extracted basing on spectral analysis. The second step relies on granulometric maps. These are the product of local granulometric analysis of an image and present information on the texture of each pixel neighbourhood, depending on the texture grain. The purpose of texture analysis is to distinguish between different classes, spectrally similar, but yet of different texture, e.g. bare soil from a built-up area, or low vegetation from a wooded area. Due to the use of granulometric analysis, based on mathematical morphology opening and closing, the results are resistant to the border effect (qualifying borders of objects in an image as spaces of high texture), which affect other methods of texture analysis like GLCM statistics or fractal analysis. Therefore, the effectiveness of the analysis is relatively high. Several indices based on values of different granulometric maps have been developed to simplify the extraction of classes of different texture. The third and final step of the process relies on a vegetation index, based on near infrared and blue bands. Its purpose is to correct partially misclassified pixels. All the indices used in the classification model developed relate to reflectance values, so the preliminary step

  11. A Classification of Recent Widespread Tree Mortality in the Western US

    NASA Astrophysics Data System (ADS)

    Hicke, J. A.; Anderegg, W.; Allen, C. D.; Stephenson, N.

    2015-12-01

    Widespread tree mortality has been documented across the western United States in recent decades. Climate change has been implicated in these events, in particular warming and associated effects on tree stress and biotic disturbance agents. Given projected future warming, the capability of accurately predicting future tree mortality is critical. However, sufficient ecological understanding is needed to do so. Here we describe differences in various mortality types associated with spatial characteristics and climate drivers. We loosely classify mortality types into four categories: 1) widespread but low severity background mortality that has been increasing mainly because of greater stress associated with rising climatic water deficit; 2) tree die-offs that are driven by severe, hotter drought in which biotic agents play minor roles, such as sudden aspen decline; 3) tree die-offs in which hotter droughts combined with outbreaks of biotic agents, often less aggressive bark beetles, to cause mortality, such as piñon pine mortality in the Southwest; and 4) tree die-offs that were initiated or facilitated by droughts but which were associated with aggressive biotic agents that can kill healthy trees at high populations, such as mountain pine beetle outbreaks. An important use of this classification is the different pathways by which climate change can cause tree mortality. For some classes (background and primarily drought-driven mortality), predictions may be sufficiently accurate based on climate (drought) metrics. For classes in which biotic agents play a role, the direct warming effect on insects may occur through mechanisms not related to drought, and therefore predictions may need to include mechanisms other than drought. We note that this is a simplistic classification designed to facilitate understanding of tree mortality, and that overlap occurs among categories.

  12. Automatic template-guided classification of remnant trees

    NASA Astrophysics Data System (ADS)

    Kennedy, Peter

    Spectral features within satellite images change so frequently and unpredictably that spectral definitions of land cover are often only accurate for a single image. Consequently, land-cover maps are expensive, because the superior pattern recognition skills of human analysts are required to manually tune spectral definitions of land cover to individual images. To reduce mapping costs, this study developed the Template-Guided Classification (TGC) algorithm, which classifies land cover automatically by reusing class information embedded in freely available large-area land-cover maps. TGC was applied to map remnant forest within six 10-m resolution SPOT images of the Vermilion River watershed in Alberta, Canada. Although the accuracy of the resulting forest maps was low (58 % forest user's accuracy and 67 % forest producer's accuracy), there were 25 % and 8 % fewer errors of omission and commission than the original maps, respectively. This improvement would be very useful if it could be obtained automatically over large-areas.

  13. Decision Tree Approach for Soil Liquefaction Assessment

    PubMed Central

    Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498

  14. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    USGS Publications Warehouse

    Edwards, T.C., Jr.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, G.G.

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.

  15. Stroke Damage Detection Using Classification Trees on Electrical Bioimpedance Cerebral Spectroscopy Measurements

    PubMed Central

    Atefi, Seyed Reza; Seoane, Fernando; Thorlin, Thorleif; Lindecrantz, Kaj

    2013-01-01

    After cancer and cardio-vascular disease, stroke is the third greatest cause of death worldwide. Given the limitations of the current imaging technologies used for stroke diagnosis, the need for portable non-invasive and less expensive diagnostic tools is crucial. Previous studies have suggested that electrical bioimpedance (EBI) measurements from the head might contain useful clinical information related to changes produced in the cerebral tissue after the onset of stroke. In this study, we recorded 720 EBI Spectroscopy (EBIS) measurements from two different head regions of 18 hemispheres of nine subjects. Three of these subjects had suffered a unilateral haemorrhagic stroke. A number of features based on structural and intrinsic frequency-dependent properties of the cerebral tissue were extracted. These features were then fed into a classification tree. The results show that a full classification of damaged and undamaged cerebral tissue was achieved after three hierarchical classification steps. Lastly, the performance of the classification tree was assessed using Leave-One-Out Cross Validation (LOO-CV). Despite the fact that the results of this study are limited to a small database, and the observations obtained must be verified further with a larger cohort of patients, these findings confirm that EBI measurements contain useful information for assessing on the health of brain tissue after stroke and supports the hypothesis that classification features based on Cole parameters, spectral information and the geometry of EBIS measurements are useful to differentiate between healthy and stroke damaged brain tissue. PMID:23966181

  16. Computer-aided diagnosis of Alzheimer's disease using support vector machines and classification trees

    NASA Astrophysics Data System (ADS)

    Salas-Gonzalez, D.; Górriz, J. M.; Ramírez, J.; López, M.; Álvarez, I.; Segovia, F.; Chaves, R.; Puntonet, C. G.

    2010-05-01

    This paper presents a computer-aided diagnosis technique for improving the accuracy of early diagnosis of Alzheimer-type dementia. The proposed methodology is based on the selection of voxels which present Welch's t-test between both classes, normal and Alzheimer images, greater than a given threshold. The mean and standard deviation of intensity values are calculated for selected voxels. They are chosen as feature vectors for two different classifiers: support vector machines with linear kernel and classification trees. The proposed methodology reaches greater than 95% accuracy in the classification task.

  17. The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification.

    PubMed

    Afrasiabi, Cyrus; Samad, Bushra; Dineen, David; Meacham, Christopher; Sjölander, Kimmen

    2013-07-01

    The PhyloFacts 'Fast Approximate Tree Classification' (FAT-CAT) web server provides a novel approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs and robustness to promiscuous domains. We also present results documenting the precision of ortholog identification based on subtree hidden Markov model scoring. The FAT-CAT phylogenetic placement is used to derive a functional annotation for the query, including confidence scores and drill-down capabilities. PhyloFacts' broad taxonomic and functional coverage, with >7.3 M proteins from across the Tree of Life, enables FAT-CAT to predict orthologs and assign function for most sequence inputs. Four pipeline parameter presets are provided to handle different sequence types, including partial sequences and proteins containing promiscuous domains; users can also modify individual parameters. PhyloFacts trees matching the query can be viewed interactively online using the PhyloScope Javascript tree viewer and are hyperlinked to various external databases. The FAT-CAT web server is available at http://phylogenomics.berkeley.edu/phylofacts/fatcat/. PMID:23685612

  18. Semi-automatic approach for music classification

    NASA Astrophysics Data System (ADS)

    Zhang, Tong

    2003-11-01

    Audio categorization is essential when managing a music database, either a professional library or a personal collection. However, a complete automation in categorizing music into proper classes for browsing and searching is not yet supported by today"s technology. Also, the issue of music classification is subjective to some extent as each user may have his own criteria for categorizing music. In this paper, we propose the idea of semi-automatic music classification. With this approach, a music browsing system is set up which contains a set of tools for separating music into a number of broad types (e.g. male solo, female solo, string instruments performance, etc.) using existing music analysis methods. With results of the automatic process, the user may further cluster music pieces in the database into finer classes and/or adjust misclassifications manually according to his own preferences and definitions. Such a system may greatly improve the efficiency of music browsing and retrieval, while at the same time guarantee accuracy and user"s satisfaction of the results. Since this semi-automatic system has two parts, i.e. the automatic part and the manual part, they are described separately in the paper, with detailed descriptions and examples of each step of the two parts included.

  19. A Nonparametric Approach to Estimate Classification Accuracy and Consistency

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2014-01-01

    When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

  20. Identification and classification of dynamic event tree scenarios via possibilistic clustering: application to a steam generator tube rupture event.

    PubMed

    Mercurio, D; Podofillini, L; Zio, E; Dang, V N

    2009-11-01

    This paper illustrates a method to identify and classify scenarios generated in a dynamic event tree (DET) analysis. Identification and classification are carried out by means of an evolutionary possibilistic fuzzy C-means clustering algorithm which takes into account not only the final system states but also the timing of the events and the process evolution. An application is considered with regards to the scenarios generated following a steam generator tube rupture in a nuclear power plant. The scenarios are generated by the accident dynamic simulator (ADS), coupled to a RELAP code that simulates the thermo-hydraulic behavior of the plant and to an operators' crew model, which simulates their cognitive and procedures-guided responses. A set of 60 scenarios has been generated by the ADS DET tool. The classification approach has grouped the 60 scenarios into 4 classes of dominant scenarios, one of which was not anticipated a priori but was "discovered" by the classifier. The proposed approach may be considered as a first effort towards the application of identification and classification approaches to scenarios post-processing for real-scale dynamic safety assessments. PMID:19819366

  1. The minimum distance approach to classification

    NASA Technical Reports Server (NTRS)

    Wacker, A. G.; Landgrebe, D. A.

    1971-01-01

    The work to advance the state-of-the-art of miminum distance classification is reportd. This is accomplished through a combination of theoretical and comprehensive experimental investigations based on multispectral scanner data. A survey of the literature for suitable distance measures was conducted and the results of this survey are presented. It is shown that minimum distance classification, using density estimators and Kullback-Leibler numbers as the distance measure, is equivalent to a form of maximum likelihood sample classification. It is also shown that for the parametric case, minimum distance classification is equivalent to nearest neighbor classification in the parameter space.

  2. An object-oriented approach for agrivultural land classification using rapideye imagery

    NASA Astrophysics Data System (ADS)

    Sang, H.; Zhai, L.; Zhang, J.; An, F.

    2015-06-01

    With the improvement of remote sensing technology, the spatial, structural and texture information of land covers are present clearly in high resolution imagery, which enhances the ability of crop mapping. Since the satellite RapidEye was launched in 2009, high resolution multispectral imagery together with wide red edge band has been utilized in vegetation monitoring. Broad red edge band related vegetation indices improved land use classification and vegetation studies. RapidEye high resolution imagery acquired on May 29 and August 9th of 2012 was used in this study to evaluate the potential of red edge band in agricultural land cover/use mapping using an objected-oriented classification approach. A new object-oriented decision tree classifier was introduced in this study to map agricultural lands in the study area. Besides the five bands of RapidEye image, the vegetation indexes derived from spectral bands and the structural and texture features are utilized as inputs for agricultural land cover/use mapping in the study. The optimization of input features for classification by reducing redundant information improves the mapping precision over 9% for AdaTree. WL, and 5% for SVM, the accuracy is over 90% for both approaches. Time phase characteristic is much important in different agricultural lands, and it improves the classification accuracy 7% for AdaTree.WL and 6% for SVM.

  3. Classification tree and minimum-volume ellipsoid analyses of the distribution of ponderosa pine in the western USA

    USGS Publications Warehouse

    Norris, Jodi R.; Jackson, Stephen T.; Betancourt, Julio L.

    2006-01-01

    Aim? Ponderosa pine (Pinus ponderosa Douglas ex Lawson & C. Lawson) is an economically and ecologically important conifer that has a wide geographic range in the western USA, but is mostly absent from the geographic centre of its distribution - the Great Basin and adjoining mountain ranges. Much of its modern range was achieved by migration of geographically distinct Sierra Nevada (P. ponderosa var. ponderosa) and Rocky Mountain (P. ponderosa var. scopulorum) varieties in the last 10,000 years. Previous research has confirmed genetic differences between the two varieties, and measurable genetic exchange occurs where their ranges now overlap in western Montana. A variety of approaches in bioclimatic modelling is required to explore the ecological differences between these varieties and their implications for historical biogeography and impending changes in western landscapes. Location? Western USA. Methods? We used a classification tree analysis and a minimum-volume ellipsoid as models to explain the broad patterns of distribution of ponderosa pine in modern environments using climatic and edaphic variables. Most biogeographical modelling assumes that the target group represents a single, ecologically uniform taxonomic population. Classification tree analysis does not require this assumption because it allows the creation of pathways that predict multiple positive and negative outcomes. Thus, classification tree analysis can be used to test the ecological uniformity of the species. In addition, a multidimensional ellipsoid was constructed to describe the niche of each variety of ponderosa pine, and distances from the niche were calculated and mapped on a 4-km grid for each ecological variable. Results? The resulting classification tree identified three dominant pathways predicting ponderosa pine presence. Two of these three pathways correspond roughly to the distribution of var. ponderosa, and the third pathway generally corresponds to the distribution of var

  4. Classification of Tree Species in Overstorey Canopy of Subtropical Forest Using QuickBird Images

    PubMed Central

    Lin, Chinsu; Popescu, Sorin C.; Thomson, Gavin; Tsogt, Khongor; Chang, Chein-I

    2015-01-01

    This paper proposes a supervised classification scheme to identify 40 tree species (2 coniferous, 38 broadleaf) belonging to 22 families and 36 genera in high spatial resolution QuickBird multispectral images (HMS). Overall kappa coefficient (OKC) and species conditional kappa coefficients (SCKC) were used to evaluate classification performance in training samples and estimate accuracy and uncertainty in test samples. Baseline classification performance using HMS images and vegetation index (VI) images were evaluated with an OKC value of 0.58 and 0.48 respectively, but performance improved significantly (up to 0.99) when used in combination with an HMS spectral-spatial texture image (SpecTex). One of the 40 species had very high conditional kappa coefficient performance (SCKC ≥ 0.95) using 4-band HMS and 5-band VIs images, but, only five species had lower performance (0.68 ≤ SCKC ≤ 0.94) using the SpecTex images. When SpecTex images were combined with a Visible Atmospherically Resistant Index (VARI), there was a significant improvement in performance in the training samples. The same level of improvement could not be replicated in the test samples indicating that a high degree of uncertainty exists in species classification accuracy which may be due to individual tree crown density, leaf greenness (inter-canopy gaps), and noise in the background environment (intra-canopy gaps). These factors increase uncertainty in the spectral texture features and therefore represent potential problems when using pixel-based classification techniques for multi-species classification. PMID:25978466

  5. Automatic lung nodule classification with radiomics approach

    NASA Astrophysics Data System (ADS)

    Ma, Jingchen; Wang, Qian; Ren, Yacheng; Hu, Haibo; Zhao, Jun

    2016-03-01

    Lung cancer is the first killer among the cancer deaths. Malignant lung nodules have extremely high mortality while some of the benign nodules don't need any treatment .Thus, the accuracy of diagnosis between benign or malignant nodules diagnosis is necessary. Notably, although currently additional invasive biopsy or second CT scan in 3 months later may help radiologists to make judgments, easier diagnosis approaches are imminently needed. In this paper, we propose a novel CAD method to distinguish the benign and malignant lung cancer from CT images directly, which can not only improve the efficiency of rumor diagnosis but also greatly decrease the pain and risk of patients in biopsy collecting process. Briefly, according to the state-of-the-art radiomics approach, 583 features were used at the first step for measurement of nodules' intensity, shape, heterogeneity and information in multi-frequencies. Further, with Random Forest method, we distinguish the benign nodules from malignant nodules by analyzing all these features. Notably, our proposed scheme was tested on all 79 CT scans with diagnosis data available in The Cancer Imaging Archive (TCIA) which contain 127 nodules and each nodule is annotated by at least one of four radiologists participating in the project. Satisfactorily, this method achieved 82.7% accuracy in classification of malignant primary lung nodules and benign nodules. We believe it would bring much value for routine lung cancer diagnosis in CT imaging and provide improvement in decision-support with much lower cost.

  6. Stratification of the severity of critically ill patients with classification trees

    PubMed Central

    2009-01-01

    Background Development of three classification trees (CT) based on the CART (Classification and Regression Trees), CHAID (Chi-Square Automatic Interaction Detection) and C4.5 methodologies for the calculation of probability of hospital mortality; the comparison of the results with the APACHE II, SAPS II and MPM II-24 scores, and with a model based on multiple logistic regression (LR). Methods Retrospective study of 2864 patients. Random partition (70:30) into a Development Set (DS) n = 1808 and Validation Set (VS) n = 808. Their properties of discrimination are compared with the ROC curve (AUC CI 95%), Percent of correct classification (PCC CI 95%); and the calibration with the Calibration Curve and the Standardized Mortality Ratio (SMR CI 95%). Results CTs are produced with a different selection of variables and decision rules: CART (5 variables and 8 decision rules), CHAID (7 variables and 15 rules) and C4.5 (6 variables and 10 rules). The common variables were: inotropic therapy, Glasgow, age, (A-a)O2 gradient and antecedent of chronic illness. In VS: all the models achieved acceptable discrimination with AUC above 0.7. CT: CART (0.75(0.71-0.81)), CHAID (0.76(0.72-0.79)) and C4.5 (0.76(0.73-0.80)). PCC: CART (72(69-75)), CHAID (72(69-75)) and C4.5 (76(73-79)). Calibration (SMR) better in the CT: CART (1.04(0.95-1.31)), CHAID (1.06(0.97-1.15) and C4.5 (1.08(0.98-1.16)). Conclusion With different methodologies of CTs, trees are generated with different selection of variables and decision rules. The CTs are easy to interpret, and they stratify the risk of hospital mortality. The CTs should be taken into account for the classification of the prognosis of critically ill patients. PMID:20003229

  7. Application of Decision Tree Algorithm for classification and identification of natural minerals using SEM-EDS

    NASA Astrophysics Data System (ADS)

    Akkaş, Efe; Akin, Lutfiye; Evren Çubukçu, H.; Artuner, Harun

    2015-07-01

    A mineral is a natural, homogeneous solid with a definite chemical composition and a highly ordered atomic arrangement. Recently, fast and accurate mineral identification/classification became a necessity. Energy Dispersive X-ray Spectrometers integrated with Scanning Electron Microscopes (SEM) are used to obtain rapid and reliable elemental analysis or chemical characterization of a solid. However, mineral identification is challenging since there is wide range of spectral dataset for natural minerals. The more mineralogical data acquired, time required for classification procedures increases. Moreover, applied instrumental conditions on a SEM-EDS differ for various applications, affecting the produced X-ray patterns even for the same mineral. This study aims to test whether C5.0 Decision Tree is a rapid and reliable method algorithm for classification and identification of various natural magmatic minerals. Ten distinct mineral groups (olivine, orthopyroxene, clinopyroxene, apatite, amphibole, plagioclase, K-feldspar, zircon, magnetite, biotite) from different igneous rocks have been analyzed on SEM-EDS. 4601 elemental X-ray intensity data have been collected under various instrumental conditions. 2400 elemental data have been used to train and the remaining 2201 data have been tested to identify the minerals. The vast majority of the test data have been classified accurately. Additionally, high accuracy has been reached on the minerals with similar chemical composition, such as olivine ((Mg,Fe)2[SiO4]) and orthopyroxene ((Mg,Fe)2[SiO6]). Furthermore, two members from amphibole group (magnesiohastingsite, tschermakite) and two from clinopyroxene group (diopside, hedenbergite) have been accurately identified by the Decision Tree Algorithm. These results demonstrate that C5.0 Decision Tree Algorithm is an efficient method for mineral group classification and the identification of mineral members.

  8. An Optimized NBC Approach in Text Classification

    NASA Astrophysics Data System (ADS)

    Yao, Zhao; Zhi-Min, Chen

    state-of-the-art text classification algorithms are good at categorizing the Web documents into a few categories. But such a classification method does not give very detailed topic-related class information for the user because the first two levels are often too coarse in Large-scale Text Hierarchies. In this paper, we propose a method named DNB which can improve the performance of classification effectively in experimental results.

  9. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules…

  10. A systematic approach to the classification of diseases.

    PubMed

    Murthy, A R

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences. PMID:22556612

  11. A SYSTEMATIC APPROACH TO THE CLASSIFICATION OF DISEASES

    PubMed Central

    Murthy, A.R.V.

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences. PMID:22556612

  12. Applying Classification Trees to Hospital Administrative Data to Identify Patients with Lower Gastrointestinal Bleeding

    PubMed Central

    Siddique, Juned; Ruhnke, Gregory W.; Flores, Andrea; Prochaska, Micah T.; Paesch, Elizabeth; Meltzer, David O.; Whelan, Chad T.

    2015-01-01

    Background Lower gastrointestinal bleeding (LGIB) is a common cause of acute hospitalization. Currently, there is no accepted standard for identifying patients with LGIB in hospital administrative data. The objective of this study was to develop and validate a set of classification algorithms that use hospital administrative data to identify LGIB. Methods Our sample consists of patients admitted between July 1, 2001 and June 30, 2003 (derivation cohort) and July 1, 2003 and June 30, 2005 (validation cohort) to the general medicine inpatient service of the University of Chicago Hospital, a large urban academic medical center. Confirmed cases of LGIB in both cohorts were determined by reviewing the charts of those patients who had at least 1 of 36 principal or secondary International Classification of Diseases, Ninth revision, Clinical Modification (ICD-9-CM) diagnosis codes associated with LGIB. Classification trees were used on the data of the derivation cohort to develop a set of decision rules for identifying patients with LGIB. These rules were then applied to the validation cohort to assess their performance. Results Three classification algorithms were identified and validated: a high specificity rule with 80.1% sensitivity and 95.8% specificity, a rule that balances sensitivity and specificity (87.8% sensitivity, 90.9% specificity), and a high sensitivity rule with 100% sensitivity and 91.0% specificity. Conclusion These classification algorithms can be used in future studies to evaluate resource utilization and assess outcomes associated with LGIB without the use of chart review. PMID:26406318

  13. A discrete element modelling approach for block impacts on trees

    NASA Astrophysics Data System (ADS)

    Toe, David; Bourrier, Franck; Olmedo, Ignatio; Berger, Frederic

    2015-04-01

    These past few year rockfall models explicitly accounting for block shape, especially those using the Discrete Element Method (DEM), have shown a good ability to predict rockfall trajectories. Integrating forest effects into those models still remain challenging. This study aims at using a DEM approach to model impacts of blocks on trees and identify the key parameters controlling the block kinematics after the impact on a tree. A DEM impact model of a block on a tree was developed and validated using laboratory experiments. Then, key parameters were assessed using a global sensitivity analyse. Modelling the impact of a block on a tree using DEM allows taking into account large displacements, material non-linearities and contacts between the block and the tree. Tree stems are represented by flexible cylinders model as plastic beams sustaining normal, shearing, bending, and twisting loading. Root soil interactions are modelled using a rotation stiffness acting on the bending moment at the bottom of the tree and a limit bending moment to account for tree overturning. The crown is taken into account using an additional mass distribute uniformly on the upper part of the tree. The block is represented by a sphere. The contact model between the block and the stem consists of an elastic frictional model. The DEM model was validated using laboratory impact tests carried out on 41 fresh beech (Fagus Sylvatica) stems. Each stem was 1,3 m long with a diameter between 3 to 7 cm. Wood stems were clamped on a rigid structure and impacted by a 149 kg charpy pendulum. Finally an intensive simulation campaign of blocks impacting trees was done to identify the input parameters controlling the block kinematics after the impact on a tree. 20 input parameters were considered in the DEM simulation model : 12 parameters were related to the tree and 8 parameters to the block. The results highlight that the impact velocity, the stem diameter, and the block volume are the three input

  14. Analysis of Maryland Poisoning Deaths Using Classification And Regression Tree (CART) Analysis

    PubMed Central

    Pamer, Carol; Serpi, Tracey; Finkelstein, Joseph

    2008-01-01

    Our study is a cross-sectional analysis of Maryland poisoning deaths for years 2003 and 2004. We used Classification and Regression Tree (CART) methodology to classify 1,204 Maryland undetermined intent poisoning deaths as either unintentional or suicidal poisonings. The predictive ability of the selected set of variables (i.e., poisoned in the home or workplace, location type where poisoned, place of death, poison type, victim race and age, year of death) was extremely good. Of the 301 test cases, only eight were misclassified by the CART regression tree. Of 1,204 undetermined intent poisoning deaths, CART classified 903 as suicides and 301 as unintentional deaths. The major strength of our study is the use of CART to differentiate with a high degree of accuracy between unintentional and suicidal poisoning deaths among Maryland undetermined intent poisoning deaths. PMID:18999168

  15. Predicting Chemically Induced Duodenal Ulcer and Adrenal Necrosis with Classification Trees

    NASA Astrophysics Data System (ADS)

    Giampaolo, Casimiro; Gray, Andrew T.; Olshen, Richard A.; Szabo, Sandor

    1991-07-01

    Binary tree-structured statistical classification algorithms and properties of 56 model alkyl nucleophiles were brought to bear on two problems of experimental pharmacology and toxicology. Each rat of a learning sample of 745 was administered one compound and autopsied to determine the presence of duodenal ulcer or adrenal hemorrhagic necrosis. The cited statistical classification schemes were then applied to these outcomes and 67 features of the compounds to ascertain those characteristics that are associated with biologic activity. For predicting duodenal ulceration, dipole moment, melting point, and solubility in octanol are particularly important, while for predicting adrenal necrosis, important features include the number of sulfhydryl groups and double bonds. These methods may constitute inexpensive but powerful ways to screen untested compounds for possible organ-specific toxicity. Mechanisms for the etiology and pathogenesis of the duodenal and adrenal lesions are suggested, as are additional avenues for drug design.

  16. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis.

    PubMed

    Cohen, Ira L; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N S; Romanczyk, Raymond G; Karmel, Bernard Z; Gardner, Judith M

    2016-09-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80 %, generalized to an independent validation set, and generalized across age groups and sites, and agreed well with ADOS classifications. Parent PDDBIs yielded better results than teacher PDDBIs but, when CART predictions agreed across informants, sensitivity increased. Results also revealed three subtypes of ASD: minimally verbal, verbal, and atypical; and two, relatively common subtypes of non-ASD children: social pragmatic problems and good social skills. These subgroups corresponded to differences in behavior profiles and associated bio-medical findings. PMID:27318809

  17. Internal Carbon Recycling in Trees - New Approach, Findings, and Implications

    NASA Astrophysics Data System (ADS)

    Angert, A.; Hilman, B.

    2012-12-01

    The CO2 emitted by respiration in a tree woody tissue (stem, branch, or root) is usually assumed to diffuse directly out to the atmosphere. Given that the internal concentrations of CO2 are one to two orders of magnitude higher than the atmospheric concentration, a reuse of this respired carbon can be beneficial to plants. We have developed a new method to track the fraction of respired CO2 not emitted from stems and branches, from the ratio of the CO2 efflux to the O2 influx. This ratio, which we defined as the apparent respiratory quotient (ARQ), is expected to equal 1.0 if carbohydrates are the substrate for respiration, and all respired CO2 is directly emitted. Using this approach we have recently showed that ~30% of the CO2 respired by Amazon forest tree stems was not directly emitted. In the current study we have applied this approach to 5 tree species living in Mediterranean climate, and have performed seasonal and diurnal ARQ measurements, at different heights along the stem and branches. We found different seasonal variations in the ARQ of riparian versus drought-resilient trees. In addition, the ARQ diurnal cycle, together with the measurements in different heights, indicate that a considerable fraction of the CO2 not emitted is recycled within the tree.

  18. The Learning Tree Montessori Child Care: An Approach to Diversity

    ERIC Educational Resources Information Center

    Wick, Laurie

    2006-01-01

    In this article the author describes how she and her partners started The Learning Tree Montessori Child Care, a Montessori program with a different approach in Seattle in 1979. The author also relates that the other area Montessori schools then offered half-day programs, and as a result the children who attended were, for the most part,…

  19. A class-oriented model for hyperspectral image classification through hierarchy-tree-based selection

    NASA Astrophysics Data System (ADS)

    Tang, Zhongqi; Fu, Guangyuan; Zhao, XiaoLin; Chen, Jin; Zhang, Li

    2016-03-01

    With the development of hyperspectral sensors over the last few decades, hyperspectral images (HSIs) face new challenges in the field of data analysis. Due to those high-dimensional data, the most challenging issue is to select an effective yet minimal subset from a mass of bands. This paper proposes a class-oriented model to solve the task of classification by incorporating spectral prior of the target, since different targets have different characteristics in spectral correlation. This model operates feature selection after a partition of hyperspectral data into groups along the spectral dimension. In the process of spectral partition, we group the raw data into several subsets by a hierarchy tree structure. In each group, band selection is performed via a recursive support vector machine (R-SVM) learning, which reduces the computational cost as well as preserves the accuracy of classification. To ensure the robustness of the result, we also present a weight-voting strategy for result merging, in which the spectral independency and the classification effectivity are both considered. Extensive experiments show that our model achieves better performance than the existing methods in task-dependent classifications, such as target detection and identification.

  20. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  1. Classification of savanna tree species, in the Greater Kruger National Park region, by integrating hyperspectral and LiDAR data in a Random Forest data mining environment

    NASA Astrophysics Data System (ADS)

    Naidoo, L.; Cho, M. A.; Mathieu, R.; Asner, G.

    2012-04-01

    The accurate classification and mapping of individual trees at species level in the savanna ecosystem can provide numerous benefits for the managerial authorities. Such benefits include the mapping of economically useful tree species, which are a key source of food production and fuel wood for the local communities, and of problematic alien invasive and bush encroaching species, which can threaten the integrity of the environment and livelihoods of the local communities. Species level mapping is particularly challenging in African savannas which are complex, heterogeneous, and open environments with high intra-species spectral variability due to differences in geology, topography, rainfall, herbivory and human impacts within relatively short distances. Savanna vegetation are also highly irregular in canopy and crown shape, height and other structural dimensions with a combination of open grassland patches and dense woody thicket - a stark contrast to the more homogeneous forest vegetation. This study classified eight common savanna tree species in the Greater Kruger National Park region, South Africa, using a combination of hyperspectral and Light Detection and Ranging (LiDAR)-derived structural parameters, in the form of seven predictor datasets, in an automated Random Forest modelling approach. The most important predictors, which were found to play an important role in the different classification models and contributed to the success of the hybrid dataset model when combined, were species tree height; NDVI; the chlorophyll b wavelength (466 nm) and a selection of raw, continuum removed and Spectral Angle Mapper (SAM) bands. It was also concluded that the hybrid predictor dataset Random Forest model yielded the highest classification accuracy and prediction success for the eight savanna tree species with an overall classification accuracy of 87.68% and KHAT value of 0.843.

  2. Random subwindows and extremely randomized trees for image classification in cell biology

    PubMed Central

    Marée, Raphaël; Geurts, Pierre; Wehenkel, Louis

    2007-01-01

    Background With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation of a large amount of images at different imaging modalities/scales. It stresses the need for computer vision methods that automate image classification tasks. Results We illustrate the potential of our image classification method in cell biology by evaluating it on four datasets of images related to protein distributions or subcellular localizations, and red-blood cell shapes. Accuracy results are quite good without any specific pre-processing neither domain knowledge incorporation. The method is implemented in Java and available upon request for evaluation and research purpose. Conclusion Our method is directly applicable to any image classification problems. We foresee the use of this automatic approach as a baseline method and first try on various biological image classification problems. PMID:17634092

  3. Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image

    NASA Astrophysics Data System (ADS)

    Adelabu, Samuel; Mutanga, Onisimo; Adam, Elhadi; Cho, Moses Azong

    2013-01-01

    Classification of different tree species in semiarid areas can be challenging as a result of the change in leaf structure and orientation due to soil moisture constraints. Tree species mapping is, however, a key parameter for forest management in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples.We performed classification using random forest (RF) and support vector machines (SVM) based on EnMap box. The overall accuracies for classifying the five tree species was 88.75 and 85% for both SVM and RF, respectively. We also demonstrated that the new red-edge band in the RapidEye sensor has the potential for classifying tree species in semiarid environments when integrated with other standard bands. Similarly, we observed that where there are limited training samples, SVM is preferred over RF. Finally, we demonstrated that the two accuracy measures of quantity and allocation disagreement are simpler and more helpful for the vast majority of remote sensing classification process than the kappa coefficient. Overall, high species classification can be achieved using strategically located RapidEye bands integrated with advanced processing algorithms.

  4. The Tree of Life and a New Classification of Bony Fishes

    PubMed Central

    Betancur-R., Ricardo; Broughton, Richard E.; Wiley, Edward O.; Carpenter, Kent; López, J. Andrés; Li, Chenhong; Holcroft, Nancy I.; Arcila, Dahiana; Sanciangco, Millicent; Cureton II, James C; Zhang, Feifei; Buser, Thaddaeus; Campbell, Matthew A.; Ballesteros, Jesus A; Roa-Varon, Adela; Willis, Stuart; Borden, W. Calvin; Rowley, Thaine; Reneau, Paulette C.; Hough, Daniel J.; Lu, Guoqing; Grande, Terry; Arratia, Gloria; Ortí, Guillermo

    2013-01-01

    The tree of life of fishes is in a state of flux because we still lack a comprehensive phylogeny that includes all major groups. The situation is most critical for a large clade of spiny-finned fishes, traditionally referred to as percomorphs, whose uncertain relationships have plagued ichthyologists for over a century. Most of what we know about the higher-level relationships among fish lineages has been based on morphology, but rapid influx of molecular studies is changing many established systematic concepts. We report a comprehensive molecular phylogeny for bony fishes that includes representatives of all major lineages. DNA sequence data for 21 molecular markers (one mitochondrial and 20 nuclear genes) were collected for 1410 bony fish taxa, plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). Bony fish diversity is represented by 1093 genera, 369 families, and all traditionally recognized orders. The maximum likelihood tree provides unprecedented resolution and high bootstrap support for most backbone nodes, defining for the first time a global phylogeny of fishes. The general structure of the tree is in agreement with expectations from previous morphological and molecular studies, but significant new clades arise. Most interestingly, the high degree of uncertainty among percomorphs is now resolved into nine well-supported supraordinal groups. The order Perciformes, considered by many a polyphyletic taxonomic waste basket, is defined for the first time as a monophyletic group in the global phylogeny. A new classification that reflects our phylogenetic hypothesis is proposed to facilitate communication about the newly found structure of the tree of life of fishes. Finally, the molecular phylogeny is calibrated using 60 fossil constraints to produce a comprehensive time tree. The new time-calibrated phylogeny will provide the basis for and stimulate new comparative studies to better understand the evolution of the amazing

  5. Application of classification-tree methods to identify nitrate sources in ground water

    USGS Publications Warehouse

    Spruill, T.B.; Showers, W.J.; Howe, S.S.

    2002-01-01

    A study was conducted to determine if nitrate sources in ground water (fertilizer on crops, fertilizer on golf courses, irrigation spray from hog (Sus scrofa) wastes, and leachate from poultry litter and septic systems) could be classified with 80% or greater success. Two statistical classification-tree models were devised from 48 water samples containing nitrate from five source categories. Model I was constructed by evaluating 32 variables and selecting four primary predictor variables (??15N, nitrate to ammonia ratio, sodium to potassium ratio, and zinc) to identify nitrate sources. A ??15N value of nitrate plus potassium 18.2 indicated inorganic or soil organic N. A nitrate to ammonia ratio 575 indicated nitrate from golf courses. A sodium to potassium ratio 3.2 indicated spray or poultry wastes. A value for zinc 2.8 indicated poultry wastes. Model 2 was devised by using all variables except ??15N. This model also included four variables (sodium plus potassium, nitrate to ammonia ratio, calcium to magnesium ratio, and sodium to potassium ratio) to distinguish categories. Both models were able to distinguish all five source categories with better than 80% overall success and with 71 to 100% success in individual categories using the learning samples. Seventeen water samples that were not used in model development were tested using Model 2 for three categories, and all were correctly classified. Classification-tree models show great potential in identifying sources of contamination and variables important in the source-identification process.

  6. A Novel Modulation Classification Approach Using Gabor Filter Network

    PubMed Central

    Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed

    2014-01-01

    A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel. PMID:25126603

  7. Non-Destructive Classification Approaches for Equilibrated Ordinary Chondrites

    NASA Astrophysics Data System (ADS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-09-01

    In order to compare a few non-destructive classification techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Mössbauer spectroscopy.

  8. Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification

    NASA Astrophysics Data System (ADS)

    Höhle, J.

    2014-09-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.

  9. Identifying Transferable Skills: A Task Classification Approach.

    ERIC Educational Resources Information Center

    Ashley, William L.; Ammerman, Harry L.

    The feasibility of classifying occupational tasks as a basis for understanding better the occupational transferability of job skills was examined. To show general skill relationships among occupations, 5 classification schemes were applied to 50 selected task statements for each of 12 occupations. Ratings by five reasonably knowledgeable people…

  10. Eating Disorder Diagnoses: Empirical Approaches to Classification

    ERIC Educational Resources Information Center

    Wonderlich, Stephen A.; Joiner, Thomas E., Jr.; Keel, Pamela K.; Williamson, Donald A.; Crosby, Ross D.

    2007-01-01

    Decisions about the classification of eating disorders have significant scientific and clinical implications. The eating disorder diagnoses in the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) reflect the collective wisdom of experts in the field but are frequently not supported in…

  11. Advanced fractal approach for unsupervised classification of SAR images

    NASA Astrophysics Data System (ADS)

    Pant, Triloki; Singh, Dharmendra; Srivastava, Tanuja

    2010-06-01

    Unsupervised classification of Synthetic Aperture Radar (SAR) images is the alternative approach when no or minimum apriori information about the image is available. Therefore, an attempt has been made to develop an unsupervised classification scheme for SAR images based on textural information in present paper. For extraction of textural features two properties are used viz. fractal dimension D and Moran's I. Using these indices an algorithm is proposed for contextual classification of SAR images. The novelty of the algorithm is that it implements the textural information available in SAR image with the help of two texture measures viz. D and I. For estimation of D, the Two Dimensional Variation Method (2DVM) has been revised and implemented whose performance is compared with another method, i.e., Triangular Prism Surface Area Method (TPSAM). It is also necessary to check the classification accuracy for various window sizes and optimize the window size for best classification. This exercise has been carried out to know the effect of window size on classification accuracy. The algorithm is applied on four SAR images of Hardwar region, India and classification accuracy has been computed. A comparison of the proposed algorithm using both fractal dimension estimation methods with the K-Means algorithm is discussed. The maximum overall classification accuracy with K-Means comes to be 53.26% whereas overall classification accuracy with proposed algorithm is 66.16% for TPSAM and 61.26% for 2DVM.

  12. Prediction of radiation levels in residences: A methodological comparison of CART (Classification and Regression Tree Analysis) and conventional regression

    SciTech Connect

    Janssen, I.; Stebbings, J.H.

    1990-01-01

    In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and {approximately}200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs.

  13. The creation of a digital soil map for Cyprus using decision-tree classification techniques

    NASA Astrophysics Data System (ADS)

    Camera, Corrado; Zomeni, Zomenia; Bruggeman, Adriana; Noller, Joy; Zissimos, Andreas

    2014-05-01

    Considering the increasing threats soil are experiencing especially in semi-arid, Mediterranean environments like Cyprus (erosion, contamination, sealing and salinisation), producing a high resolution, reliable soil map is essential for further soil conservation studies. This study aims to create a 1:50.000 soil map covering the area under the direct control of the Republic of Cyprus (5.760 km2). The study consists of two major steps. The first is the creation of a raster database of predictive variables selected according to the scorpan formula (McBratney et al., 2003). It is of particular interest the possibility of using, as soil properties, data coming from three older island-wide soil maps and the recently published geochemical atlas of Cyprus (Cohen et al., 2011). Ten highly characterizing elements were selected and used as predictors in the present study. For the other factors usual variables were used: temperature and aridity index for climate; total loss on ignition, vegetation and forestry types maps for organic matter; the DEM and related relief derivatives (slope, aspect, curvature, landscape units); bedrock, surficial geology and geomorphology (Noller, 2009) for parent material and age; and a sub-watershed map to better bound location related to parent material sources. In the second step, the digital soil map is created using the Random Forests package in R. Random Forests is a decision tree classification technique where many trees, instead of a single one, are developed and compared to increase the stability and the reliability of the prediction. The model is trained and verified on areas where a 1:25.000 published soil maps obtained from field work is available and then it is applied for predictive mapping to the other areas. Preliminary results obtained in a small area in the plain around the city of Lefkosia, where eight different soil classes are present, show very good capacities of the method. The Ramdom Forest approach leads to reproduce soil

  14. Morphological and molecular characteristics do not confirm popular classification of the Brazil nut tree in Acre, Brazil.

    PubMed

    Sujii, P S; Fernandes, E T M B; Azevedo, V C R; Ciampi, A Y; Martins, K; de O Wadt, L H

    2013-01-01

    In the State of Acre, the Brazil nut tree, Bertholletia excelsa (Lecythidaceae), is classified by the local population into two types according to morphological characteristics, including color and quality of wood, shape of the trunk and crown, and fruit production. We examined the reliability of this classification by comparing morphological and molecular data of four populations of Brazil nut trees from Vale do Rio Acre in the Brazilian Amazon. For the morphological analysis, we evaluated qualitative and quantitative information of the trees, fruits, and seeds. The molecular analysis was performed using RAPD and ISSR markers, with cluster analysis. Significant differences were found between the two types of Brazil nut trees for the characters diameter at breast height, fruit yield, fruit size, and number of seeds per fruit. Despite the significant correlation between the morphological characteristics and the popular classification, we observed all possible combinations of morphological characteristics in both types of Brazil nut trees. In some individuals, the classification did not correspond to any of the characteristics. The results obtained with molecular markers showed that the two locally classified types of Brazil nut trees did not differ genetically, indicating that there is no consistent separation between them. PMID:24089091

  15. Tree Crown Delineation on Vhr Aerial Imagery with Svm Classification Technique Optimized by Taguchi Method: a Case Study in Zagros Woodlands

    NASA Astrophysics Data System (ADS)

    Erfanifard, Y.; Behnia, N.; Moosavi, V.

    2013-09-01

    The Support Vector Machine (SVM) is a theoretically superior machine learning methodology with great results in classification of remotely sensed datasets. Determination of optimal parameters applied in SVM is still vague to some scientists. In this research, it is suggested to use the Taguchi method to optimize these parameters. The objective of this study was to detect tree crowns on very high resolution (VHR) aerial imagery in Zagros woodlands by SVM optimized by Taguchi method. A 30 ha plot of Persian oak (Quercus persica) coppice trees was selected in Zagros woodlands, Iran. The VHR aerial imagery of the plot with 0.06 m spatial resolution was obtained from National Geographic Organization (NGO), Iran, to extract the crowns of Persian oak trees in this study. The SVM parameters were optimized by Taguchi method and thereafter, the imagery was classified by the SVM with optimal parameters. The results showed that the Taguchi method is a very useful approach to optimize the combination of parameters of SVM. It was also concluded that the SVM method could detect the tree crowns with a KHAT coefficient of 0.961 which showed a great agreement with the observed samples and overall accuracy of 97.7% that showed the accuracy of the final map. Finally, the authors suggest applying this method to optimize the parameters of classification techniques like SVM.

  16. Tree-Level Hydrodynamic Approach for Improved Stomatal Conductance Parameterization

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, G.; Bohrer, G.; Matheny, A. M.; Ivanov, V. Y.

    2014-12-01

    The land-surface models do not mechanistically resolve hydrodynamic processes within the tree. The Finite-Elements Tree-Crown Hydrodynamics model version 2 (FETCH2) is based on the pervious FETCH model approach, but with finite difference numerics, and simplified single-beam conduit system. FETCH2 simulates water flow through the tree as a simplified system of porous media conduits. It explicitly resolves spatiotemporal hydraulic stresses throughout the tree's vertical extent that cannot be easily represented using other stomatal-conductance models. Empirical equations relate water potential at the stem to stomata conductance at leaves connected to the stem (through unresolved branches) at that height. While highly simplified, this approach bring some realism to the simulation of stomata conductance because the stomata can respond to stem water potential, rather than an assumed direct relationship with soil moisture, as is currently the case in almost all models. By enabling mechanistic simulation of hydrological traits, such as xylem conductivity, conductive area per DBH, vertical distribution of leaf area and maximal and minimal water content in the xylem, and their effect of the dynamics of water flow in the tree system, the FETCH2 modeling system enhanced our understanding of the role of hydraulic limitations on an experimental forest plot short-term water stresses that lead to tradeoffs between water and light availability for transpiring leaves in forest ecosystems. FETCH2 is particularly suitable to resolve the effects of structural differences between tree and species and size groups, and the consequences of differences in hydraulic strategies of different species. We leverage on a large dataset of sap flow from 60 trees of 4 species at our experimental plot at the University of Michigan Biological Station. Comparison of the sap flow and transpiration patterns in this site and an undisturbed control site shows significant difference in hydraulic strategies

  17. A simulation approach for change-points on phylogenetic trees.

    PubMed

    Persing, Adam; Jasra, Ajay; Beskos, Alexandros; Balding, David; De Iorio, Maria

    2015-01-01

    We observe n sequences at each of m sites and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a transdimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the transdimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques. PMID:25506749

  18. Identifying Population Groups with Low Palliative Care Program Enrolment Using Classification and Regression Tree Analysis

    PubMed Central

    Gao, Jun; Lavergne, M. Ruth; McIntyre, Paul

    2013-01-01

    Classification and regression tree (CART) analysis was used to identify subpopulations with lower palliative care program (PCP) enrolment rates. CART analysis uses recursive partitioning to group predictors. The PCP enrolment rate was 72 percent for the 6,892 adults who died of cancer from 2000 and 2005 in two counties in Nova Scotia, Canada. The lowest PCP enrolment rates were for nursing home residents over 82 years (27 percent), a group residing more than 43 kilometres from the PCP (31 percent), and another group living less than two weeks after their cancer diagnosis (37 percent). The highest rate (86 percent) was for the 2,118 persons who received palliative radiation. Findings from multiple logistic regression (MLR) were provided for comparison. CART findings identified low PCP enrolment subpopulations that were defined by interactions among demographic, social, medical, and health system predictors. PMID:21805944

  19. Classification tree models for predicting distributions of michigan stream fish from landscape variables

    USGS Publications Warehouse

    Steen, P.J.; Zorn, T.G.; Seelbach, P.W.; Schaeffer, J.S.

    2008-01-01

    Traditionally, fish habitat requirements have been described from local-scale environmental variables. However, recent studies have shown that studying landscape-scale processes improves our understanding of what drives species assemblages and distribution patterns across the landscape. Our goal was to learn more about constraints on the distribution of Michigan stream fish by examining landscape-scale habitat variables. We used classification trees and landscape-scale habitat variables to create and validate presence-absence models and relative abundance models for Michigan stream fishes. We developed 93 presence-absence models that on average were 72% correct in making predictions for an independent data set, and we developed 46 relative abundance models that were 76% correct in making predictions for independent data. The models were used to create statewide predictive distribution and abundance maps that have the potential to be used for a variety of conservation and scientific purposes. ?? Copyright by the American Fisheries Society 2008.

  20. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    SciTech Connect

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.; Pounds, Joel G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the set of predictors to around 8 factors that can be validated using reputable medical and public health resources.

  1. Classification of oxide glasses: A polarizability approach

    SciTech Connect

    Dimitrov, Vesselin; Komatsu, Takayuki . E-mail: komatsu@chem.nagaokaut.ac.jp

    2005-03-15

    A classification of binary oxide glasses has been proposed taking into account the values obtained on their refractive index-based oxide ion polarizability {alpha}{sub O2-}(n{sub 0}), optical basicity {lambda}(n{sub 0}), metallization criterion M(n{sub 0}), interaction parameter A(n{sub 0}), and ion's effective charges as well as O1s and metal binding energies determined by XPS. Four groups of oxide glasses have been established: glasses formed by two glass-forming acidic oxides; glasses formed by glass-forming acidic oxide and modifier's basic oxide; glasses formed by glass-forming acidic and conditional glass-forming basic oxide; glasses formed by two basic oxides. The role of electronic ion polarizability in chemical bonding of oxide glasses has been also estimated. Good agreement has been found with the previous results concerning classification of simple oxides. The results obtained probably provide good basis for prediction of type of bonding in oxide glasses on the basis of refractive index as well as for prediction of new nonlinear optical materials.

  2. Discriminating Geriatric and Nongeriatric Patients Using Functional Status Information: An Example of Classification Tree Analysis via UniODA.

    ERIC Educational Resources Information Center

    Yarnold, Paul R.

    1996-01-01

    A procedure is described that involves iterative use of univariable optimal discriminant analysis (UniODA) to construct a classification tree model for discriminating observations from different groups. The procedure is illustrated using an application that involved discriminating 125 geriatric and nongeriatric patients on the basis of their…

  3. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  4. Tree species classification in the Southern Sierra Nevada Mountains based on MASTER and LIDAR imagery

    NASA Astrophysics Data System (ADS)

    Gibbons, S.; Grigsby, S.; Ustin, S.

    2013-12-01

    NASA recently collected MASTER (MODIS/ASTER) imagery over the Southern Sierra Nevada Mountains as part of the HyspIRI (Hyperspectral Infrared Imager) preparatory campaign, a location that was chosen for its distinct changes in vegetative species with elevation. Differentiation between functional types based on spectral data has been successful, however, classification between individual species is more difficult to accomplish with only the visible and near infrared portions of the spectrum. I used MASTER imagery in combination with Critical Zone Observatory LIDAR data to map species across both a low and high elevation site in the San Joaquin Experimental Range. While the visible and thermal bands of MASTER images provided an improved classification over shortwave bands, the physical characteristics from the LIDAR data showed the most contrast between the land covers, including tree species. The National Ecological Observation Network (NEON) plans to use LIDAR and spectral data to monitor 20 domains, including the San Joaquin Experimental Range, for the next thirty years. Understanding the current species distributions not only provides insight on the available resources of the area but will also act as a baseline to determine the effects of environmental changes on vegetation using future NEON data.

  5. Effect of training characteristics on object classification: An application using Boosted Decision Trees

    NASA Astrophysics Data System (ADS)

    Sevilla-Noarbe, I.; Etayo-Sotos, P.

    2015-06-01

    We present an application of a particular machine-learning method (Boosted Decision Trees, BDTs using AdaBoost) to separate stars and galaxies in photometric images using their catalog characteristics. BDTs are a well established machine learning technique used for classification purposes. They have been widely used specially in the field of particle and astroparticle physics, and we use them here in an optical astronomy application. This algorithm is able to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. The improvements are shown using the Sloan Digital Sky Survey Data Release 9, with respect to the type photometric classifier. We obtain an improvement in the impurity of the galaxy sample of a factor 2-4 for this particular dataset, adjusting for the same efficiency of the selection. Another main goal of this study is to verify the effects that different input vectors and training sets have on the classification performance, the results being of wider use to other machine learning techniques.

  6. Object classification in images for Epo doping control based on fuzzy decision trees

    NASA Astrophysics Data System (ADS)

    Bajla, Ivan; Hollander, Igor; Heiss, Dorothea; Granec, Reinhard; Minichmayr, Markus

    2005-02-01

    Erythropoietin (Epo) is a hormone which can be misused as a doping substance. Its detection involves analysis of images containing specific objects (bands), whose position and intensity are critical for doping positivity. Within a research project of the World Anti-Doping Agency (WADA) we are implementing the GASepo software that should serve for Epo testing in doping control laboratories world-wide. For identification of the bands we have developed a segmentation procedure based on a sequence of filters and edge detectors. Whereas all true bands are properly segmented, the procedure generates a relatively high number of false positives (artefacts). To separate these artefacts we suggested a post-segmentation supervised classification using real-valued geometrical measures of objects. The method is based on the ID3 (Ross Quinlan's) rule generation method, where fuzzy representation is used for linking the linguistic terms to quantitative data. The fuzzy modification of the ID3 method provides a framework that generates fuzzy decision trees, as well as fuzzy sets for input data. Using the MLTTM software (Machine Learning Framework) we have generated a set of fuzzy rules explicitly describing bands and artefacts. The method eliminated most of the artefacts. The contribution includes a comparison of the obtained misclassification errors to the errors produced by some other statistical classification methods.

  7. Autoimmune hemolytic anemia: classification and therapeutic approaches.

    PubMed

    Sève, Pascal; Philippe, Pierre; Dufour, Jean-François; Broussolle, Christiane; Michel, Marc

    2008-12-01

    Autoimmune hemolytic anemia (AIHA) is a relatively uncommon cause of anemia. Classifications of AIHA include warm AIHA, cold AIHA (including mainly chronic cold agglutinin disease and paroxysmal cold hemoglobinuria), mixed-type AIHA and drug-induced AIHA. AIHA may also be further subdivided on the basis of etiology. Management of AIHA is based mainly on empirical data and on small, retrospective, uncontrolled studies. The therapeutic options for treating AIHA are increasing with monoclonal antibodies and, potentially, complement inhibitory drugs. Based on data available in the literature and our experience, we propose algorithms for the treatment of warm AIHA and cold agglutinin disease in adults. Therapeutic trials are needed in order to better stratify treatment, taking into account the promising efficacy of rituximab. PMID:21082924

  8. Color Image Magnification: Geometrical Pattern Classification Approach

    NASA Astrophysics Data System (ADS)

    Yong, Tien Fui; Choo, Wou Onn; Meian Kok, Hui

    In an era where technology keeps advancing, it is vital that high-resolution images are available to produce high-quality displayed images and fine-quality prints. The problem is that it is quite impossible to produce high-resolution images with acceptable clarity even with the latest digital cameras. Therefore, there is a need to enlarge the original images using an effective and efficient algorithm. The main contribution of this paper is to produce an enlarge color image with high visual quality, up to four times the original size of 100x100 pixels image. In the classification phase, the basic idea is to separate the interpolation region in the form of geometrical shape. Then, in the intensity determination phase, the interpolator assigns a proper color intensity value to the undefined pixel inside the interpolation region. This paper will discuss about problem statement, literature review, research methodology, research outcome, initial results, and finally, the conclusion.

  9. A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance

    PubMed Central

    2013-01-01

    Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928

  10. Identifying tree crown delineation shapes and need for remediation on high resolution imagery using an evidence based approach

    NASA Astrophysics Data System (ADS)

    Leckie, Donald G.; Walsworth, Nicholas; Gougeon, François A.

    2016-04-01

    In order to fully realize the benefits of automated individual tree mapping for tree species, health, forest inventory attribution and forest management decision making, the tree delineations should be as good as possible. The concept of identifying poorly delineated tree crowns and suggesting likely types of remediation was investigated. Delineations (isolations or isols) were classified into shape types reflecting whether they were realistic tree shapes and the likely kind of remediation needed. Shape type was classified by an evidence based rules approach using primitives based on isol size, shape indices, morphology, the presence of local maxima, and matches with template models representing trees of different sizes. A test set containing 50,000 isols based on an automated tree delineation of 40 cm multispectral airborne imagery of a diverse temperate-boreal forest site was used. Isolations representing single trees or several trees were the focus, as opposed to cases where a tree is split into several isols. For eight shape classes from regular through to convolute, shape classification accuracy was in the order of 62%; simplifying to six classes accuracy was 83%. Shape type did give an indication of the type of remediation and there were 6% false alarms (i.e., isols classed as needing remediation but did not). Alternately, there were 5% omissions (i.e., isols of regular shape and not earmarked for remediation that did need remediation). The usefulness of the concept of identifying poor delineations in need of remediation was demonstrated and one suite of methods developed and shown to be effective.

  11. Annual Crop Type Classification of the U.S. Great Plains for 2000 - 2011: An Application of Classification Tree Modeling using Remote Sensing and Ancillary Environmental Data (Invited)

    NASA Astrophysics Data System (ADS)

    Howard, D. M.; Wylie, B. K.

    2013-12-01

    The purpose of this study was to increase spatial and temporal availability of crop classification data using reliable source data that have the potential of being applied on local, regional, national, and global levels. This study implemented classification tree modeling to map annual crop types throughout the U.S. Great Plains from 2000 - 2011. Classification tree modeling has been shown in numerous studies to be an effective tool for developing classification models. In this study, nearly 18 million crop observation points, derived from annual U.S. Department of Agriculture (USDA) National Agriculture Statistics Service (NASS) Cropland Data Layers (CDLs), were used in the training, development, and validation of a classification tree crop type model (CTM). Each observation point was further defined by weekly Normalized Differential Vegetation Index (NDVI) readings, annual climatic conditions, soil conditions, and a number of other biogeophysical environmental characteristics. The CTM accounted for the most prevalent crop types in the area, including, corn, soybeans, winter wheat, spring wheat, cotton, sorghum, and alfalfa. Other crops that did not fit into any of these classes were identified and grouped into a miscellaneous class. An 87% success rate was achieved on the classification of 1.8 million observation points (10% of total observation points) that were withheld from training. The CTM was applied to create annual crop maps of the U.S. Great Plains for 2000 - 2011 at a spatial resolution of 250 meters. Product validation was performed by comparing county acreage derived from the modeled crop maps and county acreage data from the USDA NASS Survey Program for each crop type and each year. Greater than 15,000 county records from 2001 - 2010 were compared with a Pearson's correlation coefficient of r = 0.87.

  12. A Hybrid Sensing Approach for Pure and Adulterated Honey Classification

    PubMed Central

    Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  13. Rational approaches to improving the isolation of endophytic actinobacteria from Australian native trees.

    PubMed

    Kaewkla, Onuma; Franco, Christopher M M

    2013-02-01

    In recent years, new actinobacterial species have been isolated as endophytes of plants and shrubs and are sought after both for their role as potential producers of new drug candidates for the pharmaceutical industry and as biocontrol inoculants for sustainable agriculture. Molecular-based approaches to the study of microbial ecology generally reveal a broader microbial diversity than can be obtained by cultivation methods. This study aimed to improve the success of isolating individual members of the actinobacterial population as pure cultures as well as improving the ability to characterise the large numbers obtained in pure culture. To achieve this objective, our study successfully employed rational and holistic approaches including the use of isolation media with low concentrations of nutrients normally available to the microorganism in the plant, plating larger quantities of plant sample, incubating isolation plates for up to 16 weeks, excising colonies when they are visible and choosing Australian endemic trees as the source of the actinobacteria. A hierarchy of polyphasic methods based on culture morphology, amplified 16S rRNA gene restriction analysis and limited sequencing was used to classify all 576 actinobacterial isolates from leaf, stem and root samples of two eucalypts: a Grey Box and Red Gum, a native apricot tree and a native pine tree. The classification revealed that, in addition to 413 Streptomyces spp., isolates belonged to 16 other actinobacterial genera: Actinomadura (two strains), Actinomycetospora (six), Actinopolymorpha (two), Amycolatopsis (six), Gordonia (one), Kribbella (25), Micromonospora (six), Nocardia (ten), Nocardioides (11), Nocardiopsis (one), Nonomuraea (one), Polymorphospora (two), Promicromonospora (51), Pseudonocardia (36), Williamsia (two) and a novel genus Flindersiella (one). In order to prove novelty, 12 strains were characterised fully to the species level based on polyphasic taxonomy. One strain represented a novel

  14. An efficient tree classifier ensemble-based approach for pedestrian detection.

    PubMed

    Xu, Yanwu; Cao, Xianbin; Qiao, Hong

    2011-02-01

    Classification-based pedestrian detection systems (PDSs) are currently a hot research topic in the field of intelligent transportation. A PDS detects pedestrians in real time on moving vehicles. A practical PDS demands not only high detection accuracy but also high detection speed. However, most of the existing classification-based approaches mainly seek for high detection accuracy, while the detection speed is not purposely optimized for practical application. At the same time, the performance, particularly the speed, is primarily tuned based on experiments without theoretical foundations, leading to a long training procedure. This paper starts with measuring and optimizing detection speed, and then a practical classification-based pedestrian detection solution with high detection speed and training speed is described. First, an extended classification/detection speed metric, named feature-per-object (fpo), is proposed to measure the detection speed independently from execution. Then, an fpo minimization model with accuracy constraints is formulated based on a tree classifier ensemble, where the minimum fpo can guarantee the highest detection speed. Finally, the minimization problem is solved efficiently by using nonlinear fitting based on radial basis function neural networks. In addition, the optimal solution is directly used to instruct classifier training; thus, the training speed could be accelerated greatly. Therefore, a rapid and accurate classification-based detection technique is proposed for the PDS. Experimental results on urban traffic videos show that the proposed method has a high detection speed with an acceptable detection rate and a false-alarm rate for onboard detection; moreover, the training procedure is also very fast. PMID:20457550

  15. Multidisciplinary approach to tumors of the pancreas and biliary tree.

    PubMed

    Brown, Kimberly M

    2009-02-01

    Tumors of the pancreas and biliary tree remain formidable challenges to patients and clinicians. These tumors elude early detection, rapidly spread locally and systemically, and frequently recur despite apparently complete resection. Cystic tumors of the pancreas, however, may represent a subset of patients who do not uniformly require aggressive resection, and a thoughtful, evidence-based approach to work-up allows for the rational application of surgical therapy. Increasing evidence supports treating patients who have pancreaticobiliary disease in a multidisciplinary setting. PMID:19186234

  16. Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

    NASA Astrophysics Data System (ADS)

    Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko

    2015-01-01

    Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.

  17. An approach for quantifying the efficacy of ecological classification schemes as management tools

    NASA Astrophysics Data System (ADS)

    Flanagan, A. M.; Cerrato, R. M.

    2015-10-01

    Rigorous assessments of ecological classification schemes being applied to submerged environments are needed to evaluate their utility as management tools. Verification that a scheme can quantitatively capture habitat and community variation would be of considerable value to individuals responsible for making difficult management decisions relevant to widespread environmental challenges including those in fisheries, preservation or restoration of critical habitats, and climate change. In this paper, an assessment approach that evaluates a scheme by treating it like a quantitative statistical model is presented. It couples two direct gradient, multivariate statistical techniques, multivariate regression trees (MRT) and redundancy analysis (RDA), with a modelling protocol involving model formulation, model selection, parameter estimation, and measurement of precision to produce a very flexible strategy for analyzing structure in ecological data. To illustrate the proposed approach, the assessment focused on benthic infauna and evaluating the Folk grain size classification scheme, along with some alternative grain size models. Analysis of data sets revealed that while it was fairly easy to uncover biotic-environmental relationships that were over-fitted, the community structure inherent in the data tended to be robustly discernible and preserved across all grain size models, but rigidly parameterized models (i.e., a one size fits all approach for grain size characterization with fixed boundaries) were generally ineffective. The proposed approach provided a clear, detailed, and rigorous assessment of Folk and several alternative models and can be used for the quantitative evaluation of existing ecological classification schemes and/or in the development of new schemes.

  18. An overview of the phase-modular fault tree approach to phased mission system analysis

    NASA Technical Reports Server (NTRS)

    Meshkat, L.; Xing, L.; Donohue, S. K.; Ou, Y.

    2003-01-01

    We look at how fault tree analysis (FTA), a primary means of performing reliability analysis of PMS, can meet this challenge in this paper by presenting an overview of the modular approach to solving fault trees that represent PMS.

  19. Classification

    NASA Astrophysics Data System (ADS)

    Oza, Nikunj

    2012-03-01

    would represent one sunspot’s classification (y_i) and the corresponding set of measurements (x_i). The output of a supervised learning algorithm is a model h that approximates the unknown mapping from the inputs to the outputs. In our example, h would map from the sunspot measurements to the type of sunspot. We may have a test set S—a set of examples not used in training that we use to test how well the model h predicts the outputs on new examples. Just as with the examples in T, the examples in S are assumed to be independent and identically distributed (i.i.d.) draws from the distribution D. We measure the error of h on the test set as the proportion of test cases that h misclassifies: 1/|S| Sigma(x,y union S)[I(h(x)!= y)] where I(v) is the indicator function—it returns 1 if v is true and 0 otherwise. In our sunspot classification example, we would identify additional examples of sunspots that were not used in generating the model, and use these to determine how accurate the model is—the fraction of the test samples that the model classifies correctly. An example of a classification model is the decision tree shown in Figure 23.1. We will discuss the decision tree learning algorithm in more detail later—for now, we assume that, given a training set with examples of sunspots, this decision tree is derived. This can be used to classify previously unseen examples of sunpots. For example, if a new sunspot’s inputs indicate that its "Group Length" is in the range 10-15, then the decision tree would classify the sunspot as being of type “E,” whereas if the "Group Length" is "NULL," the "Magnetic Type" is "bipolar," and the "Penumbra" is "rudimentary," then it would be classified as type "C." In this chapter, we will add to the above description of classification problems. We will discuss decision trees and several other classification models. In particular, we will discuss the learning algorithms that generate these classification models, how to use them to

  20. [It is normal for classification approaches to be diverse].

    PubMed

    Pavlinov, I Ia

    2003-01-01

    It is asserted that the postmodern concept of science, unlike the classical ideal, presumes necessary existence of various classification approaches (schools) in taxonomy, each corresponding to a particular aspect of consideration of the "taxic reality". They are set up by diversity of initial epistemological and ontological backgrounds which fix in a certain way a) fragments of that reality allowable for investigation, and b) allowable methods of exploration of the fragments being fixed. It makes it possible to define a taxonomic school as a unity of the above backgrounds together with consideration aspect delimited by them. Two extreme positions of these backgrounds could be recognized in recent taxonomic thought. One of them follows the scholastic tradition of elaboration of a formal and, hence, universal classificatory method ("new typology", numerical phenetics, pattern cladistics). Another one asserts dependence of classificatory approach on the judgment of the nature of taxic reality (natural philosophy, evolutionary schools of taxonomy). Some arguments are put forward in favor of significant impact of evolutionary thinking onto the theory of modern taxonomy. This impact is manifested by the correspondence principle which makes classificatory algorithms (and hence resulting classifications) depending onto initial assumptions about causes of taxic diversity. It is asserted that criteria of "quality" of both classifications proper and classificatory methods can be correctly formulated within the framework of a particular consideration aspect only. For any group of organisms, several particular classifications are rightful to exist, each corresponding to a particular consideration aspect. These classifications could not be arranged along the "better-worse" scale, as they reflect different fragments of the taxic reality. Their mutual interpretation depends on degree of compatibility of background assumptions and of the tasks being resolved. Extensionally

  1. Aerial Images from AN Uav System: 3d Modeling and Tree Species Classification in a Park Area

    NASA Astrophysics Data System (ADS)

    Gini, R.; Passoni, D.; Pinto, L.; Sona, G.

    2012-07-01

    The use of aerial imagery acquired by Unmanned Aerial Vehicles (UAVs) is scheduled within the FoGLIE project (Fruition of Goods Landscape in Interactive Environment): it starts from the need to enhance the natural, artistic and cultural heritage, to produce a better usability of it by employing audiovisual movable systems of 3D reconstruction and to improve monitoring procedures, by using new media for integrating the fruition phase with the preservation ones. The pilot project focus on a test area, Parco Adda Nord, which encloses various goods' types (small buildings, agricultural fields and different tree species and bushes). Multispectral high resolution images were taken by two digital compact cameras: a Pentax Optio A40 for RGB photos and a Sigma DP1 modified to acquire the NIR band. Then, some tests were performed in order to analyze the UAV images' quality with both photogrammetric and photo-interpretation purposes, to validate the vector-sensor system, the image block geometry and to study the feasibility of tree species classification. Many pre-signalized Control Points were surveyed through GPS to allow accuracy analysis. Aerial Triangulations (ATs) were carried out with photogrammetric commercial software, Leica Photogrammetry Suite (LPS) and PhotoModeler, with manual or automatic selection of Tie Points, to pick out pros and cons of each package in managing non conventional aerial imagery as well as the differences in the modeling approach. Further analysis were done on the differences between the EO parameters and the corresponding data coming from the on board UAV navigation system.

  2. A conceptual approach to approximate tree root architecture in infinite slope models

    NASA Astrophysics Data System (ADS)

    Schmaltz, Elmar; Glade, Thomas

    2016-04-01

    paraboloids represent a cordate-root-system with radius r, height h and a constant, species-independent curvature. This procedure simplifies the classification of tree species into the three defined geometric solids. In this study we introduce a conceptual approach to estimate the 2- and 3-dimensional distribution of different tree root systems, and to implement it in a raster environment, as it is used in infinite slope models. Hereto we used the PCRaster extension in a python framework. The results show that root distribution and root growth are spatially reproducible in a simple raster framework. The outputs exhibit significant effects for a synthetically generated slope on local scale for equal time-steps. The preliminary results depict an initial step to develop a vegetation module that can be coupled with hydro-mechanical slope stability models. This approach is expected to yield a valuable contribution to the implementation of vegetation-related properties, in particular effects of root-reinforcement, into physically-based approaches using infinite slope models.

  3. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    NASA Astrophysics Data System (ADS)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  4. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  5. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier

    PubMed Central

    Kambhampati, Satya Samyukta; Singh, Vishal; Ramkumar, Barathram

    2015-01-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%. PMID:26609414

  6. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier.

    PubMed

    Kambhampati, Satya Samyukta; Singh, Vishal; Manikandan, M Sabarimalai; Ramkumar, Barathram

    2015-08-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%. PMID:26609414

  7. Improving Crop Classification Techniques Using Optical Remote Sensing Imagery, High-Resolution Agriculture Resource Inventory Shapefiles and Decision Trees

    NASA Astrophysics Data System (ADS)

    Melnychuk, A. L.; Berg, A. A.; Sweeney, S.

    2010-12-01

    Recognition of anthropogenic effects of land use management practices on bodies of water is important for remediating and preventing eutrophication. In the case of Lake Simcoe, Ontario the main surrounding landuse is agriculture. To better manage the nutrient flow into the lake, knowledge of the management of the agricultural land is important. For this basin, a comprehensive agricultural resource inventory is required for assessment of policy and for input into water quality management and assessment tools. Supervised decision tree classification schemes, used in many previous applications, have yielded reliable classifications in agricultural land-use systems. However, when using these classification techniques the user is confronted with numerous data sources. In this study we use a large inventory of optical satellite image products (Landsat, AWiFS, SPOT and MODIS) and ancillary data sources (temporal MODIS-NDVI product signatures, digital elevation models and soil maps) at various spatial and temporal resolutions in a decision tree classification scheme. The sensitivity of the classification accuracy to various products is assessed to identify optimal data sources for classifying crop systems.

  8. Pattern classification approach to rocket engine diagnostics

    SciTech Connect

    Tulpule, S.

    1989-01-01

    This paper presents a systems level approach to integrate state-of-the-art rocket engine technology with advanced computational techniques to develop an integrated diagnostic system (IDS) for future rocket propulsion systems. The key feature of this IDS is the use of advanced diagnostic algorithms for failure detection as opposed to the current practice of redline-based failure detection methods. The paper presents a top-down analysis of rocket engine diagnostic requirements, rocket engine operation, applicable diagnostic algorithms, and algorithm design techniques, which serve as a basis for the IDS. The concepts of hierarchical, model-based information processing are described, together with the use uf signal processing, pattern recognition, and artificial intelligence techniques which are an integral part of this diagnostic system. 27 refs.

  9. Comparing ANNs, EAs, and Trees: a basic machine-learning approach to predictive environmental models.

    NASA Astrophysics Data System (ADS)

    Williams, J.; Poff, N.

    2005-05-01

    Machine learning techniques for ecological applications or "eco-informatics" are becoming increasingly useful and accessible for ecologists. We evaluated the predictive ability of three commercially available (i.e. user-friendly) software packages for artificial neural networks (ANNs), evolutionary algorithms (EAs), and classification/regression trees (Trees). We analyzed fish and habitat data for streams in the mid-Atlantic region of the U.S., which was collected by the U.S. Environmental Protection Agency (EPA). The data includes over 200 environmental descriptors summarizing watershed, stream, and water chemistry characteristics in addition to derived fish community metrics (i.e. richness, IBI scores, % exotics). In our analysis we predicted individual species presence/absence and fish community metrics as a function of these local and regional scale habitat variables. Predictive ability is evaluated with independent validation data. These approaches could prove especially useful for conservation or management applications where ecologists seek to utilize the most comprehensive data to make predictions at various scales. By employing "user-friendly" software we hope to show that ecologists, without extensive knowledge of computational science, can benefit from these techniques by extracting more information about complex ecosystems. Relative strengths and weaknesses of these three approaches are compared and recommendations for their use in conservation applications are presented.

  10. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  11. Study and Ranking of Determinants of Taenia solium Infections by Classification Tree Models

    PubMed Central

    Mwape, Kabemba E.; Phiri, Isaac K.; Praet, Nicolas; Dorny, Pierre; Muma, John B.; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  12. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees

    PubMed Central

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥106 CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥105 CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  13. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees.

    PubMed

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥10(6) CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥10(5) CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  14. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  15. Study and ranking of determinants of Taenia solium infections by classification tree models.

    PubMed

    Mwape, Kabemba E; Phiri, Isaac K; Praet, Nicolas; Dorny, Pierre; Muma, John B; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  16. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  17. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2000-12-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  18. New Approach for Segmentation and Extraction of Single Tree from Point Clouds Data and Aerial Images

    NASA Astrophysics Data System (ADS)

    Homainejad, A. S.

    2016-06-01

    This paper addresses a new approach for reconstructing a 3D model from single trees via Airborne Laser Scanners (ALS) data and aerial images. The approach detects and extracts single tree from ALS data and aerial images. The existing approaches are able to provide bulk segmentation from a group of trees; however, some methods focused on detection and extraction of a particular tree from ALS and images. Segmentation of a single tree within a group of trees is mostly a mission impossible since the detection of boundary lines between the trees is a tedious job and basically it is not feasible. In this approach an experimental formula based on the height of the trees was developed and applied in order to define the boundary lines between the trees. As a result, each single tree was segmented and extracted and later a 3D model was created. Extracted trees from this approach have a unique identification and attribute. The output has application in various fields of science and engineering such as forestry, urban planning, and agriculture. For example in forestry, the result can be used for study in ecologically diverse, biodiversity and ecosystem.

  19. Land cover and forest formation distributions for St. Kitts, Nevis, St. Eustatius, Grenada and Barbados from decision tree classification of cloud-cleared satellite imagery

    USGS Publications Warehouse

    Helmer, E.H.; Kennaway, T.A.; Pedreros, D.H.; Clark, M.L.; Marcano-Vega, H.; Tieszen, L.L.; Ruzycki, T.R.; Schill, S.R.; Carrington, C.M.S.

    2008-01-01

    Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius, testing a more detailed classification than earlier work in the latter three islands. Secondly, we estimate the extents of land cover and protected forest by formation for five islands and ask how land cover has changed over the second half of the 20th century. The image interpretation approach combines image mosaics and ancillary geographic data, classifying the resulting set of raster data with decision tree software. Cloud-free image mosaics for one or two seasons were created by applying regression tree normalization to scene dates that could fill cloudy areas in a base scene. Such mosaics are also known as cloud-filled, cloud-minimized or cloud-cleared imagery, mosaics, or composites. The approach accurately distinguished several classes that more standard methods would confuse; the seamless mosaics aided reference data collection; and the multiseason imagery allowed us to separate drought deciduous forests and woodlands from semi-deciduous ones. Cultivated land areas declined 60 to 100 percent from about 1945 to 2000 on several islands. Meanwhile, forest cover has increased 50 to 950%. This trend will likely continue where sugar cane cultivation has dominated. Like the island of Puerto Rico, most higher-elevation forest formations are protected in formal or informal reserves. Also similarly, lowland forests, which are drier forest types on these islands, are not well represented in reserves. Former cultivated lands in lowland areas could provide lands for new reserves of drier forest types. The land-use history of these islands may provide insight for planners in countries currently considering

  20. ADHD classification using bag of words approach on network features

    NASA Astrophysics Data System (ADS)

    Solmaz, Berkan; Dey, Soumyabrata; Rao, A. Ravishankar; Shah, Mubarak

    2012-02-01

    Attention Deficit Hyperactivity Disorder (ADHD) is receiving lots of attention nowadays mainly because it is one of the common brain disorders among children and not much information is known about the cause of this disorder. In this study, we propose to use a novel approach for automatic classification of ADHD conditioned subjects and control subjects using functional Magnetic Resonance Imaging (fMRI) data of resting state brains. For this purpose, we compute the correlation between every possible voxel pairs within a subject and over the time frame of the experimental protocol. A network of voxels is constructed by representing a high correlation value between any two voxels as an edge. A Bag-of-Words (BoW) approach is used to represent each subject as a histogram of network features; such as the number of degrees per voxel. The classification is done using a Support Vector Machine (SVM). We also investigate the use of raw intensity values in the time series for each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features. Experimental results verified that the classification accuracy improves when the combined histogram is used. We tested our approach on a highly challenging dataset released by NITRC for ADHD-200 competition and obtained promising results. The dataset not only has a large size but also includes subjects from different demography and edge groups. To the best of our knowledge, this is the first paper to propose BoW approach in any functional brain disorder classification and we believe that this approach will be useful in analysis of many brain related conditions.

  1. Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction

    PubMed Central

    2013-01-01

    Background Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. Results This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Conclusions Our newly developed method for HMC takes into account network information in the learning phase: When

  2. Non-Destructive Classification Approaches for Equilbrated Ordinary Chondrites

    NASA Technical Reports Server (NTRS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-01-01

    Classification of meteorites is most effectively carried out by petrographic and mineralogic studies of thin sections, but a rapid and accurate classification technique for the many samples collected in dense collection areas (hot and cold deserts) is of great interest. Oil immersion techniques have been used to classify a large proportion of the US Antarctic meteorite collections since the mid-1980s [1]. This approach has allowed rapid characterization of thousands of samples over time, but nonetheless utilizes a piece of the sample that has been ground to grains or a powder. In order to compare a few non-destructive techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Moessbauer spectroscopy.

  3. Is protein classification necessary? Towards alternative approaches to function annotation

    PubMed Central

    Petrey, Donald; Honig, Barry

    2009-01-01

    The current non-redundant protein sequence database contains over seven million entries and the number of individual functional domains is significantly larger than this value. The vast quantity of data associated with these proteins poses enormous challenges to any attempt at function annotation. Classification of proteins into sequence and structural groups has been widely used as an approach to simplifying the problem. In this article we question such strategies. We describe how the multi-functionality and structural diversity of even closely related proteins confounds efforts to assign function based on overall sequence or structural similarity. Rather, we suggest that strategies that avoid classification may offer a more robust approach to protein function annotation. PMID:19269161

  4. "Trees and Things That Live in Trees": Three Children with Special Needs Experience the Project Approach

    ERIC Educational Resources Information Center

    Griebling, Susan; Elgas, Peg; Konerman, Rachel

    2015-01-01

    The authors report on research conducted during a project investigation undertaken with preschool children, ages 3-5. The report focuses on three children with special needs and the positive outcomes for each child as they engaged in the project Trees and Things That Live in Trees. Two of the children were diagnosed with developmental delays, and…

  5. Availability and Capacity of Substance Abuse Programs in Correctional Settings: A Classification and Regression Tree Analysis

    PubMed Central

    Kitsantas, Panagiota

    2009-01-01

    Objective to be addressed The purpose of this study was to investigate the structural and organizational factors that contribute to the availability and increased capacity for substance abuse treatment programs in correctional settings. We used Classification and Regression Tree statistical procedures to identify how multi-level data can explain the variability in availability and capacity of substance abuse treatment programs in jails and probation/parole offices. Methods The data for this study combined the National Criminal Justice Treatment Practices survey (NCJTP) and the 2000 Census. The NCJTP survey was a nationally representative sample of correctional administrators for jails and probation/parole agencies. The sample size included 295 substance abuse treatment programs that were classified according to the intensity of their services: high, medium, and low. The independent variables included jurisdictional-level structural variables, attributes of the correctional administrators, and program and service delivery characteristics of the correctional agency. Results The two most important variables in predicting the availability of all three types of services were stronger working relationships with other organizations and the adoption of a standardized substance abuse screening tool by correctional agencies. For high and medium intensive programs, the capacity increased when an organizational learning strategy was used by administrators and the organization used a substance abuse screening tool. Implications on advancing treatment practices in correctional settings are discussed, including further work to test theories on how to better understand access to intensive treatment services. This study presents the first phase of understanding capacity-related issues regarding treatment programs offered in correctional settings. PMID:19395204

  6. Interactive change detection based on dissimilarity image and decision tree classification

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Crouzil, Alain; Puel, Jean-Baptiste

    2015-02-01

    Our study mainly focus on detecting changed regions in two images of the same scene taken by digital cameras at different times. The images taken by digital cameras generally provide less information than multi-channel remote sensing images. Moreover, the application-dependent insignificant changes, such as shadows or clouds, may cause the failure of the classical methods based on image differences. The machine learning approach seems to be promising, but the lack of a sufficient volume of training data for photographic landscape observatories discards a lot of methods. So we investigate in this work the interactive learning approach and provide a discriminative model that is a 16-dimensional feature space comprising the textural appearance and contextual information. Dissimilarity measures in different neighborhood sizes are used to detect the difference within the neighborhood of an image pair. To detect changes between two images, the user designates change and non-change samples (pixel sets) in the images using a selection tool. This data is used to train a classifier using decision tree training method which is then applied to all the other pixels of the image pair. The experiments have proved the potential of the proposed approach.

  7. A hybrid ensemble learning approach to star-galaxy classification

    NASA Astrophysics Data System (ADS)

    Kim, Edward J.; Brunner, Robert J.; Carrasco Kind, Matias

    2015-10-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template-fitting method. Using data from the CFHTLenS survey (Canada-France-Hawaii Telescope Lensing Survey), we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2 (Deep Extragalactic Evolutionary Probe Phase 2 ), SDSS (Sloan Digital Sky Survey), VIPERS (VIMOS Public Extragalactic Redshift Survey), and VVDS (VIMOS VLT Deep Survey), and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

  8. Cluster Stability Estimation Based on a Minimal Spanning Trees Approach

    NASA Astrophysics Data System (ADS)

    Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora

    2009-08-01

    Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.

  9. Use of classification trees to apportion single echo detections to species: Application to the pelagic fish community of Lake Superior

    USGS Publications Warehouse

    Yule, Daniel L.; Adams, Jean V.; Hrabik, Thomas R.; Vinson, Mark R.; Woiak, Zebadiah; Ahrenstroff, Tyler D.

    2013-01-01

    Acoustic methods are used to estimate the density of pelagic fish in large lakes with results of midwater trawling used to assign species composition. Apportionment in lakes having mixed species can be challenging because only a small fraction of the water sampled acoustically is sampled with trawl gear. Here we describe a new method where single echo detections (SEDs) are assigned to species based on classification tree models developed from catch data that separate species based on fish size and the spatial habitats they occupy. During the summer of 2011, we conducted a spatially-balanced lake-wide acoustic and midwater trawl survey of Lake Superior. A total of 51 sites in four bathymetric depth strata (0–30 m, 30–100 m, 100–200 m, and >200 m) were sampled. We developed classification tree models for each stratum and found fish length was the most important variable for separating species. To apply these trees to the acoustic data, we needed to identify a target strength to length (TS-to-L) relationship appropriate for all abundant Lake Superior pelagic species. We tested performance of 7 general (i.e., multi-species) relationships derived from three published studies. The best-performing relationship was identified by comparing predicted and observed catch compositions using a second independent Lake Superior data set. Once identified, the relationship was used to predict lengths of SEDs from the lake-wide survey, and the classification tree models were used to assign each SED to a species. Exotic rainbow smelt (Osmerus mordax) were the most common species at bathymetric depths 100 m (384 million; 6.0 kt). Cisco (Coregonus artedi) were widely distributed over all strata with their population estimated at 182 million (44 kt). The apportionment method we describe should be transferable to other large lakes provided fish are not tightly aggregated, and an appropriate TS-to-L relationship for abundant pelagic fish species can be determined.

  10. A Dynamic Tree Approach to Environmental Transport on Hillslopes

    NASA Astrophysics Data System (ADS)

    Passalacqua, P.; Zaliapin, I.; Foufoula-Georgiou, E.; Ghil, M.; Dietrich, W. E.

    2010-12-01

    The concept of dynamic tree was introduced in Zaliapin et al. (2010) as the basis of an extended conceptual framework to study the transport of spatially heterogeneous fluxes as they propagate down a network of a given topology. Here we are interested in extending this framework over the whole basin by incorporating the hillslope paths and their geometry, which are known to differ from those of the river network. Focusing on the fluxes that start at a source, propagate downstream and have constant velocity, we first capture the static structure of the hillslope network by representing it by a tree (static tree). We then describe the transport down the hillslope tree as a particular case of nearest-neighbor hierarchical aggregation and thus obtaining the so-called dynamic tree. The properties of both the dynamic and static trees are analyzed by applying Horton-Strahler and Tokunaga taxonomies. The results obtained in three hillslope areas of different characteristics, two located in California and one in Oregon, show that both the static and the dynamic tree can be well approximated by Tokunaga self-similar trees (SSTs), in agreement with what previously obtained for the channelized paths of the river network but with different parameters. The degree of side branching is larger for the static tree than for the dynamic. We also observed a phase transition in the dynamics of the three systems which reflects an abrupt emergence of a giant cluster of connected streams.

  11. A methodological approach to the classification of dermoscopy images

    PubMed Central

    Celebi, M. Emre; Kingravi, Hassan A.; Uddin, Bakhtiyar; Iyatomi, Hitoshi; Aslandogan, Y. Alp; Stoecker, William V.; Moss, Randy H.

    2011-01-01

    In this paper a methodological approach to the classification of pigmented skin lesions in dermoscopy images is presented. First, automatic border detection is performed to separate the lesion from the background skin. Shape features are then extracted from this border. For the extraction of color and texture related features, the image is divided into various clinically significant regions using the Euclidean distance transform. This feature data is fed into an optimization framework, which ranks the features using various feature selection algorithms and determines the optimal feature subset size according to the area under the ROC curve measure obtained from support vector machine classification. The issue of class imbalance is addressed using various sampling strategies, and the classifier generalization error is estimated using Monte Carlo cross validation. Experiments on a set of 564 images yielded a specificity of 92.34% and a sensitivity of 93.33%. PMID:17387001

  12. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification

    PubMed Central

    Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898

  13. Classification tree for risk assessment in patients suffering from congestive heart failure via long-term heart rate variability.

    PubMed

    Melillo, Paolo; De Luca, Nicola; Bracale, Marcello; Pecchia, Leandro

    2013-05-01

    This study aims to develop an automatic classifier for risk assessment in patients suffering from congestive heart failure (CHF). The proposed classifier separates lower risk patients from higher risk ones, using standard long-term heart rate variability (HRV) measures. Patients are labeled as lower or higher risk according to the New York Heart Association classification (NYHA). A retrospective analysis on two public Holter databases was performed, analyzing the data of 12 patients suffering from mild CHF (NYHA I and II), labeled as lower risk, and 32 suffering from severe CHF (NYHA III and IV), labeled as higher risk. Only patients with a fraction of total heartbeats intervals (RR) classified as normal-to-normal (NN) intervals (NN/RR) higher than 80% were selected as eligible in order to have a satisfactory signal quality. Classification and regression tree (CART) was employed to develop the classifiers. A total of 30 higher risk and 11 lower risk patients were included in the analysis. The proposed classification trees achieved a sensitivity and a specificity rate of 93.3% and 63.6%, respectively, in identifying higher risk patients. Finally, the rules obtained by CART are comprehensible and consistent with the consensus showed by previous studies that depressed HRV is a useful tool for risk assessment in patients suffering from CHF. PMID:24592473

  14. A comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification

    NASA Astrophysics Data System (ADS)

    Lin, Yi; Hyyppä, Juha

    2016-04-01

    Tree species information is crucial for digital forestry, and efficient techniques for classifying tree species are extensively demanded. To this end, airborne light detection and ranging (LiDAR) has been introduced. However, the literature review suggests that most of the previous airborne LiDAR-based studies were only based on limited kinds of tree signatures. To address this gap, this study proposed developing a novel modular framework for LiDAR-based tree species classification, by deriving feature parameters in a systematic way. Specifically, feature parameters of point-distribution (PD), laser pulse intensity (IN), crown-internal (CI) and tree-external (TE) structures were proposed and derived. With a support-vector-machine (SVM) classifier used, the classifications were conducted in a leave-one-out-for-cross-validation (LOOCV) mode. Based on the samples of four typical boreal tree species, i.e., Picea abies, Pinus sylvestris, Populus tremula and Quercus robur, tests showed that the accuracies of the classifications based on the acquired PD-, IN-, CI- and TE-categorized feature parameters as well as the integration of their individual optimal parameters are 65.00%, 80.00%, 82.50%, 85.00% and 92.50%, respectively. These results indicate that the procedures proposed in this study can be used as a comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification.

  15. Prediction of an Epidemic Curve: A Supervised Classification Approach

    PubMed Central

    Nsoesie, Elaine O.; Beckman, Richard; Marathe, Madhav; Lewis, Bryan

    2012-01-01

    Classification methods are widely used for identifying underlying groupings within datasets and predicting the class for new data objects given a trained classifier. This study introduces a project aimed at using a combination of simulations and classification techniques to predict epidemic curves and infer underlying disease parameters for an ongoing outbreak. Six supervised classification methods (random forest, support vector machines, nearest neighbor with three decision rules, linear and flexible discriminant analysis) were used in identifying partial epidemic curves from six agent-based stochastic simulations of influenza epidemics. The accuracy of the methods was compared using a performance metric based on the McNemar test. The findings showed that: (1) assumptions made by the methods regarding the structure of an epidemic curve influences their performance i.e. methods with fewer assumptions perform best, (2) the performance of most methods is consistent across different individual-based networks for Seattle, Los Angeles and New York and (3) combining classifiers using a weighting approach does not guarantee better prediction. PMID:22997545

  16. Colorectal Cancer Classification and Cell Heterogeneity: A Systems Oncology Approach

    PubMed Central

    Blanco-Calvo, Moisés; Concha, Ángel; Figueroa, Angélica; Garrido, Federico; Valladares-Ayerbes, Manuel

    2015-01-01

    Colorectal cancer is a heterogeneous disease that manifests through diverse clinical scenarios. During many years, our knowledge about the variability of colorectal tumors was limited to the histopathological analysis from which generic classifications associated with different clinical expectations are derived. However, currently we are beginning to understand that under the intense pathological and clinical variability of these tumors there underlies strong genetic and biological heterogeneity. Thus, with the increasing available information of inter-tumor and intra-tumor heterogeneity, the classical pathological approach is being displaced in favor of novel molecular classifications. In the present article, we summarize the most relevant proposals of molecular classifications obtained from the analysis of colorectal tumors using powerful high throughput techniques and devices. We also discuss the role that cancer systems biology may play in the integration and interpretation of the high amount of data generated and the challenges to be addressed in the future development of precision oncology. In addition, we review the current state of implementation of these novel tools in the pathological laboratory and in clinical practice. PMID:26084042

  17. A new classification approach for detecting severe weather patterns

    NASA Astrophysics Data System (ADS)

    Teixeira de Lima, Glauston R.; Stephany, Stephan

    2013-08-01

    Early detection of possible occurrences of severe convective events would be useful in order to avoid, or at least mitigate, the environmental and socio-economic damages caused by such events. However, the enormous volume of meteorological data currently available makes difficult, if not impossible, its analysis by meteorologists. In addition, severe convective events may occur in very different spatial and temporal scales, precluding their early and accurate prediction. In this work, we propose an innovative approach for the classification of meteorological data based on the frequency of occurrence of the values of different variables provided by a weather forecast model. It is possible to identify patterns that may be associated to severe convective activity. In the considered classification problem, the information attributes are variables outputted by the weather forecast model Eta, while the decision attribute is given by the density of occurrence of cloud-to-ground atmospheric electrical discharges, assumed as correlated to the level of convective activity. Results show good classification performance for some selected mini-regions of Brazil during the summer of 2007. We expect that the screening of the outputs of the meteorological model Eta by the proposed classifier could serve as a support tool for meteorologists in order to identify in advance patterns associated to severe convective events.

  18. Full Hierarchic Versus Non-Hierarchic Classification Approaches for Mapping Sealed Surfaces at the Rural-Urban Fringe Using High-Resolution Satellite Data

    PubMed Central

    De Roeck, Tim; Van de Voorde, Tim; Canters, Frank

    2009-01-01

    Since 2008 more than half of the world population is living in cities and urban sprawl is continuing. Because of these developments, the mapping and monitoring of urban environments and their surroundings is becoming increasingly important. In this study two object-oriented approaches for high-resolution mapping of sealed surfaces are compared: a standard non-hierarchic approach and a full hierarchic approach using both multi-layer perceptrons and decision trees as learning algorithms. Both methods outperform the standard nearest neighbour classifier, which is used as a benchmark scenario. For the multi-layer perceptron approach, applying a hierarchic classification strategy substantially increases the accuracy of the classification. For the decision tree approach a one-against-all hierarchic classification strategy does not lead to an improvement of classification accuracy compared to the standard all-against-all approach. Best results are obtained with the hierarchic multi-layer perceptron classification strategy, producing a kappa value of 0.77. A simple shadow reclassification procedure based on characteristics of neighbouring objects further increases the kappa value to 0.84. PMID:22389586

  19. AutoClass: A Bayesian Approach to Classification

    NASA Technical Reports Server (NTRS)

    Stutz, John; Cheeseman, Peter; Hanson, Robin; Taylor, Will; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    We describe a Bayesian approach to the untutored discovery of classes in a set of cases, sometimes called finite mixture separation or clustering. The main difference between clustering and our approach is that we search for the "best" set of class descriptions rather than grouping the cases themselves. We describe our classes in terms of a probability distribution or density function, and the locally maximal posterior probability valued function parameters. We rate our classifications with an approximate joint probability of the data and functional form, marginalizing over the parameters. Approximation is necessitated by the computational complexity of the joint probability. Thus, we marginalize w.r.t. local maxima in the parameter space. We discuss the rationale behind our approach to classification. We give the mathematical development for the basic mixture model and describe the approximations needed for computational tractability. We instantiate the basic model with the discrete Dirichlet distribution and multivariant Gaussian density likelihoods. Then we show some results for both constructed and actual data.

  20. Human and tree classification based on a model using 3D ladar in a GPS-denied environment

    NASA Astrophysics Data System (ADS)

    Cho, Kuk; Baeg, Seung-Ho; Park, Sangdeok

    2013-05-01

    This study explained a method to classify humans and trees by extraction their geometric and statistical features in data obtained from 3D LADAR. In a wooded GPS-denied environment, it is difficult to identify the location of unmanned ground vehicles and it is also difficult to properly recognize the environment in which these vehicles move. In this study, using the point cloud data obtained via 3D LADAR, a method to extract the features of humans, trees, and other objects within an environment was implemented and verified through the processes of segmentation, feature extraction, and classification. First, for the segmentation, the radially bounded nearest neighbor method was applied. Second, for the feature extraction, each segmented object was divided into three parts, and then their geometrical and statistical features were extracted. A human was divided into three parts: the head, trunk and legs. A tree was also divided into three parts: the top, middle, and bottom. The geometric features were the variance of the x-y data for the center of each part in an object, using the distance between the two central points for each part, using K-mean clustering. The statistical features were the variance of each of the parts. In this study, three, six and six features of data were extracted, respectively, resulting in a total of 15 features. Finally, after training the extracted data via an artificial network, new data were classified. This study showed the results of an experiment that applied an algorithm proposed with a vehicle equipped with 3D LADAR in a thickly forested area, which is a GPS-denied environment. A total of 5,158 segments were obtained and the classification rates for human and trees were 82.9% and 87.4%, respectively.

  1. Application of object-oriented method for classification of VHR satellite images using rule-based approach and texture measures

    NASA Astrophysics Data System (ADS)

    Lewinski, S.; Bochenek, Z.; Turlej, K.

    2010-01-01

    New approach for classification of high-resolution satellite images is presented in the article. That approach has been developed at the Institute of Geodesy and Cartography, Warsaw, within the Geoland 2 project - SATChMo Core Mapping Service. Classification algorithm, aimed at recognition of generic land cover categories, has been elaborated using the object-oriented approach. Its functionality was tested on the basis of KOMPSAT-2 satellite images, recorded in four multispectral bands (4 m ground resolution) and in panchromatic mode (1 m ground resolution). The structure of the algorithm resembles decision tree and consists of a sequence of processes. The main assumption of the presented approach is to divide image contents into objects characterized by high and low texture measures. The texture measures are generated on the basis of a panchromatic image transformed by Sigma filters. Objects belonging to the so-called high texture are classified at first steps. In the following steps the classification of the remaining objects takes place. Applying parametric criteria of recognition at the first group of objects four generic land cover classes are classified: forests, sparse woody vegetation, urban / artificial areas and bare ground. Non-classified areas are automatically assigned to the second group of objects, which contains water and agricultural land. In the course of classification process a few segmentations are performed, which are dedicated to particular land cover categories. Classified objects, smaller than 0.25 ha are removed in the process of generalization.

  2. Active Optical Sensors for Tree Stem Detection and Classification in Nurseries

    PubMed Central

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J.; Hanson, Bradley D.; Slaughter, David C.

    2014-01-01

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops. PMID:24949638

  3. Active optical sensors for tree stem detection and classification in nurseries.

    PubMed

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J; Hanson, Bradley D; Slaughter, David C

    2014-01-01

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops. PMID:24949638

  4. Classification of Bent-Double Galaxies: Experiences with Ensembles of Decision Trees

    SciTech Connect

    Kamath, C; Cantu-Paz, E

    2002-01-08

    In earlier work, we have described our experiences with the use of decision tree classifiers to identify radio-emitting galaxies with a bent-double morphology in the FIRST astronomical survey. We now extend this work to include ensembles of decision tree classifiers, including two algorithms developed by us. These algorithms randomize the decision at each node of the tree, and because they consider fewer candidate splitting points, are faster than other methods for creating ensembles. The experiments presented in this paper with our astronomy data show that our algorithms are competitive in accuracy, but faster than other ensemble techniques such as Boosting, Bagging, and Arcx4 with different split criteria.

  5. Reflectance properties of West African savanna trees from ground radiometer measurements. II - Classification of components

    NASA Technical Reports Server (NTRS)

    Hanan, N. P.; Prince, S. D.; Franklin, J.

    1993-01-01

    A pole-mounted radiometer was used to measure the reflectance properties in the red and near-IR of three Sahelian tree species. These properties are classified depending on their location over the canopy. A geometrical description of the patterns of shadow and sunlight on and beneath a model tree when viewed from above is given, and six components are defined. Tree canopies are found to be dark in the red waveband with respect to the soil, but have little or no effect on the near-IR.

  6. A triple stable isotope approach in tree rings for detecting the impact of nitrogen emissions on tree physiology

    NASA Astrophysics Data System (ADS)

    Guerrieri, M. R.; Siegwolf, R. T. W.; Saurer, M.; Jaeggi, M.; Cherubini, P.; Ripullone, F.; Borghetti, M.

    2009-04-01

    Over the last decades, human activities have contributed to increase reactive nitrogen (N) in the atmosphere (such as NOx and NHx compounds) and their deposition on terrestrial ecosystems. The relevance of the current N deposition (Ndep) on carbon (C) sequestration has lately been questioned by both experimental and modelling approaches. Widely a different estimates of C sensitivity to Ndep have been reported in recent investigations (Magnani et al., 2007; Högberg 2007; De Vries et al. 2008; Magnani et al. 2008; Sutton et al. 2008, which highlights the need for a through re-assessment of all the physiological mechanisms and processes involved. The impact of Ndep on forest ecosystems can be investigated near the pollution sources, where the effects are expected to be easily detectable. Therefore, tree rings represent a valuable archive for disturbances due to pollution events, which can be detected by combining d13C, d18O, d15N and dendrochronological approaches. The aim of this research was to investigate the impact of long term exposure to NOx emissions on two tree species, namely: a broadleaved species (Quercus cerris) that was located close to an oil refinery in Southern Italy, and a coniferous species (Picea abies) located close to a freeway in Switzerland. The analysis of d15N in tree rings allowed to detect the input of N from anthropogenic emissions. Further, variations in the ratio of intercellular and ambient CO2 concentrations (ci/ca) and the distinction between stomatal (gs) and photosynthetic (A) responses to NOx emissions in trees were assessed using a conceptual model (Scheidegger et al., 2000), which combines d13C and d18O in tree rings. The strongest fingerprint of N emissions was detected for Q. cerris at the oil refinery site, as assessed by d15N. Long-term exposure to NOx emissions had a different impact on the ci/ca ratio in the two experimental sites: at the oil refinery (Quercus cerris), gs influenced ci/ca more, as assessed by d18O, while at

  7. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy. PMID:26687087

  8. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, J.; Tuia, D.; Berne, A.

    2015-01-01

    A data-driven approach to the classification of hydrometeors from measurements collected with polarimetric weather radars is proposed. In a first step, the optimal number of hydrometeor classes (nopt) that can be reliably identified from a large set of polarimetric data is determined. This is done by means of an unsupervised clustering technique guided by criteria related both to data similarity and to spatial smoothness of the classified images. In a second step, the nopt clusters are assigned to the appropriate hydrometeor class by means of human interpretation and comparisons with the output of other classification techniques. The main innovation in the proposed method is the unsupervised part: the hydrometeor classes are not defined a priori, but they are learned from data. The approach is applied to data collected by an X-band polarimetric weather radar during two field campaigns (from which about 50 precipitation events are used in the present study). Seven hydrometeor classes (nopt = 7) have been found in the data set, and they have been identified as light rain (LR), rain (RN), heavy rain (HR), melting snow (MS), ice crystals/small aggregates (CR), aggregates (AG), and rimed-ice particles (RI).

  9. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, J.; Tuia, D.; Berne, A.

    2014-08-01

    A data-driven approach to the classification of hydrometeors from measurements collected with polarimetric weather radars is proposed. In a first step, the optimal number nopt of hydrometeor classes that can be reliably identified from a large set of polarimetric data is determined. This is done by means of an unsupervised clustering technique guided by criteria related both to data similarity and to spatial smoothness of the classified images. In a second step, the nopt clusters are assigned to the appropriate hydrometeor class by means of human interpretation and comparisons with the output of other classification techniques. The main innovation in the proposed method is the unsupervised part: the hydrometeor classes are not defined a-priori, but they are learned from data. The proposed approach is applied to data collected by an X-band polarimetric weather radar during two field campaigns (totalling about 3000 h of precipitation). Seven hydrometeor classes have been found in the data set and they have been associated to drizzle (DZ), light rain (LR), heavy rain (HR), melting snow (MS), ice crystals/small aggregates (CR), aggregates (AG), rimed particles (RI).

  10. ECOLOGICAL RESPONSE SURFACES FOR NORTH AMERICAN BOREAL TREE SPECIES AND THEIR USE IN FOREST CLASSIFICATION

    EPA Science Inventory

    Empirical ecological response surfaces were derived for eight dominant tree species in the boreal forest region of Canada. tepwise logistic regression was used to model species dominance as a response to five climatic predictor variables. he predictor variables (annual snowfall, ...

  11. An improved classification tree analysis of high cost modules based upon an axiomatic definition of complexity

    NASA Technical Reports Server (NTRS)

    Tian, Jianhui; Porter, Adam; Zelkowitz, Marvin V.

    1992-01-01

    Identification of high cost modules has been viewed as one mechanism to improve overall system reliability, since such modules tend to produce more than their share of problems. A decision tree model was used to identify such modules. In this current paper, a previously developed axiomatic model of program complexity is merged with the previously developed decision tree process for an improvement in the ability to identify such modules. This improvement was tested using data from the NASA Software Engineering Laboratory.

  12. A comparison of feature selection methods for multitemporal tree species classification

    NASA Astrophysics Data System (ADS)

    Pipkins, Kyle; Förster, Michael; Clasen, Anne; Schmidt, Tobias; Kleinschmit, Birgit

    2014-10-01

    The problem of feature selection is a significant one in classification problems, where the addition of too many features to the classification fails to lead to significant increases in classification accuracy. This problem is especially significant within the context of multitemporal remote sensing classifications, where the costs and efforts associated with the acquisition of additional imagery can be extensive. It would thus be beneficial to identify the most important seasons for acquiring imagery for specific land cover types. This study uses a phenologically-adjusted 21 date RapidEye time-series in order to evaluate two methods of feature selection. The two methods compared in this study are a genetic algorithm (GA) and a semi-exhaustive method (EXH), both of which compare permutations of sequential date and band combinations. These methods are employed using a seven class support vector machine classification on a Normalized Difference Vegetation Index (NDVI)-transformed dataset. Overall accuracy (OAA) is used as the performance metric, and OAA significance is assessed using the McNemar test. The results from the feature selection methods are compared on the basis of phenological seasons selected across all iterations and the ideal number of combinations, based on the ratio of better performing classifications to all other classifications. The results suggest that the GA has a moderate but insignificant correlation when compared with the EXH for identifying ideal phenological seasons (overall Spearman's ρ= 0.60, p = 0.13), but is comparable when considering the number of seasons and image combinations.

  13. A comparison of ARA and DNA data for microbial source tracking based on source-classification models developed using classification trees.

    PubMed

    Price, Bertram; Venso, Elichia; Frana, Mark; Greenberg, Joshua; Ware, Adam

    2007-08-01

    The literature on microbial source tracking (MST) suggests that DNA analysis of fecal samples leads to more reliable determinations of bacterial sources of surface water contamination than antibiotic resistance analysis (ARA). Our goal is to determine whether the increased reliability, if any, in library-based MST developed with DNA data is sufficient to justify its higher cost, where the bacteria source predictions are used in TMDL surface water management programs. We describe an application of classification trees for MST applied to ARA and DNA data from samples collected in the Potomac River Watershed in Maryland. Conclusions concerning the comparison of ARA and DNA data, although preliminary at the current time, suggest that the added cost of obtaining DNA data in comparison to the cost of ARA data may not be justified, where MST is applied in TMDL surface water management programs. PMID:17599384

  14. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    PubMed

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  15. Target-classification approach applied to active UXO sites

    NASA Astrophysics Data System (ADS)

    Shubitidze, F.; Fernández, J. P.; Shamatava, Irma; Barrowes, B. E.; O'Neill, K.

    2013-06-01

    This study is designed to illustrate the discrimination performance at two UXO active sites (Oklahoma's Fort Sill and the Massachusetts Military Reservation) of a set of advanced electromagnetic induction (EMI) inversion/discrimination models which include the orthonormalized volume magnetic source (ONVMS), joint diagonalization (JD), and differential evolution (DE) approaches and whose power and flexibility greatly exceed those of the simple dipole model. The Fort Sill site is highly contaminated by a mix of the following types of munitions: 37-mm target practice tracers, 60-mm illumination mortars, 75-mm and 4.5'' projectiles, 3.5'', 2.36'', and LAAW rockets, antitank mine fuzes with and without hex nuts, practice MK2 and M67 grenades, 2.5'' ballistic windshields, M2A1-mines with/without bases, M19-14 time fuzes, and 40-mm practice grenades with/without cartridges. The site at the MMR site contains targets of yet different sizes. In this work we apply our models to EMI data collected using the MetalMapper (MM) and 2 × 2 TEMTADS sensors. The data for each anomaly are inverted to extract estimates of the extrinsic and intrinsic parameters associated with each buried target. (The latter include the total volume magnetic source or NVMS, which relates to size, shape, and material properties; the former includes location, depth, and orientation). The estimated intrinsic parameters are then used for classification performed via library matching and the use of statistical classification algorithms; this process yielded prioritized dig-lists that were submitted to the Institute for Defense Analyses (IDA) for independent scoring. The models' classification performance is illustrated and assessed based on these independent evaluations.

  16. A regional classification scheme for estimating reference water quality in streams using land-use-adjusted spatial regression-tree analysis

    USGS Publications Warehouse

    Robertson, D.M.; Saad, D.A.; Heisey, D.M.

    2006-01-01

    Various approaches are used to subdivide large areas into regions containing streams that have similar reference or background water quality and that respond similarly to different factors. For many applications, such as establishing reference conditions, it is preferable to use physical characteristics that are not affected by human activities to delineate these regions. However, most approaches, such as ecoregion classifications, rely on land use to delineate regions or have difficulties compensating for the effects of land use. Land use not only directly affects water quality, but it is often correlated with the factors used to define the regions. In this article, we describe modifications to SPARTA (spatial regression-tree analysis), a relatively new approach applied to water-quality and environmental characteristic data to delineate zones with similar factors affecting water quality. In this modified approach, land-use-adjusted (residualized) water quality and environmental characteristics are computed for each site. Regression-tree analysis is applied to the residualized data to determine the most statistically important environmental characteristics describing the distribution of a specific water-quality constituent. Geographic information for small basins throughout the study area is then used to subdivide the area into relatively homogeneous environmental water-quality zones. For each zone, commonly used approaches are subsequently used to define its reference water quality and how its water quality responds to changes in land use. SPARTA is used to delineate zones of similar reference concentrations of total phosphorus and suspended sediment throughout the upper Midwestern part of the United States. ?? 2006 Springer Science+Business Media, Inc.

  17. A Fault Tree Approach to Needs Assessment -- An Overview.

    ERIC Educational Resources Information Center

    Stephens, Kent G.

    A "failsafe" technology is presented based on a new unified theory of needs assessment. Basically the paper discusses fault tree analysis as a technique for enhancing the probability of success in any system by analyzing the most likely modes of failure that could occur and then suggesting high priority avoidance strategies for those failure…

  18. Hierarchical Multinomial Processing Tree Models: A Latent-Class Approach

    ERIC Educational Resources Information Center

    Klauer, Karl Christoph

    2006-01-01

    Multinomial processing tree models are widely used in many areas of psychology. Their application relies on the assumption of parameter homogeneity, that is, on the assumption that participants do not differ in their parameter values. Tests for parameter homogeneity are proposed that can be routinely used as part of multinomial model analyses to…

  19. A Fault Tree Approach to Analysis of Organizational Communication Systems.

    ERIC Educational Resources Information Center

    Witkin, Belle Ruth; Stephens, Kent G.

    Fault Tree Analysis (FTA) is a method of examing communication in an organization by focusing on: (1) the complex interrelationships in human systems, particularly in communication systems; (2) interactions across subsystems and system boundaries; and (3) the need to select and "prioritize" channels which will eliminate noise in the system and…

  20. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach

    ERIC Educational Resources Information Center

    Klauer, Karl Christoph

    2010-01-01

    Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…

  1. Comparison of four approaches to a rock facies classification problem

    USGS Publications Warehouse

    Dubois, M.K.; Bohling, G.C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  2. An Empirical Study of Different Approaches for Protein Classification

    PubMed Central

    Nanni, Loris

    2014-01-01

    Many domains would benefit from reliable and efficient systems for automatic protein classification. An area of particular interest in recent studies on automatic protein classification is the exploration of new methods for extracting features from a protein that work well for specific problems. These methods, however, are not generalizable and have proven useful in only a few domains. Our goal is to evaluate several feature extraction approaches for representing proteins by testing them across multiple datasets. Different types of protein representations are evaluated: those starting from the position specific scoring matrix of the proteins (PSSM), those derived from the amino-acid sequence, two matrix representations, and features taken from the 3D tertiary structure of the protein. We also test new variants of proteins descriptors. We develop our system experimentally by comparing and combining different descriptors taken from the protein representations. Each descriptor is used to train a separate support vector machine (SVM), and the results are combined by sum rule. Some stand-alone descriptors work well on some datasets but not on others. Through fusion, the different descriptors provide a performance that works well across all tested datasets, in some cases performing better than the state-of-the-art. PMID:25028675

  3. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification

    PubMed Central

    Botella, Juan; Huang, Huiling; Suero, Manuel

    2013-01-01

    Studies that evaluate the accuracy of binary classification tools are needed. Such studies provide 2 × 2 cross-classifications of test outcomes and the categories according to an unquestionable reference (or gold standard). However, sometimes a suboptimal reliability reference is employed. Several methods have been proposed to deal with studies where the observations are cross-classified with an imperfect reference. These methods require that the status of the reference, as a gold standard or as an imperfect reference, is known. In this paper a procedure for determining whether it is appropriate to maintain the assumption that the reference is a gold standard or an imperfect reference, is proposed. This procedure fits two nested multinomial tree models, and assesses and compares their absolute and incremental fit. Its implementation requires the availability of the results of several independent studies. These should be carried out using similar designs to provide frequencies of cross-classification between a test and the reference under investigation. The procedure is applied in two examples with real data. PMID:24106484

  4. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2012-02-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  5. An Approach for Automatic Classification of Radiology Reports in Spanish.

    PubMed

    Cotik, Viviana; Filippo, Darío; Castaño, José

    2015-01-01

    Automatic detection of relevant terms in medical reports is useful for educational purposes and for clinical research. Natural language processing (NLP) techniques can be applied in order to identify them. In this work we present an approach to classify radiology reports written in Spanish into two sets: the ones that indicate pathological findings and the ones that do not. In addition, the entities corresponding to pathological findings are identified in the reports. We use RadLex, a lexicon of English radiology terms, and NLP techniques to identify the occurrence of pathological findings. Reports are classified using a simple algorithm based on the presence of pathological findings, negation and hedge terms. The implemented algorithms were tested with a test set of 248 reports annotated by an expert, obtaining a best result of 0.72 F1 measure. The output of the classification task can be used to look for specific occurrences of pathological findings. PMID:26262128

  6. Robust Orbit Determination and Classification: A Learning Theoretic Approach

    NASA Astrophysics Data System (ADS)

    Sharma, S.; Cutler, J. W.

    2015-11-01

    Orbit determination involves estimation of a non-linear mapping from feature vectors associated with the position of the spacecraft to its orbital parameters. The de facto standard in orbit determination in real-world scenarios for spacecraft has been linearized estimators such as the extended Kalman filter. Such an estimator, while very accurate and convergent over its linear region, is hard to generalize over arbitrary gravitational potentials and diverse sets of measurements. It is also challenging to perform exact mathematical characterizations of the Kalman filter performance over such general systems. Here we present a new approach to orbit determination as a learning problem involving distribution regression and, also, for the multiple-spacecraft scenario, a transfer learning system for classification of feature vectors associated with spacecraft, and provide some associated analysis of such systems.

  7. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2013-01-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  8. A Novel Approach on Designing Augmented Fuzzy Cognitive Maps Using Fuzzified Decision Trees

    NASA Astrophysics Data System (ADS)

    Papageorgiou, Elpiniki I.

    This paper proposes a new methodology for designing Fuzzy Cognitive Maps using crisp decision trees that have been fuzzified. Fuzzy cognitive map is a knowledge-based technique that works as an artificial cognitive network inheriting the main aspects of cognitive maps and artificial neural networks. Decision trees, in the other hand, are well known intelligent techniques that extract rules from both symbolic and numeric data. Fuzzy theoretical techniques are used to fuzzify crisp decision trees in order to soften decision boundaries at decision nodes inherent in this type of trees. Comparisons between crisp decision trees and the fuzzified decision trees suggest that the later fuzzy tree is significantly more robust and produces a more balanced decision making. The approach proposed in this paper could incorporate any type of fuzzy decision trees. Through this methodology, new linguistic weights were determined in FCM model, thus producing augmented FCM tool. The framework is consisted of a new fuzzy algorithm to generate linguistic weights that describe the cause-effect relationships among the concepts of the FCM model, from induced fuzzy decision trees.

  9. Increased tree establishment in Lithuanian peat bogs--insights from field and remotely sensed approaches.

    PubMed

    Edvardsson, Johannes; Šimanauskienė, Rasa; Taminskas, Julius; Baužienė, Ieva; Stoffel, Markus

    2015-02-01

    Over the past century an ongoing establishment of Scots pine (Pinus sylvestris L.), sometimes at accelerating rates, is noted at three studied Lithuanian peat bogs, namely Kerėplis, Rėkyva and Aukštumala, all representing different degrees of tree coverage and geographic settings. Present establishment rates seem to depend on tree density on the bog surface and are most significant at sparsely covered sites where about three-fourth of the trees have established since the mid-1990s, whereas the initial establishment in general was during the early to mid-19th century. Three methods were used to detect, compare and describe tree establishment: (1) tree counts in small plots, (2) dendrochronological dating of bog pine trees, and (3) interpretation of aerial photographs and historical maps of the study areas. In combination, the different approaches provide complimentary information but also weigh up each other's drawbacks. Tree counts in plots provided a reasonable overview of age class distributions and enabled capturing of the most recently established trees with ages less than 50 years. The dendrochronological analysis yielded accurate tree ages and a good temporal resolution of long-term changes. Tree establishment and spread interpreted from aerial photographs and historical maps provided a good overview of tree spread and total affected area. It also helped to verify the results obtained with the other methods and an upscaling of findings to the entire peat bogs. The ongoing spread of trees in predominantly undisturbed peat bogs is related to warmer and/or drier climatic conditions, and to a minor degree to land-use changes. Our results therefore provide valuable insights into vegetation changes in peat bogs, also with respect to bog response to ongoing and future climatic changes. PMID:25310886

  10. Addition of wsp sequences to the Wolbachia phylogenetic tree and stability of the classification.

    PubMed

    Pintureau, B; Chaudier, S; Lassablière, F; Charles, H; Grenier, S

    2000-10-01

    Wolbachia are symbiotic bacteria altering reproductive characters of numerous arthropods. Their most recent phylogeny and classification are based on sequences of the wsp gene. We sequenced wsp gene from six Wolbachia strains infecting six Trichogramma species that live as egg parasitoids on many insects. This allows us to test the effect of the addition of sequences on the Wolbachia phylogeny and to check the classification of Wolbachia infecting Trichogramma. The six Wolbachia studied are classified in the B supergroup. They confirm the monophyletic structure of the B Wolbachia in Trichogramma but introduce small differences in the Wolbachia classification. Modifications include the definition of a new group, Sem, for Wolbachia of T. semblidis and the merging of the two closely related groups, Sib and Kay. Specific primers were determined and tested for the Sem group. PMID:11040288

  11. Spectral difference analysis and airborne imaging classification for citrus greening infected trees

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Citrus greening, also called Huanglongbing (HLB), became a devastating disease spread through citrus groves in Florida, since it was first found in 2005. Multispectral (MS) and hyperspectral (HS) airborne images of citrus groves in Florida were acquired to detect citrus greening infected trees in 20...

  12. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii),...

  13. Chemical classification of cattle. 2. Phylogenetic tree and specific status of the Zebu.

    PubMed

    Manwell, C; Baker, C M

    1980-01-01

    Phylogenetic trees for the ten major breed groups of cattle were constructed by Farris's (1972) maximum parsimony method, or Fitch & Margoliash's (1967) method, which averages ou the deviation over the entire assemblage. Both techniques yield essentially identical trees. The phylogenetic tree for the ten major cattle breed groups can be superimposed on a map of Europe and western Asia, the root of the tree being close to the 'fertile crescent' in Asia Minor, believed to be a primary centre of bovine domestication. For some but not all protein variants there is a cline of gene frequencies as one proceeds from the British Isles and northwest Europe towards southeast Europe and Asia Minor, with the most extreme gene frequencies in the Zebu breeds of India. It is not clear to what extent the observed clines are primary or secondary, i.e., consequent to the initial migrations of cattle towards the end of the Pleistocene or consequent to the many migrations of man with his domesticated cattle. Such clines as exist are not in themselves sufficient to prove either selection versus genetic drift or to establish taxonomic ranking. Contrary to some suggestions in the literature, the biochemical evidence supports Linnaeus's original conclusions: Bos taurus and Bos indicus are distinct species. PMID:7458002

  14. Classification of tissue pathological state using optical multiparametric monitoring approach

    NASA Astrophysics Data System (ADS)

    Kutai-Asis, Hofit; Kanter, Ido; Barbiro-Michaely, Efrat; Mayevsky, Avraham

    2008-12-01

    In order to diagnose the development of pathophysiological events in the brain, the evaluation of multiparametric data in real time is highly important. The current work presents a new approach of using cluster analysis for the evaluation of relationship between: mitochondrial NADH, tissue blood flow and hemoglobin oxygenation under various pathophysiological conditions. The Time-Sharing Fluorometer Reflectometer (TSFR) was used for monitoring of mitochondrial NADH, oxyhemoglobin (HbO2), and microcirculatory blood flow simultaneously at the same location from the rat or gerbils cortex. This allows a more accurate assessment of brain functions in real time and a better understanding of the relationship between tissue oxygen supply and demand. Moreover, in some pathophysiological cases, monitoring of only one or two parameters in the cerebral cortex may be misleading. The classification was based on the data collected in experiments where different pathophysiological conditions, such as anoxia, ischemia, and SD were used. These three parameters were plotted in three dimensions. The clustering approach results showed similar patterns in each type of treatment. The distribution of data points in space was used to define the spatial behavior of each treatment in order to produce an index for identifying different treatments. In conclusion, our present study offers a new approach of data analysis that can serve as a reliable tool for tissue pathophysiology.

  15. Identification, Classification and Differential Expression of Oleosin Genes in Tung Tree (Vernicia fordii)

    PubMed Central

    Cao, Heping; Zhang, Lin; Tan, Xiaofeng; Long, Hongxu; Shockey, Jay M.

    2014-01-01

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii), whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the “proline knot” motif (PX5SPX3P) shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach), Populus trichocarpa (poplar), Ricinus communis (castor bean), Theobroma cacao (cacao) and Vitis vinifera (grapevine). Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1) All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2) OLE mRNA levels were much higher in seeds than leaves or flowers; 3) OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4) OLE mRNA levels rapidly increased during seed development; and 5) OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1–3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants. PMID:24516650

  16. Impacts of age-dependent tree sensitivity and dating approaches on dendrogeomorphic time series of landslides

    NASA Astrophysics Data System (ADS)

    Šilhán, Karel; Stoffel, Markus

    2015-05-01

    Different approaches and thresholds have been utilized in the past to date landslides with growth ring series of disturbed trees. Past work was mostly based on conifer species because of their well-defined ring boundaries and the easy identification of compression wood after stem tilting. More recently, work has been expanded to include broad-leaved trees, which are thought to produce less and less evident reactions after landsliding. This contribution reviews recent progress made in dendrogeomorphic landslide analysis and introduces a new approach in which landslides are dated via ring eccentricity formed after tilting. We compare results of this new and the more conventional approaches. In addition, the paper also addresses tree sensitivity to landslide disturbance as a function of tree age and trunk diameter using 119 common beech (Fagus sylvatica L.) and 39 Crimean pine (Pinus nigra ssp. pallasiana) trees growing on two landslide bodies. The landslide events reconstructed with the classical approach (reaction wood) also appear as events in the eccentricity analysis, but the inclusion of eccentricity clearly allowed for more (162%) landslides to be detected in the tree-ring series. With respect to tree sensitivity, conifers and broad-leaved trees show the strongest reactions to landslides at ages comprised between 40 and 60 years, with a second phase of increased sensitivity in P. nigra at ages of ca. 120-130 years. These phases of highest sensitivities correspond with trunk diameters at breast height of 6-8 and 18-22 cm, respectively (P. nigra). This study thus calls for the inclusion of eccentricity analyses in future landslide reconstructions as well as for the selection of trees belonging to different age and diameter classes to allow for a well-balanced and more complete reconstruction of past events.

  17. Narrowing historical uncertainty: probabilistic classification of ambiguously identified tree species in historical forest survey data

    USGS Publications Warehouse

    Mladenoff, D.J.; Dahir, S.E.; Nordheim, E.V.; Schulte, L.A.; Guntenspergen, G.R.

    2002-01-01

    Historical data have increasingly become appreciated for insight into the past conditions of ecosystems. Uses of such data include assessing the extent of ecosystem change; deriving ecological baselines for management, restoration, and modeling; and assessing the importance of past conditions on the composition and function of current systems. One historical data set of this type is the Public Land Survey (PLS) of the United States General Land Office, which contains data on multiple tree species, sizes, and distances recorded at each survey point, located at half-mile (0.8 km) intervals on a 1-mi (1.6 km) grid. This survey method was begun in the 1790s on US federal lands extending westward from Ohio. Thus, the data have the potential of providing a view of much of the US landscape from the mid-1800s, and they have been used extensively for this purpose. However, historical data sources, such as those describing the species composition of forests, can often be limited in the detail recorded and the reliability of the data, since the information was often not originally recorded for ecological purposes. Forest trees are sometimes recorded ambiguously, using generic or obscure common names. For the PLS data of northern Wisconsin, USA, we developed a method to classify ambiguously identified tree species using logistic regression analysis, using data on trees that were clearly identified to species and a set of independent predictor variables to build the models. The models were first created on partial data sets for each species and then tested for fit against the remaining data. Validations were conducted using repeated, random subsets of the data. Model prediction accuracy ranged from 81% to 96% in differentiating congeneric species among oak, pine, ash, maple, birch, and elm. Major predictor variables were tree size, associated species, landscape classes indicative of soil type, and spatial location within the study region. Results help to clarify ambiguities

  18. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, Jacopo; Tuia, Devis; Berne, Alexis

    2015-04-01

    Hydrometeor classification is the process that aims at identifying the dominant type of hydrometeor (e.g. rain, hail, snow aggregates, hail, graupel, ice crystals) in a domain covered by a polarimetric weather radar during precipitation. The techniques documented in the literature are mostly based on numerical simulations and fuzzy logic. This involves the arbitrary selection of a set of hydrometeor classes and the numerical simulation of theoretical radar observations associated to each class. The information derived from the simulation is then applied to actual radar measurements by means of fuzzy logic input-output association. This approach has some limitations: the number and type of the hydrometeor categories undergoing identification is selected arbitrarily and the scattering simulations are based on constraining assumptions, especially in case of solid hydrometeors. Furthermore, in presence of noise and uncertainties, it is not guaranteed that the selected hydrometeor classes can be effectively identified in actual observations. In the present work we propose a different starting point for the classification task, which is based on observations instead of numerical simulations. We provide criteria for the selection of the number of hydrometeor classes that can be identified, by looking at how polarimetric observations collected over different precipitation events form clusters in the multi-dimensional space of the polarimetric variables. Two datasets, collected by an X-band weather radar, are employed in the study. The first dataset covers mountainous weather conditions (Swiss Alps), while the second includes Mediterranean orographic precipitation events collected during the special observation period (SOP) 2012 of the HyMeX campaign. We employ an unsupervised hierarchical clustering method to group the observations into clusters and we introduce a spatial smoothness constraint for the groups, assuming that the hydrometeor type changes smoothly in space

  19. Identification of sexually abused female adolescents at risk for suicidal ideations: a classification and regression tree analysis.

    PubMed

    Brabant, Marie-Eve; Hébert, Martine; Chagnon, François

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression, posttraumatic stress symptoms, and hopelessness discriminated profiles of suicidal and nonsuicidal survivors. The elevated prevalence of suicidal ideations among adolescent survivors of sexual abuse underscores the importance of investigating the presence of suicidal ideations in sexual abuse survivors. However, suicidal ideation is not the sole variable that needs to be investigated; depression, hopelessness and posttraumatic stress symptoms are also related to suicidal ideations in survivors and could therefore guide interventions. PMID:23428149

  20. Simple, novel approaches to investigating biophysical characteristics of individual mid-latitude deciduous trees

    NASA Astrophysics Data System (ADS)

    Kalibo, Humphrey Wafula

    Forests play a critical role in the functioning of the biosphere and support the livelihoods of millions of people. With increasing anthropogenic influences and looming effects associated with climatic variability, it is crucial that the research community and policy makers take advantage of the capabilities afforded by remote sensing technologies to generate reliable and timely data to support management decisions. Set in the species-rich woodland of Prairie Pines in Lincoln, Nebraska, this research addresses three distinct objectives that could contribute towards forest research and management. First, three supervised classification algorithms were applied to two hyperspectral AISA-Eagle images to evaluate their capability for spectrally identifying selected tree species. The findings show that each algorithm had low to moderate overall classification accuracies (46%-62%), probably due to mixed pixels resulting from pronounced heterogeneity in tree diversity; however, the algorithms could be a rapid means to assess species composition. The second objective is an investigation into how twelve individual morphologically different deciduous trees transmit incoming photosynthetically active radiation (PAR) over the course of the growing season. It was found that more diffuse light was transmitted than direct light, dictated by seasonality, vegetation fraction (VF), and leaf size. In the final objective, VF derived from upward-looking hemispherical photographs of twelve deciduous tree canopies and eight spectral vegetation indices (VIs) calculated from in situ single leaf-level reflectance data were used to investigate whether the VIs could mimic and estimate the temporal patterns of measured VF of each tree over the growing season. The findings show that all the indices accurately depicted the temporal patterns of the photo-derived VF. NDVI and SAVI had the highest correlations (R 2 > 0.7; RMSE 0.7; E > 0.8) and closely mirrored the temporal patterns of VF for nine

  1. The Iqmulus Urban Showcase: Automatic Tree Classification and Identification in Huge Mobile Mapping Point Clouds

    NASA Astrophysics Data System (ADS)

    Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.

    2016-06-01

    Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.

  2. Tree carbon allocation dynamics determined using a carbon mass balance approach.

    PubMed

    Klein, Tamir; Hoch, Günter

    2015-01-01

    Tree internal carbon (C) fluxes between compound and compartment pools are difficult to measure directly. Here we used a C mass balance approach to decipher these fluxes and provide a full description of tree C allocation dynamics. We collected independent measurements of tree C sinks, source and pools in Pinus halepensis in a semi-arid forest, and converted all fluxes to g C per tree d(-1) . Using this data set, a process flowchart was created to describe and quantify the tree C allocation on diurnal to annual time-scales. The annual C source of 24.5 kg C per tree yr(-1) was balanced by C sinks of 23.5 kg C per tree yr(-1) , which partitioned into 70%, 17% and 13% between respiration, growth, and litter (plus export to soil), respectively. Large imbalances (up to 57 g C per tree d(-1) ) were observed as C excess during the wet season, and as C deficit during the dry season. Concurrent changes in C reserves (starch) were sufficient to buffer these transient C imbalances. The C pool dynamics calculated using the flowchart were in general agreement with the observed pool sizes, providing confidence regarding our estimations of the timing, magnitude, and direction of the internal C fluxes. PMID:25157793

  3. An efficient approach to 3D single tree-crown delineation in LiDAR data

    NASA Astrophysics Data System (ADS)

    Mongus, Domen; Žalik, Borut

    2015-10-01

    This paper proposes a new method for 3D delineation of single tree-crowns in LiDAR data by exploiting the complementaries of treetop and tree trunk detections. A unified mathematical framework is provided based on the graph theory, allowing for all the segmentations to be achieved using marker-controlled watersheds. Treetops are defined by detecting concave neighbourhoods within the canopy height model using locally fitted surfaces. These serve as markers for watershed segmentation of the canopy layer where possible oversegmentation is reduced by merging the regions based on their heights, areas, and shapes. Additional tree crowns are delineated from mid- and under-storey layers based on tree trunk detection. A new approach for estimating the verticalities of the points' distributions is proposed for this purpose. The watershed segmentation is then applied on a density function within the voxel space, while boundaries of delineated trees from the canopy layer are used to prevent the overspreading of regions. The experiments show an approximately 6% increase in the efficiency of the proposed treetop definition based on locally fitted surfaces in comparison with the traditionally used local maxima of the smoothed canopy height model. In addition, 4% increase in the efficiency is achieved by the proposed tree trunk detection. Although the tree trunk detection alone is dependent on the data density, supplementing it with the treetop detection the proposed approach is efficient even when dealing with low density point-clouds.

  4. Crop classification in the U.S. Corn Belt using MODIS imagery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Land cover classification is essential in studies of land cover change, climate, hydrology, carbon sequestration and yield prediction. Land cover classification uses pattern recognition technique that includes supervised / unsupervised approaches and decision tree technique. Land cover maps for re...

  5. Assessing College Student Interest in Math and/or Computer Science in a Cross-National Sample Using Classification and Regression Trees

    ERIC Educational Resources Information Center

    Kitsantas, Anastasia; Kitsantas, Panagiota; Kitsantas, Thomas

    2012-01-01

    The purpose of this exploratory study was to assess the relative importance of a number of variables in predicting students' interest in math and/or computer science. Classification and regression trees (CART) were employed in the analysis of survey data collected from 276 college students enrolled in two U.S. and Greek universities. The…

  6. Assessment on the classification of landslide risk level using Genetic Algorithm of Operation Tree in central Taiwan

    NASA Astrophysics Data System (ADS)

    Wei, Chiang; Yeh, Hui-Chung; Chen, Yen-Chang

    2015-04-01

    This study assessed the classification of landslide areas by Genetic Algorithm of Operation Tree (GAOT) of Chen-Yu-Lan River upstream watershed of National Taiwan University Experimental Forest (NTUEF) after the Typhoon Morakot in 2009 using remotely and geological data. Landslides of 624.5 ha which accounting for 1.9% of total area were delineated with the threshold of slope (22°) and area size (1 hectare), 48 landslide sites were located in the upstream Chen-Yu-Lan watershed using FORMOSAT-II satellite imagery, the aerial photo and GIS related coverage. The five risk levels of these landslide areas was classified by the area, elevation, slope order, aspect, erosion order and geological factor order using the Simplicity Method suggested in the Technical Regulations for Soil and Water Conservation of Taiwan. If all the landslide sites were considered, the accuracy of classification using GAOT is 97.9%, superior than the K-means, Ward method, Shared Nearest Neighbor method, Maximum Likelihood Classifier and Bayesian Classifier; if 36 sites were used as training samples and the rest 12 sites were tested, the accuracy still can reach 81.3%. More geological data, anthropogenic influence and hydrological factors may be necessary for clarifying the landside area and the results benefit the assessment for future correction and management of the authorities.

  7. Evaluation of Current Approaches to Stream Classification and a Heuristic Guide to Developing Classifications of Integrated Aquatic Networks

    NASA Astrophysics Data System (ADS)

    Melles, S. J.; Jones, N. E.; Schmidt, B. J.

    2014-03-01

    Conservation and management of fresh flowing waters involves evaluating and managing effects of cumulative impacts on the aquatic environment from disturbances such as: land use change, point and nonpoint source pollution, the creation of dams and reservoirs, mining, and fishing. To assess effects of these changes on associated biotic communities it is necessary to monitor and report on the status of lotic ecosystems. A variety of stream classification methods are available to assist with these tasks, and such methods attempt to provide a systematic approach to modeling and understanding complex aquatic systems at various spatial and temporal scales. Of the vast number of approaches that exist, it is useful to group them into three main types. The first involves modeling longitudinal species turnover patterns within large drainage basins and relating these patterns to environmental predictors collected at reach and upstream catchment scales; the second uses regionalized hierarchical classification to create multi-scale, spatially homogenous aquatic ecoregions by grouping adjacent catchments together based on environmental similarities; and the third approach groups sites together on the basis of similarities in their environmental conditions both within and between catchments, independent of their geographic location. We review the literature with a focus on more recent classifications to examine the strengths and weaknesses of the different approaches. We identify gaps or problems with the current approaches, and we propose an eight-step heuristic process that may assist with development of more flexible and integrated aquatic classifications based on the current understanding, network thinking, and theoretical underpinnings.

  8. One or Two Dimensions in Spontaneous Classification: A Simplicity Approach

    ERIC Educational Resources Information Center

    Pothos, Emmanuel M.; Close, James

    2008-01-01

    When participants are asked to spontaneously categorize a set of items, they typically produce unidimensional classifications, i.e., categorize the items on the basis of only one of their dimensions of variation. We examine whether it is possible to predict unidimensional vs. two-dimensional classification on the basis of the abstract stimulus…

  9. RAVEN. Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Alfonsi, Andrea; Rabiti, Cristian; Mandelli, Diego; Cogliati, Joshua; Kinoshita, Robert

    2014-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  10. RAVEN: Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Andrea Alfonsi; Cristian Rabiti; Diego Mandelli; Joshua Cogliati; Robert Kinoshita

    2013-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  11. Integrated Analysis of Tropical Trees Growth: A Multivariate Approach

    PubMed Central

    YÁÑEZ-ESPINOSA, LAURA; TERRAZAS, TERESA; LÓPEZ-MATA, LAURO

    2006-01-01

    • Background and Aims One of the problems analysing cause–effect relationships of growth and environmental factors is that a single factor could be correlated with other ones directly influencing growth. One attempt to understand tropical trees' growth cause–effect relationships is integrating research about anatomical, physiological and environmental factors that influence growth in order to develop mathematical models. The relevance is to understand the nature of the process of growth and to model this as a function of the environment. • Methods The relationships of Aphananthe monoica, Pleuranthodendron lindenii and Psychotria costivenia radial growth and phenology with environmental factors (local climate, vertical strata microclimate and physical and chemical soil variables) were evaluated from April 2000 to September 2001. The association among these groups of variables was determined by generalized canonical correlation analysis (GCCA), which considers the probable associations of three or more data groups and the selection of the most important variables for each data group. • Key Results The GCCA allowed determination of a general model of relationships among tree phenology and radial growth with climate, microclimate and soil factors. A strong influence of climate in phenology and radial growth existed. Leaf initiation and cambial activity periods were associated with maximum temperature and day length, and vascular tissue differentiation with soil moisture and rainfall. The analyses of individual species detected different relationships for the three species. • Conclusions The analyses of the individual species suggest that each one takes advantage in a different way of the environment in which they are growing, allowing them to coexist. PMID:16822807

  12. Single-cell approaches for molecular classification of endocrine tumors

    PubMed Central

    Koh, James; Allbritton, Nancy L.; Sosa, Julie A.

    2015-01-01

    Purpose of review In this review, we summarize recent developments in single-cell technologies that can be employed for the functional and molecular classification of endocrine cells in normal and neoplastic tissue. Recent findings The emergence of new platforms for the isolation, analysis, and dynamic assessment of individual cell identity and reactive behavior enables experimental deconstruction of intratumoral heterogeneity and other contexts, where variability in cell signaling and biochemical responsiveness inform biological function and clinical presentation. These tools are particularly appropriate for examining and classifying endocrine neoplasias, as the clinical sequelae of these tumors are often driven by disrupted hormonal responsiveness secondary to compromised cell signaling. Single-cell methods allow for multidimensional experimental designs incorporating both spatial and temporal parameters with the capacity to probe dynamic cell signaling behaviors and kinetic response patterns dependent upon sequential agonist challenge. Summary Intratumoral heterogeneity in the provenance, composition, and biological activity of different forms of endocrine neoplasia presents a significant challenge for prognostic assessment. Single-cell technologies provide an array of powerful new approaches uniquely well suited for dissecting complex endocrine tumors. Studies examining the relationship between clinical behavior and tumor compositional variations in cellular activity are now possible, providing new opportunities to deconstruct the underlying mechanisms of endocrine neoplasia. PMID:26632769

  13. Coal waste classification and approaches to utilization in China

    SciTech Connect

    Xu Zesheng; Yang Qiaowen; Wang Zuna

    1998-12-31

    The amounts of coal waste or coal refuse from mining and coal preparation are adding up rapidly in China because of the increase production of coal. The coal refuse disposed in 1996 amounted to 610 million tons. The stockpiled coal refuse had reached 3 billion tons by the end of 1996, occupying about an area of 8,000 hectare. It is very important to classify coal refuse scientifically, including its chemical composition and physical chemistry for proper treatment and comprehensive utilization. The significance or goal of proper classification of coal waste is: first, to make full use of coal waste on the basis of its useful mineral content and grade; second, to advance utilization methods for coal waste that use no processing technology so as to save processing time and cost; thirdly, to be able to determine relatively precisely coal waste quality and quantity so as to decrease manmade stockpiled coal waste mixtures and to be able to utilize all kinds of coal waste; and finally, to guide development of new refuse utilization approaches. According to characteristics of coal waste resources in China, all coal wastes are classified into six main classes on basis of their source and stockpiled situation. The six main classes are coal-heading coal waste, rock-heading coal waste, spontaneous combustion coal waste, mechanical separating coal waste, sorting coal waste and rock-stripping coal waste. Each is described.

  14. Genome trees constructed using five different approaches suggest new major bacterial clades

    PubMed Central

    Wolf, Yuri I; Rogozin, Igor B; Grishin, Nick V; Tatusov, Roman L; Koonin, Eugene V

    2001-01-01

    Background The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. Results Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the

  15. Bayesian decision tree for the classification of the mode of motion in single-molecule trajectories.

    PubMed

    Türkcan, Silvan; Masson, Jean-Baptiste

    2013-01-01

    Membrane proteins move in heterogeneous environments with spatially (sometimes temporally) varying friction and with biochemical interactions with various partners. It is important to reliably distinguish different modes of motion to improve our knowledge of the membrane architecture and to understand the nature of interactions between membrane proteins and their environments. Here, we present an analysis technique for single molecule tracking (SMT) trajectories that can determine the preferred model of motion that best matches observed trajectories. The method is based on Bayesian inference to calculate the posteriori probability of an observed trajectory according to a certain model. Information theory criteria, such as the Bayesian information criterion (BIC), the Akaike information criterion (AIC), and modified AIC (AICc), are used to select the preferred model. The considered group of models includes free Brownian motion, and confined motion in 2nd or 4th order potentials. We determine the best information criteria for classifying trajectories. We tested its limits through simulations matching large sets of experimental conditions and we built a decision tree. This decision tree first uses the BIC to distinguish between free Brownian motion and confined motion. In a second step, it classifies the confining potential further using the AIC. We apply the method to experimental Clostridium Perfingens [Formula: see text]-toxin (CP[Formula: see text]T) receptor trajectories to show that these receptors are confined by a spring-like potential. An adaptation of this technique was applied on a sliding window in the temporal dimension along the trajectory. We applied this adaptation to experimental CP[Formula: see text]T trajectories that lose confinement due to disaggregation of confining domains. This new technique adds another dimension to the discussion of SMT data. The mode of motion of a receptor might hold more biologically relevant information than the diffusion

  16. A Method for Application of Classification Tree Models to Map Aquatic Vegetation Using Remotely Sensed Images from Different Sensors and Dates

    PubMed Central

    Jiang, Hao; Zhao, Dehua; Cai, Ying; An, Shuqing

    2012-01-01

    In previous attempts to identify aquatic vegetation from remotely-sensed images using classification trees (CT), the images used to apply CT models to different times or locations necessarily originated from the same satellite sensor as that from which the original images used in model development came, greatly limiting the application of CT. We have developed an effective normalization method to improve the robustness of CT models when applied to images originating from different sensors and dates. A total of 965 ground-truth samples of aquatic vegetation types were obtained in 2009 and 2010 in Taihu Lake, China. Using relevant spectral indices (SI) as classifiers, we manually developed a stable CT model structure and then applied a standard CT algorithm to obtain quantitative (optimal) thresholds from 2009 ground-truth data and images from Landsat7-ETM+, HJ-1B-CCD, Landsat5-TM and ALOS-AVNIR-2 sensors. Optimal CT thresholds produced average classification accuracies of 78.1%, 84.7% and 74.0% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. However, the optimal CT thresholds for different sensor images differed from each other, with an average relative variation (RV) of 6.40%. We developed and evaluated three new approaches to normalizing the images. The best-performing method (Method of 0.1% index scaling) normalized the SI images using tailored percentages of extreme pixel values. Using the images normalized by Method of 0.1% index scaling, CT models for a particular sensor in which thresholds were replaced by those from the models developed for images originating from other sensors provided average classification accuracies of 76.0%, 82.8% and 68.9% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. Applying the CT models developed for normalized 2009 images to 2010 images resulted in high classification (78.0%–93.3%) and overall (92.0%–93.1%) accuracies. Our results suggest

  17. Idiopathic interstitial pneumonias and emphysema: detection and classification using a texture-discriminative approach

    NASA Astrophysics Data System (ADS)

    Fetita, C.; Chang-Chien, K. C.; Brillet, P. Y.; Pr"teux, F.; Chang, R. F.

    2012-03-01

    Our study aims at developing a computer-aided diagnosis (CAD) system for fully automatic detection and classification of pathological lung parenchyma patterns in idiopathic interstitial pneumonias (IIP) and emphysema using multi-detector computed tomography (MDCT). The proposed CAD system is based on three-dimensional (3-D) mathematical morphology, texture and fuzzy logic analysis, and can be divided into four stages: (1) a multi-resolution decomposition scheme based on a 3-D morphological filter was exploited to discriminate the lung region patterns at different analysis scales. (2) An additional spatial lung partitioning based on the lung tissue texture was introduced to reinforce the spatial separation between patterns extracted at the same resolution level in the decomposition pyramid. Then, (3) a hierarchic tree structure was exploited to describe the relationship between patterns at different resolution levels, and for each pattern, six fuzzy membership functions were established for assigning a probability of association with a normal tissue or a pathological target. Finally, (4) a decision step exploiting the fuzzy-logic assignments selects the target class of each lung pattern among the following categories: normal (N), emphysema (EM), fibrosis/honeycombing (FHC), and ground glass (GDG). According to a preliminary evaluation on an extended database, the proposed method can overcome the drawbacks of a previously developed approach and achieve higher sensitivity and specificity.

  18. New Approaches to Object Classification in Synoptic Sky Surveys

    SciTech Connect

    Donalek, C.; Mahabal, A.; Djorgovski, S. G.; Marney, S.; Drake, A.; Glikman, E.; Graham, M. J.; Williams, R.

    2008-12-05

    Digital synoptic sky surveys pose several new object classification challenges. In surveys where real-time detection and classification of transient events is a science driver, there is a need for an effective elimination of instrument-related artifacts which can masquerade as transient sources in the detection pipeline, e.g., unremoved large cosmic rays, saturation trails, reflections, crosstalk artifacts, etc. We have implemented such an Artifact Filter, using a supervised neural network, for the real-time processing pipeline in the Palomar-Quest (PQ) survey. After the training phase, for each object it takes as input a set of measured morphological parameters and returns the probability of it being a real object. Despite the relatively low number of training cases for many kinds of artifacts, the overall artifact classification rate is around 90%, with no genuine transients misclassified during our real-time scans. Another question is how to assign an optimal star-galaxy classification in a multi-pass survey, where seeing and other conditions change between different epochs, potentially producing inconsistent classifications for the same object. We have implemented a star/galaxy multipass classifier that makes use of external and a priori knowledge to find the optimal classification from the individually derived ones. Both these techniques can be applied to other, similar surveys and data sets.

  19. PoMo: An Allele Frequency-Based Approach for Species Tree Estimation

    PubMed Central

    De Maio, Nicola; Schrempf, Dominik; Kosiol, Carolin

    2015-01-01

    Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches. PMID:26209413

  20. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    PubMed

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1

  1. Text Categorization Based on K-Nearest Neighbor Approach for Web Site Classification.

    ERIC Educational Resources Information Center

    Kwon, Oh-Woog; Lee, Jong-Hyeok

    2003-01-01

    Discusses text categorization and Web site classification and proposes a three-step classification system that includes the use of Web pages linked with the home page. Highlights include the k-nearest neighbor (k-NN) approach; improving performance with a feature selection method and a term weighting scheme using HTML tags; and similarity…

  2. Mathematical Programming Approaches for the Classification Problem in Two-Group Discriminant Analysis.

    ERIC Educational Resources Information Center

    Joachimsthaler, Erich A.; Stam, Antonie

    1990-01-01

    Mathematical programing formulas are introduced as new approaches to solve the classification problem in discriminant analysis. The research literature is reviewed, and an illustration using a real-world classification problem is provided. Issues relevant to potential uses of these formulations are discussed. (TJH)

  3. Machine Learning Approaches for High-resolution Urban Land Cover Classification: A Comparative Study

    SciTech Connect

    Vatsavai, Raju; Chandola, Varun; Cheriyadat, Anil M; Bright, Eddie A; Bhaduri, Budhendra L; Graesser, Jordan B

    2011-01-01

    The proliferation of several machine learning approaches makes it difficult to identify a suitable classification technique for analyzing high-resolution remote sensing images. In this study, ten classification techniques were compared from five broad machine learning categories. Surprisingly, the performance of simple statistical classification schemes like maximum likelihood and Logistic regression over complex and recent techniques is very close. Given that these two classifiers require little input from the user, they should still be considered for most classification tasks. Multiple classifier systems is a good choice if the resources permit.

  4. Neural network approaches versus statistical methods in classification of multisource remote sensing data

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.

    1990-01-01

    Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.

  5. Biodiversity among Lactobacillus helveticus Strains Isolated from Different Natural Whey Starter Cultures as Revealed by Classification Trees

    PubMed Central

    Gatti, Monica; Trivisano, Carlo; Fabrizi, Enrico; Neviani, Erasmo; Gardini, Fausto

    2004-01-01

    Lactobacillus helveticus is a homofermentative thermophilic lactic acid bacterium used extensively for manufacturing Swiss type and aged Italian cheese. In this study, the phenotypic and genotypic diversity of strains isolated from different natural dairy starter cultures used for Grana Padano, Parmigiano Reggiano, and Provolone cheeses was investigated by a classification tree technique. A data set was used that consists of 119 L. helveticus strains, each of which was studied for its physiological characters, as well as surface protein profiles and hybridization with a species-specific DNA probe. The methodology employed in this work allowed the strains to be grouped into terminal nodes without difficult and subjective interpretation. In particular, good discrimination was obtained between L. helveticus strains isolated, respectively, from Grana Padano and from Provolone natural whey starter cultures. The method used in this work allowed identification of the main characteristics that permit discrimination of biotypes. In order to understand what kind of genes could code for phenotypes of technological relevance, evidence that specific DNA sequences are present only in particular biotypes may be of great interest. PMID:14711641

  6. Rapid Erosion Modeling in a Western Kenya Watershed using Visible Near Infrared Reflectance, Classification Tree Analysis and 137Cesium

    PubMed Central

    deGraffenried, Jeff B.; Shepherd, Keith D.

    2010-01-01

    Human induced soil erosion has severe economic and environmental impacts throughout the world. It is more severe in the tropics than elsewhere and results in diminished food production and security. Kenya has limited arable land and 30 percent of the country experiences severe to very severe human induced soil degradation. The purpose of this research was to test visible near infrared diffuse reflectance spectroscopy (VNIR) as a tool for rapid assessment and benchmarking of soil condition and erosion severity class. The study was conducted in the Saiwa River watershed in the northern Rift Valley Province of western Kenya, a tropical highland area. Soil 137Cs concentration was measured to validate spectrally derived erosion classes and establish the background levels for difference land use types. Results indicate VNIR could be used to accurately evaluate a large and diverse soil data set and predict soil erosion characteristics. Soil condition was spectrally assessed and modeled. Analysis of mean raw spectra indicated significant reflectance differences between soil erosion classes. The largest differences occurred between 1,350 and 1,950 nm with the largest separation occurring at 1,920 nm. Classification and Regression Tree (CART) analysis indicated that the spectral model had practical predictive success (72%) with Receiver Operating Characteristic (ROC) of 0.74. The change in 137Cs concentrations supported the premise that VNIR is an effective tool for rapid screening of soil erosion condition. PMID:27397933

  7. Total system performance assessment for waste disposal using a logic tree approach.

    PubMed

    Kessler, J H; McGuire, R K

    1999-10-01

    The Electric Power Research Institute (EPRI) has sponsored the development of a model to assess the long-term, overall "performance" of the candidate spent fuel and high-level radioactive waste (HLW) disposal facility at Yucca Mountain, Nevada. The model simulates the processes that lead to HLW container corrosion, HLW mobilization from the spent fuel, and transport by groundwater, and contaminated groundwater usage by future hypothetical individuals leading to radiation doses to those individuals. The model must incorporate a multitude of complex, coupled processes across a variety of technical disciplines. Furthermore, because of the very long time frames involved in the modeling effort (> 10(4) years), the relative lack of directly applicable data, and many uncertainties and variabilities in those data, a probabilistic approach to model development was necessary. The developers of the model chose a logic tree approach to represent uncertainties in both conceptual models and model parameter values. The developers felt the logic tree approach was the most appropriate. This paper discusses the value and use of logic trees applied to assessing the uncertainties in HLW disposal, the components of the model, and a few of the results of that model. The paper concludes with a comparison of logic trees and Monte Carlo approaches. PMID:10765439

  8. Evaluating Two Approaches to Helping College Students Understand Evolutionary Trees through Diagramming Tasks

    ERIC Educational Resources Information Center

    Perry, Judy; Meir, Eli; Herron, Jon C.; Maruca, Susan; Stal, Derek

    2008-01-01

    To understand evolutionary theory, students must be able to understand and use evolutionary trees and their underlying concepts. Active, hands-on curricula relevant to macroevolution can be challenging to implement across large college-level classes where textbook learning is the norm. We evaluated two approaches to helping students learn…

  9. Neural network approach to classification of infrasound signals

    NASA Astrophysics Data System (ADS)

    Lee, Dong-Chang

    As part of the International Monitoring Systems of the Preparatory Commissions for the Comprehensive Nuclear Test-Ban Treaty Organization, the Infrasound Group at the University of Alaska Fairbanks maintains and operates two infrasound stations to monitor global nuclear activity. In addition, the group specializes in detecting and classifying the man-made and naturally produced signals recorded at both stations by computing various characterization parameters (e.g. mean of the cross correlation maxima, trace velocity, direction of arrival, and planarity values) using the in-house developed weighted least-squares algorithm. Classifying commonly observed low-frequency (0.015--0.1 Hz) signals at out stations, namely mountain associated waves and high trace-velocity signals, using traditional approach (e.g. analysis of power spectral density) presents a problem. Such signals can be separated statistically by setting a window to the trace-velocity estimate for each signal types, and the feasibility of such technique is demonstrated by displaying and comparing various summary plots (e.g. universal, seasonal and azimuthal variations) produced by analyzing infrasound data (2004--2007) from the Fairbanks and Antarctic arrays. Such plots with the availability of magnetic activity information (from the College International Geophysical Observatory located at Fairbanks, Alaska) leads to possible physical sources of the two signal types. Throughout this thesis a newly developed robust algorithm (sum of squares of variance ratios) with improved detection quality (under low signal to noise ratios) over two well-known detection algorithms (mean of the cross correlation maxima and Fisher Statistics) are investigated for its efficacy as a new detector. A neural network is examined for its ability to automatically classify the two signals described above against clutter (spurious signals with common characteristics). Four identical perceptron networks are trained and validated (with

  10. Using hydrogeomorphic criteria to classify wetlands on Mt. Desert Island, Maine - approach, classification system, and examples

    USGS Publications Warehouse

    Nielsen, Martha G.; Guntenspergen, Glenn R.; Neckles, Hilary A.

    2005-01-01

    A wetland classification system was designed for Mt. Desert Island, Maine, to help categorize the large number of wetlands (over 1,200 mapped units) as an aid to understanding their hydrologic functions. The classification system, developed by the U.S. Geological Survey (USGS), in cooperation with the National Park Service, uses a modified hydrogeomorphic (HGM) approach, and assigns categories based on position in the landscape, soils and surficial geologic setting, and source of water. A dichotomous key was developed to determine a preliminary HGM classification of wetlands on the island. This key is designed for use with USGS topographic maps and 1:24,000 geographic information system (GIS) coverages as an aid to the classification, but may also be used with field data. Hydrologic data collected from a wetland monitoring study were used to determine whether the preliminary classification of individual wetlands using the HGM approach yielded classes that were consistent with actual hydroperiod data. Preliminary HGM classifications of the 20 wetlands in the monitoring study were consistent with the field hydroperiod data. The modified HGM classification approach appears robust, although the method apparently works somewhat better with undisturbed wetlands than with disturbed wetlands. This wetland classification system could be applied to other hydrogeologically similar areas of northern New England.

  11. Comparison of Sub-pixel Classification Approaches for Crop-specific Mapping

    EPA Science Inventory

    The Moderate Resolution Imaging Spectroradiometer (MODIS) data has been increasingly used for crop mapping and other agricultural applications. Phenology-based classification approaches using the NDVI (Normalized Difference Vegetation Index) 16-day composite (250 m) data product...

  12. Data-Driven Multimodal Sleep Apnea Events Detection : Synchrosquezing Transform Processing and Riemannian Geometry Classification Approaches.

    PubMed

    Rutkowski, Tomasz M

    2016-07-01

    A novel multimodal and bio-inspired approach to biomedical signal processing and classification is presented in the paper. This approach allows for an automatic semantic labeling (interpretation) of sleep apnea events based the proposed data-driven biomedical signal processing and classification. The presented signal processing and classification methods have been already successfully applied to real-time unimodal brainwaves (EEG only) decoding in brain-computer interfaces developed by the author. In the current project the very encouraging results are obtained using multimodal biomedical (brainwaves and peripheral physiological) signals in a unified processing approach allowing for the automatic semantic data description. The results thus support a hypothesis of the data-driven and bio-inspired signal processing approach validity for medical data semantic interpretation based on the sleep apnea events machine-learning-related classification. PMID:27194241

  13. A classification of the Chloridoideae (Poaceae) based on multi-gene phylogenetic trees.

    PubMed

    Peterson, Paul M; Romaschenko, Konstantin; Johnson, Gabriel

    2010-05-01

    Blepharidachne, Dasyochloa, Erioneuron, Munroa, Scleropogon, and Swallenia); Traginae (Tragus with Monelytrum, Polevansia, and Willkommia all embedded); Tridentinae (includes Gouinia, Tridens, Triplasis, and Vaseyochloa); Triodiinae (Triodia); and the Tripogoninae (Melanocenchris and Tripogon with Eragrostiella embedded). In our study the Cynodonteae still include 19 genera and the Zoysieae include a single genus that are not yet placed in a subtribe. The tribe Triraphideae and the subtribe Aeluropodinae are newly treated at that rank. We propose a new tribal and subtribal classification for all known genera in the Chloridoideae. The subfamily might have originated in Africa and/or Asia since the basal lineage, the Triraphideae, includes species with African and Asian distribution. PMID:20096795

  14. Bayesian Evidence Framework for Decision Tree Learning

    NASA Astrophysics Data System (ADS)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  15. Oregon Hydrologic Landscapes: An Approach for Broadscale Hydrologic Classification

    EPA Science Inventory

    Gaged streams represent only a small percentage of watershed hydrologic conditions throughout the Unites States and globe, but there is a growing need for hydrologic classification systems that can serve as the foundation for broad-scale assessments of the hydrologic functions of...

  16. Image classification approach for automatic identification of grassland weeds

    NASA Astrophysics Data System (ADS)

    Gebhardt, Steffen; Kühbauch, Walter

    2006-08-01

    The potential of digital image processing for weed mapping in arable crops has widely been investigated in the last decades. In grassland farming these techniques are rarely applied so far. The project presented here focuses on the automatic identification of one of the most invasive and persistent grassland weed species, the broad-leaved dock (Rumex obtusifolius L.) in complex mixtures of grass and herbs. A total of 108 RGB-images were acquired in near range from a field experiment under constant illumination conditions using a commercial digital camera. The objects of interest were separated from the background by transforming the 24 bit RGB-images into 8 bit intensities and then calculating the local homogeneity images. These images were binarised by applying a dynamic grey value threshold. Finally, morphological opening was applied to the binary images. The remaining contiguous regions were considered to be objects. In order to classify these objects into 3 different weed species, a soil and a residue class, a total of 17 object-features related to shape, color and texture of the weeds were extracted. Using MANOVA, 12 of them were identified which contribute to classification. Maximum-likelihood classification was conducted to discriminate the weed species. The total classification rate across all classes ranged from 76 % to 83 %. The classification of Rumex obtusifolius achieved detection rates between 85 % and 93 % by misclassifications below 10 %. Further, Rumex obtusifolius distribution and the density maps were generated based on classification results and transformation of image coordinates into Gauss-Krueger system. These promising results show the high potential of image analysis for weed mapping in grassland and the implementation of site-specific herbicide spraying.

  17. A new approach to a maximum à posteriori-based kernel classification method.

    PubMed

    Nopriadi; Yamashita, Yukihiko

    2012-09-01

    This paper presents a new approach to a maximum a posteriori (MAP)-based classification, specifically, MAP-based kernel classification trained by linear programming (MAPLP). Unlike traditional MAP-based classifiers, MAPLP does not directly estimate a posterior probability for classification. Instead, it introduces a kernelized function to an objective function that behaves similarly to a MAP-based classifier. To evaluate the performance of MAPLP, a binary classification experiment was performed with 13 datasets. The results of this experiment are compared with those coming from conventional MAP-based kernel classifiers and also from other state-of-the-art classification methods. It shows that MAPLP performs promisingly against the other classification methods. It is argued that the proposed approach makes a significant contribution to MAP-based classification research; the approach widens the freedom to choose an objective function, it is not constrained to the strict sense Bayesian, and can be solved by linear programming. A substantial advantage of our proposed approach is that the objective function is undemanding, having only a single parameter. This simplicity, thus, allows for further research development in the future. PMID:22721808

  18. Multistage classification of multispectral Earth observational data: The design approach

    NASA Technical Reports Server (NTRS)

    Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.

    1981-01-01

    An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.

  19. Seeing the trees yet not missing the forest: an airborne lidar approach

    NASA Astrophysics Data System (ADS)

    Guo, Q.; Li, W.; Flanagan, J.

    2011-12-01

    Light Detection and Ranging (lidar) is an optical remote sensing technology that measures properties of scattered light to find range and/or other information of a distant object. Due to its ability to generate 3-dimensional data with high spatial resolution and accuracy, lidar technology is being increasingly used in ecology, geography, geology, geomorphology, seismology, remote sensing, and atmospheric physics. In this study, we acquire airborne lidar data for the study of hydrologic, geomorphologic, and geochemical processes at six Critical Zone Observatories: Southern Sierra, Boulder Creek, Shale Hills, Luquillo, Jemez, and Christina River Basin. Each site will have two lidar flights (leaf on/off, or snow on/off). Based on lidar data, we derive various products, including high resolution Digital Elevation Model (DEM), Digital Surface Model (DSM), Canopy Height Model (CHM), canopy cover & closure, tree height, DBH, canopy base height, canopy bulk density, biomass, LAI, etc. A novel approach is also developed to map individual tree based on segmentation of lidar point clouds, and a virtual forest is simulated using the location of individual trees as well as tree structure information. The simulated image is then compared to a camera photo taken at the same location. The two images look very similar, while, our simulated image provides not only a visually impressive visualization of the landscape, but also contains all the detailed information about the individual tree locations and forest structure properties.

  20. Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2013-01-01

    Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…

  1. A knowledge-based approach of satellite image classification for urban wetland detection

    NASA Astrophysics Data System (ADS)

    Xu, Xiaofan

    It has been a technical challenge to accurately detect urban wetlands with remotely sensed data by means of pixel-based image classification. This is mainly caused by inadequate spatial resolutions of satellite imagery, spectral similarities between urban wetlands and adjacent land covers, and the spatial complexity of wetlands in human-transformed, heterogeneous urban landscapes. Knowledge-based classification, with great potential to overcome or reduce these technical impediments, has been applied to various image classifications focusing on urban land use/land cover and forest wetlands, but rarely to mapping the wetlands in urban landscapes. This study aims to improve the mapping accuracy of urban wetlands by integrating the pixel-based classification with the knowledge-based approach. The study area is the metropolitan area of Kansas City, USA. SPOT satellite images of 1992, 2008, and 2010 were classified into four classes - wetland, farmland, built-up land, and forestland - using the pixel-based supervised maximum likelihood classification method. The products of supervised classification are used as the comparative base maps. For our new classification approach, a knowledge base is developed to improve urban wetland detection, which includes a set of decision rules of identifying wetland cover in relation to its elevation, spatial adjacencies, habitat conditions, hydro-geomorphological characteristics, and relevant geostatistics. Using ERDAS Imagine software's knowledge classifier tool, the decision rules are applied to the base maps in order to identify wetlands that are not able to be detected based on the pixel-based classification. The results suggest that the knowledge-based image classification approach can enhance the urban wetland detection capabilities and classification accuracies with remotely sensed satellite imagery.

  2. An ant colony approach for image texture classification

    NASA Astrophysics Data System (ADS)

    Ye, Zhiwei; Zheng, Zhaobao; Ning, Xiaogang; Yu, Xin

    2005-10-01

    Ant colonies, and more generally social insect societies, are distributed systems that show a highly structured social organization in spite of the simplicity of their individuals. As a result of this swarm intelligence, ant colonies can accomplish complex tasks that far exceed the individual capacities of a single ant. As is well known that aerial image texture classification is a long-term difficult problem, which hasn't been fully solved. This paper presents an ant colony optimization methodology for image texture classification, which assigns N images into K type of clusters as clustering is viewed as a combinatorial optimization problem in the article. The algorithm has been tested on some real images and performance of this algorithm is superior to k-means algorithm. Computational simulations reveal very encouraging results in terms of the quality of solution found.

  3. Neural network approach to classification of traffic flow states

    SciTech Connect

    Yang, H.; Qiao, F.

    1998-11-01

    The classification of traffic flow states in China has traditionally been based on the Highway Capacity Manual, published in the United States. Because traffic conditions are generally different from country to country, though, it is important to develop a practical and useful classification method applicable to Chinese highway traffic. In view of the difficulty and complexity of a mathematical and physical realization, modern pattern recognition methods are considered practical in fulfilling this goal. This study applies a self-organizing neural network pattern recognition method to classify highway traffic states into some distinctive cluster centers. A small scale test with actual data is conducted, and the method is found to be potentially applicable in practice.

  4. A science based approach to topical drug classification system (TCS).

    PubMed

    Shah, Vinod P; Yacobi, Avraham; Rădulescu, Flavian Ştefan; Miron, Dalia Simona; Lane, Majella E

    2015-08-01

    The Biopharmaceutics Classification System (BCS) for oral immediate release solid drug products has been very successful; its implementation in drug industry and regulatory approval has shown significant progress. This has been the case primarily because BCS was developed using sound scientific judgment. Following the success of BCS, we have considered the topical drug products for similar classification system based on sound scientific principles. In USA, most of the generic topical drug products have qualitatively (Q1) and quantitatively (Q2) same excipients as the reference listed drug (RLD). The applications of in vitro release (IVR) and in vitro characterization are considered for a range of dosage forms (suspensions, creams, ointments and gels) of differing strengths. We advance a Topical Drug Classification System (TCS) based on a consideration of Q1, Q2 as well as the arrangement of matter and microstructure of topical formulations (Q3). Four distinct classes are presented for the various scenarios that may arise and depending on whether biowaiver can be granted or not. PMID:26070249

  5. A Novel Anti-classification Approach for Knowledge Protection.

    PubMed

    Lin, Chen-Yi; Chen, Tung-Shou; Tsai, Hui-Fang; Lee, Wei-Bin; Hsu, Tien-Yu; Kao, Yuan-Hung

    2015-10-01

    Classification is the problem of identifying a set of categories where new data belong, on the basis of a set of training data whose category membership is known. Its application is wide-spread, such as the medical science domain. The issue of the classification knowledge protection has been paid attention increasingly in recent years because of the popularity of cloud environments. In the paper, we propose a Shaking Sorted-Sampling (triple-S) algorithm for protecting the classification knowledge of a dataset. The triple-S algorithm sorts the data of an original dataset according to the projection results of the principal components analysis so that the features of the adjacent data are similar. Then, we generate noise data with incorrect classes and add those data to the original dataset. In addition, we develop an effective positioning strategy, determining the added positions of noise data in the original dataset, to ensure the restoration of the original dataset after removing those noise data. The experimental results show that the disturbance effect of the triple-S algorithm on the CLC, MySVM, and LibSVM classifiers increases when the noise data ratio increases. In addition, compared with existing methods, the disturbance effect of the triple-S algorithm is more significant on MySVM and LibSVM when a certain amount of the noise data added to the original dataset is reached. PMID:26277613

  6. A Phenotypic Approach for IUIS PID Classification and Diagnosis: Guidelines for Clinicians at the Bedside

    PubMed Central

    Jeddane, Leïla; Ailal, Fatima; Al Herz, Waleed; Conley, Mary Ellen; Cunningham-Rundles, Charlotte; Etzioni, Amos; Fischer, Alain; Franco, Jose Luis; Geha, Raif S.; Hammarström, Lennart; Nonoyama, Shigeaki; Ochs, Hans D.; Roifman, Chaim M.; Seger, Reinhard; Tang, Mimi L. K.; Puck, Jennifer M.; Chapel, Helen; Notarangelo, Luigi D.; Casanova, Jean-Laurent

    2014-01-01

    The number of genetically defined Primary Immunodeficiency Diseases (PID) has increased exponentially, especially in the past decade. The biennial classification published by the IUIS PID expert committee is therefore quickly expanding, providing valuable information regarding the disease-causing genotypes, the immunological anomalies, and the associated clinical features of PIDs. These are grouped in eight, somewhat overlapping, categories of immune dysfunction. However, based on this immunological classification, the diagnosis of a specific PID from the clinician’s observation of an individual clinical and/or immunological phenotype remains difficult, especially for non-PID specialists. The purpose of this work is to suggest a phenotypic classification that forms the basis for diagnostic trees, leading the physician to particular groups of PIDs, starting from clinical features and combining routine immunological investigations along the way.We present 8 colored diagnostic figures that correspond to the 8 PID groups in the IUIS Classification, including all the PIDs cited in the 2011 update of the IUIS classification and most of those reported since. PMID:23657403

  7. Whole-fish versus filet polychlorinated-biphenyl concentrations: An analysis using classification and regression tree models

    SciTech Connect

    Amrhein, J.F.; Stow, C.A.; Wible, C.

    1999-08-01

    Fish polychlorinated-biphenyl (PCB) measurements usually represent one of two different sample types: filets or homogenized whole fish. Filet measurements are more appropriate for use if the goal of analysis is estimating human PCB consumption, while whole-fish analysis may be more useful for quantifying and understanding processes of contaminant flow and bioaccumulation. While it is generally assumed that whole-fish PCB concentrations exceed filet concentrations because of the presence of fatty internal organs in whole-fish samples, the literature contains no reported comparisons of filet versus whole-fish PCB concentrations. The authors measured total PCB concentrations in filets and whole-fish samples from the same individuals in Lake Michigan coho salmon (Oncorhynchus kisutch) and rainbow trout (Oncorhynchus mykiss). The average whole-fish to filet PCB concentration ratio was 1.70 for coho salmon and 1.47 for rainbow trout, but it varied considerably among individuals, with a few fish exhibiting a higher concentration in the filet than in the whole-fish sample. Classification and regression tree (CART) models indicated that filet PCB concentration and fish length were the best predictors of whole-fish PCB concentration, whereas filet and whole-fish lipid concentrations were less important predictors. Lipid normalization of the PCB data decreased within-individual variability, was equivocal with respect to variability among individuals, and accentuated the between-species difference. Both species exhibit a pronounced 1:1 relationship between the whole-fish to filet PCB concentration ratio and the whole-fish to filet lipid concentration ratio; however, the authors point out that there is a strong spurious component to this relationship, which indicates that the relationship may be more algebraic rather than an indication of underlying mechanisms.

  8. Modern technology calls for a modern approach to classification of epileptic seizures and the epilepsies.

    PubMed

    Lüders, Hans O; Amina, Shahram; Baumgartner, Christopher; Benbadis, Selim; Bermeo-Ovalle, Adriana; Devereaux, Michael; Diehl, Beate; Edwards, Jonathan; Baca-Vaca, Guadalupe Fernandez; Hamer, Hajo; Ikeda, Akio; Kaiboriboon, Kitti; Kellinghaus, Christoph; Koubeissi, Mohamad; Lardizabal, David; Lhatoo, Samden; Lüders, Jürgen; Mani, Jayanti; Mayor, Luis Carlos; Miller, Jonathan; Noachtar, Soheyl; Pestana, Elia; Rosenow, Felix; Sakamoto, Americo; Shahid, Asim; Steinhoff, Bernhard J; Syed, Tanvir; Tanner, Adriana; Tsuji, Sadatoshi

    2012-03-01

    In the last 10-15 years the ILAE Commission on Classification and Terminology has been presenting proposals to modernize the current ILAE Classification of Epileptic Seizures and Epilepsies. These proposals were discussed extensively in a series of articles published recently in Epilepsia and Epilepsy Currents. There is almost universal consensus that the availability of new diagnostic techniques as also of a modern understanding of epilepsy calls for a complete revision of the Classification of Epileptic Seizures and Epilepsies. Unfortunately, however, the Commission is still not prepared to take a bold step ahead and completely revisit our approach to classification of epileptic seizures and epilepsies. In this manuscript we critically analyze the current proposals of the Commission and make suggestions for a classification system that reflects modern diagnostic techniques and our current understanding of epilepsy. PMID:22332669

  9. A statistical approach to material classification using image patch exemplars.

    PubMed

    Varma, Manik; Zisserman, Andrew

    2009-11-01

    In this paper, we investigate material classification from single images obtained under unknown viewpoint and illumination. It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighborhoods (starting from as small as 3 \\times 3 pixels square) and that this can outperform classification using filter banks with large support. It is also shown that the performance of filter banks is inferior to that of image patches with equivalent neighborhoods. We develop novel texton-based representations which are suited to modeling this joint neighborhood distribution for Markov random fields. The representations are learned from training images and then used to classify novel images (with unknown viewpoint and lighting) into texture classes. Three such representations are proposed and their performance is assessed and compared to that of filter banks. The power of the method is demonstrated by classifying 2,806 images of all 61 materials present in the Columbia-Utrecht database. The classification performance surpasses that of recent state-of-the-art filter bank-based classifiers such as Leung and Malik (IJCV 01), Cula and Dana (IJCV 04), and Varma and Zisserman (IJCV 05). We also benchmark performance by classifying all of the textures present in the UIUC, Microsoft Textile, and San Francisco outdoor data sets. We conclude with discussions on why features based on compact neighborhoods can correctly discriminate between textures with large global structure and why the performance of filter banks is not superior to that of the source image patches from which they were derived. PMID:19762929

  10. Classification

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  11. An improved spanning tree approach for the reliability analysis of supply chain collaborative network

    NASA Astrophysics Data System (ADS)

    Lam, C. Y.; Ip, W. H.

    2012-11-01

    A higher degree of reliability in the collaborative network can increase the competitiveness and performance of an entire supply chain. As supply chain networks grow more complex, the consequences of unreliable behaviour become increasingly severe in terms of cost, effort and time. Moreover, it is computationally difficult to calculate the network reliability of a Non-deterministic Polynomial-time hard (NP-hard) all-terminal network using state enumeration, as this may require a huge number of iterations for topology optimisation. Therefore, this paper proposes an alternative approach of an improved spanning tree for reliability analysis to help effectively evaluate and analyse the reliability of collaborative networks in supply chains and reduce the comparative computational complexity of algorithms. Set theory is employed to evaluate and model the all-terminal reliability of the improved spanning tree algorithm and present a case study of a supply chain used in lamp production to illustrate the application of the proposed approach.

  12. Hierarchical Object-based Image Analysis approach for classification of sub-meter multispectral imagery in Tanzania

    NASA Astrophysics Data System (ADS)

    Chung, C.; Nagol, J. R.; Tao, X.; Anand, A.; Dempewolf, J.

    2015-12-01

    Increasing agricultural production while at the same time preserving the environment has become a challenging task. There is a need for new approaches for use of multi-scale and multi-source remote sensing data as well as ground based measurements for mapping and monitoring crop and ecosystem state to support decision making by governmental and non-governmental organizations for sustainable agricultural development. High resolution sub-meter imagery plays an important role in such an integrative framework of landscape monitoring. It helps link the ground based data to more easily available coarser resolution data, facilitating calibration and validation of derived remote sensing products. Here we present a hierarchical Object Based Image Analysis (OBIA) approach to classify sub-meter imagery. The primary reason for choosing OBIA is to accommodate pixel sizes smaller than the object or class of interest. Especially in non-homogeneous savannah regions of Tanzania, this is an important concern and the traditional pixel based spectral signature approach often fails. Ortho-rectified, calibrated, pan sharpened 0.5 meter resolution data acquired from DigitalGlobe's WorldView-2 satellite sensor was used for this purpose. Multi-scale hierarchical segmentation was performed using multi-resolution segmentation approach to facilitate the use of texture, neighborhood context, and the relationship between super and sub objects for training and classification. eCognition, a commonly used OBIA software program, was used for this purpose. Both decision tree and random forest approaches for classification were tested. The Kappa index agreement for both algorithms surpassed the 85%. The results demonstrate that using hierarchical OBIA can effectively and accurately discriminate classes at even LCCS-3 legend.

  13. Tropical forest structure characterization using airborne lidar data: an individual tree level approach

    NASA Astrophysics Data System (ADS)

    Ferraz, A.; Saatchi, S. S.

    2015-12-01

    Fine scale tropical forest structure characterization has been performed by means of field measurements techniques that record both the specie and the diameter at the breast height (dbh) for every tree within a given area. Due to dense and complex vegetation, additional important ecological variables (e.g. the tree height and crown size) are usually not measured because they are hardly recognized from the ground. The poor knowledge on the 3D tropical forest structure has been a major limitation for the understanding of different ecological issues such as the spatial distribution of carbon stocks, regeneration and competition dynamics and light penetration gradient assessments. Airborne laser scanning (ALS) is an active remote sensing technique that provides georeferenced distance measurements between the aircraft and the surface. It provides an unstructured 3D point cloud that is a high-resolution model of the forest. This study presents the first approach for tropical forest characterization at a fine scale using remote sensing data. The multi-modal lidar point cloud is decomposed into 3D clusters that correspond to single trees by means of a technique called Adaptive Mean Shift Segmentation (AMS3D). The ability of the corresponding individual tree metrics (tree height, crown area and crown volume) for the estimation of above ground biomass (agb) over the 50 ha CTFS plot in Barro Colorado Island is here assessed. We conclude that our approach is able to map the agb spatial distribution with an error of nearly 12% (RMSE=28 Mg ha-1) compared with field-based estimates over 1ha plots.

  14. Classification and therapeutic approaches in autoimmune hemolytic anemia: an update.

    PubMed

    Michel, Marc

    2011-12-01

    Autoimmune hemolytic anemia (AIHA) is an uncommon autoantibody-mediated immune disorder that affects both children and adults. The diagnosis of AIHA relies mainly on the direct antiglobulin test, which is a highly sensitive and relatively specific test. The classification of AIHA is based on the pattern of the direct antiglobulin test and on the immunochemical properties of the autoantibody (warm or cold type), but also on the presence or absence of an underlying condition or disease (secondary vs primary AIHAs) that may have an impact on treatment and outcome. The distinction between AIHAs due to warm antibody (wAIHA) and AIHAs due to cold antibody is a crucial step of the diagnostic procedure as it influences the therapeutic strategy. Whereas corticosteroids are the cornerstone of treatment in wAIHA, they have no or little efficacy in cold AIHA. In wAIHA that is refractory or dependent to corticosteroids, splenectomy and rituximab are both good alternatives and the benefit?risk ratio of each option must be discussed on an individual basis. In chronic agglutinin disease, the most common variety of cold AIHA in adults, beyond supportive measures, rituximab given either alone or in combination with chemotherapy may be helpful. In this article, the classification of AIHA and the recent progress in therapeutics are discussed. PMID:22077525

  15. Ventricular fibrillation and tachycardia classification using a machine learning approach.

    PubMed

    Li, Qiao; Rajagopalan, Cadathur; Clifford, Gari D

    2014-06-01

    Correct detection and classification of ventricular fibrillation (VF) and rapid ventricular tachycardia (VT) is of pivotal importance for an automatic external defibrillator and patient monitoring. In this paper, a VF/VT classification algorithm using a machine learning method, a support vector machine, is proposed. A total of 14 metrics were extracted from a specific window length of the electrocardiogram (ECG). A genetic algorithm was then used to select the optimal variable combinations. Three annotated public domain ECG databases (the American Heart Association Database, the Creighton University Ventricular Tachyarrhythmia Database, and the MIT-BIH Malignant Ventricular Arrhythmia Database) were used as training, test, and validation datasets. Different window sizes, varying from 1 to 10 s were tested. An accuracy (Ac) of 98.1%, sensitivity (Se) of 98.4%, and specificity (Sp) of 98.0% were obtained on the in-sample training data with 5 s-window size and two selected metrics. On the out-of-sample validation data, an Ac of 96.3% ± 3.4%, Se of 96.2% ± 2.7%, and Sp of 96.2% ± 4.6% were obtained by fivefold cross validation. The results surpass those of current reported methods. PMID:23899591

  16. Spectral data analysis approaches for improved provenance classification

    NASA Astrophysics Data System (ADS)

    Sorauf, Kellen J.; Bauer, Amy J. R.; Miziolek, Andrzej W.; De Lucia, Frank C.

    2015-06-01

    In the last 10 years various chemometric methods have been developed and used for the analysis of spectra generated by Laser Induced Breakdown Spectroscopy (LIBS). One of the more successful and proven methods is Partial Least Squares Discriminant Analysis (PLS-DA). Recently PLS-DA was utilized for purposes of provenance of spent brass cartridges and achieved correct classification at around 93% with a false alarm rate of around 5%. The LIBS spectra from the cartridge samples are rich in emission lines from numerous mostly metallic elements comprising the brass and the cited results were based on the analysis of the full broadband high resolution spectra. It was observed that some of the lines were clearly saturated in all spectra, while others were sometimes saturated due to pulse-to-pulse variation. The pulse-to-pulse variation was also evident in the intensity variations of the spectra within cartridges and between cartridges. In order to improve on the accuracy of the classification we have developed some preprocessing strategies including the removal of spectral wavelength ranges susceptible to saturation and normalization techniques to diminish the effects of intensity variations in the spectra. The results indicate incremental improvements when applying additional preprocessing steps to the limit of 100% True Positives and 0% False Positives when utilizing selected wavelengths that are normalized and averaged.

  17. Gene selection approach based on improved swarm intelligent optimisation algorithm for tumour classification.

    PubMed

    Jin, Cong; Jin, Shu-Wei

    2016-06-01

    A number of different gene selection approaches based on gene expression profiles (GEP) have been developed for tumour classification. A gene selection approach selects the most informative genes from the whole gene space, which is an important process for tumour classification using GEP. This study presents an improved swarm intelligent optimisation algorithm to select genes for maintaining the diversity of the population. The most essential characteristic of the proposed approach is that it can automatically determine the number of the selected genes. On the basis of the gene selection, the authors construct a variety of the tumour classifiers, including the ensemble classifiers. Four gene datasets are used to evaluate the performance of the proposed approach. The experimental results confirm that the proposed classifiers for tumour classification are indeed effective. PMID:27187989

  18. Multi-variate flood damage assessment: a tree-based data-mining approach

    NASA Astrophysics Data System (ADS)

    Merz, B.; Kreibich, H.; Lall, U.

    2013-01-01

    The usual approach for flood damage assessment consists of stage-damage functions which relate the relative or absolute damage for a certain class of objects to the inundation depth. Other characteristics of the flooding situation and of the flooded object are rarely taken into account, although flood damage is influenced by a variety of factors. We apply a group of data-mining techniques, known as tree-structured models, to flood damage assessment. A very comprehensive data set of more than 1000 records of direct building damage of private households in Germany is used. Each record contains details about a large variety of potential damage-influencing characteristics, such as hydrological and hydraulic aspects of the flooding situation, early warning and emergency measures undertaken, state of precaution of the household, building characteristics and socio-economic status of the household. Regression trees and bagging decision trees are used to select the more important damage-influencing variables and to derive multi-variate flood damage models. It is shown that these models outperform existing models, and that tree-structured models are a promising alternative to traditional damage models.

  19. A New Approach in Teaching the Features and Classifications of Invertebrate Animals in Biology Courses

    ERIC Educational Resources Information Center

    Sezek, Fatih

    2013-01-01

    This study examined the effectiveness of a new learning approach in teaching classification of invertebrate animals in biology courses. In this approach, we used an impersonal style: the subject jigsaw, which differs from the other jigsaws in that both course topics and student groups are divided. Students in Jigsaw group were divided into five…

  20. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  1. Investigating the limitations of tree species classification using the Combined Cluster and Discriminant Analysis method for low density ALS data from a dense forest region in Aggtelek (Hungary)

    NASA Astrophysics Data System (ADS)

    Koma, Zsófia; Deák, Márton; Kovács, József; Székely, Balázs; Kelemen, Kristóf; Standovár, Tibor

    2016-04-01

    Airborne Laser Scanning (ALS) is a widely used technology for forestry classification applications. However, single tree detection and species classification from low density ALS point cloud is limited in a dense forest region. In this study we investigate the division of a forest into homogenous groups at stand level. The study area is located in the Aggtelek karst region (Northeast Hungary) with a complex relief topography. The ALS dataset contained only 4 discrete echoes (at 2-4 pt/m2 density) from the study area during leaf-on season. Ground-truth measurements about canopy closure and proportion of tree species cover are available for every 70 meter in 500 square meter circular plots. In the first step, ALS data were processed and geometrical and intensity based features were calculated into a 5×5 meter raster based grid. The derived features contained: basic statistics of relative height, canopy RMS, echo ratio, openness, pulse penetration ratio, basic statistics of radiometric feature. In the second step the data were investigated using Combined Cluster and Discriminant Analysis (CCDA, Kovács et al., 2014). The CCDA method first determines a basic grouping for the multiple circle shaped sampling locations using hierarchical clustering and then for the arising grouping possibilities a core cycle is executed comparing the goodness of the investigated groupings with random ones. Out of these comparisons difference values arise, yielding information about the optimal grouping out of the investigated ones. If sub-groups are then further investigated, one might even find homogeneous groups. We found that low density ALS data classification into homogeneous groups are highly dependent on canopy closure, and the proportion of the dominant tree species. The presented results show high potential using CCDA for determination of homogenous separable groups in LiDAR based tree species classification. Aggtelek Karst/Slovakian Karst Caves" (HUSK/1101/221/0180, Aggtelek NP

  2. Robust Machine Learning Applied to Astronomical Data Sets. I. Star-Galaxy Classification of the Sloan Digital Sky Survey DR3 Using Decision Trees

    NASA Astrophysics Data System (ADS)

    Ball, Nicholas M.; Brunner, Robert J.; Myers, Adam D.; Tcheng, David

    2006-10-01

    We provide classifications for all 143 million nonrepeat photometric objects in the Third Data Release of the SDSS using decision trees trained on 477,068 objects with SDSS spectroscopic data. We demonstrate that these star/galaxy classifications are expected to be reliable for approximately 22 million objects with r<~20. The general machine learning environment Data-to-Knowledge and supercomputing resources enabled extensive investigation of the decision tree parameter space. This work presents the first public release of objects classified in this way for an entire SDSS data release. The objects are classified as either galaxy, star, or nsng (neither star nor galaxy), with an associated probability for each class. To demonstrate how to effectively make use of these classifications, we perform several important tests. First, we detail selection criteria within the probability space defined by the three classes to extract samples of stars and galaxies to a given completeness and efficiency. Second, we investigate the efficacy of the classifications and the effect of extrapolating from the spectroscopic regime by performing blind tests on objects in the SDSS, 2dFGRS, and 2QZ surveys. Given the photometric limits of our spectroscopic training data, we effectively begin to extrapolate past our star-galaxy training set at r~18. By comparing the number counts of our training sample with the classified sources, however, we find that our efficiencies appear to remain robust to r~20. As a result, we expect our classifications to be accurate for 900,000 galaxies and 6.7 million stars and remain robust via extrapolation for a total of 8.0 million galaxies and 13.9 million stars.

  3. Voxel-Based Approach for Estimating Urban Tree Volume from Terrestrial Laser Scanning Data

    NASA Astrophysics Data System (ADS)

    Vonderach, C.; Voegtle, T.; Adler, P.

    2012-07-01

    The importance of single trees and the determination of related parameters has been recognized in recent years, e.g. for forest inventories or management. For urban areas an increasing interest in the data acquisition of trees can be observed concerning aspects like urban climate, CO2 balance, and environmental protection. Urban trees differ significantly from natural systems with regard to the site conditions (e.g. technogenic soils, contaminants, lower groundwater level, regular disturbance), climate (increased temperature, reduced humidity) and species composition and arrangement (habitus and health status) and therefore allometric relations cannot be transferred from natural sites to urban areas. To overcome this problem an extended approach was developed for a fast and non-destructive extraction of branch volume, DBH (diameter at breast height) and height of single trees from point clouds of terrestrial laser scanning (TLS). For data acquisition, the trees were scanned with highest scan resolution from several (up to five) positions located around the tree. The resulting point clouds (20 to 60 million points) are analysed with an algorithm based on voxel (volume elements) structure, leading to an appropriate data reduction. In a first step, two kinds of noise reduction are carried out: the elimination of isolated voxels as well as voxels with marginal point density. To obtain correct volume estimates, the voxels inside the stem and branches (interior voxels) where voxels contain no laser points must be regarded. For this filling process, an easy and robust approach was developed based on a layer-wise (horizontal layers of the voxel structure) intersection of four orthogonal viewing directions. However, this procedure also generates several erroneous "phantom" voxels, which have to be eliminated. For this purpose the previous approach was extended by a special region growing algorithm. In a final step the volume is determined layer-wise based on the extracted

  4. Characterizing Vocal Repertoires—Hard vs. Soft Classification Approaches

    PubMed Central

    Wadewitz, Philip; Hammerschmidt, Kurt; Battaglia, Demian; Witt, Annette; Wolf, Fred; Fischer, Julia

    2015-01-01

    To understand the proximate and ultimate causes that shape acoustic communication in animals, objective characterizations of the vocal repertoire of a given species are critical, as they provide the foundation for comparative analyses among individuals, populations and taxa. Progress in this field has been hampered by a lack of standard in methodology, however. One problem is that researchers may settle on different variables to characterize the calls, which may impact on the classification of calls. More important, there is no agreement how to best characterize the overall structure of the repertoire in terms of the amount of gradation within and between call types. Here, we address these challenges by examining 912 calls recorded from wild chacma baboons (Papio ursinus). We extracted 118 acoustic variables from spectrograms, from which we constructed different sets of acoustic features, containing 9, 38, and 118 variables; as well 19 factors derived from principal component analysis. We compared and validated the resulting classifications of k-means and hierarchical clustering. Datasets with a higher number of acoustic features lead to better clustering results than datasets with only a few features. The use of factors in the cluster analysis resulted in an extremely poor resolution of emerging call types. Another important finding is that none of the applied clustering methods gave strong support to a specific cluster solution. Instead, the cluster analysis revealed that within distinct call types, subtypes may exist. Because hard clustering methods are not well suited to capture such gradation within call types, we applied a fuzzy clustering algorithm. We found that this algorithm provides a detailed and quantitative description of the gradation within and between chacma baboon call types. In conclusion, we suggest that fuzzy clustering should be used in future studies to analyze the graded structure of vocal repertoires. Moreover, the use of factor analyses to

  5. Neuropsychological Test Selection for Cognitive Impairment Classification: A Machine Learning Approach

    PubMed Central

    Williams, Jennifer A.; Schmitter-Edgecombe, Maureen; Cook, Diane J.

    2016-01-01

    Introduction Reducing the amount of testing required to accurately detect cognitive impairment is clinically relevant. The aim of this research was to determine the fewest number of clinical measures required to accurately classify participants as healthy older adult, mild cognitive impairment (MCI) or dementia using a suite of classification techniques. Methods Two variable selection machine learning models (i.e., naive Bayes, decision tree), a logistic regression, and two participant datasets (i.e., clinical diagnosis, clinical dementia rating; CDR) were explored. Participants classified using clinical diagnosis criteria included 52 individuals with dementia, 97 with MCI, and 161 cognitively healthy older adults. Participants classified using CDR included 154 individuals CDR = 0, 93 individuals with CDR = 0.5, and 25 individuals with CDR = 1.0+. Twenty-seven demographic, psychological, and neuropsychological variables were available for variable selection. Results No significant difference was observed between naive Bayes, decision tree, and logistic regression models for classification of both clinical diagnosis and CDR datasets. Participant classification (70.0 – 99.1%), geometric mean (60.9 – 98.1%), sensitivity (44.2 – 100%), and specificity (52.7 – 100%) were generally satisfactory. Unsurprisingly, the MCI/CDR = 0.5 participant group was the most challenging to classify. Through variable selection only 2 – 9 variables were required for classification and varied between datasets in a clinically meaningful way. Conclusions The current study results reveal that machine learning techniques can accurately classifying cognitive impairment and reduce the number of measures required for diagnosis. PMID:26332171

  6. A Consensus Tree Approach for Reconstructing Human Evolutionary History and Detecting Population Substructure

    NASA Astrophysics Data System (ADS)

    Tsai, Ming-Chi; Blelloch, Guy; Ravi, R.; Schwartz, Russell

    The random accumulation of variations in the human genome over time implicitly encodes a history of how human populations have arisen, dispersed, and intermixed since we emerged as a species. Reconstructing that history is a challenging computational and statistical problem but has important applications both to basic research and to the discovery of genotype-phenotype correlations. In this study, we present a novel approach to inferring human evolutionary history from genetic variation data. Our approach uses the idea of consensus trees, a technique generally used to reconcile species trees from divergent gene trees, adapting it to the problem of finding the robust relationships within a set of intraspecies phylogenies derived from local regions of the genome. We assess the quality of the method on two large-scale genetic variation data sets: the HapMap Phase II and the Human Genome Diversity Project. Qualitative comparison to a consensus model of the evolution of modern human population groups shows that our inferences closely match our best current understanding of human evolutionary history. A further comparison with results of a leading method for the simpler problem of population substructure assignment verifies that our method provides comparable accuracy in identifying meaningful population subgroups in addition to inferring the relationships among them.

  7. Classification of Sherry vinegars by combining multidimensional fluorescence, parafac and different classification approaches.

    PubMed

    Callejón, Raquel M; Amigo, José Manuel; Pairo, Erola; Garmón, Sergio; Ocaña, Juan Antonio; Morales, Maria Lourdes

    2012-01-15

    Sherry vinegar is a much appreciated product from Jerez-Xérès-Sherry, Manzanilla de Sanlúcar and Vinagre de Jerez Protected Designation in southwestern Spain. Its complexity and the extraordinary organoleptic properties are acquired thanks to the method of production followed, the so-called "criaderas y solera" ageing system. Three qualities for Sherry vinegar are considered according to ageing time in oak barrels: "Vinagre de Jerez" (minimum of 6 months), "Reserva" (at least 2 years) and "Gran Reserva" (at least 10 years). In the last few years, there has been an increasing need to develop rapid, inexpensive and effective analytical methods, as well as requiring low sample manipulation for the analysis and characterization of Sherry vinegar. Fluorescence spectroscopy is emerging as a competitive technique for this purpose, since provides in a few seconds an excitation-emission landscape that may be used as a fingerprint of the vinegar. Multi-way analysis, specifically Parallel Factor Analysis (PARAFAC), is a powerful tool for simultaneous determination of fluorescent components, because they extract the most relevant information from the data and allow building robust models. Moreover, the information obtained by PARAFAC can be used to build robust and reliable classification and discrimination models (e.g. by using Support Vector Machines and Partial Least Squares-Discriminant Analysis models). In this context, the aim of this work was to study the possibilities of multi-way fluorescence linked to PARAFAC and to classify the different Sherry vinegars accordingly to their ageing. The results demonstrated that the use of the proposed analytical and chemometric tools are a perfect combination to extract relevant chemical information about the vinegars as well as to classify and discriminate them considering the different ageing. PMID:22265526

  8. Statistical methods and neural network approaches for classification of data from multiple sources

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon Atli; Swain, Philip H.

    1990-01-01

    Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.

  9. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach.

    PubMed

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  10. Sovereign debt crisis in the European Union: A minimum spanning tree approach

    NASA Astrophysics Data System (ADS)

    Dias, João

    2012-03-01

    In the wake of the financial crisis, sovereign debt crisis has emerged and is severely affecting some countries in the European Union, threatening the viability of the euro and even the EU itself. This paper applies recent developments in econophysics, in particular the minimum spanning tree approach and the associate hierarchical tree, to analyze the asynchronization between the four most affected countries and other resilient countries in the euro area. For this purpose, daily government bond yield rates are used, covering the period from April 2007 to October 2010, thus including yield rates before, during and after the financial crises. The results show an increasing separation of the two groups of euro countries with the deepening of the government bond crisis.

  11. Marker-Based Hierarchical Segmentation and Classification Approach for Hyperspectral Imagery

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Tilton, James C.; Benediktsson, Jon Atli; Chanussot, Jocelyn

    2011-01-01

    The Hierarchical SEGmentation (HSEG) algorithm, which is a combination of hierarchical step-wise optimization and spectral clustering, has given good performances for hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. First, pixelwise classification is performed and the most reliably classified pixels are selected as markers, with the corresponding class labels. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. The experimental results show that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for hyperspectral image analysis.

  12. Classification of prosthetic heart valve sounds. A parametric approach

    SciTech Connect

    Candy, J.V.; Jones, H.E. |

    1995-06-01

    People with heart problems have had their lives extended considerably with the development of the prosthetic heart valve. Great strides have been made in the development of the valves through the use of improved materials as well as efficient mechanical designs. However, since the valves operate continuously over a long period, structural failures can occur-even though they are relatively uncommon. Here the development of techniques to classify the valve either as having intact struts or as having a separated strut, commonly called single leg separation, is discussed. In this paper the signal processing techniques employed to extract the required signals/parameters are briefly reviewed and then it is shown how they can be used to simulate a synthetic heart valve database for eventual Monte Carlo testing. Next, the optimal classifier is developed under assumed conditions and its performance is compared to that of an adpative-type classifier implemented with a probabilistic neural network. Finally, the adaptive classifier is applied to a data set and its performance is analyzed. Based on synthetic data it is shown that excellent performance of the classifiers can be achieved implying a potentially robust solution to this classification problem. 21 refs., 11 figs., 1 tab.

  13. A comparison of two classification based approaches for downscaling of monthly PM10 concentrations

    NASA Astrophysics Data System (ADS)

    Beck, Christoph; Weitnauer, Claudia; Jacobeit, Jucundus

    2013-04-01

    Circulation type classifications may be utilised for the downscaling of local climatic and environmental target variables in different methodological settings. In this contribution we apply and compare two different classification based approaches for downscaling of monthly indices of PM10 concentrations (monthly mean and number of days exceeding a certain threshold) at different stations in Bavaria (Germany) during the period 1979 to 2010. The first approach uses monthly frequencies of circulation types as predictors in multiple linear regression models (stepwise regression) to estimate monthly predictand values (monthly PM10 indices). The second approach utilizes type specific mean values of the target variable - determined for a calibration period - to estimate predictand values in the validation period. Both approaches are run using varying circulation classifications. This comprises different methodological concepts for circulation classification (e.g. threshold based methods, leader algorithms, cluster analysis) and as well different temporal (1-day or multiple day sequences) and spatial domains (synoptic to continental scale). All models are applied to multiple calibration and validation samples and different skill scores (e.g. reduction of variance, Pearson R) are estimated for each of the validation samples in order to quantify model performance. As main preliminary findings we may state that: - the regression based downscaling approach in most cases clearly outperforms the approach that uses type specific mean values (reference forecasting), - best skill is reached in winter (DJF) and spring (MAM), - comparable model skill is reached for the downscaling of monthly means and extremes indicators (number of days exceeding a certain threshold).

  14. Comparing two approaches to the K-theory classification of D-branes

    NASA Astrophysics Data System (ADS)

    Ferrari Ruffino, Fabio; Savelli, Raffaele

    2011-01-01

    We consider the two main classification methods of D-brane charges via K-theory, in type II superstring theory with vanishing B-field: the Gysin map approach and the one based on the Atiyah-Hirzebruch spectral sequence. Then, we find out an explicit link between these two approaches: the Gysin map provides a representative element of the equivalence class obtained via the spectral sequence. We also briefly discuss the case of rational coefficients, characterized by a complete equivalence between the two classification methods.

  15. The practice of classification and the theory of evolution, and what the demise of Charles Darwin's tree of life hypothesis means for both of them

    PubMed Central

    Doolittle, W. Ford

    2009-01-01

    Debates over the status of the tree of life (TOL) often proceed without agreement as to what it is supposed to be: a hierarchical classification scheme, a tracing of genomic and organismal history or a hypothesis about evolutionary processes and the patterns they can generate. I will argue that for Darwin it was a hypothesis, which lateral gene transfer in prokaryotes now shows to be false. I will propose a more general and relaxed evolutionary theory and point out why anti-evolutionists should take no comfort from disproof of the TOL hypothesis. PMID:19571242

  16. Risk assessment for enterprise resource planning (ERP) system implementations: a fault tree analysis approach

    NASA Astrophysics Data System (ADS)

    Zeng, Yajun; Skibniewski, Miroslaw J.

    2013-08-01

    Enterprise resource planning (ERP) system implementations are often characterised with large capital outlay, long implementation duration, and high risk of failure. In order to avoid ERP implementation failure and realise the benefits of the system, sound risk management is the key. This paper proposes a probabilistic risk assessment approach for ERP system implementation projects based on fault tree analysis, which models the relationship between ERP system components and specific risk factors. Unlike traditional risk management approaches that have been mostly focused on meeting project budget and schedule objectives, the proposed approach intends to address the risks that may cause ERP system usage failure. The approach can be used to identify the root causes of ERP system implementation usage failure and quantify the impact of critical component failures or critical risk events in the implementation process.

  17. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  18. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  19. A game-theoretic tree matching approach for object detection in high-resolution remotely sensed images

    NASA Astrophysics Data System (ADS)

    Liang, Yilong; Cahill, Nathan D.; Saber, Eli; Messinger, David W.

    2015-10-01

    In this paper, we propose a game-theoretic tree matching algorithm for object detection in high resolution (HR) remotely sensed images, where, given a scene image and an object image, the goal is to determine whether or not the object exists in the scene image. To that effect, tree based representations of the images are obtained using a hierarchical scale space approach. The nodes of the tree denote regions in the image and edges represent the relative containment between different regions. Once we have the tree representations of each image, the task of object detection is reformulated as a tree matching problem. We propose a game-theoretic technique to search for the node correspondences between a pair of trees. This method involves defining a non-cooperative matching game, where strategies denote the possible pairs of matching regions and payoffs determine the compatibilities between these strategies. Trees are matched by finding the evolutionary stable states (ESS) of the game. To validate the effectiveness of the proposed algorithm, we perform experiments on both synthetic and HR remotely sensed images. Our results demonstrate the robustness of the tree representation with respect to different spatial variations of the images, as well as the effectiveness of the proposed game-theoretic tree matching algorithm.

  20. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery using a Probabilistic Learning Framework

    NASA Astrophysics Data System (ADS)

    Basu, S.; Ganguly, S.; Michaelis, A.; Votava, P.; Roy, A.; Mukhopadhyay, S.; Nemani, R. R.

    2015-12-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  1. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery Using a Probabilistic Learning Framework

    NASA Technical Reports Server (NTRS)

    Basu, Saikat; Ganguly, Sangram; Michaelis, Andrew; Votava, Petr; Roy, Anshuman; Mukhopadhyay, Supratik; Nemani, Ramakrishna

    2015-01-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets, which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  2. Optimization of a Non-traditional Unsupervised Classification Approach for Land Cover Analysis

    NASA Technical Reports Server (NTRS)

    Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.

    1982-01-01

    The conditions under which a hybrid of clustering and canonical analysis for image classification produce optimum results were analyzed. The approach involves generation of classes by clustering for input to canonical analysis. The importance of the number of clusters input and the effect of other parameters of the clustering algorithm (ISOCLS) were examined. The approach derives its final result by clustering the canonically transformed data. Therefore the importance of number of clusters requested in this final stage was also examined. The effect of these variables were studied in terms of the average separability (as measured by transformed divergence) of the final clusters, the transformation matrices resulting from different numbers of input classes, and the accuracy of the final classifications. The research was performed with LANDSAT MSS data over the Hazleton/Berwick Pennsylvania area. Final classifications were compared pixel by pixel with an existing geographic information system to provide an indication of their accuracy.

  3. Tree Scanning

    PubMed Central

    Templeton, Alan R.; Maxwell, Taylor; Posada, David; Stengård, Jari H.; Boerwinkle, Eric; Sing, Charles F.

    2005-01-01

    We use evolutionary trees of haplotypes to study phenotypic associations by exhaustively examining all possible biallelic partitions of the tree, a technique we call tree scanning. If the first scan detects significant associations, additional rounds of tree scanning are used to partition the tree into three or more allelic classes. Two worked examples are presented. The first is a reanalysis of associations between haplotypes at the Alcohol Dehydrogenase locus in Drosophila melanogaster that was previously analyzed using a nested clade analysis, a more complicated technique for using haplotype trees to detect phenotypic associations. Tree scanning and the nested clade analysis yield the same inferences when permutation testing is used with both approaches. The second example is an analysis of associations between variation in various lipid traits and genetic variation at the Apolipoprotein E (APOE) gene in three human populations. Tree scanning successfully identified phenotypic associations expected from previous analyses. Tree scanning for the most part detected more associations and provided a better biological interpretative framework than single SNP analyses. We also show how prior information can be incorporated into the tree scan by starting with the traditional three electrophoretic alleles at APOE. Tree scanning detected genetically determined phenotypic heterogeneity within all three electrophoretic allelic classes. Overall, tree scanning is a simple, powerful, and flexible method for using haplotype trees to detect phenotype/genotype associations at candidate loci. PMID:15371364

  4. a Three-Dimensional Approach to Modeling Root Water Uptake by Multiple Trees

    NASA Astrophysics Data System (ADS)

    Manoli, G.; Bonetti, S.; domec, J.; Putti, M.; Katul, G. G.; Marani, M.

    2013-12-01

    Competition for water among rooting zones of multiple trees is a ubiquitous feature of canopy-scale transpiration, yet rarely incorporated in watershed scale models. To accommodate the three-dimensional rooting overlap in space, a three-dimensional approach to modeling soil moisture and root-water uptake is developed in which the overlap of the rooting system is allowed. The model is based on the 3-D Richards equation and uses an Ohm's law type model to account for root, xylem and stomatal conductances needed to describe root water uptake (RWU). The hydraulic model is then linked to the atmosphere via a stomatal conductance, where the stomatal aperture is regulated so as to maximum carbon gain for a given water loss. Because of this tight coupling between Fickian mass transfer of CO2 and H2O through the stomatal pores, plant hydraulics and biochemical demand for CO2 via photosynthesis are simultaneously considered in the estimation of RWU. The model is then evaluated with field data and used to investigate tree-to-tree interactions in a well-drained loblolly pine plantation. The importance of three-dimensional modeling to upscale plant-water relations at large scales is discussed.

  5. UAV based tree height estimation in apple orchards: potential of multiple approaches

    NASA Astrophysics Data System (ADS)

    Mejia-Aguilar, Abraham; Tomelleri, Enrico; Vilardi, Andrea; Zebisch, Marc

    2015-04-01

    Canopy height, as part of vegetation structure, is ecologically important for ecological studies on biomass, matter flows or meteorology. Measuring the growth of canopy can be undertaken by the use multiple remote sensing techniques. In this study, we firstly use data generated from an Unmanned Aerial Vehicles (UAV) with a simultaneous consumer-grade RGB and modified IR cameras, configured in nadir and multi-angle views to generate 3D models for Digital Surface Model (DSM) and Digital Terrain Models (DTM) in order to estimate tree height in apple orchards in South Tyrol, Italy. We evaluate the use of Ground Control Points (GCP) to minimize the error in scale and orientation. Then, we validate and compare the results of our primary data collection with data generated by geolocated field measurements over several selected tree species. Additionally, we compare DSM and DTM obtained from a recent 1-meter resolution LIDAR campaign (Light Detection and Ranging). The main purpose of this study is to contrast multiple estimation approaches and evaluate their utility for the estimation of canopy height, highlighting the use of UAV systems as a fast, reliable and non-expensive technique especially for small scale applications. The study is conducted in a homogenous tree canopy consisting of apple orchards located in Caldaro -South Tyrol, Italy. We end with proposing a potential low-cost and inexpensive application combining models for DSM from the UAV with DTM obtained from LIDAR for applications that should be updated frequently.

  6. Head Pose Estimation on Eyeglasses Using Line Detection and Classification Approach

    NASA Astrophysics Data System (ADS)

    Setthawong, Pisal; Vannija, Vajirasak

    This paper proposes a unique approach for head pose estimation of subjects with eyeglasses by using a combination of line detection and classification approaches. Head pose estimation is considered as an important non-verbal form of communication and could also be used in the area of Human-Computer Interface. A major improvement of the proposed approach is that it allows estimation of head poses at a high yaw/pitch angle when compared with existing geometric approaches, does not require expensive data preparation and training, and is generally fast when compared with other approaches.

  7. Updating the US Hydrologic Classification: An Approach to Clustering and Stratifying Ecohydrologic Data

    SciTech Connect

    McManamay, Ryan A; Bevelhimer, Mark S; Kao, Shih-Chieh

    2013-01-01

    Hydrologic classifications unveil the structure of relationships among groups of streams with differing stream flow and provide a foundation for drawing inferences about the principles that govern those relationships. Hydrologic classes provide a template to describe ecological patterns, generalize hydrologic responses to disturbance, and stratify research and management needs applicable to ecohydrology. We developed two updated hydrologic classifications for the continental US using two streamflow datasets of varying reference standards. Using only reference-quality gages, we classified 1715 stream gages into 12 classes across the US. By including more streamflow gages (n=2618) in a separate classification, we increased the dimensionality (i.e. classes) and hydrologic distinctiveness within regions at the expense of decreasing the natural flow standards (i.e. reference quality). Greater numbers of classes and higher regional affiliation within our hydrologic classifications compared to that of the previous US hydrologic classification (Poff, 1996) suggested that the level of hydrologic variation and resolution was not completely represented in smaller sample sizes. Part of the utility of classification systems rests in their ability classify new objects and stratify analyses. We constructed separate random forests to predict hydrologic class membership based on hydrologic indices or landscape variables. In addition, we provide an approach to assessing potential outliers due to hydrologic alteration based on class assignment. Departures from class membership due to disturbance take into account multiple hydrologic indices simultaneously; thus, classes can be used to determine if disturbed streams are functioning within the realm of natural hydrology.

  8. A novel information transferring approach for the classification of remote sensing images

    NASA Astrophysics Data System (ADS)

    Gao, Jianqiang; Xu, Lizhong; Shen, Jie; Huang, Fengchen; Xu, Feng

    2015-12-01

    Traditional remote sensing images classification methods focused on using a large amount of labeled target data to train an efficient classification model. However, these approaches were generally based on the target data without considering a host of auxiliary data or the additional information of auxiliary data. If the valuable information from auxiliary data could be successfully transferred to the target data, the performance of the classification model would be improved. In addition, from the perspective of practical application, these valuable information from auxiliary data should be fully used. Therefore, in this paper, based on the transfer learning idea, we proposed a novel information transferring approach to improve the remote sensing images classification performance. The main rationale of this approach is that first, the information of the same areas associated with each pixel is modeled as the intra-class set, and the information of different areas associated with each pixel is modeled as the inter-class set, and then the obtained texture feature information of each area from auxiliary is transferred to the target data set such that the inter-class set is separated and intra-class set is gathered as far as possible. Experiments show that the proposed approach is effective and feasible.

  9. A new approach to plane-sweep overlay: topological structuring and line-segment classification

    USGS Publications Warehouse

    van Roessel, Jan W.

    1991-01-01

    An integrated approach to spatial overlay was developed with the objective of creating a single function that can perform most of the tasks now assigned to discrete functions in current systems. Two important components of this system are a unique method for topological structuring, and a method for attribute propagation and line-segment classification. -Author

  10. Decision-tree-model identification of nitrate pollution activities in groundwater: A combination of a dual isotope approach and chemical ions

    NASA Astrophysics Data System (ADS)

    Xue, Dongmei; Pang, Fengmei; Meng, Fanqiao; Wang, Zhongliang; Wu, Wenliang

    2015-09-01

    To develop management practices for agricultural crops to protect against NO3- contamination in groundwater, dominant pollution activities require reliable classification. In this study, we (1) classified potential NO3- pollution activities via an unsupervised learning algorithm based on δ15N- and δ18O-NO3- and physico-chemical properties of groundwater at 55 sampling locations; and (2) determined which water quality parameters could be used to identify the sources of NO3- contamination via a decision tree model. When a combination of δ15N-, δ18O-NO3- and physico-chemical properties of groundwater was used as an input for the k-means clustering algorithm, it allowed for a reliable clustering of the 55 sampling locations into 4 corresponding agricultural activities: well irrigated agriculture (28 sampling locations), sewage irrigated agriculture (16 sampling locations), a combination of sewage irrigated agriculture, farm and industry (5 sampling locations) and a combination of well irrigated agriculture and farm (6 sampling locations). A decision tree model with 97.5% classification success was developed based on SO42 - and Cl- variables. The NO3- and the δ15N- and δ18O-NO3- variables demonstrated limitation in developing a decision tree model as multiple N sources and fractionation processes both resulted in difficulties of discriminating NO3- concentrations and isotopic values. Although only the SO42 - and Cl- were selected as important discriminating variables, concentration data alone could not identify the specific NO3- sources responsible for groundwater contamination. This is a result of comprehensive analysis. To further reduce NO3- contamination, an integrated approach should be set-up by combining N and O isotopes of NO3- with land-uses and physico-chemical properties, especially in areas with complex agricultural activities.

  11. Decision-tree-model identification of nitrate pollution activities in groundwater: A combination of a dual isotope approach and chemical ions.

    PubMed

    Xue, Dongmei; Pang, Fengmei; Meng, Fanqiao; Wang, Zhongliang; Wu, Wenliang

    2015-09-01

    To develop management practices for agricultural crops to protect against NO3(-) contamination in groundwater, dominant pollution activities require reliable classification. In this study, we (1) classified potential NO3(-) pollution activities via an unsupervised learning algorithm based on δ(15)N- and δ(18)O-NO3(-) and physico-chemical properties of groundwater at 55 sampling locations; and (2) determined which water quality parameters could be used to identify the sources of NO3(-) contamination via a decision tree model. When a combination of δ(15)N-, δ(18)O-NO3(-) and physico-chemical properties of groundwater was used as an input for the k-means clustering algorithm, it allowed for a reliable clustering of the 55 sampling locations into 4 corresponding agricultural activities: well irrigated agriculture (28 sampling locations), sewage irrigated agriculture (16 sampling locations), a combination of sewage irrigated agriculture, farm and industry (5 sampling locations) and a combination of well irrigated agriculture and farm (6 sampling locations). A decision tree model with 97.5% classification success was developed based on SO4(2-) and Cl(-) variables. The NO3(-) and the δ(15)N- and δ(18)O-NO3(-) variables demonstrated limitation in developing a decision tree model as multiple N sources and fractionation processes both resulted in difficulties of discriminating NO3(-) concentrations and isotopic values. Although only the SO4(2-) and Cl(-) were selected as important discriminating variables, concentration data alone could not identify the specific NO3(-) sources responsible for groundwater contamination. This is a result of comprehensive analysis. To further reduce NO3(-) contamination, an integrated approach should be set-up by combining N and O isotopes of NO3(-) with land-uses and physico-chemical properties, especially in areas with complex agricultural activities. PMID:26231989

  12. A sampling and classification item selection approach with content balancing.

    PubMed

    Chen, Pei-Hua

    2015-03-01

    Existing automated test assembly methods typically employ constrained combinatorial optimization. Constructing forms sequentially based on an optimization approach usually results in unparallel forms and requires heuristic modifications. Methods based on a random search approach have the major advantage of producing parallel forms sequentially without further adjustment. This study incorporated a flexible content-balancing element into the statistical perspective item selection method of the cell-only method (Chen et al. in Educational and Psychological Measurement, 72(6), 933-953, 2012). The new method was compared with a sequential interitem distance weighted deviation model (IID WDM) (Swanson & Stocking in Applied Psychological Measurement, 17(2), 151-166, 1993), a simultaneous IID WDM, and a big-shadow-test mixed integer programming (BST MIP) method to construct multiple parallel forms based on matching a reference form item-by-item. The results showed that the cell-only method with content balancing and the sequential and simultaneous versions of IID WDM yielded results comparable to those obtained using the BST MIP method. The cell-only method with content balancing is computationally less intensive than the sequential and simultaneous versions of IID WDM. PMID:24610145

  13. Gaussian Kernel Based Classification Approach for Wheat Identification

    NASA Astrophysics Data System (ADS)

    Aggarwal, R.; Kumar, A.; Raju, P. L. N.; Krishna Murthy, Y. V. N.

    2014-11-01

    Agriculture holds a pivotal role in context to India, which is basically agrarian economy. Crop type identification is a key issue for monitoring agriculture and is the basis for crop acreage and yield estimation. However, it is very challenging to identify a specific crop using single date imagery. Hence, it is highly important to go for multi-temporal analysis approach for specific crop identification. This research work deals with implementation of fuzzy classifier; Possibilistic c-Means (PCM) with and without kernel based approach, using temporal data of Landsat 8- OLI (Operational Land Imager) for identification of wheat in Radaur City, Haryana. The multi- temporal dataset covers complete phenological cycle that is from seedling to ripening of wheat crop growth. The experimental results show that inclusion of Gaussian kernel, with Euclidean Norm (ED Norm) in Possibilistic c-Means (KPCM), soft classifier has been more robust in identification of the wheat crop. Also, identification of all the wheat fields is dependent upon appropriate selection of the temporal date. The best combination of temporal data corresponds to tillering, stem extension, heading and ripening stages of wheat crop. Entropy at testing sites of wheat has been used to validate the classified results. The entropy value at testing sites was observed to be low, implying lower uncertainty of existence of any other class at wheat test sites and high certainty of existence of wheat crop.

  14. An ensemble classification approach for improved Land use/cover change detection

    NASA Astrophysics Data System (ADS)

    Chellasamy, M.; Ferré, T. P. A.; Humlekrog Greve, M.; Larsen, R.; Chinnasamy, U.

    2014-11-01

    Change Detection (CD) methods based on post-classification comparison approaches are claimed to provide potentially reliable results. They are considered to be most obvious quantitative method in the analysis of Land Use Land Cover (LULC) changes which provides from - to change information. But, the performance of post-classification comparison approaches highly depends on the accuracy of classification of individual images used for comparison. Hence, we present a classification approach that produce accurate classified results which aids to obtain improved change detection results. Machine learning is a part of broader framework in change detection, where neural networks have drawn much attention. Neural network algorithms adaptively estimate continuous functions from input data without mathematical representation of output dependence on input. A common practice for classification is to use Multi-Layer-Perceptron (MLP) neural network with backpropogation learning algorithm for prediction. To increase the ability of learning and prediction, multiple inputs (spectral, texture, topography, and multi-temporal information) are generally stacked to incorporate diversity of information. On the other hand literatures claims backpropagation algorithm to exhibit weak and unstable learning in use of multiple inputs, while dealing with complex datasets characterized by mixed uncertainty levels. To address the problem of learning complex information, we propose an ensemble classification technique that incorporates multiple inputs for classification unlike traditional stacking of multiple input data. In this paper, we present an Endorsement Theory based ensemble classification that integrates multiple information, in terms of prediction probabilities, to produce final classification results. Three different input datasets are used in this study: spectral, texture and indices, from SPOT-4 multispectral imagery captured on 1998 and 2003. Each SPOT image is classified

  15. Application Of Decision Tree Approach To Student Selection Model- A Case Study

    NASA Astrophysics Data System (ADS)

    Harwati; Sudiya, Amby

    2016-01-01

    The main purpose of the institution is to provide quality education to the students and to improve the quality of managerial decisions. One of the ways to improve the quality of students is to arrange the selection of new students with a more selective. This research takes the case in the selection of new students at Islamic University of Indonesia, Yogyakarta, Indonesia. One of the university's selection is through filtering administrative selection based on the records of prospective students at the high school without paper testing. Currently, that kind of selection does not yet has a standard model and criteria. Selection is only done by comparing candidate application file, so the subjectivity of assessment is very possible to happen because of the lack standard criteria that can differentiate the quality of students from one another. By applying data mining techniques classification, can be built a model selection for new students which includes criteria to certain standards such as the area of origin, the status of the school, the average value and so on. These criteria are determined by using rules that appear based on the classification of the academic achievement (GPA) of the students in previous years who entered the university through the same way. The decision tree method with C4.5 algorithm is used here. The results show that students are given priority for admission is that meet the following criteria: came from the island of Java, public school, majoring in science, an average value above 75, and have at least one achievement during their study in high school.

  16. A unified approach to the classification of visual data systems.

    NASA Technical Reports Server (NTRS)

    Black, J.

    1971-01-01

    Description of an approach to attaining a unified means of characterizing film, TV, and optical data systems. The concept is based on the premise that all of these imaging systems can be described by an equation similar to the ideal imaging system described by Rose (1948). This technique permits the direct comparison of film and TV performance without converting speed, film resolution lines, TV resolution lines, highlight output current and video bandwidth into compatible units, only to find some essential element of the conversion has been omitted from the particular specification in use. Most important, it permits system performance criteria to be based on input and output criteria without extensive manipulation of the elements between input and output.

  17. An ensemble approach for phenotype classification based on fuzzy partitioning of gene expression data.

    PubMed

    Dragomir, A; Maraziotis, I; Bezerianos, A

    2006-01-01

    We focus on developing a pattern recognition method suitable for performing supervised analysis tasks on molecular data resulting from microarray experiments. Molecular characterization of tissue samples using microarray gene expression profiling is expected to uncover fundamental aspects related to cancer diagnosis and drug discovery. There is therefore a need for reliable, accurate classification methods. With this study we propose a framework for constructing an ensemble of individually trained SVM classifiers, each of them specialized on subsets of the input space. The fuzzy approach used for partitioning the data produces overlapping subsets of the input space that facilitates subsequent classification tasks. PMID:17946338

  18. Ecophysiological responses of trees to long- term N deposition: a multi isotopes approach

    NASA Astrophysics Data System (ADS)

    Battipaglia, G.; Lubritto, C.; Altieri, S.; Marzaioli, F.; Cherubini, P.; Cotrufo, M. F.

    2009-04-01

    Anthropogenic emissions of nitrogen compounds, principally derived from the burning of fossil fuels, have lead to regional changes in atmospheric and precipitation chemistry. The fate and environmental consequences of these changes on ecosystems functions and on forest growth has attracted considerable research. The d15N measurements have been used successfully for detecting changes in N deposition and incorporation of atmospheric N into leaves (Siegwolf et al,2001) and tree rings (Poulson et al.,1995; Saurer et al.,2004, Guerrieri et al.2009). We show main results arising from a study of mature Pinus pinea individuals exposed to large amount of traffic exhaust for 20 years. Specifically, we examined the time-related trend in the growth residuals through dendrochronological analysis and C and N isotopes. A consistent decrease in the ring width starting from 1980 with a slight increase in δ13C value has been found as a consequence of environmental stress event. More over the effect of the fossil source 14C dilution on the atmospheric bomb enriched background has been detected in tree rings over the last decades, as a consequence of the increase in uptaking of traffic exhaust. The great variability in δ15N values of tree rings with time underlines the difficulties we encountered in using N as an environmental tool and open new questions and research avenues. Guerrieri M.R., Siegwolf R.T.W., Saurer M., Jäggi M., Cherubini., Ripullone F., Borghetti M., (2009)"Impact of different nitrogen emission sources on tree physiology as assessed by a triple stable isotope approach" Atmospheric Environment 43:410-418 Pearson J., Wellis D.M., Seller K.J., Bennet A., Soares A., Woodall J., Ingroulle M.J. (2000). Traffic exposure increases natural 15N and heavy metal concentrations in mosses. New Phytologist 147: 317-326. Siegwolf R.T.W., Matyssek R., Saurer M., Maurer S., Günthardt-Georg M.S., Schmutz P. and Bucher J.B. "Stable isotope analysis reveals differential effects of

  19. What determines tree mortality in dry environments? A multi-perspective approach.

    PubMed

    Dorman, Michael; Svoray, Tal; Perevolotsky, Avi; Moshe, Yitzhak; Sarris, Dimitrios

    2015-06-01

    dendrochronological and remotely sensed performance indicators, in contrast to potential bias when using a single approach. For example, dendrochronological data suggested highly resilient tree growth, since it was based only on the "surviving" portion of the population, thus failing to identify past demographic changes evident through remote sensing. We therefore suggest that evaluation of forest resilience should be based on several metrics, each suited for detecting transitions at a different level of organization. PMID:26465042

  20. A latent discriminative model-based approach for classification of imaginary motor tasks from EEG data.

    PubMed

    Saa, Jaime F Delgado; Çetin, Müjdat

    2012-04-01

    We consider the problem of classification of imaginary motor tasks from electroencephalography (EEG) data for brain-computer interfaces (BCIs) and propose a new approach based on hidden conditional random fields (HCRFs). HCRFs are discriminative graphical models that are attractive for this problem because they (1) exploit the temporal structure of EEG; (2) include latent variables that can be used to model different brain states in the signal; and (3) involve learned statistical models matched to the classification task, avoiding some of the limitations of generative models. Our approach involves spatial filtering of the EEG signals and estimation of power spectra based on autoregressive modeling of temporal segments of the EEG signals. Given this time-frequency representation, we select certain frequency bands that are known to be associated with execution of motor tasks. These selected features constitute the data that are fed to the HCRF, parameters of which are learned from training data. Inference algorithms on the HCRFs are used for the classification of motor tasks. We experimentally compare this approach to the best performing methods in BCI competition IV as well as a number of more recent methods and observe that our proposed method yields better classification accuracy. PMID:22414728

  1. Stygoregions – a promising approach to a bioregional classification of groundwater systems

    PubMed Central

    Stein, Heide; Griebler, Christian; Berkhoff, Sven; Matzke, Dirk; Fuchs, Andreas; Hahn, Hans Jürgen

    2012-01-01

    Linked to diverse biological processes, groundwater ecosystems deliver essential services to mankind, the most important of which is the provision of drinking water. In contrast to surface waters, ecological aspects of groundwater systems are ignored by the current European Union and national legislation. Groundwater management and protection measures refer exclusively to its good physicochemical and quantitative status. Current initiatives in developing ecologically sound integrative assessment schemes by taking groundwater fauna into account depend on the initial classification of subsurface bioregions. In a large scale survey, the regional and biogeographical distribution patterns of groundwater dwelling invertebrates were examined for many parts of Germany. Following an exploratory approach, our results underline that the distribution patterns of invertebrates in groundwater are not in accordance with any existing bioregional classification system established for surface habitats. In consequence, we propose to develope a new classification scheme for groundwater ecosystems based on stygoregions. PMID:22993698

  2. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs

    NASA Astrophysics Data System (ADS)

    Haaf, Ezra; Barthel, Roland

    2016-04-01

    When assessing hydrogeological conditions at the regional scale, the analyst is often confronted with uncertainty of structures, inputs and processes while having to base inference on scarce and patchy data. Haaf and Barthel (2015) proposed a concept for handling this predicament by developing a groundwater systems classification framework, where information is transferred from similar, but well-explored and better understood to poorly described systems. The concept is based on the central hypothesis that similar systems react similarly to the same inputs and vice versa. It is conceptually related to PUB (Prediction in ungauged basins) where organization of systems and processes by quantitative methods is intended and used to improve understanding and prediction. Furthermore, using the framework it is expected that regional conceptual and numerical models can be checked or enriched by ensemble generated data from neighborhood-based estimators. In a first step, groundwater hydrographs from a large dataset in Southern Germany are compared in an effort to identify structural similarity in groundwater dynamics. A number of approaches to group hydrographs, mostly based on a similarity measure - which have previously only been used in local-scale studies, can be found in the literature. These are tested alongside different global feature extraction techniques. The resulting classifications are then compared to a visual "expert assessment"-based classification which serves as a reference. A ranking of the classification methods is carried out and differences shown. Selected groups from the classifications are related to geological descriptors. Here we present the most promising results from a comparison of classifications based on series correlation, different series distances and series features, such as the coefficients of the discrete Fourier transform and the intrinsic mode functions of empirical mode decomposition. Additionally, we show examples of classes

  3. Factors Associated with Caregiver Stability in Permanent Placements: A Classification Tree Approach

    ERIC Educational Resources Information Center

    Proctor, Laura J.; Van Dusen Randazzo, Katherine; Litrownik, Alan J.; Newton, Rae R.; Davis, Inger P.; Villodas, Miguel

    2011-01-01

    Objective: Identify individual and environmental variables associated with caregiver stability and instability for children in diverse permanent placement types (i.e., reunification, adoption, and long-term foster care/guardianship with relatives or non-relatives), following 5 or more months in out-of-home care prior to age 4 due to substantiated…

  4. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  5. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models.

    PubMed

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0-20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  6. A simple semi-automatic approach for land cover classification from multispectral remote sensing imagery.

    PubMed

    Jiang, Dong; Huang, Yaohuan; Zhuang, Dafang; Zhu, Yunqiang; Xu, Xinliang; Ren, Hongyan

    2012-01-01

    Land cover data represent a fundamental data source for various types of scientific research. The classification of land cover based on satellite data is a challenging task, and an efficient classification method is needed. In this study, an automatic scheme is proposed for the classification of land use using multispectral remote sensing images based on change detection and a semi-supervised classifier. The satellite image can be automatically classified using only the prior land cover map and existing images; therefore human involvement is reduced to a minimum, ensuring the operability of the method. The method was tested in the Qingpu District of Shanghai, China. Using Environment Satellite 1(HJ-1) images of 2009 with 30 m spatial resolution, the areas were classified into five main types of land cover based on previous land cover data and spectral features. The results agreed on validation of land cover maps well with a Kappa value of 0.79 and statistical area biases in proportion less than 6%. This study proposed a simple semi-automatic approach for land cover classification by using prior maps with satisfied accuracy, which integrated the accuracy of visual interpretation and performance of automatic classification methods. The method can be used for land cover mapping in areas lacking ground reference information or identifying rapid variation of land cover regions (such as rapid urbanization) with convenience. PMID:23049886

  7. A Simple Semi-Automatic Approach for Land Cover Classification from Multispectral Remote Sensing Imagery

    PubMed Central

    Jiang, Dong; Huang, Yaohuan; Zhuang, Dafang; Zhu, Yunqiang; Xu, Xinliang; Ren, Hongyan

    2012-01-01

    Land cover data represent a fundamental data source for various types of scientific research. The classification of land cover based on satellite data is a challenging task, and an efficient classification method is needed. In this study, an automatic scheme is proposed for the classification of land use using multispectral remote sensing images based on change detection and a semi-supervised classifier. The satellite image can be automatically classified using only the prior land cover map and existing images; therefore human involvement is reduced to a minimum, ensuring the operability of the method. The method was tested in the Qingpu District of Shanghai, China. Using Environment Satellite 1(HJ-1) images of 2009 with 30 m spatial resolution, the areas were classified into five main types of land cover based on previous land cover data and spectral features. The results agreed on validation of land cover maps well with a Kappa value of 0.79 and statistical area biases in proportion less than 6%. This study proposed a simple semi-automatic approach for land cover classification by using prior maps with satisfied accuracy, which integrated the accuracy of visual interpretation and performance of automatic classification methods. The method can be used for land cover mapping in areas lacking ground reference information or identifying rapid variation of land cover regions (such as rapid urbanization) with convenience. PMID:23049886

  8. Ship classification using nonlinear features of radiated sound: an approach based on empirical mode decomposition.

    PubMed

    Bao, Fei; Li, Chen; Wang, Xinlong; Wang, Qingfu; Du, Shuanping

    2010-07-01

    Classification for ship-radiated underwater sound is one of the most important and challenging subjects in underwater acoustical signal processing. An approach to ship classification is proposed in this work based on analysis of ship-radiated acoustical noise in subspaces of intrinsic mode functions attained via the ensemble empirical mode decomposition. It is shown that detection and acquisition of stable and reliable nonlinear features become practically feasible by nonlinear analysis of the time series of individual decomposed components, each of which is simple enough and well represents an oscillatory mode of ship dynamics. Surrogate and nonlinear predictability analysis are conducted to probe and measure the nonlinearity and regularity. The results of both methods, which verify each other, substantiate that ship-radiated noises contain components with deterministic nonlinear features well serving for efficient classification of ships. The approach perhaps opens an alternative avenue in the direction toward object classification and identification. It may also import a new view of signals as complex as ship-radiated sound. PMID:20649216

  9. A probabilistic neural network approach for modeling and classification of bacterial growth/no-growth data.

    PubMed

    Hajmeer, M; Basheer, I

    2002-10-01

    In this paper, we propose to use probabilistic neural networks (PNNs) for classification of bacterial growth/no-growth data and modeling the probability of growth. The PNN approach combines both Bayes theorem of conditional probability and Parzen's method for estimating the probability density functions of the random variables. Unlike other neural network training paradigms, PNNs are characterized by high training speed and their ability to produce confidence levels for their classification decision. As a practical application of the proposed approach, PNNs were investigated for their ability in classification of growth/no-growth state of a pathogenic Escherichia coli R31 in response to temperature and water activity. A comparison with the most frequently used traditional statistical method based on logistic regression and multilayer feedforward artificial neural network (MFANN) trained by error backpropagation was also carried out. The PNN-based models were found to outperform linear and nonlinear logistic regression and MFANN in both the classification accuracy and ease by which PNN-based models are developed. PMID:12133614

  10. Accurate multi-source forest species mapping using the multiple spectral-spatial classification approach

    NASA Astrophysics Data System (ADS)

    Stavrakoudis, Dimitris; Gitas, Ioannis; Karydas, Christos; Kolokoussis, Polychronis; Karathanassi, Vassilia

    2015-10-01

    This paper proposes an efficient methodology for combining multiple remotely sensed imagery, in order to increase the classification accuracy in complex forest species mapping tasks. The proposed scheme follows a decision fusion approach, whereby each image is first classified separately by means of a pixel-wise Fuzzy-Output Support Vector Machine (FO-SVM) classifier. Subsequently, the multiple results are fused according to the so-called multiple spectral- spatial classifier using the minimum spanning forest (MSSC-MSF) approach, which constitutes an effective post-regularization procedure for enhancing the result of a single pixel-based classification. For this purpose, the original MSSC-MSF has been extended in order to handle multiple classifications. In particular, the fuzzy outputs of the pixel-based classifiers are stacked and used to grow the MSF, whereas the markers are also determined considering both classifications. The proposed methodology has been tested on a challenging forest species mapping task in northern Greece, considering a multispectral (GeoEye) and a hyper-spectral (CASI) image. The pixel-wise classifications resulted in overall accuracies (OA) of 68.71% for the GeoEye and 77.95% for the CASI images, respectively. Both of them are characterized by high levels of speckle noise. Applying the proposed multi-source MSSC-MSF fusion, the OA climbs to 90.86%, which is attributed both to the ability of MSSC-MSF to tackle the salt-and-pepper effect, as well as the fact that the fusion approach exploits the relative advantages of both information sources.

  11. A novel approach to probabilistic biomarker-based classification using functional near-infrared spectroscopy.

    PubMed

    Hahn, Tim; Marquand, Andre F; Plichta, Michael M; Ehlis, Ann-Christine; Schecklmann, Martin W; Dresler, Thomas; Jarczok, Tomasz A; Eirich, Elisa; Leonhard, Christine; Reif, Andreas; Lesch, Klaus-Peter; Brammer, Michael J; Mourao-Miranda, Janaina; Fallgatter, Andreas J

    2013-05-01

    Pattern recognition approaches to the analysis of neuroimaging data have brought new applications such as the classification of patients and healthy controls within reach. In our view, the reliance on expensive neuroimaging techniques which are not well tolerated by many patient groups and the inability of most current biomarker algorithms to accommodate information about prior class frequencies (such as a disorder's prevalence in the general population) are key factors limiting practical application. To overcome both limitations, we propose a probabilistic pattern recognition approach based on cheap and easy-to-use multi-channel near-infrared spectroscopy (fNIRS) measurements. We show the validity of our method by applying it to data from healthy controls (n = 14) enabling differentiation between the conditions of a visual checkerboard task. Second, we show that high-accuracy single subject classification of patients with schizophrenia (n = 40) and healthy controls (n = 40) is possible based on temporal patterns of fNIRS data measured during a working memory task. For classification, we integrate spatial and temporal information at each channel to estimate overall classification accuracy. This yields an overall accuracy of 76% which is comparable to the highest ever achieved in biomarker-based classification of patients with schizophrenia. In summary, the proposed algorithm in combination with fNIRS measurements enables the analysis of sub-second, multivariate temporal patterns of BOLD responses and high-accuracy predictions based on low-cost, easy-to-use fNIRS patterns. In addition, our approach can easily compensate for variable class priors, which is highly advantageous in making predictions in a wide range of clinical neuroimaging applications. PMID:22965654

  12. Detection of fallen trees in ALS point clouds using a Normalized Cut approach trained by simulation

    NASA Astrophysics Data System (ADS)

    Polewski, Przemyslaw; Yao, Wei; Heurich, Marco; Krzystek, Peter; Stilla, Uwe

    2015-07-01

    Downed dead wood is regarded as an important part of forest ecosystems from an ecological perspective, which drives the need for investigating its spatial distribution. Based on several studies, Airborne Laser Scanning (ALS) has proven to be a valuable remote sensing technique for obtaining such information. This paper describes a unified approach to the detection of fallen trees from ALS point clouds based on merging short segments into whole stems using the Normalized Cut algorithm. We introduce a new method of defining the segment similarity function for the clustering procedure, where the attribute weights are learned from labeled data. Based on a relationship between Normalized Cut's similarity function and a class of regression models, we show how to learn the similarity function by training a classifier. Furthermore, we propose using an appearance-based stopping criterion for the graph cut algorithm as an alternative to the standard Normalized Cut threshold approach. We set up a virtual fallen tree generation scheme to simulate complex forest scenarios with multiple overlapping fallen stems. This simulated data is then used as a basis to learn both the similarity function and the stopping criterion for Normalized Cut. We evaluate our approach on 5 plots from the strictly protected mixed mountain forest within the Bavarian Forest National Park using reference data obtained via a manual field inventory. The experimental results show that our method is able to detect up to 90% of fallen stems in plots having 30-40% overstory cover with a correctness exceeding 80%, even in quite complex forest scenes. Moreover, the performance for feature weights trained on simulated data is competitive with the case when the weights are calculated using a grid search on the test data, which indicates that the learned similarity function and stopping criterion can generalize well on new plots.

  13. Bag-of-features approach for improvement of lung tissue classification in diffuse lung disease

    NASA Astrophysics Data System (ADS)

    Kato, Noriji; Fukui, Motofumi; Isozaki, Takashi

    2009-02-01

    Many automated techniques have been proposed to classify diffuse lung disease patterns. Most of the techniques utilize texture analysis approaches with second and higher order statistics, and show successful classification result among various lung tissue patterns. However, the approaches do not work well for the patterns with inhomogeneous texture distribution within a region of interest (ROI), such as reticular and honeycombing patterns, because the statistics can only capture averaged feature over the ROI. In this work, we have introduced the bag-of-features approach to overcome this difficulty. In the approach, texture images are represented as histograms or distributions of a few basic primitives, which are obtained by clustering local image features. The intensity descriptor and the Scale Invariant Feature Transformation (SIFT) descriptor are utilized to extract the local features, which have significant discriminatory power due to their specificity to a particular image class. In contrast, the drawback of the local features is lack of invariance under translation and rotation. We improved the invariance by sampling many local regions so that the distribution of the local features is unchanged. We evaluated the performance of our system in the classification task with 5 image classes (ground glass, reticular, honeycombing, emphysema, and normal) using 1109 ROIs from 211 patients. Our system achieved high classification accuracy of 92.8%, which is superior to that of the conventional system with the gray level co-occurrence matrix (GLCM) feature especially for inhomogeneous texture patterns.

  14. A splay tree-based approach for efficient resource location in P2P networks.

    PubMed

    Zhou, Wei; Tan, Zilong; Yao, Shaowen; Wang, Shipu

    2014-01-01

    Resource location in structured P2P system has a critical influence on the system performance. Existing analytical studies of Chord protocol have shown some potential improvements in performance. In this paper a splay tree-based new Chord structure called SChord is proposed to improve the efficiency of locating resources. We consider a novel implementation of the Chord finger table (routing table) based on the splay tree. This approach extends the Chord finger table with additional routing entries. Adaptive routing algorithm is proposed for implementation, and it can be shown that hop count is significantly minimized without introducing any other protocol overheads. We analyze the hop count of the adaptive routing algorithm, as compared to Chord variants, and demonstrate sharp upper and lower bounds for both worst-case and average case settings. In addition, we theoretically analyze the hop reducing in SChord and derive the fact that SChord can significantly reduce the routing hops as compared to Chord. Several simulations are presented to evaluate the performance of the algorithm and support our analytical findings. The simulation results show the efficiency of SChord. PMID:24778602

  15. A Splay Tree-Based Approach for Efficient Resource Location in P2P Networks

    PubMed Central

    Zhou, Wei; Tan, Zilong; Yao, Shaowen; Wang, Shipu

    2014-01-01

    Resource location in structured P2P system has a critical influence on the system performance. Existing analytical studies of Chord protocol have shown some potential improvements in performance. In this paper a splay tree-based new Chord structure called SChord is proposed to improve the efficiency of locating resources. We consider a novel implementation of the Chord finger table (routing table) based on the splay tree. This approach extends the Chord finger table with additional routing entries. Adaptive routing algorithm is proposed for implementation, and it can be shown that hop count is significantly minimized without introducing any other protocol overheads. We analyze the hop count of the adaptive routing algorithm, as compared to Chord variants, and demonstrate sharp upper and lower bounds for both worst-case and average case settings. In addition, we theoretically analyze the hop reducing in SChord and derive the fact that SChord can significantly reduce the routing hops as compared to Chord. Several simulations are presented to evaluate the performance of the algorithm and support our analytical findings. The simulation results show the efficiency of SChord. PMID:24778602

  16. A Unified Experimental Approach for Estimation of Irrigationwater and Nitrate Leaching in Tree Crops

    NASA Astrophysics Data System (ADS)

    Hopmans, J. W.; Kandelous, M. M.; Moradi, A. B.

    2014-12-01

    Groundwater quality is specifically vulnerable in irrigated agricultural lands in California and many other(semi-)arid regions of the world. The routine application of nitrogen fertilizers with irrigation water in California is likely responsible for the high nitrate concentrations in groundwater, underlying much of its main agricultural areas. To optimize irrigation/fertigation practices, it is essential that irrigation and fertilizers are applied at the optimal concentration, place, and time to ensure maximum root uptake and minimize leaching losses to the groundwater. The applied irrigation water and dissolved fertilizer, as well as root growth and associated nitrate and water uptake, interact with soil properties and fertilizer source(s) in a complex manner that cannot easily be resolved. It is therefore that coupled experimental-modeling studies are required to allow for unraveling of the relevant complexities that result from typical field-wide spatial variations of soil texture and layering across farmer-managed fields. We present experimental approaches across a network of tree crop orchards in the San Joaquin Valley, that provide the necessary soil data of soil moisture, water potential and nitrate concentration to evaluate and optimize irrigation water management practices. Specifically, deep tensiometers were used to monitor in-situ continuous soil water potential gradients, for the purpose to compute leaching fluxes of water and nitrate at both the individual tree and field scale.

  17. A regression tree approach to identifying subgroups with differential treatment effects.

    PubMed

    Loh, Wei-Yin; He, Xu; Man, Michael

    2015-05-20

    In the fight against hard-to-treat diseases such as cancer, it is often difficult to discover new treatments that benefit all subjects. For regulatory agency approval, it is more practical to identify subgroups of subjects for whom the treatment has an enhanced effect. Regression trees are natural for this task because they partition the data space. We briefly review existing regression tree algorithms. Then, we introduce three new ones that are practically free of selection bias and are applicable to data from randomized trials with two or more treatments, censored response variables, and missing values in the predictor variables. The algorithms extend the generalized unbiased interaction detection and estimation (GUIDE) approach by using three key ideas: (i) treatment as a linear predictor, (ii) chi-squared tests to detect residual patterns and lack of fit, and (iii) proportional hazards modeling via Poisson regression. Importance scores with thresholds for identifying influential variables are obtained as by-products. A bootstrap technique is used to construct confidence intervals for the treatment effects in each node. The methods are compared using real and simulated data. PMID:25656439

  18. A regression tree approach to identifying subgroups with differential treatment effects

    PubMed Central

    Loh, Wei-Yin; He, Xu; Man, Michael

    2015-01-01

    In the fight against hard-to-treat diseases such as cancer, it is often difficult to discover new treatments that benefit all subjects. For regulatory agency approval, it is more practical to identify subgroups of subjects for whom the treatment has an enhanced effect. Regression trees are natural for this task because they partition the data space. We briefly review existing regression tree algorithms. Then we introduce three new ones that are practically free of selection bias and are applicable to data from randomized trials with two or more treatments, censored response variables, and missing values in the predictor variables. The algorithms extend the GUIDE approach by using three key ideas: (i) treatment as a linear predictor, (ii) chi-squared tests to detect residual patterns and lack of fit, and (iii) proportional hazards modeling via Poisson regression. Importance scores with thresholds for identifying influential variables are obtained as by-products. A bootstrap technique is used to construct confidence intervals for the treatment effects in each node. The methods are compared using real and simulated data. PMID:25656439

  19. [Proposals for social class classification based on the Spanish National Classification of Occupations 2011 using neo-Weberian and neo-Marxist approaches].

    PubMed

    Domingo-Salvany, Antònia; Bacigalupe, Amaia; Carrasco, José Miguel; Espelt, Albert; Ferrando, Josep; Borrell, Carme

    2013-01-01

    In Spain, the new National Classification of Occupations (Clasificación Nacional de Ocupaciones [CNO-2011]) is substantially different to the 1994 edition, and requires adaptation of occupational social classes for use in studies of health inequalities. This article presents two proposals to measure social class: the new classification of occupational social class (CSO-SEE12), based on the CNO-2011 and a neo-Weberian perspective, and a social class classification based on a neo-Marxist approach. The CSO-SEE12 is the result of a detailed review of the CNO-2011 codes. In contrast, the neo-Marxist classification is derived from variables related to capital and organizational and skill assets. The proposed CSO-SEE12 consists of seven classes that can be grouped into a smaller number of categories according to study needs. The neo-Marxist classification consists of 12 categories in which home owners are divided into three categories based on capital goods and employed persons are grouped into nine categories composed of organizational and skill assets. These proposals are complemented by a proposed classification of educational level that integrates the various curricula in Spain and provides correspondences with the International Standard Classification of Education. PMID:23394892

  20. An Efficient Approach for Automated Mass Segmentation and Classification in Mammograms.

    PubMed

    Dong, Min; Lu, Xiangyu; Ma, Yide; Guo, Yanan; Ma, Yurun; Wang, Keju

    2015-10-01

    Breast cancer is becoming a leading death of women all over the world; clinical experiments demonstrate that early detection and accurate diagnosis can increase the potential of treatment. In order to improve the breast cancer diagnosis precision, this paper presents a novel automated segmentation and classification method for mammograms. We conduct the experiment on both DDSM database and MIAS database, firstly extract the region of interests (ROIs) with chain codes and using the rough set (RS) method to enhance the ROIs, secondly segment the mass region from the location ROIs with an improved vector field convolution (VFC) snake and following extract features from the mass region and its surroundings, and then establish features database with 32 dimensions; finally, these features are used as input to several classification techniques. In our work, the random forest is used and compared with support vector machine (SVM), genetic algorithm support vector machine (GA-SVM), particle swarm optimization support vector machine (PSO-SVM), and decision tree. The effectiveness of our method is evaluated by a comprehensive and objective evaluation system; also, Matthew's correlation coefficient (MCC) indicator is used. Among the state-of-the-art classifiers, our method achieves the best performance with best accuracy of 97.73%, and the MCC value reaches 0.8668 and 0.8652 in unique DDSM database and both two databases, respectively. Experimental results prove that the proposed method outperforms the other methods; it could consider applying in CAD systems to assist the physicians for breast cancer diagnosis. PMID:25776767

  1. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests

    ERIC Educational Resources Information Center

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2009-01-01

    Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…

  2. A water balance approach for reconstructing streamflow using tree-ring proxy records

    NASA Astrophysics Data System (ADS)

    Saito, Laurel; Biondi, Franco; Devkota, Rajan; Vittori, Jasmine; Salas, Jose D.

    2015-10-01

    Tree-ring data have been used to augment limited instrumental records of climate and provide a longer view of past variability, thus improving assessments of future scenarios. For streamflow reconstructions, traditional regression-based approaches cannot examine factors that may alter streamflow independently of climate, such as changes in land use or land cover. In this study, seasonal water balance models were used as a mechanistic approach to reconstruct streamflow with proxy inputs of precipitation and air temperature. We examined a Thornthwaite water balance model modified to have seasonal components and a simple water balance model with a snow component. These two models were calibrated with a shuffled complex evolution approach using PRISM and proxy seasonal temperature and precipitation to reconstruct streamflow for the upper reaches of the West Walker River basin at Coleville, CA. Overall, the modified Thornthwaite model performed best during calibration, with R2 values of 0.96 and 0.80 using PRISM and proxy inputs, respectively. The modified Thornthwaite model was then used to reconstruct streamflow during AD 1500-1980 for the West Walker River basin. The reconstruction included similar wet and dry episodes as other regression-based records for the Great Basin, and provided estimates of actual evapotranspiration and of April 1 snow water equivalence. Given its limited input requirements, this approach is suitable in areas where sparse instrumental data are available to improve proxy-based streamflow reconstructions and to explore non-climatic reasons for streamflow variability during the reconstruction period.

  3. Image-Based Airborne Sensors: A Combined Approach for Spectral Signatures Classification through Deterministic Simulated Annealing

    PubMed Central

    Guijarro, María; Pajares, Gonzalo; Herrera, P. Javier

    2009-01-01

    The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the Deterministic Simulated Annealing (DSA) framework. The simple classifiers used are the well tested supervised parametric Bayesian estimator and the Fuzzy Clustering. The DSA is an optimization approach, which minimizes an energy function. The main contribution of DSA is its ability to avoid local minima during the optimization process thanks to the annealing scheme. It outperforms simple classifiers used for the combination and some combined strategies, including a scheme based on the fuzzy cognitive maps and an optimization approach based on the Hopfield neural network paradigm. PMID:22399989

  4. Combining different classification approaches to improve off-line Arabic handwritten word recognition

    NASA Astrophysics Data System (ADS)

    Zavorin, Ilya; Borovikov, Eugene; Davis, Ericson; Borovikov, Anna; Summers, Kristen

    2008-01-01

    Machine perception and recognition of handwritten text in any language is a difficult problem. Even for Latin script most solutions are restricted to specific domains like bank checks courtesy amount recognition. Arabic script presents additional challenges for handwriting recognition systems due to its highly connected nature, numerous forms of each letter, and other factors. In this paper we address the problem of offline Arabic handwriting recognition of pre-segmented words. Rather than focusing on a single classification approach and trying to perfect it, we propose to combine heterogeneous classification methodologies. We evaluate our system on the IFN/ENIT corpus of Tunisian village and town names and demonstrate that the combined approach yields results that are better than those of the individual classifiers.

  5. Towards biological plausibility of electronic noses: A spiking neural network based approach for tea odour classification.

    PubMed

    Sarkar, Sankho Turjo; Bhondekar, Amol P; Macaš, Martin; Kumar, Ritesh; Kaur, Rishemjit; Sharma, Anupma; Gulati, Ashu; Kumar, Amod

    2015-11-01

    The paper presents a novel encoding scheme for neuronal code generation for odour recognition using an electronic nose (EN). This scheme is based on channel encoding using multiple Gaussian receptive fields superimposed over the temporal EN responses. The encoded data is further applied to a spiking neural network (SNN) for pattern classification. Two forms of SNN, a back-propagation based SpikeProp and a dynamic evolving SNN are used to learn the encoded responses. The effects of information encoding on the performance of SNNs have been investigated. Statistical tests have been performed to determine the contribution of the SNN and the encoding scheme to overall odour discrimination. The approach has been implemented in odour classification of orthodox black tea (Kangra-Himachal Pradesh Region) thereby demonstrating a biomimetic approach for EN data analysis. PMID:26356597

  6. A new computer approach to mixed feature classification for forestry application

    NASA Technical Reports Server (NTRS)

    Kan, E. P.

    1976-01-01

    A computer approach for mapping mixed forest features (i.e., types, classes) from computer classification maps is discussed. Mixed features such as mixed softwood/hardwood stands are treated as admixtures of softwood and hardwood areas. Large-area mixed features are identified and small-area features neglected when the nominal size of a mixed feature can be specified. The computer program merges small isolated areas into surrounding areas by the iterative manipulation of the postprocessing algorithm that eliminates small connected sets. For a forestry application, computer-classified LANDSAT multispectral scanner data of the Sam Houston National Forest were used to demonstrate the proposed approach. The technique was successful in cleaning the salt-and-pepper appearance of multiclass classification maps and in mapping admixtures of softwood areas and hardwood areas. However, the computer-mapped mixed areas matched very poorly with the ground truth because of inadequate resolution and inappropriate definition of mixed features.

  7. A Pareto evolutionary artificial neural network approach for remote sensing image classification

    NASA Astrophysics Data System (ADS)

    Liu, Fujiang; Wu, Xincai; Guo, Yan; Sun, Huashan; Zhou, Feng; Mei, Linlu

    2006-10-01

    This paper presents a Pareto evolutionary artificial neural network (Pareto-EANN) approach based on the evolutionary algorithms for multiobjective optimization augmented with local search for the classification of remote sensing image. Its novelty lies in the use of a multiobjective genetic algorithm where single hidden layers Multilayer Perceptrons (MLP) are employed to indicate the accuracy/complexity trade-off. Some advantages of this approach include the ability to accommodate multiple criteria such as accuracy of the classifier and number of hidden units. We compared Pareto-EANN classifiers results of the classification of remote sensing image against standard backpropagation neural network classifiers and EANN classifiers; we show experimentally the efficiency of the proposed methodology.

  8. Automatic Training Sample Selection for a Multi-Evidence Based Crop Classification Approach

    NASA Astrophysics Data System (ADS)

    Chellasamy, M.; Ferre, P. A. Ty; Humlekrog Greve, M.

    2014-09-01

    An approach to use the available agricultural parcel information to automatically select training samples for crop classification is investigated. Previous research addressed the multi-evidence crop classification approach using an ensemble classifier. This first produced confidence measures using three Multi-Layer Perceptron (MLP) neural networks trained separately with spectral, texture and vegetation indices; classification labels were then assigned based on Endorsement Theory. The present study proposes an approach to feed this ensemble classifier with automatically selected training samples. The available vector data representing crop boundaries with corresponding crop codes are used as a source for training samples. These vector data are created by farmers to support subsidy claims and are, therefore, prone to errors such as mislabeling of crop codes and boundary digitization errors. The proposed approach is named as ECRA (Ensemble based Cluster Refinement Approach). ECRA first automatically removes mislabeled samples and then selects the refined training samples in an iterative training-reclassification scheme. Mislabel removal is based on the expectation that mislabels in each class will be far from cluster centroid. However, this must be a soft constraint, especially when working with a hypothesis space that does not contain a good approximation of the targets classes. Difficulty in finding a good approximation often exists either due to less informative data or a large hypothesis space. Thus this approach uses the spectral, texture and indices domains in an ensemble framework to iteratively remove the mislabeled pixels from the crop clusters declared by the farmers. Once the clusters are refined, the selected border samples are used for final learning and the unknown samples are classified using the multi-evidence approach. The study is implemented with WorldView-2 multispectral imagery acquired for a study area containing 10 crop classes. The proposed

  9. Dynamic frequency feature selection based approach for classification of motor imageries.

    PubMed

    Luo, Jing; Feng, Zuren; Zhang, Jun; Lu, Na

    2016-08-01

    Electroencephalography (EEG) is one of the most popular techniques to record the brain activities such as motor imagery, which is of low signal-to-noise ratio and could lead to high classification error. Therefore, selection of the most discriminative features could be crucial to improve the classification performance. However, the traditional feature selection methods employed in brain-computer interface (BCI) field (e.g. Mutual Information-based Best Individual Feature (MIBIF), Mutual Information-based Rough Set Reduction (MIRSR) and cross-validation) mainly focus on the overall performance on all the trials in the training set, and thus may have very poor performance on some specific samples, which is not acceptable. To address this problem, a novel sequential forward feature selection approach called Dynamic Frequency Feature Selection (DFFS) is proposed in this paper. The DFFS method emphasized the importance of the samples that got misclassified while only pursuing high overall classification performance. In the DFFS based classification scheme, the EEG data was first transformed to frequency domain using Wavelet Packet Decomposition (WPD), which is then employed as the candidate set for further discriminatory feature selection. The features are selected one by one in a boosting manner. After one feature being selected, the importance of the correctly classified samples based on the feature will be decreased, which is equivalent to increasing the importance of the misclassified samples. Therefore, a complement feature to the current features could be selected in the next run. The selected features are then fed to a classifier trained by random forest algorithm. Finally, a time series voting-based method is utilized to improve the classification performance. Comparisons between the DFFS-based approach and state-of-art methods on BCI competition IV data set 2b have been conducted, which have shown the superiority of the proposed algorithm. PMID:27253616

  10. A bag of cells approach for antinuclear antibodies HEp-2 image classification.

    PubMed

    Wiliem, Arnold; Hobson, Peter; Minchin, Rodney F; Lovell, Brian C

    2015-06-01

    The antinuclear antibody (ANA) test via indirect immunofluorescence applied on Human Epithelial type 2 (HEp-2) cells is a pathology test commonly used to identify connective tissue diseases (CTDs). Despite its effectiveness, the test is still considered labor intensive and time consuming. Applying image-based computer aided diagnosis (CAD) systems is one of the possible ways to address these issues. Ideally, a CAD system should be able to classify ANA HEp-2 images taken by a camera fitted to a fluorescence microscope. Unfortunately, most prior works have primarily focused on the HEp-2 cell image classification problem which is one of the early essential steps in the system pipeline. In this work we directly tackle the specimen image classification problem. We aim to develop a system that can be easily scaled and has competitive accuracy. ANA HEp-2 images or ANA images are generally comprised of a number of cells. Patterns exhibiting in the cells are then used to make inference on the ANA image pattern. To that end, we adapted a popular approach for general image classification problems, namely a bag of visual words approach. Each specimen is considered as a visual document containing visual vocabularies represented by its cells. A specimen image is then represented by a histogram of visual vocabulary occurrences. We name this approach as the Bag of Cells approach. We studied the performance of the proposed approach on a set of images taken from 262 ANA positive patient sera. The results show the proposed approach has competitive performance compared to the recent state-of-the-art approaches. Our proposal can also be expanded to other tests involving examining patterns of human cells to make inferences. PMID:25492545

  11. A Serial Risk Score Approach to Disease Classification that Accounts for Accuracy and Cost

    PubMed Central

    Huynh, Dat; Laeyendecker, Oliver; Brookmeyer, Ron

    2016-01-01

    Summary The performance of diagnostic tests for disease classification is often measured by accuracy (e.g. sensitivity or specificity); however, costs of the diagnostic test are a concern as well. Combinations of multiple diagnostic tests may improve accuracy, but incur additional costs. Here we consider serial testing approaches that maintain accuracy while controlling costs of the diagnostic tests. We present a serial risk score classification approach. The basic idea is to sequentially test with additional diagnostic tests just until persons are classified. In this way, it is not necessary to test all persons with all tests. The methods are studied in simulations and compared with logistic regression. We applied the methods to data from HIV cohort studies to identify HIV infected individuals who are recently infected (< 1 year) by testing with assays for multiple biomarkers. We find that the serial risk score classification approach can maintain accuracy while achieving a reduction in cost compared to testing all individuals with all assays. PMID:25156309

  12. [From population genetics to population genomics of forest trees: integrated population genomics approach].

    PubMed

    Krutovskiĭ, K V

    2006-10-01

    Early works by Altukhov and his associates on pine and spruce laid the foundation for Russian population genetic studies on tree species with the use of molecular genetic markers. In recent years, these species have become especially popular as nontraditional eukaryotic models for population and evolutionary genomic research. Tree species with large, cross-pollinating native populations, high genetic and phenotypic variation, growing in diverse environments and affected by environmental changes during hundreds of years of their individual development, are an ideal model for studying the molecular genetic basis of adaptation. The great advance in this field is due to the rapid development of population genomics in the last few years. In the broad sense, population genomics is a novel, fast-developing discipline, combining traditional population genetic approaches with the genomic level of analysis. Thousands of genes with known function and sometimes known genomic localization can be simultaneously studied in many individuals. This opens new prospects for obtaining statistical estimates for a great number of genes and segregating elements. Mating system, gene exchange, reproductive population size, population disequilibrium, interaction among populations, and many other traditional problems of population genetics can be now studied using data on variation in many genes. Moreover, population genomic analysis allows one to distinguish factors that affect individual genes, alleles, or nucleotides (such as, for example, natural selection) from factors affecting the entire genome (e.g., demography). This paper presents a brief review of traditional methods of studying genetic variation in forest tree species and introduces a new, integrated population genomics approach. The main stages of the latter are : (1) selection of genes, which are tentatively involved in variation of adaptive traits, by means of a detailed examination of the regulation and the expression of

  13. Functional classification of CATH superfamilies: a domain-based approach for protein function annotation

    PubMed Central

    Das, Sayoni; Lee, David; Sillitoe, Ian; Dawson, Natalie L.; Lees, Jonathan G.; Orengo, Christine A.

    2015-01-01

    Motivation: Computational approaches that can predict protein functions are essential to bridge the widening function annotation gap especially since <1.0% of all proteins in UniProtKB have been experimentally characterized. We present a domain-based method for protein function classification and prediction of functional sites that exploits functional sub-classification of CATH superfamilies. The superfamilies are sub-classified into functional families (FunFams) using a hierarchical clustering algorithm supervised by a new classification method, FunFHMMer. Results: FunFHMMer generates more functionally coherent groupings of protein sequences than other domain-based protein classifications. This has been validated using known functional information. The conserved positions predicted by the FunFams are also found to be enriched in known functional residues. Moreover, the functional annotations provided by the FunFams are found to be more precise than other domain-based resources. FunFHMMer currently identifies 110 439 FunFams in 2735 superfamilies which can be used to functionally annotate > 16 million domain sequences. Availability and implementation: All FunFam annotation data are made available through the CATH webpages (http://www.cathdb.info). The FunFHMMer webserver (http://www.cathdb.info/search/by_funfhmmer) allows users to submit query sequences for assignment to a CATH FunFam. Contact: sayoni.das.12@ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26139634

  14. Computational Classification Approach to Profile Neuron Subtypes from Brain Activity Mapping Data

    PubMed Central

    Li, Meng; Zhao, Fang; Lee, Jason; Wang, Dong; Kuang, Hui; Tsien, Joe Z.

    2015-01-01

    The analysis of cell type-specific activity patterns during behaviors is important for better understanding of how neural circuits generate cognition, but has not been well explored from in vivo neurophysiological datasets. Here, we describe a computational approach to uncover distinct cell subpopulations from in vivo neural spike datasets. This method, termed “inter-spike-interval classification-analysis” (ISICA), is comprised of four major steps: spike pattern feature-extraction, pre-clustering analysis, clustering classification, and unbiased classification-dimensionality selection. By using two key features of spike dynamic - namely, gamma distribution shape factors and a coefficient of variation of inter-spike interval - we show that this ISICA method provides invariant classification for dopaminergic neurons or CA1 pyramidal cell subtypes regardless of the brain states from which spike data were collected. Moreover, we show that these ISICA-classified neuron subtypes underlie distinct physiological functions. We demonstrate that the uncovered dopaminergic neuron subtypes encoded distinct aspects of fearful experiences such as valence or value, whereas distinct hippocampal CA1 pyramidal cells responded differentially to ketamine-induced anesthesia. This ISICA method should be useful to better data mining of large-scale in vivo neural datasets, leading to novel insights into circuit dynamics associated with cognitions. PMID:26212360

  15. A Comparison of Computer-Based Classification Testing Approaches Using Mixed-Format Tests with the Generalized Partial Credit Model

    ERIC Educational Resources Information Center

    Kim, Jiseon

    2010-01-01

    Classification testing has been widely used to make categorical decisions by determining whether an examinee has a certain degree of ability required by established standards. As computer technologies have developed, classification testing has become more computerized. Several approaches have been proposed and investigated in the context of…

  16. A combined segmentation and pixel-based classification approach of QuickBird imagery for land cover mapping

    NASA Astrophysics Data System (ADS)

    Wang, Jianmei; Li, Deren; Qin, Wenzhong

    2005-10-01

    Recent advances in remote-sensing technology suggest that satellite-based earth observation (EO) has great potential for providing and updating spatial information in a timely and cost-effective manner. However, with the improvement of the spatial resolution of satellite image, the detail of the image has become more complicated. Even though texture features included for multi-spectral high-resolution satellite imagery, conventional methods for pixel-based classification have limited success. In order to take better advantage of spatial information of high-resolution satellite imagery, a combined segmentation and pixel-based classification approach is presented in this paper. Firstly, pixel-based multi-spectral maximum-likelihood classification approach obtains initial classification result. Secondly, image segmentation is created by watershed transform and region merging. Finally, based on the proportions of each class present in each segment obtain final classification map. A QuickBird imagery of the suburban area of Shanghai in China is used to validate the proposed method. Experiment proves that classification map produced by the combined approach, is visual noise-free, has clean borders, and has better classification accuracy than that by pixel-based classification approach.

  17. Exploring tree species signature using waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Zhou, T.; Popescu, S. C.; Krause, K.

    2015-12-01

    Successful classification of tree species with waveform LiDAR data would be of considerable value to estimate the biomass stocks and changes in forests. Current approaches emphasize converting the full waveform data into discrete points to get larger amount of parameters and identify tree species using several discrete-points variables. However, ignores intensity values and waveform shapes which convey important structural characteristics. The overall goal of this study was to employ the intensity and waveform shape of individual tree as the waveform signature to detect tree species. The data was acquired by the National Ecological Observatory Network (NEON) within 250*250 m study area located in San Joaquin Experimental Range. Specific objectives were to: (1) segment individual trees using the smoothed canopy height model (CHM) derived from discrete LiDAR points; (2) link waveform LiDAR with above individual tree boundaries to derive sample signatures of three tree species and use these signatures to discriminate tree species in a large area; and (3) compare tree species detection results from discrete LiDAR data and waveform LiDAR data. An overall accuracy of the segmented individual tree of more than 80% was obtained. The preliminary results show that compared with the discrete LiDAR data, the waveform LiDAR signature has a higher potential for accurate tree species classification.

  18. A tree-based statistical classification algorithm (CHAID) for identifying variables responsible for the occurrence of faecal indicator bacteria during waterworks operations

    NASA Astrophysics Data System (ADS)

    Bichler, Andrea; Neumaier, Arnold; Hofmann, Thilo

    2014-11-01

    Microbial contamination of groundwater used for drinking water can affect public health and is of major concern to local water authorities and water suppliers. Potential hazards need to be identified in order to protect raw water resources. We propose a non-parametric data mining technique for exploring the presence of total coliforms (TC) in a groundwater abstraction well and its relationship to readily available, continuous time series of hydrometric monitoring parameters (seven year records of precipitation, river water levels, and groundwater heads). The original monitoring parameters were used to create an extensive generic dataset of explanatory variables by considering different accumulation or averaging periods, as well as temporal offsets of the explanatory variables. A classification tree based on the Chi-Squared Automatic Interaction Detection (CHAID) recursive partitioning algorithm revealed statistically significant relationships between precipitation and the presence of TC in both a production well and a nearby monitoring well. Different secondary explanatory variables were identified for the two wells. Elevated water levels and short-term water table fluctuations in the nearby river were found to be associated with TC in the observation well. The presence of TC in the production well was found to relate to elevated groundwater heads and fluctuations in groundwater levels. The generic variables created proved useful for increasing significance levels. The tree-based model was used to predict the occurrence of TC on the basis of hydrometric variables.

  19. Comparison of Standard and Novel Signal Analysis Approaches to Obstructive Sleep Apnea Classification

    PubMed Central

    Roebuck, Aoife; Clifford, Gari D.

    2015-01-01

    Obstructive sleep apnea (OSA) is a disorder characterized by repeated pauses in breathing during sleep, which leads to deoxygenation and voiced chokes at the end of each episode. OSA is associated by daytime sleepiness and an increased risk of serious conditions such as cardiovascular disease, diabetes, and stroke. Between 2 and 7% of the adult population globally has OSA, but it is estimated that up to 90% of those are undiagnosed and untreated. Diagnosis of OSA requires expensive and cumbersome screening. Audio offers a potential non-contact alternative, particularly with the ubiquity of excellent signal processing on every phone. Previous studies have focused on the classification of snoring and apneic chokes. However, such approaches require accurate identification of events. This leads to limited accuracy and small study populations. In this work, we propose an alternative approach which uses multiscale entropy (MSE) coefficients presented to a classifier to identify disorder in vocal patterns indicative of sleep apnea. A database of 858 patients was used, the largest reported in this domain. Apneic choke, snore, and noise events encoded with speech analysis features were input into a linear classifier. Coefficients of MSE derived from the first 4 h of each recording were used to train and test a random forest to classify patients as apneic or not. Standard speech analysis approaches for event classification achieved an out-of-sample accuracy (Ac) of 76.9% with a sensitivity (Se) of 29.2% and a specificity (Sp) of 88.7% but high variance. For OSA severity classification, MSE provided an out-of-sample Ac of 79.9%, Se of 66.0%, and Sp = 88.8%. Including demographic information improved the MSE-based classification performance to Ac = 80.5%, Se = 69.2%, and Sp = 87.9%. These results indicate that audio recordings could be used in screening for OSA, but are generally under-sensitive. PMID:26380256

  20. Operational optimization of irrigation scheduling for citrus trees using an ensemble based data assimilation approach

    NASA Astrophysics Data System (ADS)

    Hendricks Franssen, H.; Han, X.; Martinez, F.; Jimenez, M.; Manzano, J.; Chanzy, A.; Vereecken, H.

    2013-12-01

    Data assimilation (DA) techniques, like the local ensemble transform Kalman filter (LETKF) not only offer the opportunity to update model predictions by assimilating new measurement data in real time, but also provide an improved basis for real-time (DA-based) control. This study focuses on the optimization of real-time irrigation scheduling for fields of citrus trees near Picassent (Spain). For three selected fields the irrigation was optimized with DA-based control, and for other fields irrigation was optimized on the basis of a more traditional approach where reference evapotranspiration for citrus trees was estimated using the FAO-method. The performance of the two methods is compared for the year 2013. The DA-based real-time control approach is based on ensemble predictions of soil moisture profiles, using the Community Land Model (CLM). The uncertainty in the model predictions is introduced by feeding the model with weather predictions from an ensemble prediction system (EPS) and uncertain soil hydraulic parameters. The model predictions are updated daily by assimilating soil moisture data measured by capacitance probes. The measurement data are assimilated with help of LETKF. The irrigation need was calculated for each of the ensemble members, averaged, and logistic constraints (hydraulics, energy costs) were taken into account for the final assigning of irrigation in space and time. For the operational scheduling based on this approach only model states and no model parameters were updated by the model. Other, non-operational simulation experiments for the same period were carried out where (1) neither ensemble weather forecast nor DA were used (open loop), (2) Only ensemble weather forecast was used, (3) Only DA was used, (4) also soil hydraulic parameters were updated in data assimilation and (5) both soil hydraulic and plant specific parameters were updated. The FAO-based and DA-based real-time irrigation control are compared in terms of soil moisture

  1. Ottawa's urban forest: A geospatial approach to data collection for the UFORE/i-Tree Eco ecosystem services valuation model

    NASA Astrophysics Data System (ADS)

    Palmer, Michael D.

    The i-Tree Eco model, developed by the U.S. Forest Service, is commonly used to estimate the value of the urban forest and the ecosystem services trees provide. The model relies on field-based measurements to estimate ecosystem service values. However, the methods for collecting the field data required for the model can be extensive and costly for large areas, and data collection can thus be a barrier to implementing the model for many cities. This study investigated the use of geospatial technologies as a means to collect urban forest structure measurements within the City of Ottawa, Ontario. Results show that geospatial data collection methods can serve as a proxy for urban forest structure parameters required by i-Tree Eco. Valuations using the geospatial approach are shown to be less accurate than those developed from field-based data, but significantly less expensive. Planners must weigh the limitations of either approach when planning assessment projects.

  2. Insights into geomorphic and vegetation spatial patterns within dynamic river floodplains using soft classification approaches

    NASA Astrophysics Data System (ADS)

    Guneralp, I.; Filippi, A. M.; Guneralp, B.; You, M.

    2014-12-01

    Lowland rivers in broad alluvial floodplains create one of the most dynamic landscapes, governed by multiple, and commonly nonlinear, interactions among geomorphic, hydrologic, and ecologic processes. Fluvial landforms and land-cover patches composing the floodplains of lowland rivers vary in their shapes and sizes because of variations in vegetation biomass, topography, and soil composition (e.g., of abandoned meanders versus accreting bars) across space. Such floodplain heterogeneity, in turn, influences future river-channel evolution by creating variability in channel-migration rates. In this study, using Landsat 5 Thematic Mapper data and alternative image-classification approaches, we investigate geomorphic and vegetation spatial patterns in a dynamic large tropical river. Specifically, we examine the spatial relations between river-channel planform and fluvial-landform and land-cover patterns across the floodplain. We classify the images using both hard and soft classification algorithms. We characterize the structure of geomorphic landform and vegetation components of the floodplain by computing a range of class-level landscape metrics based on the classified images. Results indicate that comparable classification accuracies are accrued for the inherently hard and (hardened) soft classification images, ranging from 89.8% to 91.8% overall accuracy. However, soft classification images provide unique information regarding spatially-varying similarities and differences in water-column properties of oxbow lakes and the main river channel. Proximity analyses, where buffer zones along the river with distances corresponding to 5, 10, and 20 river-channel widths are constructed, reveal that the average size of forest patches first increase away from the river banks but they become sparse after a distance of 10 channel widths away from the river.

  3. Extended Gabor approach applied to classification of emphysematous patterns in computed tomography

    PubMed Central

    Escalante-Ramírez, Boris; Cristóbal, Gabriel; Estépar, Raúl San José

    2014-01-01

    Chronic obstructive pulmonary disease (COPD) is a progressive and irreversible lung condition typically related to emphysema. It hinders air from passing through airpaths and causes that alveolar sacs lose their elastic quality. Findings of COPD may be manifested in a variety of computed tomography (CT) studies. Nevertheless, visual assessment of CT images is time-consuming and depends on trained observers. Hence, a reliable computer-aided diagnosis system would be useful to reduce time and inter-evaluator variability. In this paper, we propose a new emphysema classification framework based on complex Gabor filters and local binary patterns. This approach simultaneously encodes global characteristics and local information to describe emphysema morphology in CT images. Kernel Fisher analysis was used to reduce dimensionality and to find the most discriminant nonlinear boundaries among classes. Finally, classification was performed using the k-nearest neighbor classifier. The results have shown the effectiveness of our approach for quantifying lesions due to emphysema and that the combination of descriptors yields to a better classification performance. PMID:24496558

  4. An approach to multi-temporal MODIS image analysis using image classification and segmentation

    NASA Astrophysics Data System (ADS)

    Senthilnath, J.; Bajpai, Shivesh; Omkar, S. N.; Diwakar, P. G.; Mani, V.

    2012-11-01

    This paper discusses an approach for river mapping and flood evaluation based on multi-temporal time series analysis of satellite images utilizing pixel spectral information for image classification and region-based segmentation for extracting water-covered regions. Analysis of MODIS satellite images is applied in three stages: before flood, during flood and after flood. Water regions are extracted from the MODIS images using image classification (based on spectral information) and image segmentation (based on spatial information). Multi-temporal MODIS images from "normal" (non-flood) and flood time-periods are processed in two steps. In the first step, image classifiers such as Support Vector Machines (SVM) and Artificial Neural Networks (ANN) separate the image pixels into water and non-water groups based on their spectral features. The classified image is then segmented using spatial features of the water pixels to remove the misclassified water. From the results obtained, we evaluate the performance of the method and conclude that the use of image classification (SVM and ANN) and region-based image segmentation is an accurate and reliable approach for the extraction of water-covered regions.

  5. A hybrid classifier fusion approach for motor unit potential classification during EMG signal decomposition.

    PubMed

    Rasheed, Sarbast; Stashuk, Daniel W; Kamel, Mohamed S

    2007-09-01

    In this paper, we propose a hybrid classifier fusion scheme for motor unit potential classification during electromyographic (EMG) signal decomposition. The scheme uses an aggregator module consisting of two stages of classifier fusion: the first at the abstract level using class labels and the second at the measurement level using confidence values. Performance of the developed system was evaluated using one set of real signals and two sets of simulated signals and was compared with the performance of the constituent base classifiers and the performance of a one-stage classifier fusion approach. Across the EMG signal data sets used and relative to the performance of base classifiers, the hybrid approach had better average classification performance overall. For the set of simulated signals of varying intensity, the hybrid classifier fusion system had on average an improved correct classification rate (CCr) (6.1%) and reduced error rate (Er) (0.4%). For the set of simulated signals of varying amounts of shape and/or firing pattern variability, the hybrid classifier fusion system had on average an improved CCr (6.2%) and reduced Er (0.9%). For real signals, the hybrid classifier fusion system had on average an improved CCr (7.5%) and reduced Er (1.7%). PMID:17867366

  6. Extended Gabor approach applied to classification of emphysematous patterns in computed tomography.

    PubMed

    Nava, Rodrigo; Escalante-Ramírez, Boris; Cristóbal, Gabriel; Estépar, Raúl San José

    2014-04-01

    Chronic obstructive pulmonary disease (COPD) is a progressive and irreversible lung condition typically related to emphysema. It hinders air from passing through airpaths and causes that alveolar sacs lose their elastic quality. Findings of COPD may be manifested in a variety of computed tomography (CT) studies. Nevertheless, visual assessment of CT images is time-consuming and depends on trained observers. Hence, a reliable computer-aided diagnosis system would be useful to reduce time and inter-evaluator variability. In this paper, we propose a new emphysema classification framework based on complex Gabor filters and local binary patterns. This approach simultaneously encodes global characteristics and local information to describe emphysema morphology in CT images. Kernel Fisher analysis was used to reduce dimensionality and to find the most discriminant nonlinear boundaries among classes. Finally, classification was performed using the k-nearest neighbor classifier. The results have shown the effectiveness of our approach for quantifying lesions due to emphysema and that the combination of descriptors yields to a better classification performance. PMID:24496558

  7. A novel, element-based approach for the objective classification of bloodstain patterns.

    PubMed

    Arthur, Ravishka M; Cockerton, Sarah L; de Bruin, Karla G; Taylor, Michael C

    2015-12-01

    The classification of bloodstain patterns has been identified as a challenging part of bloodstain pattern analysis due to the lack of a widely accepted and well-defined methodology and the ambiguity often associated with examining bloodstain patterns. The main aim of this study was to develop an objective, science-based method, for classifying bloodstain patterns, through the development of common language that could be used by BPA experts to describe the appearance of the pattern. This novel approach encourages a shift in the mindset of a BPA analyst by bringing them 'back to the basics' by treating components of a bloodstain pattern as discrete, observable and measurable units. One of the principal problems with current pattern classification methods is that pattern types are generally described in terms of the mechanism of pattern formation rather than grouping according to observable pattern characteristics. This study extends current BPA classification methodologies by developing and validating mechanism-free nomenclature that arises from observing and documenting the physical characteristics of bloodstain patterns. Following the grouping of bloodstain components on the basis of their physical characteristics, the formation evolution of these components is then investigated using concepts drawn from the fluid-dynamics of bloodstain pattern formation. This study offers a promising approach to distinguishing between different bloodstain pattern types through the use of visual aids in the form of colour maps, high-speed video and static digital images. PMID:26386338

  8. Impact of different nitrogen emission sources on tree physiology as assessed by a triple stable isotope approach

    NASA Astrophysics Data System (ADS)

    Guerrieri, M. R.; Siegwolf, R. T. W.; Saurer, M.; Jäggi, M.; Cherubini, P.; Ripullone, F.; Borghetti, M.

    The importance that nitrogen (N) deposition has in driving the carbon (C) sequestration of forests has recently been investigated using both experimental and modeling approaches. Whether increased N deposition has positive or negative effects on such ecosystems depends on the status of the N and the duration of the deposition. By combining δ13C, δ18O, δ15N and dendrochronological approaches, we analyzed the impact of two different sources of NO x emissions on two tree species, namely: a broadleaved species ( Quercus cerris) that was located close to an oil refinery in Southern Italy, and a coniferous species ( Picea abies) located close to a freeway in Switzerland. Variations in the ci/ ca ratio and the distinction between stomatal and photosynthetic responses to NO x emissions in trees were assessed using a conceptual model, which combines δ13C and δ18O. δ15N in leaves, needles and tree rings was found to be a bioindicator of N input from anthropogenic emissions, especially at the oil refinery site. We observed that N fertilization had a stimulatory effect on tree growth near the oil refinery, while the opposite effect was found for trees at the freeway site. Changes in the ci/ ca ratio were mostly related to variations in δ13C at the freeway site and, thus, were driven by photosynthesis. At the oil refinery site they were mainly related to stomatal conductance, as assessed using δ18O. This study demonstrates that a single method approach does not always provide a complete picture of which physiological traits are more affected by N emissions. The triple isotope approach combined with dendrochronological analyses proved to be a very promising tool for monitoring the ecophysiological responses of trees to long-term N deposition.

  9. An inverse modeling approach for tree-ring-based climate reconstructions under changing atmospheric CO2 concentrations

    NASA Astrophysics Data System (ADS)

    Boucher, É.; Guiot, J.; Hatté, C.; Daux, V.; Danis, P.-A.; Dussouillez, P.

    2013-11-01

    Over the last decades, dendroclimatologists have relied upon linear transfer functions to reconstruct historical climate. Transfer functions need to be calibrated using recent data from periods where CO2 concentrations reached unprecedented levels (near 400 ppm). Based on these transfer functions, dendroclimatologists must then reconstruct a different past, a past where CO2 concentrations were much below 300 ppm. However, relying upon transfer functions calibrated in this way may introduce an unanticipated bias in the reconstruction of past climate, particularly if CO2 levels have had a noticeable fertilizing effect since the beginning of the industrial era. As an alternative to the transfer function approach, we run the MAIDENiso ecophysiological model in an inverse mode to link together climatic variables, atmospheric CO2 concentrations and tree growth parameters. Our approach endeavors to find the optimal combination of meteorological conditions that best simulate observed tree ring patterns. We test our approach in the Fontainebleau forest (France). By comparing two different CO2 scenarios, we present evidence that increasing CO2 concentrations have had a slight, yet significant, effect on reconstruction results. We demonstrate that higher CO2 concentrations augment the efficiency of water use by trees, therefore favoring the reconstruction of a warmer and drier climate. Under elevated CO2 concentrations, trees close their stomata and need less water to produce the same amount of wood. Inverse process-based modeling represents a powerful alternative to the transfer function technique, especially for the study of divergent tree-ring-to-climate relationships. The approach has several advantages, most notably its ability to distinguish between climatic effects and CO2 imprints on tree growth. Therefore our method produces reconstructions that are less biased by anthropogenic greenhouse gas emissions and that are based on sound ecophysiological knowledge.

  10. On the Biogeography of Centipeda: A Species-Tree Diffusion Approach

    PubMed Central

    Nylinder, Stephan; Lemey, Philippe; De Bruyn, Mark; Suchard, Marc A.; Pfeil, Bernard E.; Walsh, Neville; Anderberg, Arne A.

    2014-01-01

    Reconstructing the biogeographic history of groups present in continuous arid landscapes is challenging due to the difficulties in defining discrete areas for analyses, and even more so when species largely overlap both in terms of geography and habitat preference. In this study, we use a novel approach to estimate ancestral areas for the small plant genus Centipeda. We apply continuous diffusion of geography by a relaxed random walk where each species is sampled from its extant distribution on an empirical distribution of time-calibrated species-trees. Using a distribution of previously published substitution rates of the internal transcribed spacer (ITS) for Asteraceae, we show how the evolution of Centipeda correlates with the temporal increase of aridity in the arid zone since the Pliocene. Geographic estimates of ancestral species show a consistent pattern of speciation of early lineages in the Lake Eyre region, with a division in more northerly and southerly groups since ∼840 ka. Summarizing the geographic slices of species-trees at the time of the latest speciation event (∼20 ka), indicates no presence of the genus in Australia west of the combined desert belt of the Nullabor Plain, the Great Victoria Desert, the Gibson Desert, and the Great Sandy Desert, or beyond the main continental shelf of Australia. The result indicates all western occurrences of the genus to be a result of recent dispersal rather than ancient vicariance. This study contributes to our understanding of the spatiotemporal processes shaping the flora of the arid zone, and offers a significant improvement in inference of ancestral areas for any organismal group distributed where it remains difficult to describe geography in terms of discrete areas. PMID:24335493

  11. An object-based approach to hierarchical classification of the Earth's topography from SRTM data

    NASA Astrophysics Data System (ADS)

    Eisank, C.; Dragut, L.

    2012-04-01

    Digital classification of the Earth's surface has significantly benefited from the availability of global DEMs and recent advances in image processing techniques. Such an innovative approach is object-based analysis, which integrates multi-scale segmentation and rule-based classification. Since the classification is based on spatially configured objects and no longer on solely thematically defined cells, the resulting landforms or landform types are represented in a more realistic way. However, up to now, the object-based approach has not been adopted for broad-scale topographic modelling. Existing global to almost-global terrain classification systems have been implemented on per cell schemes, accepting disadvantages such as the speckled character of outputs and the non-consideration of space. We introduce the first object-based method to automatically classify the Earth's surface as represented by the SRTM into a three-level hierarchy of topographic regions. The new method relies on the concept of decomposing land-surface complexity into ever more homogeneous domains. The SRTM elevation layer is automatically segmented and classified at three levels that represent domains of complexity by using self-adaptive, data-driven techniques. For each domain, scales in the data are detected with the help of local variance and segmentation is performed at these recognised scales. Objects resulting from segmentation are partitioned into sub-domains based on thresholds given by the mean values of elevation and standard deviation of elevation respectively. Results resemble patterns of existing global and regional classifications, displaying a level of detail close to manually drawn maps. Statistical evaluation indicates that most of the classes satisfy the regionalisation requirements of maximising internal homogeneity while minimising external homogeneity. Most objects have boundaries matching natural discontinuities at the regional level. The method is simple and fully

  12. Early and Mid-Holocene Climate Variability - A Multi-Proxy Approach from Multi-Millennial Tree Ring Records

    NASA Astrophysics Data System (ADS)

    Ziehmer, Malin Michelle; Nicolussi, Kurt; Schlüchter, Christian; Leuenberger, Markus

    2016-04-01

    Most reconstructions of Holocene climate variability in the Alps are based on low-frequency archives such as glacier and tree line fluctuations. However; recent finds of wood remains in glacier forefields in the Alps reveal a unique high-frequency archive allowing climate reconstruction over the entire Holocene. The evolution of Holocene climate can be reconstructed by using a multi-proxy approach combining tree ring width and multiple stable isotope chronologies by establishing highly resolved stable isotope records from calendar-dated wood which covers the past 9000 years b2k. Therefore, we collected samples in the Alps covering a large SW-NE transect, primarily in glacier forefields but also in peat bogs and small lakes. The multiple sample locations allow the analysis of climatic conditions along a climatic gradient characterized by the change from an Atlantic to a more continental climate. Subsequently, tree ring widths are measured and samples are calendrically dated by means of tree ring analysis. Due to the large amount of samples for stable isotope analysis (> 8000 samples to cover the entire Holocene by guaranteeing a sample replication of 4 samples per time unit of 5 years), dated wood samples are separated into 5-year tree ring blocks. These blocks are sliced and the cellulose is extracted after a standardized procedure and crushed by ultrasonic homogenization. In order to establish multi-proxy records, the stable isotopes of carbon, oxygen and hydrogen are simultaneously measured. Both the 5-year tree ring width and multiple stable isotope series offer new insights into the Early and Mid-Holocene climate and its variability in the Alps. The stable isotope records reveal interesting low-frequency variability. But they also display expected offsets caused by the measurement of individual trees revealing effects of sampling site, tree species and growth trend. These effects offer an additional insight into the tree growth and stand behavior of single

  13. A data mining approach to optimize pellets manufacturing process based on a decision tree algorithm.

    PubMed

    Ronowicz, Joanna; Thommes, Markus; Kleinebudde, Peter; Krysiński, Jerzy

    2015-06-20

    The present study is focused on the thorough analysis of cause-effect relationships between pellet formulation characteristics (pellet composition as well as process parameters) and the selected quality attribute of the final product. The shape using the aspect ratio value expressed the quality of pellets. A data matrix for chemometric analysis consisted of 224 pellet formulations performed by means of eight different active pharmaceutical ingredients and several various excipients, using different extrusion/spheronization process conditions. The data set contained 14 input variables (both formulation and process variables) and one output variable (pellet aspect ratio). A tree regression algorithm consistent with the Quality by Design concept was applied to obtain deeper understanding and knowledge of formulation and process parameters affecting the final pellet sphericity. The clear interpretable set of decision rules were generated. The spehronization speed, spheronization time, number of holes and water content of extrudate have been recognized as the key factors influencing pellet aspect ratio. The most spherical pellets were achieved by using a large number of holes during extrusion, a high spheronizer speed and longer time of spheronization. The described data mining approach enhances knowledge about pelletization process and simultaneously facilitates searching for the optimal process conditions which are necessary to achieve ideal spherical pellets, resulting in good flow characteristics. This data mining approach can be taken into consideration by industrial formulation scientists to support rational decision making in the field of pellets technology. PMID:25835791

  14. Tailored approach in inguinal hernia repair - decision tree based on the guidelines.

    PubMed

    Köckerling, Ferdinand; Schug-Pass, Christine

    2014-01-01

    The endoscopic procedures TEP and TAPP and the open techniques Lichtenstein, Plug and Patch, and PHS currently represent the gold standard in inguinal hernia repair recommended in the guidelines of the European Hernia Society, the International Endohernia Society, and the European Association of Endoscopic Surgery. Eighty-two percent of experienced hernia surgeons use the "tailored approach," the differentiated use of the several inguinal hernia repair techniques depending on the findings of the patient, trying to minimize the risks. The following differential therapeutic situations must be distinguished in inguinal hernia repair: unilateral in men, unilateral in women, bilateral, scrotal, after previous pelvic and lower abdominal surgery, no general anesthesia possible, recurrence, and emergency surgery. Evidence-based guidelines and consensus conferences of experts give recommendations for the best approach in the individual situation of a patient. This review tries to summarize the recommendations of the various guidelines and to transfer them into a practical decision tree for the daily work of surgeons performing inguinal hernia repair. PMID:25593944

  15. Semi-automatic classification of glaciovolcanic landforms: An object-based mapping approach based on geomorphometry

    NASA Astrophysics Data System (ADS)

    Pedersen, G. B. M.

    2016-02-01

    A new object-oriented approach is developed to classify glaciovolcanic landforms (Procedure A) and their landform elements boundaries (Procedure B). It utilizes the principle that glaciovolcanic edifices are geomorphometrically distinct from lava shields and plains (Pedersen and Grosse, 2014), and the approach is tested on data from Reykjanes Peninsula, Iceland. The outlined procedures utilize slope and profile curvature attribute maps (20 m/pixel) and the classified results are evaluated quantitatively through error matrix maps (Procedure A) and visual inspection (Procedure B). In procedure A, the highest obtained accuracy is 94.1%, but even simple mapping procedures provide good results (> 90% accuracy). Successful classification of glaciovolcanic landform element boundaries (Procedure B) is also achieved and this technique has the potential to delineate the transition from intraglacial to subaerial volcanic activity in orthographic view. This object-oriented approach based on geomorphometry overcomes issues with vegetation cover, which has been typically problematic for classification schemes utilizing spectral data. Furthermore, it handles complex edifice outlines well and is easily incorporated into a GIS environment, where results can be edited or fused with other mapping results. The approach outlined here is designed to map glaciovolcanic edifices within the Icelandic neovolcanic zone but may also be applied to similar subaerial or submarine volcanic settings, where steep volcanic edifices are surrounded by flat plains.

  16. A hybrid approach to crowd density estimation using statistical leaning and texture classification

    NASA Astrophysics Data System (ADS)

    Li, Yin; Zhou, Bowen

    2013-12-01

    Crowd density estimation is a hot topic in computer vision community. Established algorithms for crowd density estimation mainly focus on moving crowds, employing background modeling to obtain crowd blobs. However, people's motion is not obvious in most occasions such as the waiting hall in the airport or the lobby in the railway station. Moreover, conventional algorithms for crowd density estimation cannot yield desirable results for all levels of crowding due to occlusion and clutter. We propose a hybrid method to address the aforementioned problems. First, statistical learning is introduced for background subtraction, which comprises a training phase and a test phase. The crowd images are grided into small blocks which denote foreground or background. Then HOG features are extracted and are fed into a binary SVM for each block. Hence, crowd blobs can be obtained by the classification results of the trained classifier. Second, the crowd images are treated as texture images. Therefore, the estimation problem can be formulated as texture classification. The density level can be derived according to the classification results. We validate the proposed algorithm on some real scenarios where the crowd motion is not so obvious. Experimental results demonstrate that our approach can obtain the foreground crowd blobs accurately and work well for different levels of crowding.

  17. A multi-label, semi-supervised classification approach applied to personality prediction in social media.

    PubMed

    Lima, Ana Carolina E S; de Castro, Leandro Nunes

    2014-10-01

    Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naïve Bayes classifier, a Support Vector Machine, and a Multilayer Perceptron neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others. PMID:24969690

  18. Proposition of novel classification approach and features for improved real-time arrhythmia monitoring.

    PubMed

    Kim, Yoon Jae; Heo, Jeong; Park, Kwang Suk; Kim, Sungwan

    2016-08-01

    Arrhythmia refers to a group of conditions in which the heartbeat is irregular, fast, or slow due to abnormal electrical activity in the heart. Some types of arrhythmia such as ventricular fibrillation may result in cardiac arrest or death. Thus, arrhythmia detection becomes an important issue, and various studies have been conducted. Additionally, an arrhythmia detection algorithm for portable devices such as mobile phones has recently been developed because of increasing interest in e-health care. This paper proposes a novel classification approach and features, which are validated for improved real-time arrhythmia monitoring. The classification approach that was employed for arrhythmia detection is based on the concept of ensemble learning and the Taguchi method and has the advantage of being accurate and computationally efficient. The electrocardiography (ECG) data for arrhythmia detection was obtained from the MIT-BIH Arrhythmia Database (n=48). A novel feature, namely the heart rate variability calculated from 5s segments of ECG, which was not considered previously, was used. The novel classification approach and feature demonstrated arrhythmia detection accuracy of 89.13%. When the same data was classified using the conventional support vector machine (SVM), the obtained accuracy was 91.69%, 88.14%, and 88.74% for Gaussian, linear, and polynomial kernels, respectively. In terms of computation time, the proposed classifier was 5821.7 times faster than conventional SVM. In conclusion, the proposed classifier and feature showed performance comparable to those of previous studies, while the computational complexity and update interval were highly reduced. PMID:27318329

  19. Cloud field classification based upon high spatial resolution textural features. II - Simplified vector approaches

    NASA Technical Reports Server (NTRS)

    Chen, D. W.; Sengupta, S. K.; Welch, R. M.

    1989-01-01

    This paper compares the results of cloud-field classification derived from two simplified vector approaches, the Sum and Difference Histogram (SADH) and the Gray Level Difference Vector (GLDV), with the results produced by the Gray Level Cooccurrence Matrix (GLCM) approach described by Welch et al. (1988). It is shown that the SADH method produces accuracies equivalent to those obtained using the GLCM method, while the GLDV method fails to resolve error clusters. Compared to the GLCM method, the SADH method leads to a 31 percent saving in run time and a 50 percent saving in storage requirements, while the GLVD approach leads to a 40 percent saving in run time and an 87 percent saving in storage requirements.

  20. A tri-fold hybrid classification approach for diagnostics with unexampled faulty states

    NASA Astrophysics Data System (ADS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2015-01-01

    System health diagnostics provides diversified benefits such as improved safety, improved reliability and reduced costs for the operation and maintenance of engineered systems. Successful health diagnostics requires the knowledge of system failures. However, with an increasing system complexity, it is extraordinarily difficult to have a well-tested system so that all potential faulty states can be realized and studied at product testing stage. Thus, real time health diagnostics requires automatic detection of unexampled system faulty states based upon sensory data to avoid sudden catastrophic system failures. This paper presents a trifold hybrid classification (THC) approach for structural health diagnosis with unexampled health states (UHS), which comprises of preliminary UHS identification using a new thresholded Mahalanobis distance (TMD) classifier, UHS diagnostics using a two-class support vector machine (SVM) classifier, and exampled health states diagnostics using a multi-class SVM classifier. The proposed THC approach, which takes the advantages of both TMD and SVM-based classification techniques, is able to identify and isolate the unexampled faulty states through interactively detecting the deviation of sensory data from the exampled health states and forming new ones autonomously. The proposed THC approach is further extended to a generic framework for health diagnostics problems with unexampled faulty states and demonstrated with health diagnostics case studies for power transformers and rolling bearings.

  1. Membrane positioning for high- and low-resolution protein structures through a binary classification approach.

    PubMed

    Postic, Guillaume; Ghouzam, Yassine; Guiraud, Vincent; Gelly, Jean-Christophe

    2016-03-01

    The critical importance of algorithms for orienting proteins in the lipid bilayer stems from the extreme difficulty in obtaining experimental data about the membrane boundaries. Here, we present a computational method for positioning protein structures in the membrane, based on the sole alpha carbon coordinates and, therefore, compatible with both high and low structural resolutions. Our algorithm follows a new and simple approach, by treating the membrane assignment problem as a binary classification. Compared with the state-of-the-art algorithms, our method achieves similar accuracy, while being faster. Finally, our open-source software is also capable of processing coarse-grained models of protein structures. PMID:26685702

  2. A novel approach of mining strong jumping emerging patterns based on BSC-tree

    NASA Astrophysics Data System (ADS)

    Liu, Quanzhong; Shi, Peng; Hu, Zhengguo; Zhang, Yang

    2014-03-01

    It is a great challenge to discover strong jumping emerging patterns (SJEPs) from a high-dimensional dataset because of the huge pattern space. In this article, we propose a dynamically growing contrast pattern tree (DGCP-tree) structure to store grown patterns and their path codes arrays with 1-bit counts, which are from the constructed bit string compression tree. A method of mining SJEPs based on DGCP-tree is developed. In order to reduce the pattern search space, we introduce a novel pattern pruning method, which dramatically reduces non-minimal jumping emerging patterns (JEPs) during the mining process. Experiments are performed on three real cancer datasets and three datasets from the University of California, Irvine machine-learning repository. Compared with the well-known CP-tree method, the results show that the proposed method is substantially faster, able to handle higher-dimensional datasets and to prune more non-minimal JEPs.

  3. A renormalization approach to a class of exponential random processes with application to the bronchial tree

    NASA Astrophysics Data System (ADS)

    Ovidiu Vlad, Marcel

    1992-07-01

    A new stochastic renormalization approach for multi-step decay phenomena is developed. A simple physical interpretation of the renormalization method is suggested. It consists in grouping successions of variable numbers of decay events into blocks. The law of probability multiplication leads to an exponential random process Y = YX1 + X2 + …0 where both the exponents X1, X2,… and the basis Y0 are random variables. The physically consistent solution of the model corresponds to a “super-strong” renormalization regime. The dependence of the moments of the decay parameter on the number of decay steps q can be exactly determined. It consists in a linear superposition of inverse power laws in q modulated by periodic functions in In q, having different periods. The theory is applied to the renormalization of the bronchial tree. Our computation shows that the lung structure is very tolerant to fluctuations. This result supports the mechanism of morphogenesis of fractal biological organs suggested by West (Ann. Biomed. Eng. 18 (1990) 135). The possibilities of application to the scattering phenomena in one-dimensional disordered media are also investigated.

  4. Development of a Decision Support Tree Approach for Mapping Urban Vegetation Cover From Hyperspectral Imagery and GIS: the case of Athens, Greece

    NASA Astrophysics Data System (ADS)

    Georgopoulou, Iro; Petropoulos, George P.; Kalivas, Dionissios P.

    2013-04-01

    Urban vegetation represents one of the main factors directly influencing human life. Consequently, extracting information on its spatial distribution is of crucial importance to ensure, between other, sustainable urban planning and successful environmental management. To this end, remote sensing & Geographical Information Systems (GIS) technology has demonstrated a very promising, viable solution. In comparison to multispectral systems, use of hyperspectral imagery in particular, enhances dramatically our ability to accurately identify different targets on the Earth's surface. In our study, a decision tree-based classification method is presented for mapping urban vegetation cover from hyperspectral imagery. The ability of the proposed method is demonstrated using as a case study the city of Athens, Greece, for which satellite hyperspectral imagery from Hyperion sensor has been acquired. Hyperion collects spectral data in 242 spectral bands from visible to middle-infrared regions of electromagnetic spectrum and at a spatial resolution of 30 meters. Validation of our proposed method is carried out on a GIS environment based on the error matrix statistics, using as reference very high resolution imagery acquired nearly concurrently to Hyperion at our study region, supported by field visits conducted in the studied area. Additionally, the urban vegetation cover maps derived from our proposed here technique are compared versus analogous results obtained against other classification methods traditionally used in mapping urban vegetation cover. Our results confirmed the ability of our approach combined with Hyperion imagery to extract urban vegetation cover for the case of a densely-populated city with complex urban features, such as Athens. Our findings can potentially offer significant information at local scale as regards the presence of open green spaces in urban environment, since such information is vital for the successful infrastructure development, urban

  5. Repeated measurements of blood lactate concentration as a prognostic marker in horses with acute colitis evaluated with classification and regression trees (CART) and random forest analysis.

    PubMed

    Petersen, M B; Tolver, A; Husted, L; Tølbøll, T H; Pihl, T H

    2016-07-01

    The objective of this study was to investigate the prognostic value of single and repeated measurements of blood l-lactate (Lac) and ionised calcium (iCa) concentrations, packed cell volume (PCV) and plasma total protein (TP) concentration in horses with acute colitis. A total of 66 adult horses admitted with acute colitis (<24 h) to a referral hospital in the 2002-2011 period were included. The prognostic value of Lac, iCa, PCV and TP recorded at admission and 6 h post admission was analysed with univariate analysis, logistic regression, classification and regression trees, as well as random forest analysis. Ponies and Icelandic horses made up 59% of the population, whilst the remaining 41% were horses. Blood lactate concentration at admission was the only individual parameter significantly associated with probability of survival to discharge (P < 0.001). In a training sample, a Lac cut-off value of 7 mmol/L had a sensitivity of 0.66 and a specificity of 0.92 in predicting survival. In independent test data, the sensitivity was 0.69 and the specificity was 0.76. At the observed survival rate (38%), the optimal decision tree identified horses as non-survivors when the Lac at admission was ≥4.3 mmol/L and the Lac 6 h post admission stayed at >2 mmol/L (sensitivity, 0.72; specificity, 0.8). In conclusion, blood lactate concentration measured at admission and repeated 6 h later aided the prognostic evaluation of horses with acute colitis in this population with a very high mortality rate. This should allow clinicians to give a more reliable prognosis for the horse. PMID:27240909

  6. Selection bias in species distribution models: An econometric approach on forest trees based on structural modeling

    NASA Astrophysics Data System (ADS)

    Martin-StPaul, N. K.; Ay, J. S.; Guillemot, J.; Doyen, L.; Leadley, P.

    2014-12-01

    Species distribution models (SDMs) are widely used to study and predict the outcome of global changes on species. In human dominated ecosystems the presence of a given species is the result of both its ecological suitability and human footprint on nature such as land use choices. Land use choices may thus be responsible for a selection bias in the presence/absence data used in SDM calibration. We present a structural modelling approach (i.e. based on structural equation modelling) that accounts for this selection bias. The new structural species distribution model (SSDM) estimates simultaneously land use choices and species responses to bioclimatic variables. A land use equation based on an econometric model of landowner choices was joined to an equation of species response to bioclimatic variables. SSDM allows the residuals of both equations to be dependent, taking into account the possibility of shared omitted variables and measurement errors. We provide a general description of the statistical theory and a set of applications on forest trees over France using databases of climate and forest inventory at different spatial resolution (from 2km to 8km). We also compared the outputs of the SSDM with outputs of a classical SDM (i.e. Biomod ensemble modelling) in terms of bioclimatic response curves and potential distributions under current climate and climate change scenarios. The shapes of the bioclimatic response curves and the modelled species distribution maps differed markedly between SSDM and classical SDMs, with contrasted patterns according to species and spatial resolutions. The magnitude and directions of these differences were dependent on the correlations between the errors from both equations and were highest for higher spatial resolutions. A first conclusion is that the use of classical SDMs can potentially lead to strong miss-estimation of the actual and future probability of presence modelled. Beyond this selection bias, the SSDM we propose represents

  7. Classification of non native tree species in Adda Park (Italy) through multispectral and multitemporal surveys from UAV

    NASA Astrophysics Data System (ADS)

    Pinto, Livio; Sona, Giovanna; Biffi, Andrea; Dosso, Paolo; Passoni, Daniele; Baracani, Matteo

    2014-05-01

    July, was realized over a longer period : from 09/07/2013 to 28/08/2013, due to weather condition and technical reasons. In any case the vegetation characteristics resulted to be unchanged. The second set of flights, in autumn, were done in a shorter period, during the days 16-17-18 October 2013, thus obtaining even better homogeneity of the vegetation conditions. Image and data processing are based on standard classification techniques, both pixel and object based, applied simultaneously on multispectral and multitemporal data, with the aim of producing a thematic map of the species of interest. The classification accuracies will be computed on the basis of ground truth comparison, to study possible misclassification among species.

  8. Towards global empirical upscaling of FLUXNET eddy covariance observations: validation of a model tree ensemble approach using a biosphere model

    NASA Astrophysics Data System (ADS)

    Jung, M.; Reichstein, M.; Bondeau, A.

    2009-10-01

    Global, spatially and temporally explicit estimates of carbon and water fluxes derived from empirical up-scaling eddy covariance measurements would constitute a new and possibly powerful data stream to study the variability of the global terrestrial carbon and water cycle. This paper introduces and validates a machine learning approach dedicated to the upscaling of observations from the current global network of eddy covariance towers (FLUXNET). We present a new model TRee Induction ALgorithm (TRIAL) that performs hierarchical stratification of the data set into units where particular multiple regressions for a target variable hold. We propose an ensemble approach (Evolving tRees with RandOm gRowth, ERROR) where the base learning algorithm is perturbed in order to gain a diverse sequence of different model trees which evolves over time. We evaluate the efficiency of the model tree ensemble (MTE) approach using an artificial data set derived from the Lund-Potsdam-Jena managed Land (LPJmL) biosphere model. We aim at reproducing global monthly gross primary production as simulated by LPJmL from 1998-2005 using only locations and months where high quality FLUXNET data exist for the training of the model trees. The model trees are trained with the LPJmL land cover and meteorological input data, climate data, and the fraction of absorbed photosynthetic active radiation simulated by LPJmL. Given that we know the "true result" in the form of global LPJmL simulations we can effectively study the performance of the MTE upscaling and associated problems of extrapolation capacity. We show that MTE is able to explain 92% of the variability of the global LPJmL GPP simulations. The mean spatial pattern and the seasonal variability of GPP that constitute the largest sources of variance are very well reproduced (96% and 94% of variance explained respectively) while the monthly interannual anomalies which occupy much less variance are less well matched (41% of variance explained

  9. Neuropsychological assessment of individuals with brain tumor: comparison of approaches used in the classification of impairment.

    PubMed

    Dwan, Toni Maree; Ownsworth, Tamara; Chambers, Suzanne; Walker, David G; Shum, David H K

    2015-01-01

    Approaches to classifying neuropsychological impairment after brain tumor vary according to testing level (individual tests, domains, or global index) and source of reference (i.e., norms, controls, and pre-morbid functioning). This study aimed to compare rates of impairment according to different classification approaches. Participants were 44 individuals (57% female) with a primary brain tumor diagnosis (mean age = 45.6 years) and 44 matched control participants (59% female, mean age = 44.5 years). All participants completed a test battery that assesses pre-morbid IQ (Wechsler adult reading test), attention/processing speed (digit span, trail making test A), memory (Hopkins verbal learning test-revised, Rey-Osterrieth complex figure-recall), and executive function (trail making test B, Rey-Osterrieth complex figure copy, controlled oral word association test). Results indicated that across the different sources of reference, 86-93% of participants were classified as impaired at a test-specific level, 61-73% were classified as impaired at a domain-specific level, and 32-50% were classified as impaired at a global level. Rates of impairment did not significantly differ according to source of reference (p > 0.05); however, at the individual participant level, classification based on estimated pre-morbid IQ was often inconsistent with classification based on the norms or controls. Participants with brain tumor performed significantly poorer than matched controls on tests of neuropsychological functioning, including executive function (p = 0.001) and memory (p < 0.001), but not attention/processing speed (p > 0.05). These results highlight the need to examine individuals' performance across a multi-faceted neuropsychological test battery to avoid over- or under-estimation of impairment. PMID:25815271

  10. An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests

    PubMed Central

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2010-01-01

    Recursive partitioning methods have become popular and widely used tools for non-parametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years. High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated using freely available implementations in the R system for statistical computing. PMID:19968396

  11. Detection and classification of interstitial lung diseases and emphysema using a joint morphological-fuzzy approach

    NASA Astrophysics Data System (ADS)

    Chang Chien, Kuang-Che; Fetita, Catalin; Brillet, Pierre-Yves; Prêteux, Françoise; Chang, Ruey-Feng

    2009-02-01

    Multi-detector computed tomography (MDCT) has high accuracy and specificity on volumetrically capturing serial images of the lung. It increases the capability of computerized classification for lung tissue in medical research. This paper proposes a three-dimensional (3D) automated approach based on mathematical morphology and fuzzy logic for quantifying and classifying interstitial lung diseases (ILDs) and emphysema. The proposed methodology is composed of several stages: (1) an image multi-resolution decomposition scheme based on a 3D morphological filter is used to detect and analyze the different density patterns of the lung texture. Then, (2) for each pattern in the multi-resolution decomposition, six features are computed, for which fuzzy membership functions define a probability of association with a pathology class. Finally, (3) for each pathology class, the probabilities are combined up according to the weight assigned to each membership function and two threshold values are used to decide the final class of the pattern. The proposed approach was tested on 10 MDCT cases and the classification accuracy was: emphysema: 95%, fibrosis/honeycombing: 84% and ground glass: 97%.

  12. Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

    NASA Astrophysics Data System (ADS)

    Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

    2015-07-01

    While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.

  13. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis. PMID:14735943

  14. Model-based approach to the detection and classification of mines in sidescan sonar

    NASA Astrophysics Data System (ADS)

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-01

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis.

  15. Texture classification of anatomical structures in CT using a context-free machine learning approach

    NASA Astrophysics Data System (ADS)

    Jiménez del Toro, Oscar A.; Foncubierta-Rodríguez, Antonio; Depeursinge, Adrien; Müller, Henning

    2015-03-01

    Medical images contain a large amount of visual information about structures and anomalies in the human body. To make sense of this information, human interpretation is often essential. On the other hand, computer-based approaches can exploit information contained in the images by numerically measuring and quantifying specific visual features. Annotation of organs and other anatomical regions is an important step before computing numerical features on medical images. In this paper, a texture-based organ classification algorithm is presented, which can be used to reduce the time required for annotating medical images. The texture of organs is analyzed using a combination of state-of-the-art techniques: the Riesz transform and a bag of meaningful visual words. The effect of a meaningfulness transformation in the visual word space yields two important advantages that can be seen in the results. The number of descriptors is enormously reduced down to 10% of the original size, whereas classification accuracy is improved by up to 25% with respect to the baseline approach.

  16. A probabilistic approach to segmentation and classification of neoplasia in uterine cervix images using color and geometric features

    NASA Astrophysics Data System (ADS)

    Srinivasan, Yeshwanth; Hernes, Dana; Tulpule, Bhakti; Yang, Shuyu; Guo, Jiangling; Mitra, Sunanda; Yagneswaran, Sriraja; Nutter, Brian; Jeronimo, Jose; Phillips, Benny; Long, Rodney; Ferris, Daron

    2005-04-01

    Automated segmentation and classification of diagnostic markers in medical imagery are challenging tasks. Numerous algorithms for segmentation and classification based on statistical approaches of varying complexity are found in the literature. However, the design of an efficient and automated algorithm for precise classification of desired diagnostic markers is extremely image-specific. The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating an archive of 60,000 digitized color images of the uterine cervix. NLM is developing tools for the analysis and dissemination of these images over the Web for the study of visual features correlated with precancerous neoplasia and cancer. To enable indexing of images of the cervix, it is essential to develop algorithms for the segmentation of regions of interest, such as acetowhitened regions, and automatic identification and classification of regions exhibiting mosaicism and punctation. Success of such algorithms depends, primarily, on the selection of relevant features representing the region of interest. We present color and geometric features based statistical classification and segmentation algorithms yielding excellent identification of the regions of interest. The distinct classification of the mosaic regions from the non-mosaic ones has been obtained by clustering multiple geometric and color features of the segmented sections using various morphological and statistical approaches. Such automated classification methodologies will facilitate content-based image retrieval from the digital archive of uterine cervix and have the potential of developing an image based screening tool for cervical cancer.

  17. A novel semi-supervised hyperspectral image classification approach based on spatial neighborhood information and classifier combination

    NASA Astrophysics Data System (ADS)

    Tan, Kun; Hu, Jun; Li, Jun; Du, Peijun

    2015-07-01

    In the process of semi-supervised hyperspectral image classification, spatial neighborhood information of training samples is widely applied to solve the small sample size problem. However, the neighborhood information of unlabeled samples is usually ignored. In this paper, we propose a new algorithm for hyperspectral image semi-supervised classification in which the spatial neighborhood information is combined with classifier to enhance the classification ability in determining the class label of the selected unlabeled samples. There are two key points in this algorithm: (1) it is considered that the correct label should appear in the spatial neighborhood of unlabeled samples; (2) the combination of classifier can obtains better results. Two classifiers multinomial logistic regression (MLR) and k-nearest neighbor (KNN) are combined together in the above way to further improve the performance. The performance of the proposed approach was assessed with two real hyperspectral data sets, and the obtained results indicate that the proposed approach is effective for hyperspectral classification.

  18. A case-comparison study of automatic document classification utilizing both serial and parallel approaches

    NASA Astrophysics Data System (ADS)

    Wilges, B.; Bastos, R. C.; Mateus, G. P.; Dantas, M. A. R.

    2014-10-01

    A well-known problem faced by any organization nowadays is the high volume of data that is available and the required process to transform this volume into differential information. In this study, a case-comparison study of automatic document classification (ADC) approach is presented, utilizing both serial and parallel paradigms. The serial approach was implemented by adopting the RapidMiner software tool, which is recognized as the worldleading open-source system for data mining. On the other hand, considering the MapReduce programming model, the Hadoop software environment has been used. The main goal of this case-comparison study is to exploit differences between these two paradigms, especially when large volumes of data such as Web text documents are utilized to build a category database. In the literature, many studies point out that distributed processing in unstructured documents have been yielding efficient results in utilizing Hadoop. Results from our research indicate a threshold to such efficiency.

  19. Land cover classification of Landsat 8 satellite data based on Fuzzy Logic approach

    NASA Astrophysics Data System (ADS)

    Taufik, Afirah; Sakinah Syed Ahmad, Sharifah

    2016-06-01

    The aim of this paper is to propose a method to classify the land covers of a satellite image based on fuzzy rule-based system approach. The study uses bands in Landsat 8 and other indices, such as Normalized Difference Water Index (NDWI), Normalized difference built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) as input for the fuzzy inference system. The selected three indices represent our main three classes called water, built- up land, and vegetation. The combination of the original multispectral bands and selected indices provide more information about the image. The parameter selection of fuzzy membership is performed by using a supervised method known as ANFIS (Adaptive neuro fuzzy inference system) training. The fuzzy system is tested for the classification on the land cover image that covers Klang Valley area. The results showed that the fuzzy system approach is effective and can be explored and implemented for other areas of Landsat data.

  20. PTrees: A point-based approach to forest tree extraction from lidar data

    NASA Astrophysics Data System (ADS)

    Vega, C.; Hamrouni, A.; El Mokhtari, S.; Morel, J.; Bock, J.; Renaud, J.-P.; Bouvier, M.; Durrieu, S.

    2014-12-01

    This paper introduces PTrees, a multi-scale dynamic point cloud segmentation dedicated to forest tree extraction from lidar point clouds. The method process the point data using the raw elevation values (Z) and compute height (H = Z - ground elevation) during post-processing using an innovative procedure allowing to preserve the geometry of crown points. Multiple segmentations are done at different scales. Segmentation criteria are then applied to dynamically select the best set of apices from the tree segments extracted at the various scales. The selected set of apices is then used to generate a final segmentation. PTrees has been tested in 3 different forest types, allowing to detect 82% of the trees with under 10% of false detection rate. Future development will integrate crown profile estimation during the segmentation process in order to both maximize the detection of suppressed trees and minimize false detections.

  1. A Decision Tree Approach to the Interpretation of Multivariate Statistical Techniques.

    ERIC Educational Resources Information Center

    Fok, Lillian Y.; And Others

    1995-01-01

    Discusses the nature, power, and limitations of four multivariate techniques: factor analysis, multiple analysis of variance, multiple regression, and multiple discriminant analysis. Shows how decision trees assist in interpreting results. (SK)

  2. A novel approach to phylogenetic tree construction using stochastic optimization and clustering

    PubMed Central

    Qin, Ling; Chen, Yixin; Pan, Yi; Chen, Ling

    2006-01-01

    Background The problem of inferring the evolutionary history and constructing the phylogenetic tree with high performance has become one of the major problems in computational biology. Results A new phylogenetic tree construction method from a given set of objects (proteins, species, etc.) is presented. As an extension of ant colony optimization, this method proposes an adaptive phylogenetic clustering algorithm based on a digraph to find a tree structure that defines the ancestral relationships among the given objects. Conclusion Our phylogenetic tree construction method is tested to compare its results with that of the genetic algorithm (GA). Experimental results show that our algorithm converges much faster and also achieves higher quality than GA. PMID:17217517

  3. Regression-Based Approach For Feature Selection In Classification Issues. Application To Breast Cancer Detection And Recurrence

    NASA Astrophysics Data System (ADS)

    Belciug, Smaranda; Serbanescu, Mircea-Sebastian

    2015-09-01

    Feature selection is considered a key factor in classifications/decision problems. It is currently used in designing intelligent decision systems to choose the best features which allow the best performance. This paper proposes a regression-based approach to select the most important predictors to significantly increase the classification performance. Application to breast cancer detection and recurrence using publically available datasets proved the efficiency of this technique.

  4. The Application of Classification and Regression Trees for the Triage of Women for Referral to Colposcopy and the Estimation of Risk for Cervical Intraepithelial Neoplasia: A Study Based on 1625 Cases with Incomplete Data from Molecular Tests

    PubMed Central

    Pouliakis, Abraham; Karakitsou, Efrossyni; Chrelias, Charalampos; Pappas, Asimakis; Panayiotides, Ioannis; Valasoulis, George; Kyrgiou, Maria; Paraskevaidis, Evangelos; Karakitsos, Petros

    2015-01-01

    Objective. Nowadays numerous ancillary techniques detecting HPV DNA and mRNA compete with cytology; however no perfect test exists; in this study we evaluated classification and regression trees (CARTs) for the production of triage rules and estimate the risk for cervical intraepithelial neoplasia (CIN) in cases with ASCUS+ in cytology. Study Design. We used 1625 cases. In contrast to other approaches we used missing data to increase the data volume, obtain more accurate results, and simulate real conditions in the everyday practice of gynecologic clinics and laboratories. The proposed CART was based on the cytological result, HPV DNA typing, HPV mRNA detection based on NASBA and flow cytometry, p16 immunocytochemical expression, and finally age and parous status. Results. Algorithms useful for the triage of women were produced; gynecologists could apply these in conjunction with available examination results and conclude to an estimation of the risk for a woman to harbor CIN expressed as a probability. Conclusions. The most important test was the cytological examination; however the CART handled cases with inadequate cytological outcome and increased the diagnostic accuracy by exploiting the results of ancillary techniques even if there were inadequate missing data. The CART performance was better than any other single test involved in this study. PMID:26339651

  5. Identifying changes in dissolved organic matter content and characteristics by fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis during wastewater treatment.

    PubMed

    Yu, Huibin; Song, Yonghui; Liu, Ruixia; Pan, Hongwei; Xiang, Liancheng; Qian, Feng

    2014-10-01

    The stabilization of latent tracers of dissolved organic matter (DOM) of wastewater was analyzed by three-dimensional excitation-emission matrix (EEM) fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis (CART) in wastewater treatment performance. DOM of water samples collected from primary sedimentation, anaerobic, anoxic, oxic and secondary sedimentation tanks in a large-scale wastewater treatment plant contained four fluorescence components: tryptophan-like (C1), tyrosine-like (C2), microbial humic-like (C3) and fulvic-like (C4) materials extracted by self-organizing map. These components showed good positive linear correlations with dissolved organic carbon of DOM. C1 and C2 were representative components in the wastewater, and they were removed to a higher extent than those of C3 and C4 in the treatment process. C2 was a latent parameter determined by CART to differentiate water samples of oxic and secondary sedimentation tanks from the successive treatment units, indirectly proving that most of tyrosine-like material was degraded by anaerobic microorganisms. C1 was an accurate parameter to comprehensively separate the samples of the five treatment units from each other, indirectly indicating that tryptophan-like material was decomposed by anaerobic and aerobic bacteria. EEM fluorescence spectroscopy in combination with self-organizing map and CART analysis can be a nondestructive effective method for characterizing structural component of DOM fractions and monitoring organic matter removal in wastewater treatment process. PMID:25065793

  6. Evaluating an ensemble classification approach for crop diversity verification in Danish greening subsidy control

    NASA Astrophysics Data System (ADS)

    Chellasamy, Menaka; Ferré, Ty Paul Andrew; Greve, Mogens Humlekrog

    2016-07-01

    Beginning in 2015, Danish farmers are obliged to meet specific crop diversification rules based on total land area and number of crops cultivated to be eligible for new greening subsidies. Hence, there is a need for the Danish government to extend their subsidy control system to verify farmers' declarations to warrant greening payments under the new crop diversification rules. Remote Sensing (RS) technology has been used since 1992 to control farmers' subsidies in Denmark. However, a proper RS-based approach is yet to be finalised to validate new crop diversity requirements designed for assessing compliance under the recent subsidy scheme (2014-2020); This study uses an ensemble classification approach (proposed by the authors in previous studies) for validating the crop diversity requirements of the new rules. The approach uses a neural network ensemble classification system with bi-temporal (spring and early summer) WorldView-2 imagery (WV2) and includes the following steps: (1) automatic computation of pixel-based prediction probabilities using multiple neural networks; (2) quantification of the classification uncertainty using Endorsement Theory (ET); (3) discrimination of crop pixels and validation of the crop diversification rules at farm level; and (4) identification of farmers who are violating the requirements for greening subsidies. The prediction probabilities are computed by a neural network ensemble supplied with training samples selected automatically using farmers declared parcels (field vectors containing crop information and the field boundary of each crop). Crop discrimination is performed by considering a set of conclusions derived from individual neural networks based on ET. Verification of the diversification rules is performed by incorporating pixel-based classification uncertainty or confidence intervals with the class labels at the farmer level. The proposed approach was tested with WV2 imagery acquired in 2011 for a study area in Vennebjerg

  7. Sensitivity of Bovine Tuberculosis Surveillance in Wildlife in France: A Scenario Tree Approach.

    PubMed

    Rivière, Julie; Le Strat, Yann; Dufour, Barbara; Hendrikx, Pascal

    2015-01-01

    Bovine tuberculosis (bTB) is a common disease in cattle and wildlife, with an impact on animal and human health, and economic implications. Infected wild animals have been detected in some European countries, and bTB reservoirs in wildlife have been identified, potentially hindering the eradication of bTB from cattle populations. However, the surveillance of bTB in wildlife involves several practical difficulties and is not currently covered by EU legislation. We report here the first assessment of the sensitivity of the bTB surveillance system for free-ranging wildlife launched in France in 2011 (the Sylvatub system), based on scenario tree modelling. Three surveillance system components were identified: (i) passive scanning surveillance for hunted wild boar, red deer and roe deer, based on carcass examination, (ii) passive surveillance on animals found dead, moribund or with abnormal behaviour, for wild boar, red deer, roe deer and badger and (iii) active surveillance for wild boar and badger. The application of these three surveillance system components depends on the geographic risk of bTB infection in wildlife, which in turn depends on the prevalence of bTB in cattle. We estimated the effectiveness of the three components of the Sylvatub surveillance system quantitatively, for each species separately. Active surveillance and passive scanning surveillance by carcass examination were the approaches most likely to detect at least one infected animal in a population with a given design prevalence, regardless of the local risk level and species considered. The awareness of hunters, which depends on their training and the geographic risk, was found to affect surveillance sensitivity. The results obtained are relevant for hunters and veterinary authorities wishing to determine the actual efficacy of wildlife bTB surveillance as a function of geographic area and species, and could provide support for decision-making processes concerning the enhancement of surveillance

  8. Sensitivity of Bovine Tuberculosis Surveillance in Wildlife in France: A Scenario Tree Approach

    PubMed Central

    Rivière, Julie

    2015-01-01

    Bovine tuberculosis (bTB) is a common disease in cattle and wildlife, with an impact on animal and human health, and economic implications. Infected wild animals have been detected in some European countries, and bTB reservoirs in wildlife have been identified, potentially hindering the eradication of bTB from cattle populations. However, the surveillance of bTB in wildlife involves several practical difficulties and is not currently covered by EU legislation. We report here the first assessment of the sensitivity of the bTB surveillance system for free-ranging wildlife launched in France in 2011 (the Sylvatub system), based on scenario tree modelling. Three surveillance system components were identified: (i) passive scanning surveillance for hunted wild boar, red deer and roe deer, based on carcass examination, (ii) passive surveillance on animals found dead, moribund or with abnormal behaviour, for wild boar, red deer, roe deer and badger and (iii) active surveillance for wild boar and badger. The application of these three surveillance system components depends on the geographic risk of bTB infection in wildlife, which in turn depends on the prevalence of bTB in cattle. We estimated the effectiveness of the three components of the Sylvatub surveillance system quantitatively, for each species separately. Active surveillance and passive scanning surveillance by carcass examination were the approaches most likely to detect at least one infected animal in a population with a given design prevalence, regardless of the local risk level and species considered. The awareness of hunters, which depends on their training and the geographic risk, was found to affect surveillance sensitivity. The results obtained are relevant for hunters and veterinary authorities wishing to determine the actual efficacy of wildlife bTB surveillance as a function of geographic area and species, and could provide support for decision-making processes concerning the enhancement of surveillance

  9. An allometry-based approach for understanding forest structure, predicting tree-size distribution and assessing the degree of disturbance

    PubMed Central

    Anfodillo, Tommaso; Carrer, Marco; Simini, Filippo; Popa, Ionel; Banavar, Jayanth R.; Maritan, Amos

    2013-01-01

    Tree-size distribution is one of the most investigated subjects in plant population biology. The forestry literature reports that tree-size distribution trajectories vary across different stands and/or species, whereas the metabolic scaling theory suggests that the tree number scales universally as −2 power of diameter. Here, we propose a simple functional scaling model in which these two opposing results are reconciled. Basic principles related to crown shape, energy optimization and the finite-size scaling approach were used to define a set of relationships based on a single parameter that allows us to predict the slope of the tree-size distributions in a steady-state condition. We tested the model predictions on four temperate mountain forests. Plots (4 ha each, fully mapped) were selected with different degrees of human disturbance (semi-natural stands versus formerly managed). Results showed that the size distribution range successfully fitted by the model is related to the degree of forest disturbance: in semi-natural forests the range is wide, whereas in formerly managed forests, the agreement with the model is confined to a very restricted range. We argue that simple allometric relationships, at an individual level, shape the structure of the whole forest community. PMID:23193128

  10. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures.

    PubMed

    Musah, Rabi A; Espinoza, Edgard O; Cody, Robert B; Lesiak, Ashton D; Christensen, Earl D; Moore, Hannah E; Maleknia, Simin; Drijfhout, Falko P

    2015-01-01

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required. PMID:26156000

  11. A physical approach to the automated classification of clinical percussion sounds.

    PubMed

    Pantea, M A; Maev, R Gr; Malyarenko, E V; Baylor, A E

    2012-01-01

    Chest percussion is a traditional technique used for the physical examination of pulmonary injuries and diseases. It is a method of tapping body parts with fingers or small instruments to evaluate the size, consistency, borders, and presence of fluid/air in the lungs and abdomen. Percussion has been successfully used for the diagnosis of such potentially lethal conditions as traumatic and tension pneumothorax. This technique, however, has certain shortcomings, including limitations of the human ear and the subjectivity of the administrator, that lead to overall low sensitivity. Automation of the method by using a standardized percussion source and computerized classification of digitized signals would remove the subjective factor and other limitations of the technique. It would also enable rapid on-site diagnostics of pulmonary traumas when thorough clinical examination is impossible. This paper lays the groundwork for an objective signal classification approach based on a general physical model of a damped harmonic oscillator. Using this concept, critical parameters that effectively subdivide percussion signals into three main groups, historically known as "tympanic," "resonant," and "dull," are identified, opening the possibility for automated diagnostics of air/liquid inclusions in the thorax and abdomen. The key role of damping in forming the character of the percussion signal is investigated using a 3D thorax phantom. The contribution of the abdominal component into the complex multimode spectrum of chest percussion signals is demonstrated. PMID:22280623

  12. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    NASA Astrophysics Data System (ADS)

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.

    2015-07-01

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.

  13. A Hybrid BPSO-CGA Approach for Gene Selection and Classification of Microarray Data

    PubMed Central

    Chuang, Li-Yeh; Yang, Cheng-Huei; Li, Jung-Chike

    2012-01-01

    Abstract Microarray analysis promises to detect variations in gene expressions, and changes in the transcription rates of an entire genome in vivo. Microarray gene expression profiles indicate the relative abundance of mRNA corresponding to the genes. The selection of relevant genes from microarray data poses a formidable challenge to researchers due to the high-dimensionality of features, multiclass categories being involved, and the usually small sample size. A classification process is often employed which decreases the dimensionality of the microarray data. In order to correctly analyze microarray data, the goal is to find an optimal subset of features (genes) which adequately represents the original set of features. A hybrid method of binary particle swarm optimization (BPSO) and a combat genetic algorithm (CGA) is to perform the microarray data selection. The K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) served as a classifier. The proposed BPSO-CGA approach is compared to ten microarray data sets from the literature. The experimental results indicate that the proposed method not only effectively reduce the number of genes expression level, but also achieves a low classification error rate. PMID:21210743

  14. Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations.

    PubMed

    Torkzaban, Bahareh; Kayvanjoo, Amir Hossein; Ardalan, Arman; Mousavi, Soraya; Mariotti, Roberto; Baldoni, Luciana; Ebrahimie, Esmaeil; Ebrahimi, Mansour; Hosseini-Mazinani, Mehdi

    2015-01-01

    Finding efficient analytical techniques is overwhelmingly turning into a bottleneck for the effectiveness of large biological data. Machine learning offers a novel and powerful tool to advance classification and modeling solutions in molecular biology. However, these methods have been less frequently used with empirical population genetics data. In this study, we developed a new combined approach of data analysis using microsatellite marker data from our previous studies of olive populations using machine learning algorithms. Herein, 267 olive accessions of various origins including 21 reference cultivars, 132 local ecotypes, and 37 wild olive specimens from the Iranian plateau, together with 77 of the most represented Mediterranean varieties were investigated using a finely selected panel of 11 microsatellite markers. We organized data in two '4-targeted' and '16-targeted' experiments. A strategy of assaying different machine based analyses (i.e. data cleaning, feature selection, and machine learning classification) was devised to identify the most informative loci and the most diagnostic alleles to represent the population and the geography of each olive accession. These analyses revealed microsatellite markers with the highest differentiating capacity and proved efficiency for our method of clustering olive accessions to reflect upon their regions of origin. A distinguished highlight of this study was the discovery of the best combination of markers for better differentiating of populations via machine learning models, which can be exploited to distinguish among other biological populations. PMID:26599001

  15. High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    DOE PAGESBeta

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijhout, Falko P.

    2015-07-09

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguishmore » between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.« less

  16. High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    SciTech Connect

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijhout, Falko P.

    2015-07-09

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.

  17. Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations

    PubMed Central

    Mousavi, Soraya; Mariotti, Roberto; Baldoni, Luciana; Ebrahimie, Esmaeil; Ebrahimi, Mansour; Hosseini-Mazinani, Mehdi

    2015-01-01

    Finding efficient analytical techniques is overwhelmingly turning into a bottleneck for the effectiveness of large biological data. Machine learning offers a novel and powerful tool to advance classification and modeling solutions in molecular biology. However, these methods have been less frequently used with empirical population genetics data. In this study, we developed a new combined approach of data analysis using microsatellite marker data from our previous studies of olive populations using machine learning algorithms. Herein, 267 olive accessions of various origins including 21 reference cultivars, 132 local ecotypes, and 37 wild olive specimens from the Iranian plateau, together with 77 of the most represented Mediterranean varieties were investigated using a finely selected panel of 11 microsatellite markers. We organized data in two ‘4-targeted’ and ‘16-targeted’ experiments. A strategy of assaying different machine based analyses (i.e. data cleaning, feature selection, and machine learning classification) was devised to identify the most informative loci and the most diagnostic alleles to represent the population and the geography of each olive accession. These analyses revealed microsatellite markers with the highest differentiating capacity and proved efficiency for our method of clustering olive accessions to reflect upon their regions of origin. A distinguished highlight of this study was the discovery of the best combination of markers for better differentiating of populations via machine learning models, which can be exploited to distinguish among other biological populations. PMID:26599001

  18. TransportTP: A two-phase classification approach for membrane transporter prediction and characterization

    PubMed Central

    2009-01-01

    Background Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides. Results In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8%, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6% and a precision of 73.4%, according to our manual curation. Conclusions TransportTP is the most effective tool for eukaryotic transporter characterization up to date. PMID:20003433

  19. Sparse representation approaches for the classification of high-dimensional biological data

    PubMed Central

    2013-01-01

    Background High-throughput genomic and proteomic data have important applications in medicine including prevention, diagnosis, treatment, and prognosis of diseases, and molecular biology, for example pathway identification. Many of such applications can be formulated to classification and dimension reduction problems in machine learning. There are computationally challenging issues with regards to accurately classifying such data, and which due to dimensionality, noise and redundancy, to name a few. The principle of sparse representation has been applied to analyzing high-dimensional biological data within the frameworks of clustering, classification, and dimension reduction approaches. However, the existing sparse representation methods are inefficient. The kernel extensions are not well addressed either. Moreover, the sparse representation techniques have not been comprehensively studied yet in bioinformatics. Results In this paper, a Bayesian treatment is presented on sparse representations. Various sparse coding and dictionary learning models are discussed. We propose fast parallel active-set optimization algorithm for each model. Kernel versions are devised based on their dimension-free property. These models are applied for classifying high-dimensional biological data. Conclusions In our experiment, we compared our models with other methods on both accuracy and computing time. It is shown that our models can achieve satisfactory accuracy, and their performance are very efficient. PMID:24565287

  20. An effective band selection approach for classification in remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Cukur, Hüseyin; Binol, Hamidullah; Uslu, Faruk S.; Bal, Abdullah

    2015-10-01

    Hyperspectral imagery (HSI) is a special imaging form that is characterized by high spectral resolution with up to hundreds of very narrow and contiguous bands which is ranging from the visible to the infrared region. Since HSI contains more distinctive features than conventional images, its computation cost of processing is very high. That's why; dimensionality reduction is become significant for classification performance. In this study, dimension reduction has been achieved via VNS based band selection method on hyperspectral images. This method is based on systematic change of neighborhood used in the search space. In order to improve the band selection performance, we have offered clustering technique based on mutual information (MI) before applying VNS. The offered combination technique is called MI-VNS. Support Vector Machine (SVM) has been used as a classifier to evaluate the performance of the proposed band selection technique. The experimental results show that MI-VNS approach has increased the classification performance and decrease the computational time compare to without band selection and conventional VNS.

  1. Land cover data from Landsat single-date archive imagery: an integrated classification approach

    NASA Astrophysics Data System (ADS)

    Bajocco, Sofia; Ceccarelli, Tomaso; Rinaldo, Simone; De Angelis, Antonella; Salvati, Luca; Perini, Luigi

    2012-10-01

    The analysis of land cover dynamics provides insight into many environmental problems. However, there are few data sources which can be used to derive consistent time series, remote sensing being one of the most valuable ones. Due to their multi-temporal and spatial coverage needs, such analysis is usually based on large land cover datasets, which requires automated, objective and repeatable procedures. The USGS Landsat archives provide free access to multispectral, high-resolution remotely sensed data starting from the mid-eighties; in many cases, however, only single date images are available. This paper suggests an objective approach for generating land cover information from 30m resolution and single date Landsat archive satellite imagery. A procedure was developed integrating pixel-based and object-oriented classifiers, which consists of the following basic steps: i) pre-processing of the satellite image, including radiance and reflectance calibration, texture analysis and derivation of vegetation indices, ii) segmentation of the pre-processed image, iii) its classification integrating both radiometric and textural properties. The integrated procedure was tested for an area in Sardinia Region, Italy, and compared with a purely pixel-based one. Results demonstrated that a better overall accuracy, evaluated against the available land cover cartography, was obtained with the integrated (86%) compared to the pixel-based classification (68%) at the first CORINE Land Cover level. The proposed methodology needs to be further tested for evaluating its trasferability in time (constructing comparable land cover time series) and space (for covering larger areas).

  2. A multimodal temporal panorama approach for moving vehicle detection, reconstruction, and classification

    NASA Astrophysics Data System (ADS)

    Wang, Tao; Zhu, Zhigang

    2012-06-01

    Moving vehicle detection and classification using multimodal data is a challenging task in data collection, audio-visual alignment, data labeling and feature selection under uncontrolled environments with occlusions, motion blurs, varying image resolutions and perspective distortions. In this work, we propose an effective multimodal temporal panorama approach for the task using a novel long-range audio-visual sensing system. A new audio-visual vehicle (AVV) dataset for moving vehicle detection and classification is created, which features automatic vehicle detection and audio-visual alignment, accurate vehicle extraction and reconstruction, and efficient data labeling. In particular, vehicles' visual images are reconstructed once detected in order to remove most of the occlusions, motion blurs, and variations of perspective views. Multimodal audio-visual features are extracted, including global geometric features (aspect ratios, profiles), local structure features (HOGs), as well various audio features (MFCCs, etc). Using radial-based SVMs, the effectiveness of the integration of these multimodal features is thoroughly and systemically studied. The concept of MTP may not be only limited to visual, motion and audio modalities; it could also be applicable to other sensing modalities that can obtain data in the temporal domain.

  3. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    PubMed Central

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.

    2015-01-01

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required. PMID:26156000

  4. Classification of emotional states from electrocardiogram signals: a non-linear approach based on hurst

    PubMed Central

    2013-01-01

    Background Identifying the emotional state is helpful in applications involving patients with autism and other intellectual disabilities; computer-based training, human computer interaction etc. Electrocardiogram (ECG) signals, being an activity of the autonomous nervous system (ANS), reflect the underlying true emotional state of a person. However, the performance of various methods developed so far lacks accuracy, and more robust methods need to be developed to identify the emotional pattern associated with ECG signals. Methods Emotional ECG data was obtained from sixty participants by inducing the six basic emotional states (happiness, sadness, fear, disgust, surprise and neutral) using audio-visual stimuli. The non-linear feature ‘Hurst’ was computed using Rescaled Range Statistics (RRS) and Finite Variance Scaling (FVS) methods. New Hurst features were proposed by combining the existing RRS and FVS methods with Higher Order Statistics (HOS). The features were then classified using four classifiers – Bayesian Classifier, Regression Tree, K- nearest neighbor and Fuzzy K-nearest neighbor. Seventy percent of the features were used for training and thirty percent for testing the algorithm. Results Analysis of Variance (ANOVA) conveyed that Hurst and the proposed features were statistically significant (p < 0.001). Hurst computed using RRS and FVS methods showed similar classification accuracy. The features obtained by combining FVS and HOS performed better with a maximum accuracy of 92.87% and 76.45% for classifying the six emotional states using random and subject independent validation respectively. Conclusions The results indicate that the combination of non-linear analysis and HOS tend to capture the finer emotional changes that can be seen in healthy ECG data. This work can be further fine tuned to develop a real time system. PMID:23680041

  5. Land use analyses in the African Sahel: an object-oriented classification approach using TerraSAR-X data

    NASA Astrophysics Data System (ADS)

    Biro, Khalid; Sulieman, Hussein; Pradhan, Biswajeet; Buchroithner, Manfred

    Recently, object-oriented classification techniques based on image segmentation approaches are being actively studied in the high-resolution image processing tools to extract a variety of thematic information. Land use patterns in the semi-arid Sahel zone of Africa in Sudan is characterized by complex vegetation covers with scattered distribution of small and large agricultural fields. Therefore, image segmentation approach might not suitable due to the presence of these land cover variations. In this study, land use types in the African Sahel dry-lands were analysed using the object-oriented classification approach. TerraSAR-X data (X-band in HH and HV polarisation) with 3.0 m of spatial resolution was used for the land cover classification. Using a feature space optimization tool based on nearest neighbour classifier, the attributes of the TerraSAR-X image were optimized to obtain the best separability among the classes for the land use mapping. The results highlighted the importance of both TerraSAR-X and object-oriented classification approaches as a useful source of information and technique for land use analyses over drylands of African Sahel in Sudan. Keywords: Object-oriented classification; TerraSAR-X; land use; drylands; African Sahel

  6. Use of a Novel Grammatical Inference Approach in Classification of Amyloidogenic Hexapeptides

    PubMed Central

    2016-01-01

    The present paper is a novel contribution to the field of bioinformatics by using grammatical inference in the analysis of data. We developed an algorithm for generating star-free regular expressions which turned out to be good recommendation tools, as they are characterized by a relatively high correlation coefficient between the observed and predicted binary classifications. The experiments have been performed for three datasets of amyloidogenic hexapeptides, and our results are compared with those obtained using the graph approaches, the current state-of-the-art methods in heuristic automata induction, and the support vector machine. The results showed the superior performance of the new grammatical inference algorithm on fixed-length amyloid datasets. PMID:27051459

  7. Use of a Novel Grammatical Inference Approach in Classification of Amyloidogenic Hexapeptides.

    PubMed

    Wieczorek, Wojciech; Unold, Olgierd

    2016-01-01

    The present paper is a novel contribution to the field of bioinformatics by using grammatical inference in the analysis of data. We developed an algorithm for generating star-free regular expressions which turned out to be good recommendation tools, as they are characterized by a relatively high correlation coefficient between the observed and predicted binary classifications. The experiments have been performed for three datasets of amyloidogenic hexapeptides, and our results are compared with those obtained using the graph approaches, the current state-of-the-art methods in heuristic automata induction, and the support vector machine. The results showed the superior performance of the new grammatical inference algorithm on fixed-length amyloid datasets. PMID:27051459

  8. ISOLATING CONTENT AND METADATA FROM WEBLOGS USING CLASSIFICATION AND RULE-BASED APPROACHES

    SciTech Connect

    Marshall, Eric J.; Bell, Eric B.

    2011-09-04

    The emergence and increasing prevalence of social media, such as internet forums, weblogs (blogs), wikis, etc., has created a new opportunity to measure public opinion, attitude, and social structures. A major challenge in leveraging this information is isolating the content and metadata in weblogs, as there is no standard, universally supported, machine-readable format for presenting this information. We present two algorithms for isolating this information. The first uses web block classification, where each node in the Document Object Model (DOM) for a page is classified according to one of several pre-defined attributes from a common blog schema. The second uses a set of heuristics to select web blocks. These algorithms perform at a level suitable for initial use, validating this approach for isolating content and metadata from blogs. The resultant data serves as a starting point for analytical work on the content and substance of collections of weblog pages.

  9. An automated approach to passive sonar classification using binary image features

    NASA Astrophysics Data System (ADS)

    Vahidpour, Vahid; Rastegarnia, Amir; Khalili, Azam

    2015-07-01

    This paper proposes a new method for ship recognition and classification using sound produced and radiated underwater. To do so, a three-step procedure is proposed. First, the preprocessing operations are utilized to reduce noise effects and provide signal for feature extraction. Second, a binary image, made from frequency spectrum of signal segmentation, is formed to extract effective features. Third, a neural classifier is designed to classify the signals. Two approaches, the proposed method and the fractal-based method are compared and tested on real data. The comparative results indicated better recognition ability and more robust performance of the proposed method than the fractal-based method. Therefore, the proposed method could improve the recognition accuracy of underwater acoustic targets.

  10. Fractal geometry-based classification approach for the recognition of lung cancer cells

    NASA Astrophysics Data System (ADS)

    Xia, Deshen; Gao, Wenqing; Li, Hua

    1994-05-01

    This paper describes a new fractal geometry based classification approach for the recognition of lung cancer cells, which is used in the health inspection for lung cancers, because cancer cells grow much faster and more irregularly than normal cells do, the shape of the segmented cancer cells is very irregular and considered as a graph without characteristic length. We use Texture Energy Intensity Rn to do fractal preprocessing to segment the cells from the image and to calculate the fractal dimention value for extracting the fractal features, so that we can get the figure characteristics of different cancer cells and normal cells respectively. Fractal geometry gives us a correct description of cancer-cell shapes. Through this method, a good recognition of Adenoma, Squamous, and small cancer cells can be obtained.

  11. Schizophrenia Detection and Classification by Advanced Analysis of EEG Recordings Using a Single Electrode Approach

    PubMed Central

    Dvey-Aharon, Zack; Fogelson, Noa; Peled, Avi; Intrator, Nathan

    2015-01-01

    Electroencephalographic (EEG) analysis has emerged as a powerful tool for brain state interpretation and diagnosis, but not for the diagnosis of mental disorders; this may be explained by its low spatial resolution or depth sensitivity. This paper concerns the diagnosis of schizophrenia using EEG, which currently suffers from several cardinal problems: it heavily depends on assumptions, conditions and prior knowledge regarding the patient. Additionally, the diagnostic experiments take hours, and the accuracy of the analysis is low or unreliable. This article presents the “TFFO” (Time-Frequency transformation followed by Feature-Optimization), a novel approach for schizophrenia detection showing great success in classification accuracy with no false positives. The methodology is designed for single electrode recording, and it attempts to make the data acquisition process feasible and quick for most patients. PMID:25837521

  12. Single event and TREE latchup mitigation for a star tracker sensor: An innovative approach to system level latchup mitigation

    SciTech Connect

    Kimbrough, J.R.; Colella, N.J.; Davis, R.W.; Bruener, D.B.; Coakley, P.G.; Lutjens, S.W.; Mallon, C.E.

    1994-08-01

    Electronic packages designed for spacecraft should be fault-tolerant and operate without ground control intervention through extremes in the space radiation environment. If designed for military use, the electronics must survive and function in a nuclear radiation environment. This paper presents an innovative ``blink`` approach rather than the typical ``operate through`` approach to achieve system level latchup mitigation on a prototype star tracker camera. Included are circuit designs, flash x-ray test data, and heavy ion data demonstrating latchup mitigation protecting micro-electronics from current latchup and burnout due to Single Event Latchup (SEL) and Transient Radiation Effects on Electronics (TREE).

  13. Effective Key Parameter Determination for an Automatic Approach to Land Cover Classification Based on Multispectral Remote Sensing Imagery

    PubMed Central

    Wang, Yong; Jiang, Dong; Zhuang, Dafang; Huang, Yaohuan; Wang, Wei; Yu, Xinfang

    2013-01-01

    The classification of land cover based on satellite data is important for many areas of scientific research. Unfortunately, some traditional land cover classification methods (e.g. known as supervised classification) are very labor-intensive and subjective because of the required human involvement. Jiang et al. proposed a simple but robust method for land cover classification using a prior classification map and a current multispectral remote sensing image. This new method has proven to be a suitable classification method; however, its drawback is that it is a semi-automatic method because the key parameters cannot be selected automatically. In this study, we propose an approach in which the two key parameters are chosen automatically. The proposed method consists primarily of the following three interdependent parts: the selection procedure for the pure-pixel training-sample dataset, the method to determine the key parameters, and the optimal combination model. In this study, the proposed approach employs both overall accuracy and their Kappa Coefficients (KC), and Time-Consumings (TC, unit: second) in order to select the two key parameters automatically instead of using a test-decision, which avoids subjective bias. A case study of Weichang District of Hebei Province, China, using Landsat-5/TM data of 2010 with 30 m spatial resolution and prior classification map of 2005 recognised as relatively precise data, was conducted to test the performance of this method. The experimental results show that the methodology determining the key parameters uses the portfolio optimisation model and increases the degree of automation of Jiang et al.'s classification method, which may have a wide scope of scientific application. PMID:24204582

  14. Classification of boreal forest by satellite and inventory data using neural network approach

    NASA Astrophysics Data System (ADS)

    Romanov, A. A.

    2012-12-01

    The main objective of this research was to develop methodology for boreal (Siberian Taiga) land cover classification in a high accuracy level. The study area covers the territories of Central Siberian several parts along the Yenisei River (60-62 degrees North Latitude): the right bank includes mixed forest and dark taiga, the left - pine forests; so were taken as a high heterogeneity and statistically equal surfaces concerning spectral characteristics. Two main types of data were used: time series of middle spatial resolution satellite images (Landsat 5, 7 and SPOT4) and inventory datasets from the nature fieldworks (used for training samples sets preparation). Method of collecting field datasets included a short botany description (type/species of vegetation, density, compactness of the crowns, individual height and max/min diameters representative of each type, surface altitude of the plot), at the same time the geometric characteristic of each training sample unit corresponded to the spatial resolution of satellite images and geo-referenced (prepared datasets both of the preliminary processing and verification). The network of test plots was planned as irregular and determined by the landscape oriented approach. The main focus of the thematic data processing has been allocated for the use of neural networks (fuzzy logic inc.); therefore, the results of field studies have been converting input parameter of type / species of vegetation cover of each unit and the degree of variability. Proposed approach involves the processing of time series separately for each image mainly for the verification: shooting parameters taken into consideration (time, albedo) and thus expected to assess the quality of mapping. So the input variables for the networks were sensor bands, surface altitude, solar angels and land surface temperature (for a few experiments); also given attention to the formation of the formula class on the basis of statistical pre-processing of results of

  15. Tropical dendrochemistry: A novel approach to estimate age and growth from ringless trees

    SciTech Connect

    Poussart,P.; Myneni, S.; Lanzirotti, A.

    2006-01-01

    Although tropical forests play an active role in the global carbon cycle and climate, their growth history remains poorly characterized compared to other ecosystems on the planet. Trees are prime candidates for the extraction of paleoclimate archives as they can be probed sub-annually, are widely distributed and can live for over 1400 years. However, dendrochronological techniques have found limited applications in the tropics because trees often lack visible growth rings. Alternative methods exist (dendrometry, radio- and stable isotopes), but the derived records are either of short-duration, lack seasonal resolution or are prohibitively labor intensive to produce. Here, we show the first X-ray microprobe synchrotron record of calcium (Ca) from a ringless Miliusa velutina tree from Thailand and use it to estimate the tree's age and growth history. The Ca age model agrees within {le}2 years of bomb-radiocarbon age estimates and confirms that the cycles are seasonal. The amplitude of the Ca annual cycle is correlated significantly with growth and annual Ca maxima correlate with the amount of dry season rainfall. Synchrotron measurements are fast and producing sufficient numbers of replicated multi-century tropical dendrochemical climate records now seems analytically feasible.

  16. A Fault Tree Approach to Analysis of Behavioral Systems: An Overview.

    ERIC Educational Resources Information Center

    Stephens, Kent G.

    Developed at Brigham Young University, Fault Tree Analysis (FTA) is a technique for enhancing the probability of success in any system by analyzing the most likely modes of failure that could occur. It provides a logical, step-by-step description of possible failure events within a system and their interaction--the combinations of potential…

  17. Clinical features of organophosphate poisoning: A review of different classification systems and approaches

    PubMed Central

    Peter, John Victor; Sudarsan, Thomas Isiah; Moran, John L.

    2014-01-01

    Purpose: The typical toxidrome in organophosphate (OP) poisoning comprises of the Salivation, Lacrimation, Urination, Defecation, Gastric cramps, Emesis (SLUDGE) symptoms. However, several other manifestations are described. We review the spectrum of symptoms and signs in OP poisoning as well as the different approaches to clinical features in these patients. Materials and Methods: Articles were obtained by electronic search of PubMed® between 1966 and April 2014 using the search terms organophosphorus compounds or phosphoric acid esters AND poison or poisoning AND manifestations. Results: Of the 5026 articles on OP poisoning, 2584 articles pertained to human poisoning; 452 articles focusing on clinical manifestations in human OP poisoning were retrieved for detailed evaluation. In addition to the traditional approach of symptoms and signs of OP poisoning as peripheral (muscarinic, nicotinic) and central nervous system receptor stimulation, symptoms were alternatively approached using a time-based classification. In this, symptom onset was categorized as acute (within 24-h), delayed (24-h to 2-week) or late (beyond 2-week). Although most symptoms occur with minutes or hours following acute exposure, delayed onset symptoms occurring after a period of minimal or mild symptoms, may impact treatment and timing of the discharge following acute exposure. Symptoms and signs were also viewed as an organ specific as cardiovascular, respiratory or neurological manifestations. An organ specific approach enables focused management of individual organ dysfunction that may vary with different OP compounds. Conclusions: Different approaches to the symptoms and signs in OP poisoning may better our understanding of the underlying mechanism that in turn may assist with the management of acutely poisoned patients. PMID:25425841

  18. Contrasting regional and national mechanisms for predicting elevated arsenic in private wells across the United States using classification and regression trees.

    PubMed

    Frederick, Logan; VanDerslice, James; Taddie, Marissa; Malecki, Kristen; Gregg, Josh; Faust, Nicholas; Johnson, William P

    2016-03-15

    Arsenic contamination in groundwater is a public health and environmental concern in the United States (U.S.) particularly where monitoring is not required under the Safe Water Drinking Act. Previous studies suggest the influence of regional mechanisms for arsenic mobilization into groundwater; however, no study has examined how influencing parameters change at a continental scale spanning multiple regions. We herein examine covariates for groundwater in the western, central and eastern U.S. regions representing mechanisms associated with arsenic concentrations exceeding the U.S. Environmental Protection Agency maximum contamination level (MCL) of 10 parts per billion (ppb). Statistically significant covariates were identified via classification and regression tree (CART) analysis, and included hydrometeorological and groundwater chemical parameters. The CART analyses were performed at two scales: national and regional; for which three physiographic regions located in the western (Payette Section and the Snake River Plain), central (Osage Plains of the Central Lowlands), and eastern (Embayed Section of the Coastal Plains) U.S. were examined. Validity of each of the three regional CART models was indicated by values >85% for the area under the receiver-operating characteristic curve. Aridity (precipitation minus potential evapotranspiration) was identified as the primary covariate associated with elevated arsenic at the national scale. At the regional scale, aridity and pH were the major covariates in the arid to semi-arid (western) region; whereas dissolved iron (taken to represent chemically reducing conditions) and pH were major covariates in the temperate (eastern) region, although additional important covariates emerged, including elevated phosphate. Analysis in the central U.S. region indicated that elevated arsenic concentrations were driven by a mixture of those observed in the western and eastern regions. PMID:26803265

  19. The suitability of the dual isotope approach (δ13C and δ18O) in tree ring studies

    NASA Astrophysics Data System (ADS)

    Siegwolf, Rolf; Saurer, Matthias

    2016-04-01

    The use of stable isotopes, complementary to tree ring width data in tree ring research has proven to be a powerful tool in studying the impact of environmental parameters on tree physiology and growth. These three proxies are thus instrumental for climate reconstruction and improve the understanding of underlying causes of growth changes. In various cases, however, their use suggests non-plausible interpretations. Often the use of one isotope alone does not allow the detection of such "erroneous isotope responses". A careful analysis of these deviating results shows that either the validity of the carbon isotope discrimination concept is no longer true (Farquhar et al. 1982) or the assumptions for the leaf water enrichment model (Cernusak et al., 2003) are violated and thus both fractionation models are not applicable. In this presentation we discuss such cases when the known fractionation concepts fail and do not allow a correct interpretation of the isotope data. With the help of the dual isotope approach (Scheidegger et al.; 2000) it is demonstrated, how to detect and uncover the causes for such anomalous isotope data. The fractionation concepts and their combinations before the background of CO2 and H2O gas exchange are briefly explained and the specific use of the dual isotope approach for tree ring data analyses and interpretations are demonstrated. References: Cernusak, L. A., Arthur, D. J., Pate, J. S. and Farquhar, G. D.: Water relations link carbon and oxygen isotope discrimination to phloem sap sugar concentration in Eucalyptus globules, Plant Physiol., 131, 1544-1554, 2003. Farquhar, G. D., O'Leary, M. H. and Berry, J. A.: On the relationship between carbon isotope discrimination and the intercellular carbon dioxide concentration in leaves, Aust. J. Plant Physiol., 9, 121-137, 1982. Scheidegger, Y., Saurer, M., Bahn, M. and Siegwolf, R.: Linking stable oxygen and carbon isotopes with stomatal conductance and photosynthetic capacity: A conceptual model

  20. Alternative standardization approaches to improving streamflow reconstructions with ring-width indices of riparian trees

    USGS Publications Warehouse

    Meko, David M; Friedman, Jonathan M.; Touchan, Ramzi; Edmondson, Jesse R.; Griffin, Eleanor R.; Scott, Julian A.

    2015-01-01

    Old, multi-aged populations of riparian trees provide an opportunity to improve reconstructions of streamflow. Here, ring widths of 394 plains cottonwood (Populus deltoids, ssp. monilifera) trees in the North Unit of Theodore Roosevelt National Park, North Dakota, are used to reconstruct streamflow along the Little Missouri River (LMR), North Dakota, US. Different versions of the cottonwood chronology are developed by (1) age-curve standardization (ACS), using age-stratified samples and a single estimated curve of ring width against estimated ring age, and (2) time-curve standardization (TCS), using a subset of longer ring-width series individually detrended with cubic smoothing splines of width against year. The cottonwood chronologies are combined with the first principal component of four upland conifer chronologies developed by conventional methods to investigate the possible value of riparian tree-ring chronologies for streamflow reconstruction of the LMR. Regression modeling indicates that the statistical signal for flow is stronger in the riparian cottonwood than in the upland chronologies. The flow signal from cottonwood complements rather than repeats the signal from upland conifers and is especially strong in young trees (e.g. 5–35 years). Reconstructions using a combination of cottonwoods and upland conifers are found to explain more than 50% of the variance of LMR flow over a 1935–1990 calibration period and to yield reconstruction of flow to 1658. The low-frequency component of reconstructed flow is sensitive to the choice of standardization method for the cottonwood. In contrast to the TCS version, the ACS reconstruction features persistent low flows in the 19th century. Results demonstrate the value to streamflow reconstruction of riparian cottonwood and suggest that more studies are needed to exploit the low-frequency streamflow signal in densely sampled age-stratified stands of riparian trees.

  1. Automatic approach to solve the morphological galaxy classification problem using the sparse representation technique and dictionary learning

    NASA Astrophysics Data System (ADS)

    Diaz-Hernandez, R.; Ortiz-Esquivel, A.; Peregrina-Barreto, H.; Altamirano-Robles, L.; Gonzalez-Bernal, J.

    2016-04-01

    The observation of celestial objects in the sky is a practice that helps astronomers to understand the way in which the Universe is structured. However, due to the large number of observed objects with modern telescopes, the analysis of these by hand is a difficult task. An important part in galaxy research is the morphological structure classification based on the Hubble sequence. In this research, we present an approach to solve the morphological galaxy classification problem in an automatic way by using the Sparse Representation technique and dictionary learning with K-SVD. For the tests in this work, we use a database of galaxies extracted from the Principal Galaxy Catalog (PGC) and the APM Equatorial Catalogue of Galaxies obtaining a total of 2403 useful galaxies. In order to represent each galaxy frame, we propose to calculate a set of 20 features such as Hu's invariant moments, galaxy nucleus eccentricity, gabor galaxy ratio and some other features commonly used in galaxy classification. A stage of feature relevance analysis was performed using Relief-f in order to determine which are the best parameters for the classification tests using 2, 3, 4, 5, 6 and 7 galaxy classes making signal vectors of different length values with the most important features. For the classification task, we use a 20-random cross-validation technique to evaluate classification accuracy with all signal sets achieving a score of 82.27 % for 2 galaxy classes and up to 44.27 % for 7 galaxy classes.

  2. Automatic approach to solve the morphological galaxy classification problem using the sparse representation technique and dictionary learning

    NASA Astrophysics Data System (ADS)

    Diaz-Hernandez, R.; Ortiz-Esquivel, A.; Peregrina-Barreto, H.; Altamirano-Robles, L.; Gonzalez-Bernal, J.

    2016-06-01

    The observation of celestial objects in the sky is a practice that helps astronomers to understand the way in which the Universe is structured. However, due to the large number of observed objects with modern telescopes, the analysis of these by hand is a difficult task. An important part in galaxy research is the morphological structure classification based on the Hubble sequence. In this research, we present an approach to solve the morphological galaxy classification problem in an automatic way by using the Sparse Representation technique and dictionary learning with K-SVD. For the tests in this work, we use a database of galaxies extracted from the Principal Galaxy Catalog (PGC) and the APM Equatorial Catalogue of Galaxies obtaining a total of 2403 useful galaxies. In order to represent each galaxy frame, we propose to calculate a set of 20 features such as Hu's invariant moments, galaxy nucleus eccentricity, gabor galaxy ratio and some other features commonly used in galaxy classification. A stage of feature relevance analysis was performed using Relief-f in order to determine which are the best parameters for the classification tests using 2, 3, 4, 5, 6 and 7 galaxy classes making signal vectors of different length values with the most important features. For the classification task, we use a 20-random cross-validation technique to evaluate classification accuracy with all signal sets achieving a score of 82.27 % for 2 galaxy classes and up to 44.27 % for 7 galaxy classes.

  3. Automated melanoma detection: multispectral imaging and neural network approach for classification.

    PubMed

    Tomatis, Stefano; Bono, Aldo; Bartoli, Cesare; Carrara, Mauro; Lualdi, Manuela; Tragni, Gabrina; Marchesini, Renato

    2003-02-01

    Our aim in the present research is to investigate the diagnostic performance of artificial neural networks (ANNs) applied to multispectral images of cutaneous pigmented skin lesions as well as to compare this approach to a standard traditional linear classification method, such as discriminant function analysis. This study involves a series of 534 patients with 573 cutaneous pigmented lesions (132 melanomas and 441 nonmelanoma lesions). Each lesion was analyzed by a telespectrophotometric system (TS) in vivo, before surgery. The system is able to acquire a set of 17 images at selected wavelengths from 400 to 1040 nm. For each wavelength, five lesion descriptors were extracted, related to the criteria of the ABCD (for asymmetry, border, color, and dimension) clinical guide for melanoma diagnosis. These variables were first reduced in dimension by the use of factor analysis techniques and then used as input data in an ANN. Multivariate discriminant analysis (MDA) was also performed on the same dataset. The whole dataset was split into two independent groups: i.e., train (the first 400 cases, 95 melanomas) and verification set (last 173 cases, 37 melanomas). Factor analysis was able to summarize the data structure into ten variables, accounting for at least 90% of the original parameters variance. After proper training, the ANN was able to classify the population with 80% sensitivity, 72% specificity, and 78% sensitivity, 76% specificity for the train and validation set, respectively. Following ROC analysis, area under curve (AUC) was 0.852 (train) and 0.847 (verify). Sensitivity and specificity values obtained by the standard discriminant analysis classifier resulted in a figure of 80% sensitivity, 60% specificity and 76% sensitivity, 57% specificity for the train and validation set, respectively. AUC for MDA was 0.810 and 0.764 for the train and verify set, respectively. Classification results were significantly different between the two methods both for diagnostic

  4. Validation of a novel classification model of psychogenic nonepileptic seizures by video-EEG analysis and a machine learning approach.

    PubMed

    Magaudda, Adriana; Laganà, Angela; Calamuneri, Alessandro; Brizzi, Teresa; Scalera, Cinzia; Beghi, Massimiliano; Cornaggia, Cesare Maria; Di Rosa, Gabriella

    2016-07-01

    The aim of this study was to validate a novel classification for the diagnosis of PNESs. Fifty-five PNES video-EEG recordings were retrospectively analyzed by four epileptologists and one psychiatrist in a blind manner and classified into four distinct groups: Hypermotor (H), Akinetic (A), Focal Motor (FM), and with Subjective Symptoms (SS). Eleven signs and symptoms, which are frequently found in PNESs, were chosen for statistical validation of our classification. An artificial neural network (ANN) analyzed PNES video recordings based on the signs and symptoms mentioned above. By comparing results produced by the ANN with classifications given by examiners, we were able to understand whether such classification was objective and generalizable. Through accordance metrics based on signs and symptoms (range: 0-100%), we found that most of the seizures belonging to class A showed a high degree of accordance (mean±SD=73%±5%); a similar pattern was found for class SS (80% slightly lower accordance was reported for class H (58%±18%)), with a minimum of 30% in some cases. Low agreement arose from the FM group. Seizures were univocally assigned to a given class in 83.6% of seizures. The ANN classified PNESs in the same way as visual examination in 86.7%. Agreement between ANN classification and visual classification reached 83.3% (SD=17.8%) accordance for class H, 100% (SD=22%) for class A, 83.3% (SD=21.2%) for class SS, and 50% (SD=19.52%) for class FM. This is the first study in which the validity of a new PNES classification was established and reached in two different ways. Video-EEG evaluation needs to be performed by an experienced clinician, but later on, it may be fed into ANN analysis, whose feedback will provide guidance for differential diagnosis. Our analysis, supported by the ML approach, showed that this model of classification could be objectively performed by video-EEG examination. PMID:27208925

  5. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.

    PubMed

    Galpert, Deborah; Del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

  6. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

    PubMed Central

    Galpert, Deborah; del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

  7. An Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources

    SciTech Connect

    Rocco, D; Critchlow, T J

    2003-05-01

    The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources--e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

  8. Multi-locus tree and species tree approaches toward resolving a complex clade of downy mildews (Straminipila, Oomycota), including pathogens of beet and spinach.

    PubMed

    Choi, Young-Joon; Klosterman, Steven J; Kummer, Volker; Voglmayr, Hermann; Shin, Hyeon-Dong; Thines, Marco

    2015-05-01

    Accurate species determination of plant pathogens is a prerequisite for their control and quarantine, and further for assessing their potential threat to crops. The family Peronosporaceae (Straminipila; Oomycota) consists of obligate biotrophic pathogens that cause downy mildew disease on angiosperms, including a large number of cultivated plants. In the largest downy mildew genus Peronospora, a phylogenetically complex clade includes the economically important downy mildew pathogens of spinach and beet, as well as the type species of the genus Peronospora. To resolve this complex clade at the species level and to infer evolutionary relationships among them, we used multi-locus phylogenetic analysis and species tree estimation. Both approaches discriminated all nine currently accepted species and revealed four previously unrecognized lineages, which are specific to a host genus or species. This is in line with a narrow species concept, i.e. that a downy mildew species is associated with only a particular host plant genus or species. Instead of applying the dubious name Peronospora farinosa, which has been proposed for formal rejection, our results provide strong evidence that Peronospora schachtii is an independent species from lineages on Atriplex and apparently occurs exclusively on Beta vulgaris. The members of the clade investigated, the Peronospora rumicis clade, associate with three different host plant families, Amaranthaceae, Caryophyllaceae, and Polygonaceae, suggesting that they may have speciated following at least two recent inter-family host shifts, rather than contemporary cospeciation with the host plants. PMID:25772799

  9. An automatic indexing and neural network approach to concept retrieval and classification of multilingual (Chinese-English) documents.

    PubMed

    Lin, C H; Chen, H

    1996-01-01

    An automatic indexing and concept classification approach to a multilingual (Chinese and English) bibliographic database is presented. We introduced a multi-linear term-phrasing technique to extract concept descriptors (terms or keywords) from a Chinese-English bibliographic database. A concept space of related descriptors was then generated using a co-occurrence analysis technique. Like a man-made thesaurus, the system-generated concept space can be used to generate additional semantically-relevant terms for search. For concept classification and clustering, a variant of a Hopfield neural network was developed to cluster similar concept descriptors and to generate a small number of concept groups to represent (summarize) the subject matter of the database. The concept space approach to information classification and retrieval has been adopted by the authors in other scientific databases and business applications, but multilingual information retrieval presents a unique challenge. This research reports our experiment on multilingual databases. Our system was initially developed in the MS-DOS environment, running ETEN Chinese operating system. For performance reasons, it was then tested on a UNIX-based system. Due to the unique ideographic nature of the Chinese language, a Chinese term-phrase indexing paradigm considering the ideographic characteristics of Chinese was developed as a multilingual information classification model. By applying the neural network based concept classification technique, the model presents a novel way of organizing unstructured multilingual information. PMID:18263007

  10. Data mining approach identifies research priorities and data requirements for resolving the red algal tree of life

    PubMed Central

    2010-01-01

    Background The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Results Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Conclusions Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable

  11. A discriminative model-constrained EM approach to 3D MRI brain tissue classification and intensity non-uniformity correction

    NASA Astrophysics Data System (ADS)

    Wels, Michael; Zheng, Yefeng; Huber, Martin; Hornegger, Joachim; Comaniciu, Dorin

    2011-06-01

    We describe a fully automated method for tissue classification, which is the segmentation into cerebral gray matter (GM), cerebral white matter (WM), and cerebral spinal fluid (CSF), and intensity non-uniformity (INU) correction in brain magnetic resonance imaging (MRI) volumes. It combines supervised MRI modality-specific discriminative modeling and unsupervised statistical expectation maximization (EM) segmentation into an integrated Bayesian framework. While both the parametric observation models and the non-parametrically modeled INUs are estimated via EM during segmentation itself, a Markov random field (MRF) prior model regularizes segmentation and parameter estimation. Firstly, the regularization takes into account knowledge about spatial and appearance-related homogeneity of segments in terms of pairwise clique potentials of adjacent voxels. Secondly and more importantly, patient-specific knowledge about the global spatial distribution of brain tissue is incorporated into the segmentation process via unary clique potentials. They are based on a strong discriminative model provided by a probabilistic boosting tree (PBT) for classifying image voxels. It relies on the surrounding context and alignment-based features derived from a probabilistic anatomical atlas. The context considered is encoded by 3D Haar-like features of reduced INU sensitivity. Alignment is carried out fully automatically by means of an affine registration algorithm minimizing cross-correlation. Both types of features do not immediately use the observed intensities provided by the MRI modality but instead rely on specifically transformed features, which are less sensitive to MRI artifacts. Detailed quantitative evaluations on standard phantom scans and standard real-world data show the accuracy and robustness of the proposed method. They also demonstrate relative superiority in comparison to other state-of-the-art approaches to this kind of computational task: our method achieves average

  12. Tropical dendrochemistry: A novel approach for reconstructing seasonally-resolved growth rates from ringless tropical trees

    NASA Astrophysics Data System (ADS)

    Poussart, P. M.; Myneni, S. C.

    2005-12-01

    Although tropical forests play an active role in the global carbon cycle and are host to a variety of pristine paleoclimate archives, they remain poorly characterized as compared to other ecosystems on the planet. In particular, dating and reconstructing the growth rate history of tropical trees remains a challenge and continues to delay research efforts towards understanding tropical forest dynamics. Traditional dendrochronological techniques have found limited applications in the tropics because temperature seasonality is often too small to initiate the production of visible annual growth rings. Dendrometers, cambium scarring methods and sub-annual records of oxygen and carbon isotopes from tree cellulose may be used to estimate growth rate histories when growth rings are absent. However, dendrometer records rarely extend beyond the past couple of decades and the generation of seasonally-resolved isotopic records remains labour intensive, currently prohibiting the level of record replication necessary for statistical analysis. Here, we present evidence that Ca may also be used as a proxy for dating and reconstructing growth rates of trees lacking visible growth rings. Using the Brookhaven National Lab Synchrotron, we recover a radial record of cyclic variations in Ca from a Miliusa velutina tree from northern Thailand. We determine that the Ca cycles are seasonal based on a comparison between radiocarbon age estimates and a trace element age model, which agree within 2 years over the period of 1955 to 2000. The amplitude of the Ca annual cycle is significantly correlated with growth rate estimates, which are also correlated to the amount of dry season rainfall. The measurements at the Synchrotron are fast, non-destructive and require little sample preparation. Application of this technique in the tropics holds the potential to resolve longstanding questions about tropical forest dynamics and interannual to decadal changes in the carbon cycle.

  13. A differential geometric approach to automated segmentation of human airway tree.

    PubMed

    Pu, Jiantao; Fuhrman, Carl; Good, Walter F; Sciurba, Frank C; Gur, David

    2011-02-01

    Airway diseases are frequently associated with morphological changes that may affect the physiology of the lungs. Accurate characterization of airways may be useful for quantitatively assessing prognosis and for monitoring therapeutic efficacy. The information gained may also provide insight into the underlying mechanisms of various lung diseases. We developed a computerized scheme to automatically segment the 3-D human airway tree depicted on computed tomography (CT) images. The method takes advantage of both principal curvatures and principal directions in differentiating airways from other tissues in geometric space. A "puzzle game" procedure is used to identify false negative regions and reduce false positive regions that do not meet the shape analysis criteria. The negative impact of partial volume effects on small airway detection is partially alleviated by repeating the developed differential geometric analysis on lung anatomical structures modeled at multiple iso-values (thresholds). In addition to having advantages, such as full automation, easy implementation and relative insensitivity to image noise and/or artifacts, this scheme has virtually no leakage issues and can be easily extended to the extraction or the segmentation of other tubular type structures (e.g., vascular tree). The performance of this scheme was assessed quantitatively using 75 chest CT examinations acquired on 45 subjects with different slice thicknesses and using 20 publicly available test cases that were originally designed for evaluating the performance of different airway tree segmentation algorithms. PMID:20851792

  14. A Differential Geometric Approach to Automated Segmentation of Human Airway Tree

    PubMed Central

    Pu, Jiantao; Fuhrman, Carl; Good, Walter F; Sciurba, Frank C; Gur, David

    2012-01-01

    Airway diseases are frequently associated with morphological changes that may affect the physiology of the lungs. Accurate characterization of airways may be useful for quantitatively assessing prognosis and for monitoring therapeutic efficacy. The information gained may also provide insight into the underlying mechanisms of various lung diseases. We developed a computerized scheme to automatically segment the three-dimensional human airway tree depicted on CT images. The method takes advantage of both principal curvatures and principal directions in differentiating airways from other tissues in geometric space. A “puzzle game” procedure is used to identify false negative regions and reduce false positive regions that do not meet the shape analysis criteria. The negative impact of partial volume effects on small airway detection is partially alleviated by repeating the developed differential geometric analysis on lung anatomical structures modeled at multiple iso-values (thresholds). In addition to having advantages, such as full automation, easy implementation and relative insensitivity to image noise and/or artifacts, this scheme has virtually no leakage issues and can be easily extended to the extraction or the segmentation of other tubular type structures (e.g., vascular tree). The performance of this scheme was assessed quantitatively using 75 chest CT examinations acquired on 45 subjects with different slice thicknesses and using 20 publicly available test cases that were originally designed for evaluating the performance of different airway tree segmentation algorithms. PMID:20851792

  15. Buildings classification from airborne LiDAR point clouds through OBIA and ontology driven approach

    NASA Astrophysics Data System (ADS)

    Tomljenovic, Ivan; Belgiu, Mariana; Lampoltshammer, Thomas J.

    2013-04-01

    In the last years, airborne Light Detection and Ranging (LiDAR) data proved to be a valuable information resource for a vast number of applications ranging from land cover mapping to individual surface feature extraction from complex urban environments. To extract information from LiDAR data, users apply prior knowledge. Unfortunately, there is no consistent initiative for structuring this knowledge into data models that can be shared and reused across different applications and domains. The absence of such models poses great challenges to data interpretation, data fusion and integration as well as information transferability. The intention of this work is to describe the design, development and deployment of an ontology-based system to classify buildings from airborne LiDAR data. The novelty of this approach consists of the development of a domain ontology that specifies explicitly the knowledge used to extract features from airborne LiDAR data. The overall goal of this approach is to investigate the possibility for classification of features of interest from LiDAR data by means of domain ontology. The proposed workflow is applied to the building extraction process for the region of "Biberach an der Riss" in South Germany. Strip-adjusted and georeferenced airborne LiDAR data is processed based on geometrical and radiometric signatures stored within the point cloud. Region-growing segmentation algorithms are applied and segmented regions are exported to the GeoJSON format. Subsequently, the data is imported into the ontology-based reasoning process used to automatically classify exported features of interest. Based on the ontology it becomes possible to define domain concepts, associated properties and relations. As a consequence, the resulting specific body of knowledge restricts possible interpretation variants. Moreover, ontologies are machinable and thus it is possible to run reasoning on top of them. Available reasoners (FACT++, JESS, Pellet) are used to check

  16. Machine learning in soil classification.

    PubMed

    Bhattacharya, B; Solomatine, D P

    2006-03-01

    In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained. PMID:16530382

  17. Tree Testing of Hierarchical Menu Structures for Health Applications

    PubMed Central

    Le, Thai; Chaudhuri, Shomir; Chung, Jane; Thompson, Hilaire J; Demiris, George

    2014-01-01

    To address the need for greater evidence-based evaluation of Health Information Technology (HIT) systems we introduce a method of usability testing termed tree testing. In a tree test, participants are presented with an abstract hierarchical tree of the system taxonomy and asked to navigate through the tree in completing representative tasks. We apply tree testing to a commercially available health application, demonstrating a use case and providing a comparison with more traditional in-person usability testing methods. Online tree tests (N=54) and in-person usability tests (N=15) were conducted from August to September 2013. Tree testing provided a method to quantitatively evaluate the information structure of a system using various navigational metrics including completion time, task accuracy, and path length. The results of the analyses compared favorably to the results seen from the traditional usability test. Tree testing provides a flexible, evidence-based approach for researchers to evaluate the information structure of HITs. In addition, remote tree testing provides a quick, flexible, and high volume method of acquiring feedback in a structured format that allows for quantitative comparisons. With the diverse nature and often large quantities of health information available, addressing issues of terminology and concept classifications during the early development process of a health information system will improve navigation through the system and save future resources. Tree testing is a usability method that can be used to quickly and easily assess information hierarchy of health information systems. PMID:24582924

  18. Nitrogen isotopes in Tree-Rings - An approach combining soil biogeochemistry and isotopic long series with statistical modeling

    NASA Astrophysics Data System (ADS)

    Savard, Martine M.; Bégin, Christian; Paré, David; Marion, Joëlle; Laganière, Jérôme; Séguin, Armand; Stefani, Franck; Smirnoff, Anna

    2016-04-01

    Monitoring atmospheric emissions from industrial centers in North America generally started less than 25 years ago. To compensate for the lack of monitoring, previous investigations have interpreted tree-ring N changes using the known chronology of human activities, without facing the challenge of separating climatic effects from potential anthropogenic impacts. Here we document such an attempt conducted in the oil sands (OS) mining region of Northeastern Alberta, Canada. The reactive nitrogen (Nr)-emitting oil extraction operations began in 1967, but air quality measurements were only initiated in 1997. To investigate if the beginning and intensification of OS operations induced changes in the forest N-cycle, we sampled white spruce (Picea glauca (Moench) Voss) stands located at various distances from the main mining area, and receiving low, but different N deposition. Our approach combines soil biogeochemical and metagenomic characterization with long, well dated, tree-ring isotopic series. To objectively delineate the natural N isotopic behaviour in trees, we have characterized tree-ring N isotope (15N/14N) ratios between 1880 and 2009, used statistical analyses of the isotopic values and local climatic parameters of the pre-mining period to calibrate response functions and project the isotopic responses to climate during the extraction period. During that period, the measured series depart negatively from the projected natural trends. In addition, these long-term negative isotopic trends are better reproduced by multiple-regression models combining climatic parameters with the proxy for regional mining Nr emissions. These negative isotopic trends point towards changes in the forest soil biogeochemical N cycle. The biogeochemical data and ultimate soil mechanisms responsible for such changes will be discussed during the presentation.

  19. Linking Tree Growth Response to Measured Microclimate - A Field Based Approach

    NASA Astrophysics Data System (ADS)

    Martin, J. T.; Hoylman, Z. H.; Looker, N. T.; Jencso, K. G.; Hu, J.

    2015-12-01

    The general relationship between climate and tree growth is a well established and important tenet shaping both paleo and future perspectives of forest ecosystem growth dynamics. Across much of the American west, water limits growth via physiological mechanisms that tie regional and local climatic conditions to forest productivity in a relatively predictable way, and these growth responses are clearly evident in tree ring records. However, within the annual cycle of a forest landscape, water availability varies across both time and space, and interacts with other potentially growth limiting factors such as temperature, light, and nutrients. In addition, tree growth responses may lag climate drivers and may vary in terms of where in a tree carbon is allocated. As such, determining when and where water actually limits forest growth in real time can be a significant challenge. Despite these challenges, we present data suggestive of real-time growth limitation driven by soil moisture supply and atmospheric water demand reflected in high frequency field measurements of stem radii and cell structure across ecological gradients. The experiment was conducted at the Lubrecht Experimental Forest in western Montana where, over two years, we observed intra-annual growth rates of four dominant conifer species: Douglas fir, Ponderosa Pine, Engelmann Spruce and Western Larch using point dendrometers and microcores. In all four species studied, compensatory use of stored water (inferred from stem water deficit) appears to exhibit a threshold relationship with a critical balance point between water supply and demand. The occurrence of this point in time coincided with a decrease in stem growth rates, and the while the timing varied up to one month across topographic and elevational gradients, the onset date of growth limitation was a reliable predictor of overall annual growth. Our findings support previous model-based observations of nonlinearity in the relationship between

  20. Downscaling Transpiration from the Field to the Tree Scale using the Neural Network Approach

    NASA Astrophysics Data System (ADS)

    Hopmans, J. W.

    2015-12-01

    Estimating actual evapotranspiration (ETa) spatial variability in orchards is key when trying to quantify water (and associated nutrients) leaching, both with the mass balance and inverse modeling methods. ETa measurements however generally occur at larger scales (e.g. Eddy-covariance method) or have a limited quantitative accuracy. In this study we propose to establish a statistical relation between field ETa and field averaged variables known to be closely related to it, such as stem water potential (WP), soil water storage (WS) and ETc. For that we use 4 years of soil and almond trees water status data to train artificial neural networks (ANNs) predicting field scale ETa and downscale the relation to the individual tree scale. ANNs composed of only two neurons in a hidden layer (11 parameters on total) proved to be the most accurate (overall RMSE = 0.0246 mm/h, R2 = 0.944), seemingly because adding more neurons generated overfitting of noise in the training dataset. According to the optimized weights in the best ANNs, the first hidden neuron could be considered in charge of relaying the ETc information while the other one would deal with the water stress response to stem WP, soil WS, and ETc. As individual trees had specific signatures for combinations of these variables, variability was generated in their ETa responses. The relative canopy cover was the main source of variability of ETa while stem WP was the most influent factor for the ETa / ETc ratio. Trees on drip-irrigated side of the orchard appeared to be less affected by low estimated soil WS in the root zone than on the fanjet micro-sprinklers side, possibly due to a combination of (i) more substantial root biomass increasing the plant hydraulic conductance, (ii) bias in the soil WS estimation due to soil moisture heterogeneity on the drip-side, and (iii) the access to deeper water resource. Tree scale ETa responses are in good agreement with soil-plant water relations reported in the literature, and

  1. Toward the Improvement of Trail Classification in National Parks Using the Recreation Opportunity Spectrum Approach

    NASA Astrophysics Data System (ADS)

    Oishi, Yoshitaka

    2013-06-01

    Trail settings in national parks are essential management tools for improving both ecological conservation efforts and the quality of visitor experiences. This study proposes a plan for the appropriate maintenance of trails in Chubusangaku National Park, Japan, based on the recreation opportunity spectrum (ROS) approach. First, we distributed 452 questionnaires to determine park visitors' preferences for setting a trail (response rate = 68 %). Respondents' preferences were then evaluated according to the following seven parameters: access, remoteness, naturalness, facilities and site management, social encounters, visitor impact, and visitor management. Using nonmetric multidimensional scaling and cluster analysis, the visitors were classified into seven groups. Last, we classified the actual trails according to the visitor questionnaire criteria to examine the discrepancy between visitors' preferences and actual trail settings. The actual trail classification indicated that while most developed trails were located in accessible places, primitive trails were located in remote areas. However, interestingly, two visitor groups seemed to prefer a well-conserved natural environment and, simultaneously, easily accessible trails. This finding does not correspond to a premise of the ROS approach, which supposes that primitive trails should be located in remote areas without ready access. Based on this study's results, we propose that creating trails, which afford visitors the opportunity to experience a well-conserved natural environment in accessible areas is a useful means to provide visitors with diverse recreation opportunities. The process of data collection and analysis in this study can be one approach to produce ROS maps for providing visitors with recreational opportunities of greater diversity and higher quality.

  2. Phylogeny and Classification of the Trapdoor Spider Genus Myrmekiaphila: An Integrative Approach to Evaluating Taxonomic Hypotheses

    PubMed Central

    Bailey, Ashley L.; Brewer, Michael S.; Hendrixson, Brent E.; Bond, Jason E.

    2010-01-01

    Background Revised by Bond and Platnick in 2007, the trapdoor spider genus Myrmekiaphila comprises 11 species. Species delimitation and placement within one of three species groups was based on modifications of the male copulatory device. Because a phylogeny of the group was not available these species groups might not represent monophyletic lineages; species definitions likewise were untested hypotheses. The purpose of this study is to reconstruct the phylogeny of Myrmekiaphila species using molecular data to formally test the delimitation of species and species-groups. We seek to refine a set of established systematic hypotheses by integrating across molecular and morphological data sets. Methods and Findings Phylogenetic analyses comprising Bayesian searches were conducted for a mtDNA matrix composed of contiguous 12S rRNA, tRNA-val, and 16S rRNA genes and a nuclear DNA matrix comprising the glutamyl and prolyl tRNA synthetase gene each consisting of 1348 and 481 bp, respectively. Separate analyses of the mitochondrial and nuclear genome data and a concatenated data set yield M. torreya and M. millerae paraphyletic with respect to M. coreyi and M. howelli and polyphyletic fluviatilis and foliata species groups. Conclusions Despite the perception that molecular data present a solution to a crisis in taxonomy, studies like this demonstrate the efficacy of an approach that considers data from multiple sources. A DNA barcoding approach during the species discovery process would fail to recognize at least two species (M. coreyi and M. howelli) whereas a combined approach more accurately assesses species diversity and illuminates speciation pattern and process. Concomitantly these data also demonstrate that morphological characters likewise fail in their ability to recover monophyletic species groups and result in an unnatural classification. Optimizations of these characters demonstrate a pattern of “Dollo evolution” wherein a complex character evolves only once

  3. Adhesive restorations in the posterior area with subgingival cervical margins: new classification and differentiated treatment approach.

    PubMed

    Veneziani, Marco

    2010-01-01

    The aim of this article is to analyze some of the issues related to the adhesive restoration of teeth with deep cervical and/or subgingival margins in the posterior area. Three different problems tend to occur during restoration: loss of dental substance, detection of subgingival cervical margins, and dentin sealing of the cervical margins. These conditions, together with the presence of medium/large-sized cavities associated with cuspal involvement and absence of cervical enamel, are indications for indirect adhesive restorations. Subgingival margins are associated with biological and technical problems such as difficulty in isolating the working field with a dental dam, adhesion procedures, impression taking, and final positioning of the restoration itself. A new classification is suggested based on two clinical parameters: 1) a technicaloperative parameter (possibility of correct isolation through the dental dam) and 2) a biological parameter (depending on the biologic width). Three different clinical situations and three different therapeutic approaches are identified (1st, 2nd, and 3rd, respectively): coronal relocation of the margin, surgical exposure of the margin, and clinical crown lengthening. The latter is associated with three further operative sequences: immediate, early, or delayed impression taking. The different therapeutic options are described and illustrated by several clinical cases. The surgical-restorative approach, whereby surgery is strictly associated with buildup, onlay preparation, and impression taking is particularly interesting. The restoration is cemented after only 1 week. This approach makes it possible to speed up the therapy by eliminating the intermediate phases associated with positioning the provisional restorations, and with fast and efficient healing of the soft marginal tissue. PMID:20305873

  4. Detection of dispersed radio pulses: a machine learning approach to candidate identification and classification

    NASA Astrophysics Data System (ADS)

    Devine, Thomas Ryan; Goseva-Popstojanova, Katerina; McLaughlin, Maura

    2016-06-01

    Searching for extraterrestrial, transient signals in astronomical data sets is an active area of current research. However, machine learning techniques are lacking in the literature concerning single-pulse detection. This paper presents a new, two-stage approach for identifying and classifying dispersed pulse groups (DPGs) in single-pulse search output. The first stage identified DPGs and extracted features to characterize them using a new peak identification algorithm which tracks sloping tendencies around local maxima in plots of signal-to-noise ratio versus dispersion measure. The second stage used supervised machine learning to classify DPGs. We created four benchmark data sets: one unbalanced and three balanced versions using three different imbalance treatments. We empirically evaluated 48 classifiers by training and testing binary and multiclass versions of six machine learning algorithms on each of the four benchmark versions. While each classifier had advantages and disadvantages, all classifiers with imbalance treatments had higher recall values than those with unbalanced data, regardless of the machine learning algorithm used. Based on the benchmarking results, we selected a subset of classifiers to classify the full, unlabelled data set of over 1.5 million DPGs identified in 42 405 observations made by the Green Bank Telescope. Overall, the classifiers using a multiclass ensemble tree learner in combination with two oversampling imbalance treatments were the most efficient; they identified additional known pulsars not in the benchmark data set and provided six potential discoveries, with significantly less false positives than the other classifiers.

  5. Classification of first-episode psychosis: a multi-modal multi-feature approach integrating structural and diffusion imaging.

    PubMed

    Peruzzo, Denis; Castellani, Umberto; Perlini, Cinzia; Bellani, Marcella; Marinelli, Veronica; Rambaldelli, Gianluca; Lasalvia, Antonio; Tosato, Sarah; De Santi, Katia; Murino, Vittorio; Ruggeri, Mirella; Brambilla, Paolo

    2015-06-01

    Currently, most of the classification studies of psychosis focused on chronic patients and employed single machine learning approaches. To overcome these limitations, we here compare, to our best knowledge for the first time, different classification methods of first-episode psychosis (FEP) using multi-modal imaging data exploited on several cortical and subcortical structures and white matter fiber bundles. 23 FEP patients and 23 age-, gender-, and race-matched healthy participants were included in the study. An innovative multivariate approach based on multiple kernel learning (MKL) methods was implemented on structural MRI and diffusion tensor imaging. MKL provides the best classification performances in comparison with the more widely used support vector machine, enabling the definition of a reliable automatic decisional system based on the integration of multi-modal imaging information. Our results show a discrimination accuracy greater than 90 % between healthy subjects and patients with FEP. Regions with an accuracy greater than 70 % on different imaging sources and measures were middle and superior frontal gyrus, parahippocampal gyrus, uncinate fascicles, and cingulum. This study shows that multivariate machine learning approaches integrating multi-modal and multisource imaging data can classify FEP patients with high accuracy. Interestingly, specific grey matter structures and white matter bundles reach high classification reliability when using different imaging modalities and indices, potentially outlining a prefronto-limbic network impaired in FEP with particular regard to the right hemisphere. PMID:25344845

  6. Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data

    PubMed Central

    Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.

    2016-01-01

    We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872

  7. Fault tree handbook

    SciTech Connect

    Haasl, D.F.; Roberts, N.H.; Vesely, W.E.; Goldberg, F.F.

    1981-01-01

    This handbook describes a methodology for reliability analysis of complex systems such as those which comprise the engineered safety features of nuclear power generating stations. After an initial overview of the available system analysis approaches, the handbook focuses on a description of the deductive method known as fault tree analysis. The following aspects of fault tree analysis are covered: basic concepts for fault tree analysis; basic elements of a fault tree; fault tree construction; probability, statistics, and Boolean algebra for the fault tree analyst; qualitative and quantitative fault tree evaluation techniques; and computer codes for fault tree evaluation. Also discussed are several example problems illustrating the basic concepts of fault tree construction and evaluation.

  8. A Philosophical Approach to Describing Science Content: An Example From Geologic Classification.

    ERIC Educational Resources Information Center

    Finley, Fred N.

    1981-01-01

    Examines how research of philosophers of science may be useful to science education researchers and curriculum developers in the development of descriptions of science content related to classification schemes. Provides examples of concept analysis of two igneous rock classification schemes. (DS)

  9. Building and Solving Odd-One-Out Classification Problems: A Systematic Approach

    ERIC Educational Resources Information Center

    Ruiz, Philippe E.

    2011-01-01

    Classification problems ("find the odd-one-out") are frequently used as tests of inductive reasoning to evaluate human or animal intelligence. This paper introduces a systematic method for building the set of all possible classification problems, followed by a simple algorithm for solving the problems of the R-ASCM, a psychometric test derived…

  10. Classification and characterisation of SRF produced from different flows of processed MSW in the Navarra region and its co-combustion performance with olive tree pruning residues.

    PubMed

    Ramos Casado, Raquel; Arenales Rivera, Jorge; Borjabad García, Elena; Escalada Cuadrado, Ricardo; Fernández Llorente, Miguel; Bados Sevillano, Raquel; Pascual Delgado, Alfonso

    2016-01-01

    The scope of this work is to study the co-combustion of a solid recovered fuel (SRF) produced from household wastes and packaging wastes recovered from selective collection (SC) in the autonomous community of Navarra, located in the northeast of Spain. The municipal solid waste (MSW) is subjected to a mechanical biological treatment (MBT) in order to stabilize the organic matter and recover the recyclable materials as it is done for packaging wastes. Afterwards, rejects from this treatment plant were preconditioned and compressed by a pelletizing process to produce a secondary fuel according to quality and classification criteria of EN 15359, producing the so-called SRF. A fuel characterisation was carried out according to CEN standards and the SRF was classified as follows: NCV 2; Cl 3; Hg 1. SRF pellets were cofired with residual biomass pellets from olive tree pruning (OTP) in a bubbling fluidised bed combustor, as an option of energy recovery. The mixture of fuels, with a mixing ratio close to 50% by weight, showed a significant calorific value of 18.25 MJ/kg at 8% of moisture content. In addition, elemental composition of the mixture based on nitrogen (N), sulphur (S) and chlorine (Cl) (1% N, 0.2% S and 0.4% Cl) was not far from some herbaceous biomasses. The co-combustion showed good results as an energy recovery technology because of the synergies of both fuels, improving notably the combustion conditions and reducing significantly CO concentration, regarding to the combustion of OTP, though other contaminants such as NOx and HCl increased. During eight hours of stable operation, the concentration of dioxins and furans was measured obtaining a value of 7.68 ng/Nm(3) (toxic equivalence: i-TEQ of 0.33 ng/Nm(3)). Proportions of SRF lower than 50% in the mixtures should be tested in order to cut down the emissions of these pollutants, or an abatement system for organochloride compounds may be required. PMID:26072185

  11. Spectral-spatial classification of hyperspectral data based on a stochastic minimum spanning forest approach.

    PubMed

    Bernard, Kévin; Tarabalka, Yuliya; Angulo, Jesús; Chanussot, Jocelyn; Benediktsson, Jón Atli

    2012-04-01

    In this paper, a new method for supervised hyperspectral data classification is proposed. In particular, the notion of stochastic minimum spanning forest (MSF) is introduced. For a given hyperspectral image, a pixelwise classification is first performed. From this classification map, M marker maps are generated by randomly selecting pixels and labeling them as markers for the construction of MSFs. The next step consists in building an MSF from each of the M marker maps. Finally, all the M realizations are aggregated with a maximum vote decision rule in order to build the final classification map. The proposed method is tested on three different data sets of hyperspectral airborne images with different resolutions and contexts. The influences of the number of markers and of the number of realizations M on the results are investigated in experiments. The performance of the proposed method is compared to several classification techniques (both pixelwise and spectral-spatial) using standard quantitative criteria and visual qualitative evaluation. PMID:22086502

  12. A divide and conquer approach for imbalanced multi-class classification and its application to medical decision making.

    PubMed

    Li, Hu

    2016-03-01

    Many real world data contains more than two categories and the number of instances in each category differs greatly. Such as in medical diagnostic data, there may be several types of cancer and each with tens instances, but contains even more normal instances. Similarly, there may be very few abnormal samples in pharmaceutical test but which may cause great harm. Classification of such type of data is often summarized as imbalanced multi-class classification. Most existing researches study multi-class classification and imbalanced data classification separately, few study in a combination way, in particular for medical diagnosis data classification. In the context of medical diagnosis and pharmaceutical test, in this paper, we propose a divide and conquer approach to partition multi-class data and a self-adaptive data resample method for imbalanced data. The proposed methods are tested on 23 UCI datasets in medical, pharmaceutical and other fields. Experiment results show that the proposed methods outperform other compared methods, in particular on those medical and pharmaceutical dataset. PMID:27113314

  13. Moving beyond the Galloway diagrams for delta classification: A graph-theoretic approach.

    NASA Astrophysics Data System (ADS)

    Tejedor, Alejandro; Longjas, Anthony; Caldwell, Rebecca; Edmonds, Douglas; Zaliapin, Ilya; Foufoula-Georgiou, Efi

    2016-04-01

    Delta channel networks self-organize to a variety of stunning and complex patterns in response to different forcings (e.g., river, tides and waves) and the physical properties of their sediment (e.g., particle size, cohesiveness). Understanding and quantifying properties of these patterns is an essential step to solve the inverse problem of inferring process from form. A recently introduced framework based on spectral graph theory allows us to assess delta channel network complexity from a topologic (channel connectivity) and dynamic (flux exchange) perspective [Tejedor et al., 2015a,b]. We demonstrate the potential of this framework, together with numerical and experimental deltas, wherein different delta properties can be varied individually, to replace the qualitative approach still in use today [Galloway, 1975; Orton and Reading, 1993]. Specifically, in this work we have examined the effect of sediment parameters (grain size, cohesiveness) on the channel structure of river dominated deltas generated by a morphodynamic model (Delft3D). Our analysis shows that deltas with coarser incoming sediment are more complex topologically (increased number of looped pathways) but simpler dynamically (reduced flux exchange between subnetworks). We capitalize on the combined approach of controlled simulation (with known drivers) and quantitative comparison by positioning field and simulated deltas in the so-called TopoDynamic space to open up a path to provide valuable information towards a refined classification and inference scheme of delta morphology. Furthermore, numerical deltas allow us to explore the delta channel structure not only in a spatially explicit manner but also temporally, since the complete temporal record of delta evolution is available

  14. A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization.

    PubMed

    Vafaee Sharbaf, Fatemeh; Mosafer, Sara; Moattar, Mohammad Hossein

    2016-06-01

    This paper proposes an approach for gene selection in microarray data. The proposed approach consists of a primary filter approach using Fisher criterion which reduces the initial genes and hence the search space and time complexity. Then, a wrapper approach which is based on cellular learning automata (CLA) optimized with ant colony method (ACO) is used to find the set of features which improve the classification accuracy. CLA is applied due to its capability to learn and model complicated relationships. The selected features from the last phase are evaluated using ROC curve and the most effective while smallest feature subset is determined. The classifiers which are evaluated in the proposed framework are K-nearest neighbor; support vector machine and naïve Bayes. The proposed approach is evaluated on 4 microarray datasets. The evaluations confirm that the proposed approach can find the smallest subset of genes while approaching the maximum accuracy. PMID:27154739

  15. An approach for automated fault diagnosis based on a fuzzy decision tree and boundary analysis of a reconstructed phase space.

    PubMed

    Aydin, Ilhan; Karakose, Mehmet; Akin, Erhan

    2014-03-01

    Although reconstructed phase space is one of the most powerful methods for analyzing a time series, it can fail in fault diagnosis of an induction motor when the appropriate pre-processing is not performed. Therefore, boundary analysis based a new feature extraction method in phase space is proposed for diagnosis of induction motor faults. The proposed approach requires the measurement of one phase current signal to construct the phase space representation. Each phase space is converted into an image, and the boundary of each image is extracted by a boundary detection algorithm. A fuzzy decision tree has been designed to detect broken rotor bars and broken connector faults. The results indicate that the proposed approach has a higher recognition rate than other methods on the same dataset. PMID:24296116

  16. Knowledge-based multisensoral and multitemporal approach for land use classification in rugged terrain using Landsat TM and ERS SAR

    NASA Astrophysics Data System (ADS)

    Stolz, Roswitha; Strasser, Gertrud; Mauser, Wolfram

    1999-12-01

    Land use has an important impact on the climatic and hydrological cycle. For modeling this impact detailed knowledge of the land use and land cover pattern is necessary. Optical remote sensing data are good information sources to derive land use classifications for large areas. But due to the fact that commonly used classification algorithms are solely based on the spectral information, this often leads to misclassifications, because different classes can show similar spectral signatures. This is especially true for areas where a high rate of cloudiness reduces the availability of data. These are often heterogeneous and rugged areas such as mountains and their forelands. Advanced knowledge-based classification approaches which integrate non-spectral geographical ancillary data (i.e. climatic and terrain data) can improve the classification accuracy drastically. Still the method fails if spatially distributed ancillary data is not available or show no influence on the land use structure. The major advantage of the approach described in this paper is that it uses data, which are solely based on remotely sensed images and is therefore independent from map sources. The lack of multitemporal satellite data is clear