Science.gov

Sample records for classification tree approach

  1. The decision tree approach to classification

    NASA Technical Reports Server (NTRS)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  2. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  3. Decision tree approach for classification of remotely sensed satellite data using open source support

    NASA Astrophysics Data System (ADS)

    Sharma, Richa; Ghosh, Aniruddha; Joshi, P. K.

    2013-10-01

    In this study, an attempt has been made to develop a decision tree classification (DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source data mining software. The classified image is compared with the image classified using classical ISODATA clustering and Maximum Likelihood Classifier (MLC) algorithms. Classification result based on DTC method provided better visual depiction than results produced by ISODATA clustering or by MLC algorithms. The overall accuracy was found to be 90% (kappa = 0.88) using the DTC, 76.67% (kappa = 0.72) using the Maximum Likelihood and 57.5% (kappa = 0.49) using ISODATA clustering method. Based on the overall accuracy and kappa statistics, DTC was found to be more preferred classification approach than others.

  4. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  5. Classification tree methods provide a multifactorial approach to predicting insular body size evolution in rodents.

    PubMed

    Durst, Paul A P; Roth, V Louise

    2012-04-01

    Many hypotheses have been proposed to explain size changes in insular mammals, but no single variable suffices to explain the diversity of responses, particularly within Rodentia. Here in a data set on insular rodents, we observe strong consistency in the direction of size change within islands and within species but (outside of Heteromyidae) little consistency at broader taxonomic scales. Using traits of islands and of species in a classification tree analysis, we find the most important factor predicting direction of change to be mainland body mass (large rodents decrease, small ones increase); other variables (island climate, number of rodent species, and area) were significant, although their roles as revealed by the classification tree were context dependent. Ecological interactions appear relatively uninformative, and on any given island, the largest and smallest rodent species converged or diverged in size with equal frequency. Our approach provides a promising framework for continuing examination of insular body size evolution. PMID:22437183

  6. Snow event classification with a 2D video disdrometer - A decision tree approach

    NASA Astrophysics Data System (ADS)

    Bernauer, F.; Hürkamp, K.; Rühm, W.; Tschiersch, J.

    2016-05-01

    Snowfall classification according to crystal type or degree of riming of the snowflakes is import for many atmospheric processes, e.g. wet deposition of aerosol particles. 2D video disdrometers (2DVD) have recently proved their capability to measure microphysical parameters of snowfall. The present work has the aim of classifying snowfall according to microphysical properties of single hydrometeors (e.g. shape and fall velocity) measured by means of a 2DVD. The constraints for the shape and velocity parameters which are used in a decision tree for classification of the 2DVD measurements, are derived from detailed on-site observations, combining automatic 2DVD classification with visual inspection. The developed decision tree algorithm subdivides the detected events into three classes of dominating crystal type (single crystals, complex crystals and pellets) and three classes of dominating degree of riming (weak, moderate and strong). The classification results for the crystal type were validated with an independent data set proving the unambiguousness of the classification. In addition, for three long-term events, good agreement of the classification results with independently measured maximum dimension of snowflakes, snowflake bulk density and surrounding temperature was found. The developed classification algorithm is applicable for wind speeds below 5.0 m s -1 and has the advantage of being easily implemented by other users.

  7. Classification trees with neural network feature extraction.

    PubMed

    Guo, H; Gelfand, S B

    1992-01-01

    The ideal use of small multilayer nets at the decision nodes of a binary classification tree to extract nonlinear features is proposed. The nets are trained and the tree is grown using a gradient-type learning algorithm in the multiclass case. The method improves on standard classification tree design methods in that it generally produces trees with lower error rates and fewer nodes. It also reduces the problems associated with training large unstructured nets and transfers the problem of selecting the size of the net to the simpler problem of finding a tree of the right size. An efficient tree pruning algorithm is proposed for this purpose. Trees constructed with the method and the CART method are compared on a waveform recognition problem and a handwritten character recognition problem. The approach demonstrates significant decrease in error rate and tree size. It also yields comparable error rates and shorter training times than a large multilayer net trained with backpropagation on the same problems.

  8. Applying an Ensemble Classification Tree Approach to the Prediction of Completion of a 12-Step Facilitation Intervention with Stimulant Abusers

    PubMed Central

    Doyle, Suzanne R.; Donovan, Dennis M.

    2014-01-01

    Aims The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed and the subagging procedure of random subsampling is applied. Methods Among 234 individuals with stimulant use disorders randomized to a 12-Step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. Findings The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower ASI Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. Conclusions The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038

  9. Harvesting classification trees for drug discovery.

    PubMed

    Yuan, Yan; Chipman, Hugh A; Welch, William J

    2012-12-21

    Millions of compounds are available as potential drug candidates. High throughput screening (HTS) is widely used in drug discovery to assay compounds for a particular biological activity. A common approach is to build a classification model using a smaller sample of assay data to predict the activity of unscreened compounds and hence select further compounds for assay. This improves the efficiency of the search by increasing the proportion of hits found among the assayed compounds. In many assays, the biological activity is dichotomized into a binary indicator variable; the explanatory variables are chemical descriptors capturing compound structure. A tree model is interpretable, which is key, since it is of interest to identify diverse chemical classes among the active compounds to serve as leads for drug optimization. Interpretability of a tree is often reduced, however, by the sheer size of the tree model and the number of variables and rules of the terminal nodes. We develop a "tree harvesting" algorithm to filter out redundant "junk" rules from the tree while retaining its predictive accuracy. This simplification can facilitate the process of uncovering key relations between molecular structure and activity and may clarify rules defining multiple activity mechanisms. Using data from the National Cancer Institute, we illustrate that many of the rules used to build a classification tree may be redundant. Unlike tree pruning, tree harvesting allows variables with junk rules to be removed near the top of the tree. The reduction in complexity of the terminal nodes improves the interpretability of the model. The algorithm also aims to reorganize the tree nodes associated with the interesting "active" class into larger, more coherent groups, thus facilitating identification of the mechanisms for activity.

  10. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object-based classification approaches.

    PubMed

    Meneguzzo, Dacia M; Liknes, Greg C; Nelson, Mark D

    2013-08-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics. Despite the significance of ToF, forest and other natural resource inventory programs and geospatial land cover datasets that are available at a national scale do not include comprehensive information regarding ToF in the United States. Additional ground-based data collection and acquisition of specialized imagery to inventory these resources are expensive alternatives. As a potential solution, we identified two remote sensing-based approaches that use free high-resolution aerial imagery from the National Agriculture Imagery Program (NAIP) to map all tree cover in an agriculturally dominant landscape. We compared the results obtained using an unsupervised per-pixel classifier (independent component analysis-[ICA]) and an object-based image analysis (OBIA) procedure in Steele County, Minnesota, USA. Three types of accuracy assessments were used to evaluate how each method performed in terms of: (1) producing a county-level estimate of total tree-covered area, (2) correctly locating tree cover on the ground, and (3) how tree cover patch metrics computed from the classified outputs compared to those delineated by a human photo interpreter. Both approaches were found to be viable for mapping tree cover over a broad spatial extent and could serve to supplement ground-based inventory data. The ICA approach produced an estimate of total tree cover more similar to the photo-interpreted result, but the output from the OBIA method was more realistic in terms of describing the actual observed spatial pattern of tree cover.

  11. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object-based classification approaches.

    PubMed

    Meneguzzo, Dacia M; Liknes, Greg C; Nelson, Mark D

    2013-08-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics. Despite the significance of ToF, forest and other natural resource inventory programs and geospatial land cover datasets that are available at a national scale do not include comprehensive information regarding ToF in the United States. Additional ground-based data collection and acquisition of specialized imagery to inventory these resources are expensive alternatives. As a potential solution, we identified two remote sensing-based approaches that use free high-resolution aerial imagery from the National Agriculture Imagery Program (NAIP) to map all tree cover in an agriculturally dominant landscape. We compared the results obtained using an unsupervised per-pixel classifier (independent component analysis-[ICA]) and an object-based image analysis (OBIA) procedure in Steele County, Minnesota, USA. Three types of accuracy assessments were used to evaluate how each method performed in terms of: (1) producing a county-level estimate of total tree-covered area, (2) correctly locating tree cover on the ground, and (3) how tree cover patch metrics computed from the classified outputs compared to those delineated by a human photo interpreter. Both approaches were found to be viable for mapping tree cover over a broad spatial extent and could serve to supplement ground-based inventory data. The ICA approach produced an estimate of total tree cover more similar to the photo-interpreted result, but the output from the OBIA method was more realistic in terms of describing the actual observed spatial pattern of tree cover. PMID:23255169

  12. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  13. Phylogenetic classification and the universal tree.

    PubMed

    Doolittle, W F

    1999-06-25

    From comparative analyses of the nucleotide sequences of genes encoding ribosomal RNAs and several proteins, molecular phylogeneticists have constructed a "universal tree of life," taking it as the basis for a "natural" hierarchical classification of all living things. Although confidence in some of the tree's early branches has recently been shaken, new approaches could still resolve many methodological uncertainties. More challenging is evidence that most archaeal and bacterial genomes (and the inferred ancestral eukaryotic nuclear genome) contain genes from multiple sources. If "chimerism" or "lateral gene transfer" cannot be dismissed as trivial in extent or limited to special categories of genes, then no hierarchical universal classification can be taken as natural. Molecular phylogeneticists will have failed to find the "true tree," not because their methods are inadequate or because they have chosen the wrong genes, but because the history of life cannot properly be represented as a tree. However, taxonomies based on molecular sequences will remain indispensable, and understanding of the evolutionary process will ultimately be enriched, not impoverished. PMID:10381871

  14. Type I error control for tree classification.

    PubMed

    Jung, Sin-Ho; Chen, Yong; Ahn, Hongshik

    2014-01-01

    Binary tree classification has been useful for classifying the whole population based on the levels of outcome variable that is associated with chosen predictors. Often we start a classification with a large number of candidate predictors, and each predictor takes a number of different cutoff values. Because of these types of multiplicity, binary tree classification method is subject to severe type I error probability. Nonetheless, there have not been many publications to address this issue. In this paper, we propose a binary tree classification method to control the probability to accept a predictor below certain level, say 5%.

  15. Interactions between factors related to the decision of sex offenders to confess during police interrogation: a classification-tree approach.

    PubMed

    Beauregard, Eric; Deslauriers-Varin, Nadine; St-Yves, Michel

    2010-09-01

    Most studies of confessions have looked at the influence of individual factors, neglecting the potential interactions between these factors and their impact on the decision to confess or not during an interrogation. Classification and regression tree analyses conducted on a sample of 624 convicted sex offenders showed that certain factors related to the offenders (e.g., personality, criminal career), victims (e.g., sex, relationship to offender), and case (e.g., time of day of the crime) were related to the decision to confess or not during the police interrogation. Several interactions were also observed between these factors. Results will be discussed in light of previous findings and interrogation strategies for sex offenders.

  16. Selecting Relevant Descriptors for Classification by Bayesian Estimates: A Comparison with Decision Trees and Support Vector Machines Approaches for Disparate Data Sets.

    PubMed

    Carbon-Mangels, Miriam; Hutter, Michael C

    2011-10-01

    Classification algorithms suffer from the curse of dimensionality, which leads to overfitting, particularly if the problem is over-determined. Therefore it is of particular interest to identify the most relevant descriptors to reduce the complexity. We applied Bayesian estimates to model the probability distribution of descriptors values used for binary classification using n-fold cross-validation. As a measure for the discriminative power of the classifiers, the symmetric form of the Kullback-Leibler divergence of their probability distributions was computed. We found that the most relevant descriptors possess a Gaussian-like distribution of their values, show the largest divergences, and therefore appear most often in the cross-validation scenario. The results were compared to those of the LASSO feature selection method applied to multiple decision trees and support vector machine approaches for data sets of substrates and nonsubstrates of three Cytochrome P450 isoenzymes, which comprise strongly unbalanced compound distributions. In contrast to decision trees and support vector machines, the performance of Bayesian estimates is less affected by unbalanced data sets. This strategy reveals those descriptors that allow a simple linear separation of the classes, whereas the superior accuracy of decision trees and support vector machines can be attributed to nonlinear separation, which are in turn more prone to overfitting.

  17. Predicting 'very poor' beach water quality gradings using classification tree.

    PubMed

    Thoe, Wai; Choi, King Wah; Lee, Joseph Hun-wei

    2016-02-01

    A beach water quality prediction system has been developed in Hong Kong using multiple linear regression (MLR) models. However, linear models are found to be weak at capturing the infrequent 'very poor' water quality occasions when Escherichia coli (E. coli) concentration exceeds 610 counts/100 mL. This study uses a classification tree to increase the accuracy in predicting the 'very poor' water quality events at three Hong Kong beaches affected either by non-point source or point source pollution. Binary-output classification trees (to predict whether E. coli concentration exceeds 610 counts/100 mL) are developed over the periods before and after the implementation of the Harbour Area Treatment Scheme, when systematic changes in water quality were observed. Results show that classification trees can capture more 'very poor' events in both periods when compared to the corresponding linear models, with an increase in correct positives by an average of 20%. Classification trees are also developed at two beaches to predict the four-category Beach Water Quality Indices. They perform worse than the binary tree and give excessive false alarms of 'very poor' events. Finally, a combined modelling approach using both MLR model and classification tree is proposed to enhance the beach water quality prediction system for Hong Kong.

  18. Predicting 'very poor' beach water quality gradings using classification tree.

    PubMed

    Thoe, Wai; Choi, King Wah; Lee, Joseph Hun-wei

    2016-02-01

    A beach water quality prediction system has been developed in Hong Kong using multiple linear regression (MLR) models. However, linear models are found to be weak at capturing the infrequent 'very poor' water quality occasions when Escherichia coli (E. coli) concentration exceeds 610 counts/100 mL. This study uses a classification tree to increase the accuracy in predicting the 'very poor' water quality events at three Hong Kong beaches affected either by non-point source or point source pollution. Binary-output classification trees (to predict whether E. coli concentration exceeds 610 counts/100 mL) are developed over the periods before and after the implementation of the Harbour Area Treatment Scheme, when systematic changes in water quality were observed. Results show that classification trees can capture more 'very poor' events in both periods when compared to the corresponding linear models, with an increase in correct positives by an average of 20%. Classification trees are also developed at two beaches to predict the four-category Beach Water Quality Indices. They perform worse than the binary tree and give excessive false alarms of 'very poor' events. Finally, a combined modelling approach using both MLR model and classification tree is proposed to enhance the beach water quality prediction system for Hong Kong. PMID:26837834

  19. Fast Image Texture Classification Using Decision Trees

    NASA Technical Reports Server (NTRS)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  20. Voxel classification based airway tree segmentation

    NASA Astrophysics Data System (ADS)

    Lo, Pechin; de Bruijne, Marleen

    2008-03-01

    This paper presents a voxel classification based method for segmenting the human airway tree in volumetric computed tomography (CT) images. In contrast to standard methods that use only voxel intensities, our method uses a more complex appearance model based on a set of local image appearance features and Kth nearest neighbor (KNN) classification. The optimal set of features for classification is selected automatically from a large set of features describing the local image structure at several scales. The use of multiple features enables the appearance model to differentiate between airway tree voxels and other voxels of similar intensities in the lung, thus making the segmentation robust to pathologies such as emphysema. The classifier is trained on imperfect segmentations that can easily be obtained using region growing with a manual threshold selection. Experiments show that the proposed method results in a more robust segmentation that can grow into the smaller airway branches without leaking into emphysematous areas, and is able to segment many branches that are not present in the training set.

  1. Semi-supervised SVM for individual tree crown species classification

    NASA Astrophysics Data System (ADS)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  2. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    NASA Astrophysics Data System (ADS)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  3. Protein classification based on propagation of unrooted binary trees.

    PubMed

    Kocsor, András; Busa-Fekete, Róbert; Pongor, Sándor

    2008-01-01

    We present two efficient network propagation algorithms that operate on a binary tree, i.e., a sparse-edged substitute of an entire similarity network. TreeProp-N is based on passing increments between nodes while TreeProp-E employs propagation to the edges of the tree. Both algorithms improve protein classification efficiency.

  4. Consensus of classification trees for skin sensitisation hazard prediction.

    PubMed

    Asturiol, D; Casati, S; Worth, A

    2016-10-01

    Since March 2013, it is no longer possible to market in the European Union (EU) cosmetics containing new ingredients tested on animals. Although several in silico alternatives are available and achievements have been made in the development and regulatory adoption of skin sensitisation non-animal tests, there is not yet a generally accepted approach for skin sensitisation assessment that would fully substitute the need for animal testing. The aim of this work was to build a defined approach (i.e. a predictive model based on readouts from various information sources that uses a fixed procedure for generating a prediction) for skin sensitisation hazard prediction (sensitiser/non-sensitiser) using Local Lymph Node Assay (LLNA) results as reference classifications. To derive the model, we built a dataset with high quality data from in chemico (DPRA) and in vitro (KeratinoSens™ and h-CLAT) methods, and it was complemented with predictions from several software packages. The modelling exercise showed that skin sensitisation hazard was better predicted by classification trees based on in silico predictions. The defined approach consists of a consensus of two classification trees that are based on descriptors that account for protein reactivity and structural features. The model showed an accuracy of 0.93, sensitivity of 0.98, and specificity of 0.85 for 269 chemicals. In addition, the defined approach provides a measure of confidence associated to the prediction. PMID:27458072

  5. Consensus of classification trees for skin sensitisation hazard prediction.

    PubMed

    Asturiol, D; Casati, S; Worth, A

    2016-10-01

    Since March 2013, it is no longer possible to market in the European Union (EU) cosmetics containing new ingredients tested on animals. Although several in silico alternatives are available and achievements have been made in the development and regulatory adoption of skin sensitisation non-animal tests, there is not yet a generally accepted approach for skin sensitisation assessment that would fully substitute the need for animal testing. The aim of this work was to build a defined approach (i.e. a predictive model based on readouts from various information sources that uses a fixed procedure for generating a prediction) for skin sensitisation hazard prediction (sensitiser/non-sensitiser) using Local Lymph Node Assay (LLNA) results as reference classifications. To derive the model, we built a dataset with high quality data from in chemico (DPRA) and in vitro (KeratinoSens™ and h-CLAT) methods, and it was complemented with predictions from several software packages. The modelling exercise showed that skin sensitisation hazard was better predicted by classification trees based on in silico predictions. The defined approach consists of a consensus of two classification trees that are based on descriptors that account for protein reactivity and structural features. The model showed an accuracy of 0.93, sensitivity of 0.98, and specificity of 0.85 for 269 chemicals. In addition, the defined approach provides a measure of confidence associated to the prediction.

  6. Tree classification with fused mobile laser scanning and hyperspectral data.

    PubMed

    Puttonen, Eetu; Jaakkola, Anttoni; Litkey, Paula; Hyyppä, Juha

    2011-01-01

    Mobile Laser Scanning data were collected simultaneously with hyperspectral data using the Finnish Geodetic Institute Sensei system. The data were tested for tree species classification. The test area was an urban garden in the City of Espoo, Finland. Point clouds representing 168 individual tree specimens of 23 tree species were determined manually. The classification of the trees was done using first only the spatial data from point clouds, then with only the spectral data obtained with a spectrometer, and finally with the combined spatial and hyperspectral data from both sensors. Two classification tests were performed: the separation of coniferous and deciduous trees, and the identification of individual tree species. All determined tree specimens were used in distinguishing coniferous and deciduous trees. A subset of 133 trees and 10 tree species was used in the tree species classification. The best classification results for the fused data were 95.8% for the separation of the coniferous and deciduous classes. The best overall tree species classification succeeded with 83.5% accuracy for the best tested fused data feature combination. The respective results for paired structural features derived from the laser point cloud were 90.5% for the separation of the coniferous and deciduous classes and 65.4% for the species classification. Classification accuracies with paired hyperspectral reflectance value data were 90.5% for the separation of coniferous and deciduous classes and 62.4% for different species. The results are among the first of their kind and they show that mobile collected fused data outperformed single-sensor data in both classification tests and by a significant margin.

  7. Watershed Merge Tree Classification for Electron Microscopy Image Segmentation

    SciTech Connect

    Liu, TIng; Jurrus, Elizabeth R.; Seyedhosseini, Mojtaba; Ellisman, Mark; Tasdizen, Tolga

    2012-11-11

    Automated segmentation of electron microscopy (EM) images is a challenging problem. In this paper, we present a novel method that utilizes a hierarchical structure and boundary classification for 2D neuron segmentation. With a membrane detection probability map, a watershed merge tree is built for the representation of hierarchical region merging from the watershed algorithm. A boundary classifier is learned with non-local image features to predict each potential merge in the tree, upon which merge decisions are made with consistency constraints in the sense of optimization to acquire the final segmentation. Independent of classifiers and decision strategies, our approach proposes a general framework for efficient hierarchical segmentation with statistical learning. We demonstrate that our method leads to a substantial improvement in segmentation accuracy.

  8. Classification of Liss IV Imagery Using Decision Tree Methods

    NASA Astrophysics Data System (ADS)

    Verma, Amit Kumar; Garg, P. K.; Prasad, K. S. Hari; Dadhwal, V. K.

    2016-06-01

    Image classification is a compulsory step in any remote sensing research. Classification uses the spectral information represented by the digital numbers in one or more spectral bands and attempts to classify each individual pixel based on this spectral information. Crop classification is the main concern of remote sensing applications for developing sustainable agriculture system. Vegetation indices computed from satellite images gives a good indication of the presence of vegetation. It is an indicator that describes the greenness, density and health of vegetation. Texture is also an important characteristics which is used to identifying objects or region of interest is an image. This paper illustrate the use of decision tree method to classify the land in to crop land and non-crop land and to classify different crops. In this paper we evaluate the possibility of crop classification using an integrated approach methods based on texture property with different vegetation indices for single date LISS IV sensor 5.8 meter high spatial resolution data. Eleven vegetation indices (NDVI, DVI, GEMI, GNDVI, MSAVI2, NDWI, NG, NR, NNIR, OSAVI and VI green) has been generated using green, red and NIR band and then image is classified using decision tree method. The other approach is used integration of texture feature (mean, variance, kurtosis and skewness) with these vegetation indices. A comparison has been done between these two methods. The results indicate that inclusion of textural feature with vegetation indices can be effectively implemented to produce classifiedmaps with 8.33% higher accuracy for Indian satellite IRS-P6, LISS IV sensor images.

  9. Classification Based on Tree-Structured Allocation Rules

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qui

    2008-01-01

    The authors consider the problem of classifying an unknown observation into 1 of several populations by using tree-structured allocation rules. Although many parametric classification procedures are robust to certain assumption violations, there is need for classification procedures that can be used regardless of the group-conditional…

  10. Urban Tree Classification Using Full-Waveform Airborne Laser Scanning

    NASA Astrophysics Data System (ADS)

    Koma, Zs.; Koenig, K.; Höfle, B.

    2016-06-01

    Vegetation mapping in urban environments plays an important role in biological research and urban management. Airborne laser scanning provides detailed 3D geodata, which allows to classify single trees into different taxa. Until now, research dealing with tree classification focused on forest environments. This study investigates the object-based classification of urban trees at taxonomic family level, using full-waveform airborne laser scanning data captured in the city centre of Vienna (Austria). The data set is characterised by a variety of taxa, including deciduous trees (beeches, mallows, plane trees and soapberries) and the coniferous pine species. A workflow for tree object classification is presented using geometric and radiometric features. The derived features are related to point density, crown shape and radiometric characteristics. For the derivation of crown features, a prior detection of the crown base is performed. The effects of interfering objects (e.g. fences and cars which are typical in urban areas) on the feature characteristics and the subsequent classification accuracy are investigated. The applicability of the features is evaluated by Random Forest classification and exploratory analysis. The most reliable classification is achieved by using the combination of geometric and radiometric features, resulting in 87.5% overall accuracy. By using radiometric features only, a reliable classification with accuracy of 86.3% can be achieved. The influence of interfering objects on feature characteristics is identified, in particular for the radiometric features. The results indicate the potential of using radiometric features in urban tree classification and show its limitations due to anthropogenic influences at the same time.

  11. Tree-structured wavelet transform signature for classification of melanoma

    NASA Astrophysics Data System (ADS)

    Patwardhan, Sachin V.; Dhawan, Atam P.; Relue, Patricia A.

    2002-05-01

    The purpose of this work is to evaluate the use of a wavelet transform based tree structure in classifying skin lesion images in to melanoma and dysplastic nevus based on the spatial/frequency information. The classification is done using the wavelet transform tree structure analysis. Development of the tree structure in the proposed method uses energy ratio thresholds obtained from a statistical analysis of the coefficients in the wavelet domain. The method is used to obtain a tree structure signature of melanoma and dysplastic nevus, which is then used to classify the data set in to the two classes. Images are classified by using a semantic comparison of the wavelet transform tree structure signatures. Results show that the proposed method is effective and simple for classification based on spatial/frequency information, which also includes the textural information.

  12. Decision tree methods: applications for classification and prediction.

    PubMed

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure. PMID:26120265

  13. Automatic Classification of Trees from Laser Scanning Point Clouds

    NASA Astrophysics Data System (ADS)

    Sirmacek, B.; Lindenbergh, R.

    2015-08-01

    Development of laser scanning technologies has promoted tree monitoring studies to a new level, as the laser scanning point clouds enable accurate 3D measurements in a fast and environmental friendly manner. In this paper, we introduce a probability matrix computation based algorithm for automatically classifying laser scanning point clouds into 'tree' and 'non-tree' classes. Our method uses the 3D coordinates of the laser scanning points as input and generates a new point cloud which holds a label for each point indicating if it belongs to the 'tree' or 'non-tree' class. To do so, a grid surface is assigned to the lowest height level of the point cloud. The grids are filled with probability values which are calculated by checking the point density above the grid. Since the tree trunk locations appear with very high values in the probability matrix, selecting the local maxima of the grid surface help to detect the tree trunks. Further points are assigned to tree trunks if they appear in the close proximity of trunks. Since heavy mathematical computations (such as point cloud organization, detailed shape 3D detection methods, graph network generation) are not required, the proposed algorithm works very fast compared to the existing methods. The tree classification results are found reliable even on point clouds of cities containing many different objects. As the most significant weakness, false detection of light poles, traffic signs and other objects close to trees cannot be prevented. Nevertheless, the experimental results on mobile and airborne laser scanning point clouds indicate the possible usage of the algorithm as an important step for tree growth observation, tree counting and similar applications. While the laser scanning point cloud is giving opportunity to classify even very small trees, accuracy of the results is reduced in the low point density areas further away than the scanning location. These advantages and disadvantages of two laser scanning point

  14. Using Classification Trees to Predict Alumni Giving for Higher Education

    ERIC Educational Resources Information Center

    Weerts, David J.; Ronca, Justin M.

    2009-01-01

    As the relative level of public support for higher education declines, colleges and universities aim to maximize alumni-giving to keep their programs competitive. Anchored in a utility maximization framework, this study employs the classification and regression tree methodology to examine characteristics of alumni donors and non-donors at a…

  15. Growth in Mathematics Achievement: Analysis with Classification and Regression Trees

    ERIC Educational Resources Information Center

    Ma, Xin

    2005-01-01

    A recently developed statistical technique, often referred to as classification and regression trees (CART), holds great potential for researchers to discover how student-level (and school-level) characteristics interactively affect growth in mathematics achievement. CART is a host of advanced statistical methods that statistically cluster…

  16. Multiple Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2010-01-01

    A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.

  17. Combining QuickBird, LiDAR, and GIS topography indices to identify a single native tree species in a complex landscape using an object-based classification approach

    NASA Astrophysics Data System (ADS)

    Pham, Lien T. H.; Brabyn, Lars; Ashraf, Salman

    2016-08-01

    There are now a wide range of techniques that can be combined for image analysis. These include the use of object-based classifications rather than pixel-based classifiers, the use of LiDAR to determine vegetation height and vertical structure, as well terrain variables such as topographic wetness index and slope that can be calculated using GIS. This research investigates the benefits of combining these techniques to identify individual tree species. A QuickBird image and low point density LiDAR data for a coastal region in New Zealand was used to examine the possibility of mapping Pohutukawa trees which are regarded as an iconic tree in New Zealand. The study area included a mix of buildings and vegetation types. After image and LiDAR preparation, single tree objects were identified using a range of techniques including: a threshold of above ground height to eliminate ground based objects; Normalised Difference Vegetation Index and elevation difference between the first and last return of LiDAR data to distinguish vegetation from buildings; geometric information to separate clusters of trees from single trees, and treetop identification and region growing techniques to separate tree clusters into single tree crowns. Important feature variables were identified using Random Forest, and the Support Vector Machine provided the classification. The combined techniques using LiDAR and spectral data produced an overall accuracy of 85.4% (Kappa 80.6%). Classification using just the spectral data produced an overall accuracy of 75.8% (Kappa 67.8%). The research findings demonstrate how the combining of LiDAR and spectral data improves classification for Pohutukawa trees.

  18. A Section-based Method For Tree Species Classification Using Airborne LiDAR Discrete Points In Urban Areas

    NASA Astrophysics Data System (ADS)

    Chunjing, Y. C.; Hui, T.; Zhongjie, R.; Guikai, B.

    2015-12-01

    As a new approach to forest inventory utilizing, LiDAR remote sensing has become an important research issue in the past. Lidar researches initially concentrate on the investigation for mapping forests at the tree level and identifying important structural parameters, such as tree height, crown size, crown base height, individual tree species, and stem volume etc. But for the virtual city visualization and mapping, the traditional methods of tree classification can't satisfy the more complex conditions. Recently, the advanced LiDAR technology has generated new full waveform scanners that provide a higher point density and additional information about the reflecting characteristics of trees. Subsequently, it was demonstrated that it is feasible to detect individual overstorey trees in forests and classify species. But the important issues like the calibration and the decomposition of full waveform data with a series of Gaussian functions usually take a lot of works. What's more, the detection and classification of vegetation results relay much on the prior outcomes. From all above, the section-based method for tree species classification using small footprint and high sampling density lidar data is proposed in this paper, which can overcome the tree species classification issues in urban areas. More specific objectives are to: (1)use local maximum height decision and four direction sections certification methods to get the precise locations of the trees;(2) develop new lidar-derived features processing techniques for characterizing the section structure of individual tree crowns;(3) investigate several techniques for filtering and analyzing vertical profiles of individual trees to classify the trees, and using the expert decision skills based on percentile analysis;(4) assess the accuracy of estimating tree species for each tree, and (5) investigate which type of lidar data, point frequency or intensity, provides the most accurate estimate of tree species

  19. A representation and classification scheme for tree-like structures in medical images: an application on branching pattern analysis of ductal trees in x-ray galactograms

    NASA Astrophysics Data System (ADS)

    Megalooikonomou, Vasileios; Kontos, Despina; Danglemaier, Joseph; Javadi, Ailar; Bakic, Predrag R.; Maidment, Andrew D. A.

    2006-03-01

    We propose a multi-step approach for representing and classifying tree-like structures in medical images. Examples of such tree-like structures are encountered in the bronchial system, the vessel topology and the breast ductal network. We assume that the tree-like structures are already segmented. To avoid the tree isomorphism problem we obtain the breadth-first canonical form of a tree. Our approach is based on employing tree encoding techniques, such as the depth-first string encoding and the Prüfer encoding, to obtain a symbolic representation. Thus, the problem of classifying trees is reduced to string classification where node labels are the string terms. We employ the tf-idf text mining technique to assign a weight of significance to each string term (i.e., tree node label). We perform similarity searches and k-nearest neighbor classification of the trees using the tf-idf weight vectors and the cosine similarity metric. We applied our approach to the breast ductal network manually extracted from clinical x-ray galactograms. The goal was to characterize the ductal tree-like parenchymal structures in order to distinguish among different groups of women. Our best classification accuracy reached up to 90% for certain experimental settings (k=4), outperforming on the average by 10% that of a previous state-of-the-art method based on ramification matrices. These results illustrate the effectiveness of the proposed approach in analyzing tree-like patterns in breast images. Developing such automated tools for the analysis of tree-like structures in medical images can potentially provide insight to the relationship between the topology of branching and function or pathology.

  20. Tree-based disease classification using protein data.

    PubMed

    Zhu, Hongtu; Yu, Chang-Yung; Zhang, Heping

    2003-09-01

    A reliable and precise classification of diseases is essential for successful diagnosis and treatment. Using mass spectrometry from clinical specimens, scientists may find the protein variations among disease and use this information to improve diagnosis. In this paper, we propose a novel procedure to classify disease status based on the protein data from mass spectrometry. Our new tree-based algorithm consists of three steps: projection, selection and classification tree. The projection step aims to project all observations from specimens into the same bases so that the projected data have fixed coordinates. Thus, for each specimen, we obtain a large vector of 'coefficients' on the same basis. The purpose of the selection step is data reduction by condensing the large vector from the projection step into a much lower order of informative vector. Finally, using these reduced vectors, we apply recursive partitioning to construct an informative classification tree. This method has been successfully applied to protein data, provided by the Department of Radiology and Chemistry at Duke University.

  1. Logistic Regression-Based Trichotomous Classification Tree and Its Application in Medical Diagnosis.

    PubMed

    Zhu, Yanke; Fang, Jiqian

    2016-11-01

    The classification tree is a valuable methodology for predictive modeling and data mining. However, the current existing classification trees ignore the fact that there might be a subset of individuals who cannot be well classified based on the information of the given set of predictor variables and who might be classified with a higher error rate; most of the current existing classification trees do not use the combination of variables in each step. An algorithm of a logistic regression-based trichotomous classification tree (LRTCT) is proposed that employs the trichotomous tree structure and the linear combination of predictor variables in the recursive partitioning process. Compared with the widely used classification and regression tree through the applications on a series of simulated data and 2 real data sets, the LRTCT performed better in several aspects and does not require excessive complicated calculations.

  2. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System

    PubMed Central

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3–6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable ‘if-then’ rules. PMID:26461492

  3. Graduates employment classification using data mining approach

    NASA Astrophysics Data System (ADS)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  4. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    NASA Astrophysics Data System (ADS)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  5. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  6. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    NASA Astrophysics Data System (ADS)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  7. Multivariate Approaches to Classification in Extragalactic Astronomy

    NASA Astrophysics Data System (ADS)

    Fraix-Burnet, Didier; Thuillard, Marc; Chattopadhyay, Asis Kumar

    2015-08-01

    Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono- or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.

  8. Classification of depressive disorders: a multiaxial approach.

    PubMed

    Zung, W W; Mahorney, S L; Davidson, J

    1984-07-01

    Classification of depressive disorders can be performed following the Linnaean binomial nomenclature model by defining depression as a genus and the subtypes as species. A medical classification of depressive disorders would need to be jointly inclusive and mutually exclusive between subtypes, and to provide time of onset, severity of illness, pathological process, prognosis, and treatment indications. Etiologic, phenomenologic (clinical, descriptive), statistical, and biological approaches have been used to classify depression. A convergence of these various systems in different combinations has evolved, and a consensus approach using the multiaxial system, such as DSM-III, has advanced psychiatric nomenclature to provide clinical relevance and heuristic possibilities. PMID:6735996

  9. Biosensor Approach to Psychopathology Classification

    PubMed Central

    Koshelev, Misha; Lohrenz, Terry; Vannucci, Marina; Montague, P. Read

    2010-01-01

    We used a multi-round, two-party exchange game in which a healthy subject played a subject diagnosed with a DSM-IV (Diagnostic and Statistics Manual-IV) disorder, and applied a Bayesian clustering approach to the behavior exhibited by the healthy subject. The goal was to characterize quantitatively the style of play elicited in the healthy subject (the proposer) by their DSM-diagnosed partner (the responder). The approach exploits the dynamics of the behavior elicited in the healthy proposer as a biosensor for cognitive features that characterize the psychopathology group at the other side of the interaction. Using a large cohort of subjects (n = 574), we found statistically significant clustering of proposers' behavior overlapping with a range of DSM-IV disorders including autism spectrum disorder, borderline personality disorder, attention deficit hyperactivity disorder, and major depressive disorder. To further validate these results, we developed a computer agent to replace the human subject in the proposer role (the biosensor) and show that it can also detect these same four DSM-defined disorders. These results suggest that the highly developed social sensitivities that humans bring to a two-party social exchange can be exploited and automated to detect important psychopathologies, using an interpersonal behavioral probe not directly related to the defining diagnostic criteria. PMID:20975934

  10. Investigating multiple data sources for tree species classification in temperate forest and use for single tree delineation

    NASA Astrophysics Data System (ADS)

    Heinzel, Johannes; Koch, Barbara

    2012-08-01

    Despite numerous studies existing for tree species classification the difficult situation in dense and mixed temperate forest is still a challenging task. This study attempts to extend the existing limitations by investigating comprehensive sets of different types of features derived from multiple data sources. These sets include features from full-waveform LiDAR, LiDAR height metrics, texture, hyperspectral data and colour infrared (CIR) images. Support vector machines (SVM) are used as an appropriate classifier to handle the high dimensional feature space and an internal ranking method allows the determination of the most important parameters. In addition, for species discrimination, focus is put on single tree applicable scale. While most experiences within these scales derive from boreal forests and are often restricted to two or three species, we concentrate on more complex temperate forests. The four main species pine (Pinus sylvestris), spruce (Picea abies), oak (Quercus petraea) and beech (Fagus sylvatica) are classified with an accuracy of 89.7%, 88.7%, 83.1% and 90.7%, respectively. Instead of directly classifying delineated single trees a raster cell based classification is conducted. This overcomes problems with erroneous polygons of merged tree crowns, which occur frequently within dense deciduous or mixed canopies. Lastly, we further test the possibility to correct these failures by combining species classification with single tree delineation.

  11. Exploring full-waveform LiDAR parameters for tree species classification

    NASA Astrophysics Data System (ADS)

    Heinzel, Johannes; Koch, Barbara

    2011-02-01

    Precise tree species classification with high density full-waveform LiDAR data is a key research topic for automated forest inventory. Most approaches constrain to geometric features and only a few consider intensity values. Since full-waveform data offers a much larger amount of deducible information this study explores a high number of parameter and feature combinations. Those variables having the highest impact on species differentiation are determined. To handle the large amount of airborne full-waveform data and to extract a comprehensive number of variable combinations an improved algorithm was developed. The full-waveform point parameters amplitude, width, range corrected intensity and total number of targets within a beam are transferred into raster covering a test site of 10 km 2. It was possible to isolate the three most important variables based on the intensity, the width and the total number of targets. Up to six tree species were classified with an overall accuracy of 57%, limiting to the four main species accuracy was improved to 78% and constraining just to conifers and broadleaved trees even 91% could be classified correctly.

  12. Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui

    2016-07-01

    The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.

  13. Multiclass cancer classification by using fuzzy support vector machine and binary decision tree with gene selection.

    PubMed

    Mao, Yong; Zhou, Xiaobo; Pi, Daoying; Sun, Youxian; Wong, Stephen T C

    2005-06-30

    We investigate the problems of multiclass cancer classification with gene selection from gene expression data. Two different constructed multiclass classifiers with gene selection are proposed, which are fuzzy support vector machine (FSVM) with gene selection and binary classification tree based on SVM with gene selection. Using F test and recursive feature elimination based on SVM as gene selection methods, binary classification tree based on SVM with F test, binary classification tree based on SVM with recursive feature elimination based on SVM, and FSVM with recursive feature elimination based on SVM are tested in our experiments. To accelerate computation, preselecting the strongest genes is also used. The proposed techniques are applied to analyze breast cancer data, small round blue-cell tumors, and acute leukemia data. Compared to existing multiclass cancer classifiers and binary classification tree based on SVM with F test or binary classification tree based on SVM with recursive feature elimination based on SVM mentioned in this paper, FSVM based on recursive feature elimination based on SVM can find most important genes that affect certain types of cancer with high recognition accuracy.

  14. A statistical approach to root system classification.

    PubMed

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  15. A statistical approach to root system classification

    PubMed Central

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  16. A statistical approach to root system classification.

    PubMed

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for "plant functional type" identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential.

  17. The use of airborne hyperspectral data for tree species classification in a species-rich Central European forest area

    NASA Astrophysics Data System (ADS)

    Richter, Ronny; Reu, Björn; Wirth, Christian; Doktor, Daniel; Vohland, Michael

    2016-10-01

    The success of remote sensing approaches to assess tree species diversity in a heterogeneously mixed forest stand depends on the availability of both appropriate data and suitable classification algorithms. To separate the high number of in total ten broadleaf tree species in a small structured floodplain forest, the Leipzig Riverside Forest, we introduce a majority based classification approach for Discriminant Analysis based on Partial Least Squares (PLS-DA), which was tested against Random Forest (RF) and Support Vector Machines (SVM). The classifier performance was tested on different sets of airborne hyperspectral image data (AISA DUAL) that were acquired on single dates in August and September and also stacked to a composite product. Shadowed gaps and shadowed crown parts were eliminated via spectral mixture analysis (SMA) prior to the pixel-based classification. Training and validation sets were defined spectrally with the conditioned Latin hypercube method as a stratified random sampling procedure. In the validation, PLS-DA consistently outperformed the RF and SVM approaches on all datasets. The additional use of spectral variable selection (CARS, "competitive adaptive reweighted sampling") combined with PLS-DA further improved classification accuracies. Up to 78.4% overall accuracy was achieved for the stacked dataset. The image recorded in August provided slightly higher accuracies than the September image, regardless of the applied classifier.

  18. Remote sensing image classification method based on evidence theory and decision tree

    NASA Astrophysics Data System (ADS)

    Li, Xuerong; Xing, Qianguo; Kang, Lingyan

    2010-11-01

    Remote sensing image classification is an important and complex problem. Conventional remote sensing image classification methods are mostly based on Bayesian subjective probability theory, but there are many defects for its uncertainty. This paper firstly introduces evidence theory and decision tree method. Then it emphatically introduces the function of support degree that evidence theory is used on pattern recognition. Combining the D-S evidence theory with the decision tree algorithm, a D-S evidence theory decision tree method is proposed, where the support degree function is the tie. The method is used to classify the classes, such as water, urban land and green land with the exclusive spectral feature parameters as input values, and produce three classification images of support degree. Then proper threshold value is chosen and according image is handled with the method of binarization. Then overlay handling is done with these images according to the type of classifications, finally the initial result is obtained. Then further accuracy assessment will be done. If initial classification accuracy is unfit for the requirement, reclassification for images with support degree of less than threshold is conducted until final classification meets the accuracy requirements. Compared to Bayesian classification, main advantages of this method are that it can perform reclassification and reach a very high accuracy. This method is finally used to classify the land use of Yantai Economic and Technological Development Zone to four classes such as urban land, green land and water, and effectively support the classification.

  19. Tree Species Classification By Multiseasonal High Resolution Satellite Data

    NASA Astrophysics Data System (ADS)

    Elatawneh, Alata; Wallner, Adelheid; Straub, Christoph; Schneider, Thomas; Knoke, Thomas

    2013-12-01

    Accurate forest tree species mapping is a fundamental issue for sustainable forest management and planning. Forest tree species mapping with the means of remote sensing data is still a topic to be investigated. The Bavaria state institute of forestry is investigating the potential of using digital aerial images for forest management purposes. However, using aerial images is still cost- and time-consuming, in addition to their acquisition restrictions. The new space-born sensor generations such as, RapidEye, with a very high temporal resolution, offering multiseasonal data have the potential to improve the forest tree species mapping. In this study, we investigated the potential of multiseasonal RapidEye data for mapping tree species in a Mid European forest in Southern Germany. The RapidEye data of level A3 were collected on ten different dates in the years 2009, 2010 and 2011. For data analysis, a model was developed, which combines the Spectral Angle Mapper technique with a 10-fold- cross-validation. The analysis succeeded to differentiate four tree species; Norway spruce (Picea abies L.), Silver Fir (Abies alba Mill.), European beech (Fagus sylvatica) and Maple (Acer pseudoplatanus). The model success was evaluated using digital aerial images acquired in the year 2009 and inventory point records from 2008/09 inventory. Model results of the multiseasonal RapidEye data analysis achieved an overall accuracy of 76%. However, the success of the model was evaluated only for all the identified species and not for the individual.

  20. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    PubMed Central

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  1. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    PubMed

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  2. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    PubMed

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity.

  3. A Systematic Approach to Subgroup Classification in Intellectual Disability

    ERIC Educational Resources Information Center

    Schalock, Robert L.; Luckasson, Ruth

    2015-01-01

    This article describes a systematic approach to subgroup classification based on a classification framework and sequential steps involved in the subgrouping process. The sequential steps are stating the purpose of the classification, identifying the classification elements, using relevant information, and using clearly stated and purposeful…

  4. Flow Analysis: A Novel Approach For Classification.

    PubMed

    Vakh, Christina; Falkova, Marina; Timofeeva, Irina; Moskvin, Alexey; Moskvin, Leonid; Bulatov, Andrey

    2016-09-01

    We suggest a novel approach for classification of flow analysis methods according to the conditions under which the mass transfer processes and chemical reactions take place in the flow mode: dispersion-convection flow methods and forced-convection flow methods. The first group includes continuous flow analysis, flow injection analysis, all injection analysis, sequential injection analysis, sequential injection chromatography, cross injection analysis, multi-commutated flow analysis, multi-syringe flow injection analysis, multi-pumping flow systems, loop flow analysis, and simultaneous injection effective mixing flow analysis. The second group includes segmented flow analysis, zone fluidics, flow batch analysis, sequential injection analysis with a mixing chamber, stepwise injection analysis, and multi-commutated stepwise injection analysis. The offered classification allows systematizing a large number of flow analysis methods. Recent developments and applications of dispersion-convection flow methods and forced-convection flow methods are presented.

  5. Flow Analysis: A Novel Approach For Classification.

    PubMed

    Vakh, Christina; Falkova, Marina; Timofeeva, Irina; Moskvin, Alexey; Moskvin, Leonid; Bulatov, Andrey

    2016-09-01

    We suggest a novel approach for classification of flow analysis methods according to the conditions under which the mass transfer processes and chemical reactions take place in the flow mode: dispersion-convection flow methods and forced-convection flow methods. The first group includes continuous flow analysis, flow injection analysis, all injection analysis, sequential injection analysis, sequential injection chromatography, cross injection analysis, multi-commutated flow analysis, multi-syringe flow injection analysis, multi-pumping flow systems, loop flow analysis, and simultaneous injection effective mixing flow analysis. The second group includes segmented flow analysis, zone fluidics, flow batch analysis, sequential injection analysis with a mixing chamber, stepwise injection analysis, and multi-commutated stepwise injection analysis. The offered classification allows systematizing a large number of flow analysis methods. Recent developments and applications of dispersion-convection flow methods and forced-convection flow methods are presented. PMID:26364745

  6. Automatic Approach to Vhr Satellite Image Classification

    NASA Astrophysics Data System (ADS)

    Kupidura, P.; Osińska-Skotak, K.; Pluto-Kossakowska, J.

    2016-06-01

    In this paper, we present a proposition of a fully automatic classification of VHR satellite images. Unlike the most widespread approaches: supervised classification, which requires prior defining of class signatures, or unsupervised classification, which must be followed by an interpretation of its results, the proposed method requires no human intervention except for the setting of the initial parameters. The presented approach bases on both spectral and textural analysis of the image and consists of 3 steps. The first step, the analysis of spectral data, relies on NDVI values. Its purpose is to distinguish between basic classes, such as water, vegetation and non-vegetation, which all differ significantly spectrally, thus they can be easily extracted basing on spectral analysis. The second step relies on granulometric maps. These are the product of local granulometric analysis of an image and present information on the texture of each pixel neighbourhood, depending on the texture grain. The purpose of texture analysis is to distinguish between different classes, spectrally similar, but yet of different texture, e.g. bare soil from a built-up area, or low vegetation from a wooded area. Due to the use of granulometric analysis, based on mathematical morphology opening and closing, the results are resistant to the border effect (qualifying borders of objects in an image as spaces of high texture), which affect other methods of texture analysis like GLCM statistics or fractal analysis. Therefore, the effectiveness of the analysis is relatively high. Several indices based on values of different granulometric maps have been developed to simplify the extraction of classes of different texture. The third and final step of the process relies on a vegetation index, based on near infrared and blue bands. Its purpose is to correct partially misclassified pixels. All the indices used in the classification model developed relate to reflectance values, so the preliminary step

  7. Automated morphological analysis of bone marrow cells in microscopic images for diagnosis of leukemia: nucleus-plasma separation and cell classification using a hierarchical tree model of hematopoesis

    NASA Astrophysics Data System (ADS)

    Krappe, Sebastian; Wittenberg, Thomas; Haferlach, Torsten; Münzenmayer, Christian

    2016-03-01

    The morphological differentiation of bone marrow is fundamental for the diagnosis of leukemia. Currently, the counting and classification of the different types of bone marrow cells is done manually under the use of bright field microscopy. This is a time-consuming, subjective, tedious and error-prone process. Furthermore, repeated examinations of a slide may yield intra- and inter-observer variances. For that reason a computer assisted diagnosis system for bone marrow differentiation is pursued. In this work we focus (a) on a new method for the separation of nucleus and plasma parts and (b) on a knowledge-based hierarchical tree classifier for the differentiation of bone marrow cells in 16 different classes. Classification trees are easily interpretable and understandable and provide a classification together with an explanation. Using classification trees, expert knowledge (i.e. knowledge about similar classes and cell lines in the tree model of hematopoiesis) is integrated in the structure of the tree. The proposed segmentation method is evaluated with more than 10,000 manually segmented cells. For the evaluation of the proposed hierarchical classifier more than 140,000 automatically segmented bone marrow cells are used. Future automated solutions for the morphological analysis of bone marrow smears could potentially apply such an approach for the pre-classification of bone marrow cells and thereby shortening the examination time.

  8. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  9. Educational level and osteoporosis risk in postmenopausal Moroccan women: a classification tree analysis.

    PubMed

    Allali, Fadoua; Rostom, Samira; Bennani, Loubna; Abouqal, Redouane; Hajjaj-Hassouni, Najia

    2010-11-01

    The objectives of this study are (1) to evaluate whether the prevalence of osteoporosis and peripheral fractures might be influenced by the educational level and (2) to develop a simple algorithm using a tree-based approach with education level and other easily collected clinical data that allow clinicians to classify women into varying levels of osteoporosis risk. A total number of 356 women with a mean age of 58.9±7.7 years were included in this study. Patients were separated into four groups according to school educational level; group 1, no education (n=98 patients); group 2, elementary level (n=57 patients); group 3, secondary level (n=138 patients) and group 4, university level (n=66 patients). We observed dose-response linear relations between educational level and mean bone mineral density (BMD). The mean BMDs of education group 1 (10.39% (lumbar spine), 10.8% (trochanter), 16.8% (wrist), and 8.8% (femoral neck)) were lower compared with those of group IV (p<0.05). Twelve percent of patient had peripheral fractures. The prevalence of peripheral fractures increased with lowered educational levels. Logistic regression analysis revealed a significant independent increase in the risk of peripheral fracture in patients with no formal education (odds ratio, 5.68; 95% , 1.16-27.64) after adjustment for age, BMI and spine BMD. Using the classification tree, four predictors were identified as the most important determinant for osteoporosis risk: the level of education, physical activity, age>62 years and BMI<30 kg/m2. This algorithm correctly classified 74% of the women with osteoporosis. Based on the area under the receiver-operator characteristic curves, the accuracy of the Classification and Regression Tree (CART) model was 0.79. Our findings suggested that a lower level of education was associated with significantly lower BMDs at the lumbar spine and the hip sites, and with higher prevalence of osteoporosis at these sites in a dose-response manner, even after

  10. A Classification of Recent Widespread Tree Mortality in the Western US

    NASA Astrophysics Data System (ADS)

    Hicke, J. A.; Anderegg, W.; Allen, C. D.; Stephenson, N.

    2015-12-01

    Widespread tree mortality has been documented across the western United States in recent decades. Climate change has been implicated in these events, in particular warming and associated effects on tree stress and biotic disturbance agents. Given projected future warming, the capability of accurately predicting future tree mortality is critical. However, sufficient ecological understanding is needed to do so. Here we describe differences in various mortality types associated with spatial characteristics and climate drivers. We loosely classify mortality types into four categories: 1) widespread but low severity background mortality that has been increasing mainly because of greater stress associated with rising climatic water deficit; 2) tree die-offs that are driven by severe, hotter drought in which biotic agents play minor roles, such as sudden aspen decline; 3) tree die-offs in which hotter droughts combined with outbreaks of biotic agents, often less aggressive bark beetles, to cause mortality, such as piñon pine mortality in the Southwest; and 4) tree die-offs that were initiated or facilitated by droughts but which were associated with aggressive biotic agents that can kill healthy trees at high populations, such as mountain pine beetle outbreaks. An important use of this classification is the different pathways by which climate change can cause tree mortality. For some classes (background and primarily drought-driven mortality), predictions may be sufficiently accurate based on climate (drought) metrics. For classes in which biotic agents play a role, the direct warming effect on insects may occur through mechanisms not related to drought, and therefore predictions may need to include mechanisms other than drought. We note that this is a simplistic classification designed to facilitate understanding of tree mortality, and that overlap occurs among categories.

  11. Semi-automatic approach for music classification

    NASA Astrophysics Data System (ADS)

    Zhang, Tong

    2003-11-01

    Audio categorization is essential when managing a music database, either a professional library or a personal collection. However, a complete automation in categorizing music into proper classes for browsing and searching is not yet supported by today"s technology. Also, the issue of music classification is subjective to some extent as each user may have his own criteria for categorizing music. In this paper, we propose the idea of semi-automatic music classification. With this approach, a music browsing system is set up which contains a set of tools for separating music into a number of broad types (e.g. male solo, female solo, string instruments performance, etc.) using existing music analysis methods. With results of the automatic process, the user may further cluster music pieces in the database into finer classes and/or adjust misclassifications manually according to his own preferences and definitions. Such a system may greatly improve the efficiency of music browsing and retrieval, while at the same time guarantee accuracy and user"s satisfaction of the results. Since this semi-automatic system has two parts, i.e. the automatic part and the manual part, they are described separately in the paper, with detailed descriptions and examples of each step of the two parts included.

  12. A Nonparametric Approach to Estimate Classification Accuracy and Consistency

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2014-01-01

    When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

  13. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    USGS Publications Warehouse

    Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, G.G.

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.

  14. The minimum distance approach to classification

    NASA Technical Reports Server (NTRS)

    Wacker, A. G.; Landgrebe, D. A.

    1971-01-01

    The work to advance the state-of-the-art of miminum distance classification is reportd. This is accomplished through a combination of theoretical and comprehensive experimental investigations based on multispectral scanner data. A survey of the literature for suitable distance measures was conducted and the results of this survey are presented. It is shown that minimum distance classification, using density estimators and Kullback-Leibler numbers as the distance measure, is equivalent to a form of maximum likelihood sample classification. It is also shown that for the parametric case, minimum distance classification is equivalent to nearest neighbor classification in the parameter space.

  15. Stroke Damage Detection Using Classification Trees on Electrical Bioimpedance Cerebral Spectroscopy Measurements

    PubMed Central

    Atefi, Seyed Reza; Seoane, Fernando; Thorlin, Thorleif; Lindecrantz, Kaj

    2013-01-01

    After cancer and cardio-vascular disease, stroke is the third greatest cause of death worldwide. Given the limitations of the current imaging technologies used for stroke diagnosis, the need for portable non-invasive and less expensive diagnostic tools is crucial. Previous studies have suggested that electrical bioimpedance (EBI) measurements from the head might contain useful clinical information related to changes produced in the cerebral tissue after the onset of stroke. In this study, we recorded 720 EBI Spectroscopy (EBIS) measurements from two different head regions of 18 hemispheres of nine subjects. Three of these subjects had suffered a unilateral haemorrhagic stroke. A number of features based on structural and intrinsic frequency-dependent properties of the cerebral tissue were extracted. These features were then fed into a classification tree. The results show that a full classification of damaged and undamaged cerebral tissue was achieved after three hierarchical classification steps. Lastly, the performance of the classification tree was assessed using Leave-One-Out Cross Validation (LOO-CV). Despite the fact that the results of this study are limited to a small database, and the observations obtained must be verified further with a larger cohort of patients, these findings confirm that EBI measurements contain useful information for assessing on the health of brain tissue after stroke and supports the hypothesis that classification features based on Cole parameters, spectral information and the geometry of EBIS measurements are useful to differentiate between healthy and stroke damaged brain tissue. PMID:23966181

  16. Computer-aided diagnosis of Alzheimer's disease using support vector machines and classification trees

    NASA Astrophysics Data System (ADS)

    Salas-Gonzalez, D.; Górriz, J. M.; Ramírez, J.; López, M.; Álvarez, I.; Segovia, F.; Chaves, R.; Puntonet, C. G.

    2010-05-01

    This paper presents a computer-aided diagnosis technique for improving the accuracy of early diagnosis of Alzheimer-type dementia. The proposed methodology is based on the selection of voxels which present Welch's t-test between both classes, normal and Alzheimer images, greater than a given threshold. The mean and standard deviation of intensity values are calculated for selected voxels. They are chosen as feature vectors for two different classifiers: support vector machines with linear kernel and classification trees. The proposed methodology reaches greater than 95% accuracy in the classification task.

  17. Automatic lung nodule classification with radiomics approach

    NASA Astrophysics Data System (ADS)

    Ma, Jingchen; Wang, Qian; Ren, Yacheng; Hu, Haibo; Zhao, Jun

    2016-03-01

    Lung cancer is the first killer among the cancer deaths. Malignant lung nodules have extremely high mortality while some of the benign nodules don't need any treatment .Thus, the accuracy of diagnosis between benign or malignant nodules diagnosis is necessary. Notably, although currently additional invasive biopsy or second CT scan in 3 months later may help radiologists to make judgments, easier diagnosis approaches are imminently needed. In this paper, we propose a novel CAD method to distinguish the benign and malignant lung cancer from CT images directly, which can not only improve the efficiency of rumor diagnosis but also greatly decrease the pain and risk of patients in biopsy collecting process. Briefly, according to the state-of-the-art radiomics approach, 583 features were used at the first step for measurement of nodules' intensity, shape, heterogeneity and information in multi-frequencies. Further, with Random Forest method, we distinguish the benign nodules from malignant nodules by analyzing all these features. Notably, our proposed scheme was tested on all 79 CT scans with diagnosis data available in The Cancer Imaging Archive (TCIA) which contain 127 nodules and each nodule is annotated by at least one of four radiologists participating in the project. Satisfactorily, this method achieved 82.7% accuracy in classification of malignant primary lung nodules and benign nodules. We believe it would bring much value for routine lung cancer diagnosis in CT imaging and provide improvement in decision-support with much lower cost.

  18. The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification.

    PubMed

    Afrasiabi, Cyrus; Samad, Bushra; Dineen, David; Meacham, Christopher; Sjölander, Kimmen

    2013-07-01

    The PhyloFacts 'Fast Approximate Tree Classification' (FAT-CAT) web server provides a novel approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs and robustness to promiscuous domains. We also present results documenting the precision of ortholog identification based on subtree hidden Markov model scoring. The FAT-CAT phylogenetic placement is used to derive a functional annotation for the query, including confidence scores and drill-down capabilities. PhyloFacts' broad taxonomic and functional coverage, with >7.3 M proteins from across the Tree of Life, enables FAT-CAT to predict orthologs and assign function for most sequence inputs. Four pipeline parameter presets are provided to handle different sequence types, including partial sequences and proteins containing promiscuous domains; users can also modify individual parameters. PhyloFacts trees matching the query can be viewed interactively online using the PhyloScope Javascript tree viewer and are hyperlinked to various external databases. The FAT-CAT web server is available at http://phylogenomics.berkeley.edu/phylofacts/fatcat/.

  19. An object-oriented approach for agrivultural land classification using rapideye imagery

    NASA Astrophysics Data System (ADS)

    Sang, H.; Zhai, L.; Zhang, J.; An, F.

    2015-06-01

    With the improvement of remote sensing technology, the spatial, structural and texture information of land covers are present clearly in high resolution imagery, which enhances the ability of crop mapping. Since the satellite RapidEye was launched in 2009, high resolution multispectral imagery together with wide red edge band has been utilized in vegetation monitoring. Broad red edge band related vegetation indices improved land use classification and vegetation studies. RapidEye high resolution imagery acquired on May 29 and August 9th of 2012 was used in this study to evaluate the potential of red edge band in agricultural land cover/use mapping using an objected-oriented classification approach. A new object-oriented decision tree classifier was introduced in this study to map agricultural lands in the study area. Besides the five bands of RapidEye image, the vegetation indexes derived from spectral bands and the structural and texture features are utilized as inputs for agricultural land cover/use mapping in the study. The optimization of input features for classification by reducing redundant information improves the mapping precision over 9% for AdaTree. WL, and 5% for SVM, the accuracy is over 90% for both approaches. Time phase characteristic is much important in different agricultural lands, and it improves the classification accuracy 7% for AdaTree.WL and 6% for SVM.

  20. Identification and classification of dynamic event tree scenarios via possibilistic clustering: application to a steam generator tube rupture event.

    PubMed

    Mercurio, D; Podofillini, L; Zio, E; Dang, V N

    2009-11-01

    This paper illustrates a method to identify and classify scenarios generated in a dynamic event tree (DET) analysis. Identification and classification are carried out by means of an evolutionary possibilistic fuzzy C-means clustering algorithm which takes into account not only the final system states but also the timing of the events and the process evolution. An application is considered with regards to the scenarios generated following a steam generator tube rupture in a nuclear power plant. The scenarios are generated by the accident dynamic simulator (ADS), coupled to a RELAP code that simulates the thermo-hydraulic behavior of the plant and to an operators' crew model, which simulates their cognitive and procedures-guided responses. A set of 60 scenarios has been generated by the ADS DET tool. The classification approach has grouped the 60 scenarios into 4 classes of dominant scenarios, one of which was not anticipated a priori but was "discovered" by the classifier. The proposed approach may be considered as a first effort towards the application of identification and classification approaches to scenarios post-processing for real-scale dynamic safety assessments. PMID:19819366

  1. A SYSTEMATIC APPROACH TO THE CLASSIFICATION OF DISEASES

    PubMed Central

    Murthy, A.R.V.

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences. PMID:22556612

  2. Classification of the PALMS single particle mass spectral data from Atlanta by regression tree analysis

    NASA Astrophysics Data System (ADS)

    Middlebrook, A. M.; Murphy, D. M.; Lee, S.; Lee, S.; Lee, S.; Thomson, D. S.; Thomson, D. S.

    2001-12-01

    During the Atlanta Supersites project in August 1999, the PALMS (Particle Analysis by Laser Mass Spectrometry) instrument collected over 500,000 individual particle spectra. The Atlanta data were originally analyzed by examining combinations of peaks and relative peak areas [Lee et al., 2001a,b], and a wide range of particle components such as sulfate, nitrate, mineral species, metals, organic species, and elemental carbon were detected. To further study the dataset, a classification program using regression tree analysis was developed and applied. Spectral data were compressed into a lower resolution spectrum (every 0.25 mass units) of the raw data and a list of peak areas (every mass unit). Each spectrum started as a normalized classification vector by itself. If the dot product of two classification vectors was within a certain threshold, they were combined into a new classification. The new classification vector was a normalized running average of the classifications being combined. In subsequent steps, the threshold for combining classifications was continuously lowered until a reasonable number of classifications remained. After the final iteration, each spectrum was compared individually with the entire set of classification vectors. Classifications were also combined manually. The classification results from the Atlanta data are generally consistent with those determined by peak identification. However, the classification program identified specific patterns in the mass spectra that were not found by peak identification and generated new particle types. Furthermore, rare particle types that may affect human health were studied in more detail. A description of the classification program as well as the results for the Atlanta data will be presented. Lee, S.-H., D. M. Murphy, D. S. Thomson, and A. M. Middlebrook, Chemical components of single particles measured with particle analysis by laser mass spectrometry (PALMS) during the Atlanta Supersites Project

  3. Classification tree and minimum-volume ellipsoid analyses of the distribution of ponderosa pine in the western USA

    USGS Publications Warehouse

    Norris, Jodi R.; Jackson, Stephen T.; Betancourt, Julio L.

    2006-01-01

    Aim? Ponderosa pine (Pinus ponderosa Douglas ex Lawson & C. Lawson) is an economically and ecologically important conifer that has a wide geographic range in the western USA, but is mostly absent from the geographic centre of its distribution - the Great Basin and adjoining mountain ranges. Much of its modern range was achieved by migration of geographically distinct Sierra Nevada (P. ponderosa var. ponderosa) and Rocky Mountain (P. ponderosa var. scopulorum) varieties in the last 10,000 years. Previous research has confirmed genetic differences between the two varieties, and measurable genetic exchange occurs where their ranges now overlap in western Montana. A variety of approaches in bioclimatic modelling is required to explore the ecological differences between these varieties and their implications for historical biogeography and impending changes in western landscapes. Location? Western USA. Methods? We used a classification tree analysis and a minimum-volume ellipsoid as models to explain the broad patterns of distribution of ponderosa pine in modern environments using climatic and edaphic variables. Most biogeographical modelling assumes that the target group represents a single, ecologically uniform taxonomic population. Classification tree analysis does not require this assumption because it allows the creation of pathways that predict multiple positive and negative outcomes. Thus, classification tree analysis can be used to test the ecological uniformity of the species. In addition, a multidimensional ellipsoid was constructed to describe the niche of each variety of ponderosa pine, and distances from the niche were calculated and mapped on a 4-km grid for each ecological variable. Results? The resulting classification tree identified three dominant pathways predicting ponderosa pine presence. Two of these three pathways correspond roughly to the distribution of var. ponderosa, and the third pathway generally corresponds to the distribution of var

  4. Classification of Tree Species in Overstorey Canopy of Subtropical Forest Using QuickBird Images

    PubMed Central

    Lin, Chinsu; Popescu, Sorin C.; Thomson, Gavin; Tsogt, Khongor; Chang, Chein-I

    2015-01-01

    This paper proposes a supervised classification scheme to identify 40 tree species (2 coniferous, 38 broadleaf) belonging to 22 families and 36 genera in high spatial resolution QuickBird multispectral images (HMS). Overall kappa coefficient (OKC) and species conditional kappa coefficients (SCKC) were used to evaluate classification performance in training samples and estimate accuracy and uncertainty in test samples. Baseline classification performance using HMS images and vegetation index (VI) images were evaluated with an OKC value of 0.58 and 0.48 respectively, but performance improved significantly (up to 0.99) when used in combination with an HMS spectral-spatial texture image (SpecTex). One of the 40 species had very high conditional kappa coefficient performance (SCKC ≥ 0.95) using 4-band HMS and 5-band VIs images, but, only five species had lower performance (0.68 ≤ SCKC ≤ 0.94) using the SpecTex images. When SpecTex images were combined with a Visible Atmospherically Resistant Index (VARI), there was a significant improvement in performance in the training samples. The same level of improvement could not be replicated in the test samples indicating that a high degree of uncertainty exists in species classification accuracy which may be due to individual tree crown density, leaf greenness (inter-canopy gaps), and noise in the background environment (intra-canopy gaps). These factors increase uncertainty in the spectral texture features and therefore represent potential problems when using pixel-based classification techniques for multi-species classification. PMID:25978466

  5. Parameter optimization of image classification techniques to delineate crowns of coppice trees on UltraCam-D aerial imagery in woodlands

    NASA Astrophysics Data System (ADS)

    Erfanifard, Yousef; Stereńczak, Krzysztof; Behnia, Negin

    2014-01-01

    Estimating the optimal parameters of some classification techniques becomes their negative aspect as it affects their performance for a given dataset and reduces classification accuracy. It was aimed to optimize the combination of effective parameters of support vector machine (SVM), artificial neural network (ANN), and object-based image analysis (OBIA) classification techniques by the Taguchi method. The optimized techniques were applied to delineate crowns of Persian oak coppice trees on UltraCam-D very high spatial resolution aerial imagery in Zagros semiarid woodlands, Iran. The imagery was classified and the maps were assessed by receiver operating characteristic curve and other performance metrics. The results showed that Taguchi is a robust approach to optimize the combination of effective parameters in these image classification techniques. The area under curve (AUC) showed that the optimized OBIA could well discriminate tree crowns on the imagery (AUC=0.897), while SVM and ANN yielded slightly less AUC performances of 0.819 and 0.850, respectively. The indices of accuracy (0.999) and precision (0.999) and performance metrics of specificity (0.999) and sensitivity (0.999) in the optimized OBIA were higher than with other techniques. The optimization of effective parameters of image classification techniques by the Taguchi method, thus, provided encouraging results to discriminate the crowns of Persian oak coppice trees on UltraCam-D aerial imagery in Zagros semiarid woodlands.

  6. Stratification of the severity of critically ill patients with classification trees

    PubMed Central

    2009-01-01

    Background Development of three classification trees (CT) based on the CART (Classification and Regression Trees), CHAID (Chi-Square Automatic Interaction Detection) and C4.5 methodologies for the calculation of probability of hospital mortality; the comparison of the results with the APACHE II, SAPS II and MPM II-24 scores, and with a model based on multiple logistic regression (LR). Methods Retrospective study of 2864 patients. Random partition (70:30) into a Development Set (DS) n = 1808 and Validation Set (VS) n = 808. Their properties of discrimination are compared with the ROC curve (AUC CI 95%), Percent of correct classification (PCC CI 95%); and the calibration with the Calibration Curve and the Standardized Mortality Ratio (SMR CI 95%). Results CTs are produced with a different selection of variables and decision rules: CART (5 variables and 8 decision rules), CHAID (7 variables and 15 rules) and C4.5 (6 variables and 10 rules). The common variables were: inotropic therapy, Glasgow, age, (A-a)O2 gradient and antecedent of chronic illness. In VS: all the models achieved acceptable discrimination with AUC above 0.7. CT: CART (0.75(0.71-0.81)), CHAID (0.76(0.72-0.79)) and C4.5 (0.76(0.73-0.80)). PCC: CART (72(69-75)), CHAID (72(69-75)) and C4.5 (76(73-79)). Calibration (SMR) better in the CT: CART (1.04(0.95-1.31)), CHAID (1.06(0.97-1.15) and C4.5 (1.08(0.98-1.16)). Conclusion With different methodologies of CTs, trees are generated with different selection of variables and decision rules. The CTs are easy to interpret, and they stratify the risk of hospital mortality. The CTs should be taken into account for the classification of the prognosis of critically ill patients. PMID:20003229

  7. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules…

  8. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  9. PcHD: personalized classification of heartbeat types using a decision tree.

    PubMed

    Park, Juyoung; Kang, Kyungtae

    2014-11-01

    The computer-aided interpretation of electrocardiogram (ECG) signals provides a non-invasive and inexpensive technique for analyzing heart activity under various cardiac conditions. Further, the proliferation of smartphones and wireless networks makes it possible to perform continuous Holter monitoring. However, although considerable attention has been paid to automated detection and classification of heartbeats from ECG data, classifier learning strategies have never been used to deal with individual variations in cardiac activity. In this paper, we propose a novel method for automatic classification of an individual׳s ECG beats for Holter monitoring. We use the Pan-Tompkins algorithm to accurately extract features such as the QRS complex and P wave, and employ a decision tree to classify each beat in terms of these features. Evaluations conducted against the MIT-BIH arrhythmia database before and after personalization of the decision tree using a patient׳s own ECG data yield heartbeat classification accuracies of 94.6% and 99%, respectively. These are comparable to results obtained from state-of-the-art schemes, validating the efficacy of our proposed method.

  10. Computer-assisted detection of colonic polyps with CT colonography using neural networks and binary classification trees.

    PubMed

    Jerebko, Anna K; Summers, Ronald M; Malley, James D; Franaszek, Marek; Johnson, C Daniel

    2003-01-01

    Detection of colonic polyps in CT colonography is problematic due to complexities of polyp shape and the surface of the normal colon. Published results indicate the feasibility of computer-aided detection of polyps but better classifiers are needed to improve specificity. In this paper we compare the classification results of two approaches: neural networks and recursive binary trees. As our starting point we collect surface geometry information from three-dimensional reconstruction of the colon, followed by a filter based on selected variables such as region density, Gaussian and average curvature and sphericity. The filter returns sites that are candidate polyps, based on earlier work using detection thresholds, to which the neural nets or the binary trees are applied. A data set of 39 polyps from 3 to 25 mm in size was used in our investigation. For both neural net and binary trees we use tenfold cross-validation to better estimate the true error rates. The backpropagation neural net with one hidden layer trained with Levenberg-Marquardt algorithm achieved the best results: sensitivity 90% and specificity 95% with 16 false positives per study.

  11. A Novel Modulation Classification Approach Using Gabor Filter Network

    PubMed Central

    Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed

    2014-01-01

    A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel. PMID:25126603

  12. A discrete element modelling approach for block impacts on trees

    NASA Astrophysics Data System (ADS)

    Toe, David; Bourrier, Franck; Olmedo, Ignatio; Berger, Frederic

    2015-04-01

    These past few year rockfall models explicitly accounting for block shape, especially those using the Discrete Element Method (DEM), have shown a good ability to predict rockfall trajectories. Integrating forest effects into those models still remain challenging. This study aims at using a DEM approach to model impacts of blocks on trees and identify the key parameters controlling the block kinematics after the impact on a tree. A DEM impact model of a block on a tree was developed and validated using laboratory experiments. Then, key parameters were assessed using a global sensitivity analyse. Modelling the impact of a block on a tree using DEM allows taking into account large displacements, material non-linearities and contacts between the block and the tree. Tree stems are represented by flexible cylinders model as plastic beams sustaining normal, shearing, bending, and twisting loading. Root soil interactions are modelled using a rotation stiffness acting on the bending moment at the bottom of the tree and a limit bending moment to account for tree overturning. The crown is taken into account using an additional mass distribute uniformly on the upper part of the tree. The block is represented by a sphere. The contact model between the block and the stem consists of an elastic frictional model. The DEM model was validated using laboratory impact tests carried out on 41 fresh beech (Fagus Sylvatica) stems. Each stem was 1,3 m long with a diameter between 3 to 7 cm. Wood stems were clamped on a rigid structure and impacted by a 149 kg charpy pendulum. Finally an intensive simulation campaign of blocks impacting trees was done to identify the input parameters controlling the block kinematics after the impact on a tree. 20 input parameters were considered in the DEM simulation model : 12 parameters were related to the tree and 8 parameters to the block. The results highlight that the impact velocity, the stem diameter, and the block volume are the three input

  13. [Identification of the main risk factors for non infectious diseases: method of classification trees].

    PubMed

    Konstantinova, E D; Varaksin, A N; Zhovner, I V

    2013-01-01

    There is presented ideology of the application of one of the methods for assessment of the influence of multi-factor influence of risk factors on population health--the method of classification trees. The method of classification trees is a hierarchical procedure for constructing a decision rule that allows to divide the population into groups with higher and lower morbidity "in the coordinates of" risk factors. The main advantage of the method--the possibility of finding the complex of risk factors having the greatest impact on the health of the population (in contrast to common methods, analyzing only the single-factor effects). In the paper there are presented two possible variants of application of classification trees: 1) the finding of the complex of environmental risk factors (RF), which provides the maximum impact on the prevalence of non infectious diseases in preschool children) in Yekaterinburg (environmental risk factors--the pollution of air drinking water, in the presence of a gas stove in the child's flat, etc.). It is shown that, together with socio-economic risk factors environmental risk factors increase the prevalence of respiratory diseases in preschool children in Ekaterinburg in 2.5-4 times (depending on the list and the number of environmental RF), 2) finding the complex of non-environmental factors that most effectively compensating the negative effect of environmental pollution on human health. This posing of the problem is associated with the fact that pollution environmental factors are (usually) unmodified, while family, behavioral or social factors can be partially or completely eliminated Implementation of the recommendations presented in the paper can reduce the incidence of circulatory diseases in preschool children in Yekaterinburg more than 2 times.

  14. Eating Disorder Diagnoses: Empirical Approaches to Classification

    ERIC Educational Resources Information Center

    Wonderlich, Stephen A.; Joiner, Thomas E., Jr.; Keel, Pamela K.; Williamson, Donald A.; Crosby, Ross D.

    2007-01-01

    Decisions about the classification of eating disorders have significant scientific and clinical implications. The eating disorder diagnoses in the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) reflect the collective wisdom of experts in the field but are frequently not supported in…

  15. Binary tree of SVM: a new fast multiclass training and classification algorithm.

    PubMed

    Fei, Ben; Liu, Jinbai

    2006-05-01

    We present a new architecture named Binary Tree of support vector machine (SVM), or BTS, in order to achieve high classification efficiency for multiclass problems. BTS and its enhanced version, c-BTS, decrease the number of binary classifiers to the greatest extent without increasing the complexity of the original problem. In the training phase, BTS has N - 1 binary classifiers in the best situation (N is the number of classes), while it has log4/3 ((N + 3)/4) binary tests on average when making a decision. At the same time the upper bound of convergence complexity is determined. The experiments in this paper indicate that maintaining comparable accuracy, BTS is much faster to be trained than other methods. Especially in classification, due to its Log complexity, it is much faster than directed acyclic graph SVM (DAGSVM) and ECOC in problems that have big class number.

  16. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis.

    PubMed

    Cohen, Ira L; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N S; Romanczyk, Raymond G; Karmel, Bernard Z; Gardner, Judith M

    2016-09-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80 %, generalized to an independent validation set, and generalized across age groups and sites, and agreed well with ADOS classifications. Parent PDDBIs yielded better results than teacher PDDBIs but, when CART predictions agreed across informants, sensitivity increased. Results also revealed three subtypes of ASD: minimally verbal, verbal, and atypical; and two, relatively common subtypes of non-ASD children: social pragmatic problems and good social skills. These subgroups corresponded to differences in behavior profiles and associated bio-medical findings. PMID:27318809

  17. Prediction of protein phosphorylation sites using classification trees and SVM classifier

    NASA Astrophysics Data System (ADS)

    Betkier, Piotr; Szymański, Zbigniew

    2011-10-01

    The paper presents a method of solving the problem of protein phosphorylation sites recognition. Six classifiers were created for prediction whether specified amino acid sequences represented as a 9-character strings react with given types of the kinase-enzymes. The method consists of three steps. Positions in the amino acid sequences significant for classification are found with the use of classification trees in the first step. Afterwards, the symbols composing the sequences are mapped to the real numbers domain using the Gini index method. The last step consists of creating the SVM classifiers as the final prediction models. The paper contains evaluation of the obtained results and the description of the methods applied to evaluate the quality of the classifiers.

  18. Non-Destructive Classification Approaches for Equilibrated Ordinary Chondrites

    NASA Astrophysics Data System (ADS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-09-01

    In order to compare a few non-destructive classification techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Mössbauer spectroscopy.

  19. Predictive model of biliocystic communication in liver hydatid cysts using classification and regression tree analysis

    PubMed Central

    2010-01-01

    Background Incidence of liver hydatid cyst (LHC) rupture ranged 15%-40% of all cases and most of them concern the bile duct tree. Patients with biliocystic communication (BCC) had specific clinic and therapeutic aspect. The purpose of this study was to determine witch patients with LHC may develop BCC using classification and regression tree (CART) analysis Methods A retrospective study of 672 patients with liver hydatid cyst treated at the surgery department "A" at Ibn Sina University Hospital, Rabat Morocco. Four-teen risk factors for BCC occurrence were entered into CART analysis to build an algorithm that can predict at the best way the occurrence of BCC. Results Incidence of BCC was 24.5%. Subgroups with high risk were patients with jaundice and thick pericyst risk at 73.2% and patients with thick pericyst, with no jaundice 36.5 years and younger with no past history of LHC risk at 40.5%. Our developed CART model has sensitivity at 39.6%, specificity at 93.3%, positive predictive value at 65.6%, a negative predictive value at 82.6% and accuracy of good classification at 80.1%. Discriminating ability of the model was good 82%. Conclusion we developed a simple classification tool to identify LHC patients with high risk BCC during a routine clinic visit (only on clinical history and examination followed by an ultrasonography). Predictive factors were based on pericyst aspect, jaundice, age, past history of liver hydatidosis and morphological Gharbi cyst aspect. We think that this classification can be useful with efficacy to direct patients at appropriated medical struct's. PMID:20398342

  20. A class-oriented model for hyperspectral image classification through hierarchy-tree-based selection

    NASA Astrophysics Data System (ADS)

    Tang, Zhongqi; Fu, Guangyuan; Zhao, XiaoLin; Chen, Jin; Zhang, Li

    2016-03-01

    With the development of hyperspectral sensors over the last few decades, hyperspectral images (HSIs) face new challenges in the field of data analysis. Due to those high-dimensional data, the most challenging issue is to select an effective yet minimal subset from a mass of bands. This paper proposes a class-oriented model to solve the task of classification by incorporating spectral prior of the target, since different targets have different characteristics in spectral correlation. This model operates feature selection after a partition of hyperspectral data into groups along the spectral dimension. In the process of spectral partition, we group the raw data into several subsets by a hierarchy tree structure. In each group, band selection is performed via a recursive support vector machine (R-SVM) learning, which reduces the computational cost as well as preserves the accuracy of classification. To ensure the robustness of the result, we also present a weight-voting strategy for result merging, in which the spectral independency and the classification effectivity are both considered. Extensive experiments show that our model achieves better performance than the existing methods in task-dependent classifications, such as target detection and identification.

  1. Internal Carbon Recycling in Trees - New Approach, Findings, and Implications

    NASA Astrophysics Data System (ADS)

    Angert, A.; Hilman, B.

    2012-12-01

    The CO2 emitted by respiration in a tree woody tissue (stem, branch, or root) is usually assumed to diffuse directly out to the atmosphere. Given that the internal concentrations of CO2 are one to two orders of magnitude higher than the atmospheric concentration, a reuse of this respired carbon can be beneficial to plants. We have developed a new method to track the fraction of respired CO2 not emitted from stems and branches, from the ratio of the CO2 efflux to the O2 influx. This ratio, which we defined as the apparent respiratory quotient (ARQ), is expected to equal 1.0 if carbohydrates are the substrate for respiration, and all respired CO2 is directly emitted. Using this approach we have recently showed that ~30% of the CO2 respired by Amazon forest tree stems was not directly emitted. In the current study we have applied this approach to 5 tree species living in Mediterranean climate, and have performed seasonal and diurnal ARQ measurements, at different heights along the stem and branches. We found different seasonal variations in the ARQ of riparian versus drought-resilient trees. In addition, the ARQ diurnal cycle, together with the measurements in different heights, indicate that a considerable fraction of the CO2 not emitted is recycled within the tree.

  2. The Learning Tree Montessori Child Care: An Approach to Diversity

    ERIC Educational Resources Information Center

    Wick, Laurie

    2006-01-01

    In this article the author describes how she and her partners started The Learning Tree Montessori Child Care, a Montessori program with a different approach in Seattle in 1979. The author also relates that the other area Montessori schools then offered half-day programs, and as a result the children who attended were, for the most part,…

  3. Classification of savanna tree species, in the Greater Kruger National Park region, by integrating hyperspectral and LiDAR data in a Random Forest data mining environment

    NASA Astrophysics Data System (ADS)

    Naidoo, L.; Cho, M. A.; Mathieu, R.; Asner, G.

    2012-04-01

    The accurate classification and mapping of individual trees at species level in the savanna ecosystem can provide numerous benefits for the managerial authorities. Such benefits include the mapping of economically useful tree species, which are a key source of food production and fuel wood for the local communities, and of problematic alien invasive and bush encroaching species, which can threaten the integrity of the environment and livelihoods of the local communities. Species level mapping is particularly challenging in African savannas which are complex, heterogeneous, and open environments with high intra-species spectral variability due to differences in geology, topography, rainfall, herbivory and human impacts within relatively short distances. Savanna vegetation are also highly irregular in canopy and crown shape, height and other structural dimensions with a combination of open grassland patches and dense woody thicket - a stark contrast to the more homogeneous forest vegetation. This study classified eight common savanna tree species in the Greater Kruger National Park region, South Africa, using a combination of hyperspectral and Light Detection and Ranging (LiDAR)-derived structural parameters, in the form of seven predictor datasets, in an automated Random Forest modelling approach. The most important predictors, which were found to play an important role in the different classification models and contributed to the success of the hybrid dataset model when combined, were species tree height; NDVI; the chlorophyll b wavelength (466 nm) and a selection of raw, continuum removed and Spectral Angle Mapper (SAM) bands. It was also concluded that the hybrid predictor dataset Random Forest model yielded the highest classification accuracy and prediction success for the eight savanna tree species with an overall classification accuracy of 87.68% and KHAT value of 0.843.

  4. Exploiting machine learning algorithms for tree species classification in a semiarid woodland using RapidEye image

    NASA Astrophysics Data System (ADS)

    Adelabu, Samuel; Mutanga, Onisimo; Adam, Elhadi; Cho, Moses Azong

    2013-01-01

    Classification of different tree species in semiarid areas can be challenging as a result of the change in leaf structure and orientation due to soil moisture constraints. Tree species mapping is, however, a key parameter for forest management in semiarid environments. In this study, we examined the suitability of 5-band RapidEye satellite data for the classification of five tree species in mopane woodland of Botswana using machine leaning algorithms with limited training samples.We performed classification using random forest (RF) and support vector machines (SVM) based on EnMap box. The overall accuracies for classifying the five tree species was 88.75 and 85% for both SVM and RF, respectively. We also demonstrated that the new red-edge band in the RapidEye sensor has the potential for classifying tree species in semiarid environments when integrated with other standard bands. Similarly, we observed that where there are limited training samples, SVM is preferred over RF. Finally, we demonstrated that the two accuracy measures of quantity and allocation disagreement are simpler and more helpful for the vast majority of remote sensing classification process than the kappa coefficient. Overall, high species classification can be achieved using strategically located RapidEye bands integrated with advanced processing algorithms.

  5. The Tree of Life and a New Classification of Bony Fishes

    PubMed Central

    Betancur-R., Ricardo; Broughton, Richard E.; Wiley, Edward O.; Carpenter, Kent; López, J. Andrés; Li, Chenhong; Holcroft, Nancy I.; Arcila, Dahiana; Sanciangco, Millicent; Cureton II, James C; Zhang, Feifei; Buser, Thaddaeus; Campbell, Matthew A.; Ballesteros, Jesus A; Roa-Varon, Adela; Willis, Stuart; Borden, W. Calvin; Rowley, Thaine; Reneau, Paulette C.; Hough, Daniel J.; Lu, Guoqing; Grande, Terry; Arratia, Gloria; Ortí, Guillermo

    2013-01-01

    The tree of life of fishes is in a state of flux because we still lack a comprehensive phylogeny that includes all major groups. The situation is most critical for a large clade of spiny-finned fishes, traditionally referred to as percomorphs, whose uncertain relationships have plagued ichthyologists for over a century. Most of what we know about the higher-level relationships among fish lineages has been based on morphology, but rapid influx of molecular studies is changing many established systematic concepts. We report a comprehensive molecular phylogeny for bony fishes that includes representatives of all major lineages. DNA sequence data for 21 molecular markers (one mitochondrial and 20 nuclear genes) were collected for 1410 bony fish taxa, plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). Bony fish diversity is represented by 1093 genera, 369 families, and all traditionally recognized orders. The maximum likelihood tree provides unprecedented resolution and high bootstrap support for most backbone nodes, defining for the first time a global phylogeny of fishes. The general structure of the tree is in agreement with expectations from previous morphological and molecular studies, but significant new clades arise. Most interestingly, the high degree of uncertainty among percomorphs is now resolved into nine well-supported supraordinal groups. The order Perciformes, considered by many a polyphyletic taxonomic waste basket, is defined for the first time as a monophyletic group in the global phylogeny. A new classification that reflects our phylogenetic hypothesis is proposed to facilitate communication about the newly found structure of the tree of life of fishes. Finally, the molecular phylogeny is calibrated using 60 fossil constraints to produce a comprehensive time tree. The new time-calibrated phylogeny will provide the basis for and stimulate new comparative studies to better understand the evolution of the amazing

  6. Application of classification-tree methods to identify nitrate sources in ground water

    USGS Publications Warehouse

    Spruill, T.B.; Showers, W.J.; Howe, S.S.

    2002-01-01

    A study was conducted to determine if nitrate sources in ground water (fertilizer on crops, fertilizer on golf courses, irrigation spray from hog (Sus scrofa) wastes, and leachate from poultry litter and septic systems) could be classified with 80% or greater success. Two statistical classification-tree models were devised from 48 water samples containing nitrate from five source categories. Model I was constructed by evaluating 32 variables and selecting four primary predictor variables (??15N, nitrate to ammonia ratio, sodium to potassium ratio, and zinc) to identify nitrate sources. A ??15N value of nitrate plus potassium 18.2 indicated inorganic or soil organic N. A nitrate to ammonia ratio 575 indicated nitrate from golf courses. A sodium to potassium ratio 3.2 indicated spray or poultry wastes. A value for zinc 2.8 indicated poultry wastes. Model 2 was devised by using all variables except ??15N. This model also included four variables (sodium plus potassium, nitrate to ammonia ratio, calcium to magnesium ratio, and sodium to potassium ratio) to distinguish categories. Both models were able to distinguish all five source categories with better than 80% overall success and with 71 to 100% success in individual categories using the learning samples. Seventeen water samples that were not used in model development were tested using Model 2 for three categories, and all were correctly classified. Classification-tree models show great potential in identifying sources of contamination and variables important in the source-identification process.

  7. Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification

    NASA Astrophysics Data System (ADS)

    Höhle, J.

    2014-09-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.

  8. Classification of oxide glasses: A polarizability approach

    SciTech Connect

    Dimitrov, Vesselin; Komatsu, Takayuki . E-mail: komatsu@chem.nagaokaut.ac.jp

    2005-03-15

    A classification of binary oxide glasses has been proposed taking into account the values obtained on their refractive index-based oxide ion polarizability {alpha}{sub O2-}(n{sub 0}), optical basicity {lambda}(n{sub 0}), metallization criterion M(n{sub 0}), interaction parameter A(n{sub 0}), and ion's effective charges as well as O1s and metal binding energies determined by XPS. Four groups of oxide glasses have been established: glasses formed by two glass-forming acidic oxides; glasses formed by glass-forming acidic oxide and modifier's basic oxide; glasses formed by glass-forming acidic and conditional glass-forming basic oxide; glasses formed by two basic oxides. The role of electronic ion polarizability in chemical bonding of oxide glasses has been also estimated. Good agreement has been found with the previous results concerning classification of simple oxides. The results obtained probably provide good basis for prediction of type of bonding in oxide glasses on the basis of refractive index as well as for prediction of new nonlinear optical materials.

  9. A kernel autoassociator approach to pattern classification.

    PubMed

    Zhang, Haihong; Huang, Weimin; Huang, Zhiyong; Zhang, Bailing

    2005-06-01

    Autoassociators are a special type of neural networks which, by learning to reproduce a given set of patterns, grasp the underlying concept that is useful for pattern classification. In this paper, we present a novel nonlinear model referred to as kernel autoassociators based on kernel methods. While conventional non-linear autoassociation models emphasize searching for the non-linear representations of input patterns, a kernel autoassociator takes a kernel feature space as the nonlinear manifold, and places emphasis on the reconstruction of input patterns from the kernel feature space. Two methods are proposed to address the reconstruction problem, using linear and multivariate polynomial functions, respectively. We apply the proposed model to novelty detection with or without novelty examples and study it on the promoter detection and sonar target recognition problems. We also apply the model to mclass classification problems including wine recognition, glass recognition, handwritten digit recognition, and face recognition. The experimental results show that, compared with conventional autoassociators and other recognition systems, kernel autoassociators can provide better or comparable performance for concept learning and recognition in various domains. PMID:15971928

  10. A kernel autoassociator approach to pattern classification.

    PubMed

    Zhang, Haihong; Huang, Weimin; Huang, Zhiyong; Zhang, Bailing

    2005-06-01

    Autoassociators are a special type of neural networks which, by learning to reproduce a given set of patterns, grasp the underlying concept that is useful for pattern classification. In this paper, we present a novel nonlinear model referred to as kernel autoassociators based on kernel methods. While conventional non-linear autoassociation models emphasize searching for the non-linear representations of input patterns, a kernel autoassociator takes a kernel feature space as the nonlinear manifold, and places emphasis on the reconstruction of input patterns from the kernel feature space. Two methods are proposed to address the reconstruction problem, using linear and multivariate polynomial functions, respectively. We apply the proposed model to novelty detection with or without novelty examples and study it on the promoter detection and sonar target recognition problems. We also apply the model to mclass classification problems including wine recognition, glass recognition, handwritten digit recognition, and face recognition. The experimental results show that, compared with conventional autoassociators and other recognition systems, kernel autoassociators can provide better or comparable performance for concept learning and recognition in various domains.

  11. The creation of a digital soil map for Cyprus using decision-tree classification techniques

    NASA Astrophysics Data System (ADS)

    Camera, Corrado; Zomeni, Zomenia; Bruggeman, Adriana; Noller, Joy; Zissimos, Andreas

    2014-05-01

    Considering the increasing threats soil are experiencing especially in semi-arid, Mediterranean environments like Cyprus (erosion, contamination, sealing and salinisation), producing a high resolution, reliable soil map is essential for further soil conservation studies. This study aims to create a 1:50.000 soil map covering the area under the direct control of the Republic of Cyprus (5.760 km2). The study consists of two major steps. The first is the creation of a raster database of predictive variables selected according to the scorpan formula (McBratney et al., 2003). It is of particular interest the possibility of using, as soil properties, data coming from three older island-wide soil maps and the recently published geochemical atlas of Cyprus (Cohen et al., 2011). Ten highly characterizing elements were selected and used as predictors in the present study. For the other factors usual variables were used: temperature and aridity index for climate; total loss on ignition, vegetation and forestry types maps for organic matter; the DEM and related relief derivatives (slope, aspect, curvature, landscape units); bedrock, surficial geology and geomorphology (Noller, 2009) for parent material and age; and a sub-watershed map to better bound location related to parent material sources. In the second step, the digital soil map is created using the Random Forests package in R. Random Forests is a decision tree classification technique where many trees, instead of a single one, are developed and compared to increase the stability and the reliability of the prediction. The model is trained and verified on areas where a 1:25.000 published soil maps obtained from field work is available and then it is applied for predictive mapping to the other areas. Preliminary results obtained in a small area in the plain around the city of Lefkosia, where eight different soil classes are present, show very good capacities of the method. The Ramdom Forest approach leads to reproduce soil

  12. Morphological and molecular characteristics do not confirm popular classification of the Brazil nut tree in Acre, Brazil.

    PubMed

    Sujii, P S; Fernandes, E T M B; Azevedo, V C R; Ciampi, A Y; Martins, K; de O Wadt, L H

    2013-09-27

    In the State of Acre, the Brazil nut tree, Bertholletia excelsa (Lecythidaceae), is classified by the local population into two types according to morphological characteristics, including color and quality of wood, shape of the trunk and crown, and fruit production. We examined the reliability of this classification by comparing morphological and molecular data of four populations of Brazil nut trees from Vale do Rio Acre in the Brazilian Amazon. For the morphological analysis, we evaluated qualitative and quantitative information of the trees, fruits, and seeds. The molecular analysis was performed using RAPD and ISSR markers, with cluster analysis. Significant differences were found between the two types of Brazil nut trees for the characters diameter at breast height, fruit yield, fruit size, and number of seeds per fruit. Despite the significant correlation between the morphological characteristics and the popular classification, we observed all possible combinations of morphological characteristics in both types of Brazil nut trees. In some individuals, the classification did not correspond to any of the characteristics. The results obtained with molecular markers showed that the two locally classified types of Brazil nut trees did not differ genetically, indicating that there is no consistent separation between them.

  13. Tree Crown Delineation on Vhr Aerial Imagery with Svm Classification Technique Optimized by Taguchi Method: a Case Study in Zagros Woodlands

    NASA Astrophysics Data System (ADS)

    Erfanifard, Y.; Behnia, N.; Moosavi, V.

    2013-09-01

    The Support Vector Machine (SVM) is a theoretically superior machine learning methodology with great results in classification of remotely sensed datasets. Determination of optimal parameters applied in SVM is still vague to some scientists. In this research, it is suggested to use the Taguchi method to optimize these parameters. The objective of this study was to detect tree crowns on very high resolution (VHR) aerial imagery in Zagros woodlands by SVM optimized by Taguchi method. A 30 ha plot of Persian oak (Quercus persica) coppice trees was selected in Zagros woodlands, Iran. The VHR aerial imagery of the plot with 0.06 m spatial resolution was obtained from National Geographic Organization (NGO), Iran, to extract the crowns of Persian oak trees in this study. The SVM parameters were optimized by Taguchi method and thereafter, the imagery was classified by the SVM with optimal parameters. The results showed that the Taguchi method is a very useful approach to optimize the combination of parameters of SVM. It was also concluded that the SVM method could detect the tree crowns with a KHAT coefficient of 0.961 which showed a great agreement with the observed samples and overall accuracy of 97.7% that showed the accuracy of the final map. Finally, the authors suggest applying this method to optimize the parameters of classification techniques like SVM.

  14. Tree-Level Hydrodynamic Approach for Improved Stomatal Conductance Parameterization

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, G.; Bohrer, G.; Matheny, A. M.; Ivanov, V. Y.

    2014-12-01

    The land-surface models do not mechanistically resolve hydrodynamic processes within the tree. The Finite-Elements Tree-Crown Hydrodynamics model version 2 (FETCH2) is based on the pervious FETCH model approach, but with finite difference numerics, and simplified single-beam conduit system. FETCH2 simulates water flow through the tree as a simplified system of porous media conduits. It explicitly resolves spatiotemporal hydraulic stresses throughout the tree's vertical extent that cannot be easily represented using other stomatal-conductance models. Empirical equations relate water potential at the stem to stomata conductance at leaves connected to the stem (through unresolved branches) at that height. While highly simplified, this approach bring some realism to the simulation of stomata conductance because the stomata can respond to stem water potential, rather than an assumed direct relationship with soil moisture, as is currently the case in almost all models. By enabling mechanistic simulation of hydrological traits, such as xylem conductivity, conductive area per DBH, vertical distribution of leaf area and maximal and minimal water content in the xylem, and their effect of the dynamics of water flow in the tree system, the FETCH2 modeling system enhanced our understanding of the role of hydraulic limitations on an experimental forest plot short-term water stresses that lead to tradeoffs between water and light availability for transpiring leaves in forest ecosystems. FETCH2 is particularly suitable to resolve the effects of structural differences between tree and species and size groups, and the consequences of differences in hydraulic strategies of different species. We leverage on a large dataset of sap flow from 60 trees of 4 species at our experimental plot at the University of Michigan Biological Station. Comparison of the sap flow and transpiration patterns in this site and an undisturbed control site shows significant difference in hydraulic strategies

  15. A neuro-fuzzy approach in the classification of students' academic performance.

    PubMed

    Do, Quang Hung; Chen, Jeng-Fung

    2013-01-01

    Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions.

  16. A Neuro-Fuzzy Approach in the Classification of Students' Academic Performance

    PubMed Central

    2013-01-01

    Classifying the student academic performance with high accuracy facilitates admission decisions and enhances educational services at educational institutions. The purpose of this paper is to present a neuro-fuzzy approach for classifying students into different groups. The neuro-fuzzy classifier used previous exam results and other related factors as input variables and labeled students based on their expected academic performance. The results showed that the proposed approach achieved a high accuracy. The results were also compared with those obtained from other well-known classification approaches, including support vector machine, Naive Bayes, neural network, and decision tree approaches. The comparative analysis indicated that the neuro-fuzzy approach performed better than the others. It is expected that this work may be used to support student admission procedures and to strengthen the services of educational institutions. PMID:24302928

  17. A simulation approach for change-points on phylogenetic trees.

    PubMed

    Persing, Adam; Jasra, Ajay; Beskos, Alexandros; Balding, David; De Iorio, Maria

    2015-01-01

    We observe n sequences at each of m sites and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a transdimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the transdimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques. PMID:25506749

  18. A simulation approach for change-points on phylogenetic trees.

    PubMed

    Persing, Adam; Jasra, Ajay; Beskos, Alexandros; Balding, David; De Iorio, Maria

    2015-01-01

    We observe n sequences at each of m sites and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a transdimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in n. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the transdimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques.

  19. A Hybrid Sensing Approach for Pure and Adulterated Honey Classification

    PubMed Central

    Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  20. A hybrid sensing approach for pure and adulterated honey classification.

    PubMed

    Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  1. An efficiency data envelopment analysis model reinforced by classification and regression tree for hospital performance evaluation.

    PubMed

    Chuang, Chun-Ling; Chang, Peng-Chan; Lin, Rong-Ho

    2011-10-01

    As changes in the medical environment and policies on national health insurance coverage have triggered tremendous impacts on the business performance and financial management of medical institutions, effective management becomes increasingly crucial for hospitals to enhance competitiveness and to strive for sustainable development. The study accordingly aims at evaluating hospital operational efficiency for better resource allocation and cost effectiveness. Several data envelopment analysis (DEA)-based models were first compared, and the DEA-artificial neural network (ANN) model was identified as more capable than the DEA and DEA-assurance region (AR) models of measuring operational efficiency and recognizing the best-performing hospital. The classification and regression tree (CART) efficiency model was then utilized to extract rules for improving resource allocation of medical institutions. PMID:20878210

  2. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    SciTech Connect

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.; Pounds, Joel G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the set of predictors to around 8 factors that can be validated using reputable medical and public health resources.

  3. Classification tree models for predicting distributions of michigan stream fish from landscape variables

    USGS Publications Warehouse

    Steen, P.J.; Zorn, T.G.; Seelbach, P.W.; Schaeffer, J.S.

    2008-01-01

    Traditionally, fish habitat requirements have been described from local-scale environmental variables. However, recent studies have shown that studying landscape-scale processes improves our understanding of what drives species assemblages and distribution patterns across the landscape. Our goal was to learn more about constraints on the distribution of Michigan stream fish by examining landscape-scale habitat variables. We used classification trees and landscape-scale habitat variables to create and validate presence-absence models and relative abundance models for Michigan stream fishes. We developed 93 presence-absence models that on average were 72% correct in making predictions for an independent data set, and we developed 46 relative abundance models that were 76% correct in making predictions for independent data. The models were used to create statewide predictive distribution and abundance maps that have the potential to be used for a variety of conservation and scientific purposes. ?? Copyright by the American Fisheries Society 2008.

  4. Hong Kong CIE sky classification and prediction by accessible weather data and trees-based methods

    NASA Astrophysics Data System (ADS)

    Lou, S.; Li, D. H. W.; Lam, J. C.

    2016-08-01

    Solar irradiance and daylight illuminance are important for solar energy and daylighting designs. Recently, the International Commission of Illuminance (CIE) adopted a range of sky conditions to represent the possible sky distributions which are crucial to the estimation of solar irradiance and daylight illuminance on vertical building facades. The important issue would be whether the sky conditions are correctly identified by the accessible variables. Previously, a number of climatic parameters including sky luminance distributions, vertical solar irradiance and sky illuminance were proposed for the CIE sky classification. However, such data are not always available. This paper proposes an approach based on the readily accessible data that systematically recorded by the local meteorological station for many years. The performance was evaluated using measured vertical solar irradiance and illuminance. The results show that the proposed approach is reliable for sky classification.

  5. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  6. Tree species classification in the Southern Sierra Nevada Mountains based on MASTER and LIDAR imagery

    NASA Astrophysics Data System (ADS)

    Gibbons, S.; Grigsby, S.; Ustin, S.

    2013-12-01

    NASA recently collected MASTER (MODIS/ASTER) imagery over the Southern Sierra Nevada Mountains as part of the HyspIRI (Hyperspectral Infrared Imager) preparatory campaign, a location that was chosen for its distinct changes in vegetative species with elevation. Differentiation between functional types based on spectral data has been successful, however, classification between individual species is more difficult to accomplish with only the visible and near infrared portions of the spectrum. I used MASTER imagery in combination with Critical Zone Observatory LIDAR data to map species across both a low and high elevation site in the San Joaquin Experimental Range. While the visible and thermal bands of MASTER images provided an improved classification over shortwave bands, the physical characteristics from the LIDAR data showed the most contrast between the land covers, including tree species. The National Ecological Observation Network (NEON) plans to use LIDAR and spectral data to monitor 20 domains, including the San Joaquin Experimental Range, for the next thirty years. Understanding the current species distributions not only provides insight on the available resources of the area but will also act as a baseline to determine the effects of environmental changes on vegetation using future NEON data.

  7. Object classification in images for Epo doping control based on fuzzy decision trees

    NASA Astrophysics Data System (ADS)

    Bajla, Ivan; Hollander, Igor; Heiss, Dorothea; Granec, Reinhard; Minichmayr, Markus

    2005-02-01

    Erythropoietin (Epo) is a hormone which can be misused as a doping substance. Its detection involves analysis of images containing specific objects (bands), whose position and intensity are critical for doping positivity. Within a research project of the World Anti-Doping Agency (WADA) we are implementing the GASepo software that should serve for Epo testing in doping control laboratories world-wide. For identification of the bands we have developed a segmentation procedure based on a sequence of filters and edge detectors. Whereas all true bands are properly segmented, the procedure generates a relatively high number of false positives (artefacts). To separate these artefacts we suggested a post-segmentation supervised classification using real-valued geometrical measures of objects. The method is based on the ID3 (Ross Quinlan's) rule generation method, where fuzzy representation is used for linking the linguistic terms to quantitative data. The fuzzy modification of the ID3 method provides a framework that generates fuzzy decision trees, as well as fuzzy sets for input data. Using the MLTTM software (Machine Learning Framework) we have generated a set of fuzzy rules explicitly describing bands and artefacts. The method eliminated most of the artefacts. The contribution includes a comparison of the obtained misclassification errors to the errors produced by some other statistical classification methods.

  8. Effect of training characteristics on object classification: An application using Boosted Decision Trees

    NASA Astrophysics Data System (ADS)

    Sevilla-Noarbe, I.; Etayo-Sotos, P.

    2015-06-01

    We present an application of a particular machine-learning method (Boosted Decision Trees, BDTs using AdaBoost) to separate stars and galaxies in photometric images using their catalog characteristics. BDTs are a well established machine learning technique used for classification purposes. They have been widely used specially in the field of particle and astroparticle physics, and we use them here in an optical astronomy application. This algorithm is able to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. The improvements are shown using the Sloan Digital Sky Survey Data Release 9, with respect to the type photometric classifier. We obtain an improvement in the impurity of the galaxy sample of a factor 2-4 for this particular dataset, adjusting for the same efficiency of the selection. Another main goal of this study is to verify the effects that different input vectors and training sets have on the classification performance, the results being of wider use to other machine learning techniques.

  9. Annual Crop Type Classification of the U.S. Great Plains for 2000 - 2011: An Application of Classification Tree Modeling using Remote Sensing and Ancillary Environmental Data (Invited)

    NASA Astrophysics Data System (ADS)

    Howard, D. M.; Wylie, B. K.

    2013-12-01

    The purpose of this study was to increase spatial and temporal availability of crop classification data using reliable source data that have the potential of being applied on local, regional, national, and global levels. This study implemented classification tree modeling to map annual crop types throughout the U.S. Great Plains from 2000 - 2011. Classification tree modeling has been shown in numerous studies to be an effective tool for developing classification models. In this study, nearly 18 million crop observation points, derived from annual U.S. Department of Agriculture (USDA) National Agriculture Statistics Service (NASS) Cropland Data Layers (CDLs), were used in the training, development, and validation of a classification tree crop type model (CTM). Each observation point was further defined by weekly Normalized Differential Vegetation Index (NDVI) readings, annual climatic conditions, soil conditions, and a number of other biogeophysical environmental characteristics. The CTM accounted for the most prevalent crop types in the area, including, corn, soybeans, winter wheat, spring wheat, cotton, sorghum, and alfalfa. Other crops that did not fit into any of these classes were identified and grouped into a miscellaneous class. An 87% success rate was achieved on the classification of 1.8 million observation points (10% of total observation points) that were withheld from training. The CTM was applied to create annual crop maps of the U.S. Great Plains for 2000 - 2011 at a spatial resolution of 250 meters. Product validation was performed by comparing county acreage derived from the modeled crop maps and county acreage data from the USDA NASS Survey Program for each crop type and each year. Greater than 15,000 county records from 2001 - 2010 were compared with a Pearson's correlation coefficient of r = 0.87.

  10. An approach for quantifying the efficacy of ecological classification schemes as management tools

    NASA Astrophysics Data System (ADS)

    Flanagan, A. M.; Cerrato, R. M.

    2015-10-01

    Rigorous assessments of ecological classification schemes being applied to submerged environments are needed to evaluate their utility as management tools. Verification that a scheme can quantitatively capture habitat and community variation would be of considerable value to individuals responsible for making difficult management decisions relevant to widespread environmental challenges including those in fisheries, preservation or restoration of critical habitats, and climate change. In this paper, an assessment approach that evaluates a scheme by treating it like a quantitative statistical model is presented. It couples two direct gradient, multivariate statistical techniques, multivariate regression trees (MRT) and redundancy analysis (RDA), with a modelling protocol involving model formulation, model selection, parameter estimation, and measurement of precision to produce a very flexible strategy for analyzing structure in ecological data. To illustrate the proposed approach, the assessment focused on benthic infauna and evaluating the Folk grain size classification scheme, along with some alternative grain size models. Analysis of data sets revealed that while it was fairly easy to uncover biotic-environmental relationships that were over-fitted, the community structure inherent in the data tended to be robustly discernible and preserved across all grain size models, but rigidly parameterized models (i.e., a one size fits all approach for grain size characterization with fixed boundaries) were generally ineffective. The proposed approach provided a clear, detailed, and rigorous assessment of Folk and several alternative models and can be used for the quantitative evaluation of existing ecological classification schemes and/or in the development of new schemes.

  11. [It is normal for classification approaches to be diverse].

    PubMed

    Pavlinov, I Ia

    2003-01-01

    It is asserted that the postmodern concept of science, unlike the classical ideal, presumes necessary existence of various classification approaches (schools) in taxonomy, each corresponding to a particular aspect of consideration of the "taxic reality". They are set up by diversity of initial epistemological and ontological backgrounds which fix in a certain way a) fragments of that reality allowable for investigation, and b) allowable methods of exploration of the fragments being fixed. It makes it possible to define a taxonomic school as a unity of the above backgrounds together with consideration aspect delimited by them. Two extreme positions of these backgrounds could be recognized in recent taxonomic thought. One of them follows the scholastic tradition of elaboration of a formal and, hence, universal classificatory method ("new typology", numerical phenetics, pattern cladistics). Another one asserts dependence of classificatory approach on the judgment of the nature of taxic reality (natural philosophy, evolutionary schools of taxonomy). Some arguments are put forward in favor of significant impact of evolutionary thinking onto the theory of modern taxonomy. This impact is manifested by the correspondence principle which makes classificatory algorithms (and hence resulting classifications) depending onto initial assumptions about causes of taxic diversity. It is asserted that criteria of "quality" of both classifications proper and classificatory methods can be correctly formulated within the framework of a particular consideration aspect only. For any group of organisms, several particular classifications are rightful to exist, each corresponding to a particular consideration aspect. These classifications could not be arranged along the "better-worse" scale, as they reflect different fragments of the taxic reality. Their mutual interpretation depends on degree of compatibility of background assumptions and of the tasks being resolved. Extensionally

  12. Simulating California Reservoir Operation Using the Classification and Regression Tree Algorithm Combined with a Shuffled Cross-Validation Scheme

    NASA Astrophysics Data System (ADS)

    Yang, T.; Gao, X.; Sorooshian, S.; Li, X.

    2015-12-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs, and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as the consideration of policy and regulation, environmental constraints, dry/wet conditions, etc. In this paper, a reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve model's predictive performance. An application study of 9 major reservoirs in California is carried out and the simulated results from different decision tree approaches are compared with observation, including original CART and Random Forest. The statistical measurements show that CART combined with the shuffled cross-validation scheme gives a better predictive performance over the other two methods, especially in simulating the peak flows. The results for simulated controlled outflow, storage changes and storage trajectories also show that the proposed model is able to consistently and reasonably predict the human's reservoir operation decisions. In addition, we found that the operation in the Trinity Lake, Oroville Lake and Shasta Lake are greatly influenced by policy and regulation, while low elevation reservoirs are more sensitive to inflow amount than others.

  13. Identifying tree crown delineation shapes and need for remediation on high resolution imagery using an evidence based approach

    NASA Astrophysics Data System (ADS)

    Leckie, Donald G.; Walsworth, Nicholas; Gougeon, François A.

    2016-04-01

    In order to fully realize the benefits of automated individual tree mapping for tree species, health, forest inventory attribution and forest management decision making, the tree delineations should be as good as possible. The concept of identifying poorly delineated tree crowns and suggesting likely types of remediation was investigated. Delineations (isolations or isols) were classified into shape types reflecting whether they were realistic tree shapes and the likely kind of remediation needed. Shape type was classified by an evidence based rules approach using primitives based on isol size, shape indices, morphology, the presence of local maxima, and matches with template models representing trees of different sizes. A test set containing 50,000 isols based on an automated tree delineation of 40 cm multispectral airborne imagery of a diverse temperate-boreal forest site was used. Isolations representing single trees or several trees were the focus, as opposed to cases where a tree is split into several isols. For eight shape classes from regular through to convolute, shape classification accuracy was in the order of 62%; simplifying to six classes accuracy was 83%. Shape type did give an indication of the type of remediation and there were 6% false alarms (i.e., isols classed as needing remediation but did not). Alternately, there were 5% omissions (i.e., isols of regular shape and not earmarked for remediation that did need remediation). The usefulness of the concept of identifying poor delineations in need of remediation was demonstrated and one suite of methods developed and shown to be effective.

  14. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    NASA Astrophysics Data System (ADS)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  15. Pattern classification approach to rocket engine diagnostics

    SciTech Connect

    Tulpule, S.

    1989-01-01

    This paper presents a systems level approach to integrate state-of-the-art rocket engine technology with advanced computational techniques to develop an integrated diagnostic system (IDS) for future rocket propulsion systems. The key feature of this IDS is the use of advanced diagnostic algorithms for failure detection as opposed to the current practice of redline-based failure detection methods. The paper presents a top-down analysis of rocket engine diagnostic requirements, rocket engine operation, applicable diagnostic algorithms, and algorithm design techniques, which serve as a basis for the IDS. The concepts of hierarchical, model-based information processing are described, together with the use uf signal processing, pattern recognition, and artificial intelligence techniques which are an integral part of this diagnostic system. 27 refs.

  16. Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

    NASA Astrophysics Data System (ADS)

    Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko

    2015-01-01

    Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.

  17. An efficient tree classifier ensemble-based approach for pedestrian detection.

    PubMed

    Xu, Yanwu; Cao, Xianbin; Qiao, Hong

    2011-02-01

    Classification-based pedestrian detection systems (PDSs) are currently a hot research topic in the field of intelligent transportation. A PDS detects pedestrians in real time on moving vehicles. A practical PDS demands not only high detection accuracy but also high detection speed. However, most of the existing classification-based approaches mainly seek for high detection accuracy, while the detection speed is not purposely optimized for practical application. At the same time, the performance, particularly the speed, is primarily tuned based on experiments without theoretical foundations, leading to a long training procedure. This paper starts with measuring and optimizing detection speed, and then a practical classification-based pedestrian detection solution with high detection speed and training speed is described. First, an extended classification/detection speed metric, named feature-per-object (fpo), is proposed to measure the detection speed independently from execution. Then, an fpo minimization model with accuracy constraints is formulated based on a tree classifier ensemble, where the minimum fpo can guarantee the highest detection speed. Finally, the minimization problem is solved efficiently by using nonlinear fitting based on radial basis function neural networks. In addition, the optimal solution is directly used to instruct classifier training; thus, the training speed could be accelerated greatly. Therefore, a rapid and accurate classification-based detection technique is proposed for the PDS. Experimental results on urban traffic videos show that the proposed method has a high detection speed with an acceptable detection rate and a false-alarm rate for onboard detection; moreover, the training procedure is also very fast.

  18. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  19. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2000-12-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  20. An approach for leukemia classification based on cooperative game theory.

    PubMed

    Torkaman, Atefeh; Charkari, Nasrollah Moghaddam; Aghaeipour, Mahnaz

    2011-01-01

    Hematological malignancies are the types of cancer that affect blood, bone marrow and lymph nodes. As these tissues are naturally connected through the immune system, a disease affecting one of them will often affect the others as well. The hematological malignancies include; Leukemia, Lymphoma, Multiple myeloma. Among them, leukemia is a serious malignancy that starts in blood tissues especially the bone marrow, where the blood is made. Researches show, leukemia is one of the common cancers in the world. So, the emphasis on diagnostic techniques and best treatments would be able to provide better prognosis and survival for patients. In this paper, an automatic diagnosis recommender system for classifying leukemia based on cooperative game is presented. Through out this research, we analyze the flow cytometry data toward the classification of leukemia into eight classes. We work on real data set from different types of leukemia that have been collected at Iran Blood Transfusion Organization (IBTO). Generally, the data set contains 400 samples taken from human leukemic bone marrow. This study deals with cooperative game used for classification according to different weights assigned to the markers. The proposed method is versatile as there are no constraints to what the input or output represent. This means that it can be used to classify a population according to their contributions. In other words, it applies equally to other groups of data. The experimental results show the accuracy rate of 93.12%, for classification and compared to decision tree (C4.5) with (90.16%) in accuracy. The result demonstrates that cooperative game is very promising to be used directly for classification of leukemia as a part of Active Medical decision support system for interpretation of flow cytometry readout. This system could assist clinical hematologists to properly recognize different kinds of leukemia by preparing suggestions and this could improve the treatment of leukemic

  1. An overview of the phase-modular fault tree approach to phased mission system analysis

    NASA Technical Reports Server (NTRS)

    Meshkat, L.; Xing, L.; Donohue, S. K.; Ou, Y.

    2003-01-01

    We look at how fault tree analysis (FTA), a primary means of performing reliability analysis of PMS, can meet this challenge in this paper by presenting an overview of the modular approach to solving fault trees that represent PMS.

  2. An improved methodology for land-cover classification using artificial neural networks and a decision tree classifier

    NASA Astrophysics Data System (ADS)

    Arellano-Neri, Olimpia

    Mapping is essential for the analysis of the land and land-cover dynamics, which influence many environmental processes and properties. When creating land-cover maps it is important to minimize error, since error will propagate into later analyses based upon these land cover maps. The reliability of land cover maps derived from remotely sensed data depends upon an accurate classification. For decades, traditional statistical methods have been applied in land-cover classification with varying degrees of accuracy. One of the most significant developments in the field of land-cover classification using remotely sensed data has been the introduction of Artificial Neural Networks (ANN) procedures. In this research, Artificial Neural Networks were applied to remotely sensed data of the southwestern Ohio region for land-cover classification. Three variants on traditional ANN-based classifiers were explored here: (1) the use of a customized architecture of the neural network in terms of the input layer for each land-cover class, (2) the use of texture analysis to combine spectral information and spatial information which is essential for urban classes, and (3) the use of decision tree (DT) classification to refine the ANN classification and ultimately to achieve a more reliable land-cover thematic map. The objective of this research was to prove that a classification based on Artificial Neural Networks (ANN) and decision tree (DT) would outperform by far the National Land Cover Data (NLCD). The NLCD is a land-cover classification produced by a cooperative effort between the United States Geological Survey (USGS) and the United States Environmental Protection Agency (USEPA). In order to achieve this objective, an accuracy assessment was conducted for both NLCD classification and ANN/DT classification. Error matrices resulting from the accuracy assessments provided overall accuracy, accuracy of each class, omission errors, and commission errors for each classification. The

  3. Weighing risk factors associated with bee colony collapse disorder by classification and regression tree analysis.

    PubMed

    VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude

    2010-10-01

    Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.

  4. Aerial Images from AN Uav System: 3d Modeling and Tree Species Classification in a Park Area

    NASA Astrophysics Data System (ADS)

    Gini, R.; Passoni, D.; Pinto, L.; Sona, G.

    2012-07-01

    The use of aerial imagery acquired by Unmanned Aerial Vehicles (UAVs) is scheduled within the FoGLIE project (Fruition of Goods Landscape in Interactive Environment): it starts from the need to enhance the natural, artistic and cultural heritage, to produce a better usability of it by employing audiovisual movable systems of 3D reconstruction and to improve monitoring procedures, by using new media for integrating the fruition phase with the preservation ones. The pilot project focus on a test area, Parco Adda Nord, which encloses various goods' types (small buildings, agricultural fields and different tree species and bushes). Multispectral high resolution images were taken by two digital compact cameras: a Pentax Optio A40 for RGB photos and a Sigma DP1 modified to acquire the NIR band. Then, some tests were performed in order to analyze the UAV images' quality with both photogrammetric and photo-interpretation purposes, to validate the vector-sensor system, the image block geometry and to study the feasibility of tree species classification. Many pre-signalized Control Points were surveyed through GPS to allow accuracy analysis. Aerial Triangulations (ATs) were carried out with photogrammetric commercial software, Leica Photogrammetry Suite (LPS) and PhotoModeler, with manual or automatic selection of Tie Points, to pick out pros and cons of each package in managing non conventional aerial imagery as well as the differences in the modeling approach. Further analysis were done on the differences between the EO parameters and the corresponding data coming from the on board UAV navigation system.

  5. Shape variability and classification of human hair: a worldwide approach.

    PubMed

    De la Mettrie, Roland; Saint-Léger, Didier; Loussouarn, Geneviève; Garcel, Annelise; Porter, Crystal; Langaney, André

    2007-06-01

    Human hair has been commonly classified according to three conventional ethnic human subgroups, that is, African, Asian, and European. Such broad classification hardly accounts for the high complexity of human biological diversity, resulting from both multiple and past or recent mixed origins. The research reported here is intended to develop a more factual and scientific approach based on physical features of human hair. The aim of the study is dual: (1) to define hair types according to specific shape criteria through objective and simple measurements taken on hairs from 1442 subjects from 18 different countries and (2) to define such hair types without referring to human ethnicity. The driving principle is simple: Because hair can be found in many different human subgroups, defining a straight or a curly hair should provide a more objective approach than a debatable ethnicity-based classification. The proposed method is simple to use and requires the measurement of only three easily accessible descriptors of hair shape: curve diameter (CD), curl index (i), and number of waves (w). This method leads to a worldwide coherent classification of hair in eight well-defined categories. The new hair categories, as described, should be more appropriate and more reliable than conventional standards in cosmetic and forensic sciences. Furthermore, the classification can be useful for testing whether hair shape diversity follows the continuous geographic and historical pattern suggested for human genetic variation or presents major discontinuities between some large human subdivisions, as claimed by earlier classical anthropology.

  6. Control of tree water networks: A geometric programming approach

    NASA Astrophysics Data System (ADS)

    Sela Perelman, L.; Amin, S.

    2015-10-01

    This paper presents a modeling and operation approach for tree water supply systems. The network control problem is approximated as a geometric programming (GP) problem. The original nonlinear nonconvex network control problem is transformed into a convex optimization problem. The optimization model can be efficiently solved to optimality using state-of-the-art solvers. Two control schemes are presented: (1) operation of network actuators (pumps and valves) and (2) controlled demand shedding allocation between network consumers with limited resources. The dual of the network control problem is formulated and is used to perform sensitivity analysis with respect to hydraulic constraints. The approach is demonstrated on a small branched-topology network and later extended to a medium-size irrigation network. The results demonstrate an intrinsic trade-off between energy costs and demand shedding policy, providing an efficient decision support tool for active management of water systems.

  7. A conceptual approach to approximate tree root architecture in infinite slope models

    NASA Astrophysics Data System (ADS)

    Schmaltz, Elmar; Glade, Thomas

    2016-04-01

    paraboloids represent a cordate-root-system with radius r, height h and a constant, species-independent curvature. This procedure simplifies the classification of tree species into the three defined geometric solids. In this study we introduce a conceptual approach to estimate the 2- and 3-dimensional distribution of different tree root systems, and to implement it in a raster environment, as it is used in infinite slope models. Hereto we used the PCRaster extension in a python framework. The results show that root distribution and root growth are spatially reproducible in a simple raster framework. The outputs exhibit significant effects for a synthetically generated slope on local scale for equal time-steps. The preliminary results depict an initial step to develop a vegetation module that can be coupled with hydro-mechanical slope stability models. This approach is expected to yield a valuable contribution to the implementation of vegetation-related properties, in particular effects of root-reinforcement, into physically-based approaches using infinite slope models.

  8. ADHD classification using bag of words approach on network features

    NASA Astrophysics Data System (ADS)

    Solmaz, Berkan; Dey, Soumyabrata; Rao, A. Ravishankar; Shah, Mubarak

    2012-02-01

    Attention Deficit Hyperactivity Disorder (ADHD) is receiving lots of attention nowadays mainly because it is one of the common brain disorders among children and not much information is known about the cause of this disorder. In this study, we propose to use a novel approach for automatic classification of ADHD conditioned subjects and control subjects using functional Magnetic Resonance Imaging (fMRI) data of resting state brains. For this purpose, we compute the correlation between every possible voxel pairs within a subject and over the time frame of the experimental protocol. A network of voxels is constructed by representing a high correlation value between any two voxels as an edge. A Bag-of-Words (BoW) approach is used to represent each subject as a histogram of network features; such as the number of degrees per voxel. The classification is done using a Support Vector Machine (SVM). We also investigate the use of raw intensity values in the time series for each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features. Experimental results verified that the classification accuracy improves when the combined histogram is used. We tested our approach on a highly challenging dataset released by NITRC for ADHD-200 competition and obtained promising results. The dataset not only has a large size but also includes subjects from different demography and edge groups. To the best of our knowledge, this is the first paper to propose BoW approach in any functional brain disorder classification and we believe that this approach will be useful in analysis of many brain related conditions.

  9. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier

    PubMed Central

    Kambhampati, Satya Samyukta; Singh, Vishal; Ramkumar, Barathram

    2015-01-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%. PMID:26609414

  10. Local fractal dimension based approaches for colonic polyp classification.

    PubMed

    Häfner, Michael; Tamaki, Toru; Tanaka, Shinji; Uhl, Andreas; Wimmer, Georg; Yoshida, Shigeto

    2015-12-01

    This work introduces texture analysis methods that are based on computing the local fractal dimension (LFD; or also called the local density function) and applies them for colonic polyp classification. The methods are tested on 8 HD-endoscopic image databases, where each database is acquired using different imaging modalities (Pentax's i-Scan technology combined with or without staining the mucosa) and on a zoom-endoscopic image database using narrow band imaging. In this paper, we present three novel extensions to a LFD based approach. These extensions additionally extract shape and/or gradient information of the image to enhance the discriminativity of the original approach. To compare the results of the LFD based approaches with the results of other approaches, five state of the art approaches for colonic polyp classification are applied to the employed databases. Experiments show that LFD based approaches are well suited for colonic polyp classification, especially the three proposed extensions. The three proposed extensions are the best performing methods or at least among the best performing methods for each of the employed databases. The methods are additionally tested by means of a public texture image database, the UIUCtex database. With this database, the viewpoint invariance of the methods is assessed, an important features for the employed endoscopic image databases. Results imply that most of the LFD based methods are more viewpoint invariant than the other methods. However, the shape, size and orientation adapted LFD approaches (which are especially designed to enhance the viewpoint invariance) are in general not more viewpoint invariant than the other LFD based approaches.

  11. Use of Binary Partition Tree and energy minimization for object-based classification of urban land cover

    NASA Astrophysics Data System (ADS)

    Li, Mengmeng; Bijker, Wietske; Stein, Alfred

    2015-04-01

    Two main challenges are faced when classifying urban land cover from very high resolution satellite images: obtaining an optimal image segmentation and distinguishing buildings from other man-made objects. For optimal segmentation, this work proposes a hierarchical representation of an image by means of a Binary Partition Tree (BPT) and an unsupervised evaluation of image segmentations by energy minimization. For building extraction, we apply fuzzy sets to create a fuzzy landscape of shadows which in turn involves a two-step procedure. The first step is a preliminarily image classification at a fine segmentation level to generate vegetation and shadow information. The second step models the directional relationship between building and shadow objects to extract building information at the optimal segmentation level. We conducted the experiments on two datasets of Pléiades images from Wuhan City, China. To demonstrate its performance, the proposed classification is compared at the optimal segmentation level with Maximum Likelihood Classification and Support Vector Machine classification. The results show that the proposed classification produced the highest overall accuracies and kappa coefficients, and the smallest over-classification and under-classification geometric errors. We conclude first that integrating BPT with energy minimization offers an effective means for image segmentation. Second, we conclude that the directional relationship between building and shadow objects represented by a fuzzy landscape is important for building extraction.

  12. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  13. Study and Ranking of Determinants of Taenia solium Infections by Classification Tree Models

    PubMed Central

    Mwape, Kabemba E.; Phiri, Isaac K.; Praet, Nicolas; Dorny, Pierre; Muma, John B.; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  14. Study and ranking of determinants of Taenia solium infections by classification tree models.

    PubMed

    Mwape, Kabemba E; Phiri, Isaac K; Praet, Nicolas; Dorny, Pierre; Muma, John B; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  15. Study and ranking of determinants of Taenia solium infections by classification tree models.

    PubMed

    Mwape, Kabemba E; Phiri, Isaac K; Praet, Nicolas; Dorny, Pierre; Muma, John B; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals.

  16. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  17. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees.

    PubMed

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥10(6) CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥10(5) CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  18. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees

    PubMed Central

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥106 CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥105 CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  19. Non-Destructive Classification Approaches for Equilbrated Ordinary Chondrites

    NASA Technical Reports Server (NTRS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-01-01

    Classification of meteorites is most effectively carried out by petrographic and mineralogic studies of thin sections, but a rapid and accurate classification technique for the many samples collected in dense collection areas (hot and cold deserts) is of great interest. Oil immersion techniques have been used to classify a large proportion of the US Antarctic meteorite collections since the mid-1980s [1]. This approach has allowed rapid characterization of thousands of samples over time, but nonetheless utilizes a piece of the sample that has been ground to grains or a powder. In order to compare a few non-destructive techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Moessbauer spectroscopy.

  20. Nearest feature line embedding approach to hyperspectral image classification

    NASA Astrophysics Data System (ADS)

    Chang, Yang-Lang; Liu, Jin-Nan; Han, Chin-Chuan; Chen, Ying-Nong; Hsieh, Tung-Ju; Huang, Bormin

    2012-10-01

    In this paper, a nearest feature line (NFL) embedding transformation is proposed for dimension reduction of hyperspectral image (HSI). Eigenspace projection approaches are generally used for feature extraction of HSI in remote sensing image classification. In order to improve the classification accuracy, the feature vectors of high dimensions are reduced to the low dimensionalities by the effective projection transformation. Similarly, the proposed NFL measurement is embedded into the transformation during the discriminant analysis stage instead of the matching stage. The class separability, neighborhood structure preservation, and NFL measurement are also simultaneously considered to find the effective and discriminating transformation in eigenspaces for image classification. The nearest neighbor classifier is used to show the discriminative performance. The proposed NFL embedding transformation is compared with several conventional state-of-the-art algorithms. It was evaluated by the AVIRIS data sets of Northwest Tippecanoe County. Experimental results have demonstrated that NFL embedding method is an effective transformation for dimension reduction in land cover classification of earth remote sensing.

  1. A hybrid ensemble learning approach to star-galaxy classification

    NASA Astrophysics Data System (ADS)

    Kim, Edward J.; Brunner, Robert J.; Carrasco Kind, Matias

    2015-10-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template-fitting method. Using data from the CFHTLenS survey (Canada-France-Hawaii Telescope Lensing Survey), we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2 (Deep Extragalactic Evolutionary Probe Phase 2 ), SDSS (Sloan Digital Sky Survey), VIPERS (VIMOS Public Extragalactic Redshift Survey), and VVDS (VIMOS VLT Deep Survey), and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

  2. Improved wetland remote sensing in Yellowstone National Park using classification trees to combine TM imagery and ancillary environmental data

    USGS Publications Warehouse

    Wright, C.; Gallant, A.

    2007-01-01

    The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub-shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other

  3. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  4. Land cover and forest formation distributions for St. Kitts, Nevis, St. Eustatius, Grenada and Barbados from decision tree classification of cloud-cleared satellite imagery

    USGS Publications Warehouse

    Helmer, E.H.; Kennaway, T.A.; Pedreros, D.H.; Clark, M.L.; Marcano-Vega, H.; Tieszen, L.L.; Ruzycki, T.R.; Schill, S.R.; Carrington, C.M.S.

    2008-01-01

    Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius, testing a more detailed classification than earlier work in the latter three islands. Secondly, we estimate the extents of land cover and protected forest by formation for five islands and ask how land cover has changed over the second half of the 20th century. The image interpretation approach combines image mosaics and ancillary geographic data, classifying the resulting set of raster data with decision tree software. Cloud-free image mosaics for one or two seasons were created by applying regression tree normalization to scene dates that could fill cloudy areas in a base scene. Such mosaics are also known as cloud-filled, cloud-minimized or cloud-cleared imagery, mosaics, or composites. The approach accurately distinguished several classes that more standard methods would confuse; the seamless mosaics aided reference data collection; and the multiseason imagery allowed us to separate drought deciduous forests and woodlands from semi-deciduous ones. Cultivated land areas declined 60 to 100 percent from about 1945 to 2000 on several islands. Meanwhile, forest cover has increased 50 to 950%. This trend will likely continue where sugar cane cultivation has dominated. Like the island of Puerto Rico, most higher-elevation forest formations are protected in formal or informal reserves. Also similarly, lowland forests, which are drier forest types on these islands, are not well represented in reserves. Former cultivated lands in lowland areas could provide lands for new reserves of drier forest types. The land-use history of these islands may provide insight for planners in countries currently considering

  5. New Approach for Segmentation and Extraction of Single Tree from Point Clouds Data and Aerial Images

    NASA Astrophysics Data System (ADS)

    Homainejad, A. S.

    2016-06-01

    This paper addresses a new approach for reconstructing a 3D model from single trees via Airborne Laser Scanners (ALS) data and aerial images. The approach detects and extracts single tree from ALS data and aerial images. The existing approaches are able to provide bulk segmentation from a group of trees; however, some methods focused on detection and extraction of a particular tree from ALS and images. Segmentation of a single tree within a group of trees is mostly a mission impossible since the detection of boundary lines between the trees is a tedious job and basically it is not feasible. In this approach an experimental formula based on the height of the trees was developed and applied in order to define the boundary lines between the trees. As a result, each single tree was segmented and extracted and later a 3D model was created. Extracted trees from this approach have a unique identification and attribute. The output has application in various fields of science and engineering such as forestry, urban planning, and agriculture. For example in forestry, the result can be used for study in ecologically diverse, biodiversity and ecosystem.

  6. Using PPI network autocorrelation in hierarchical multi-label classification trees for gene function prediction

    PubMed Central

    2013-01-01

    Background Ontologies and catalogs of gene functions, such as the Gene Ontology (GO) and MIPS-FUN, assume that functional classes are organized hierarchically, that is, general functions include more specific ones. This has recently motivated the development of several machine learning algorithms for gene function prediction that leverages on this hierarchical organization where instances may belong to multiple classes. In addition, it is possible to exploit relationships among examples, since it is plausible that related genes tend to share functional annotations. Although these relationships have been identified and extensively studied in the area of protein-protein interaction (PPI) networks, they have not received much attention in hierarchical and multi-class gene function prediction. Relations between genes introduce autocorrelation in functional annotations and violate the assumption that instances are independently and identically distributed (i.i.d.), which underlines most machine learning algorithms. Although the explicit consideration of these relations brings additional complexity to the learning process, we expect substantial benefits in predictive accuracy of learned classifiers. Results This article demonstrates the benefits (in terms of predictive accuracy) of considering autocorrelation in multi-class gene function prediction. We develop a tree-based algorithm for considering network autocorrelation in the setting of Hierarchical Multi-label Classification (HMC). We empirically evaluate the proposed algorithm, called NHMC (Network Hierarchical Multi-label Classification), on 12 yeast datasets using each of the MIPS-FUN and GO annotation schemes and exploiting 2 different PPI networks. The results clearly show that taking autocorrelation into account improves the predictive performance of the learned models for predicting gene function. Conclusions Our newly developed method for HMC takes into account network information in the learning phase: When

  7. A Distributed Artificial Intelligence Approach To Object Identification And Classification

    NASA Astrophysics Data System (ADS)

    Sikka, Digvijay I.; Varshney, Pramod K.; Vannicola, Vincent C.

    1989-09-01

    This paper presents an application of Distributed Artificial Intelligence (DAI) tools to the data fusion and classification problem. Our approach is to use a blackboard for information management and hypothe-ses formulation. The blackboard is used by the knowledge sources (KSs) for sharing information and posting their hypotheses on, just as experts sitting around a round table would do. The present simulation performs classification of an Aircraft(AC), after identifying it by its features, into disjoint sets (object classes) comprising of the five commercial ACs; Boeing 747, Boeing 707, DC10, Concord and Boeing 727. A situation data base is characterized by experimental data available from the three levels of expert reasoning. Ohio State University ElectroScience Laboratory provided this experimental data. To validate the architecture presented, we employ two KSs for modeling the sensors, aspect angle polarization feature and the ellipticity data. The system has been implemented on Symbolics 3645, under Genera 7.1, in Common LISP.

  8. Availability and Capacity of Substance Abuse Programs in Correctional Settings: A Classification and Regression Tree Analysis

    PubMed Central

    Kitsantas, Panagiota

    2009-01-01

    Objective to be addressed The purpose of this study was to investigate the structural and organizational factors that contribute to the availability and increased capacity for substance abuse treatment programs in correctional settings. We used Classification and Regression Tree statistical procedures to identify how multi-level data can explain the variability in availability and capacity of substance abuse treatment programs in jails and probation/parole offices. Methods The data for this study combined the National Criminal Justice Treatment Practices survey (NCJTP) and the 2000 Census. The NCJTP survey was a nationally representative sample of correctional administrators for jails and probation/parole agencies. The sample size included 295 substance abuse treatment programs that were classified according to the intensity of their services: high, medium, and low. The independent variables included jurisdictional-level structural variables, attributes of the correctional administrators, and program and service delivery characteristics of the correctional agency. Results The two most important variables in predicting the availability of all three types of services were stronger working relationships with other organizations and the adoption of a standardized substance abuse screening tool by correctional agencies. For high and medium intensive programs, the capacity increased when an organizational learning strategy was used by administrators and the organization used a substance abuse screening tool. Implications on advancing treatment practices in correctional settings are discussed, including further work to test theories on how to better understand access to intensive treatment services. This study presents the first phase of understanding capacity-related issues regarding treatment programs offered in correctional settings. PMID:19395204

  9. Assembling the fungal tree of life: progress, classification, and evolution of subcellular traits.

    PubMed

    Lutzoni, François; Kauff, Frank; Cox, Cymon J; McLaughlin, David; Celio, Gail; Dentinger, Bryn; Padamsee, Mahajabeen; Hibbett, David; James, Timothy Y; Baloch, Elisabeth; Grube, Martin; Reeb, Valérie; Hofstetter, Valérie; Schoch, Conrad; Arnold, A Elizabeth; Miadlikowska, Jolanta; Spatafora, Joseph; Johnson, Desiree; Hambleton, Sarah; Crockett, Michael; Shoemaker, Robert; Sung, Gi-Ho; Lücking, Robert; Lumbsch, Thorsten; O'Donnell, Kerry; Binder, Manfred; Diederich, Paul; Ertz, Damien; Gueidan, Cécile; Hansen, Karen; Harris, Richard C; Hosaka, Kentaro; Lim, Young-Woon; Matheny, Brandon; Nishida, Hiromi; Pfister, Don; Rogers, Jack; Rossman, Amy; Schmitt, Imke; Sipman, Harrie; Stone, Jeffrey; Sugiyama, Junta; Yahr, Rebecca; Vilgalys, Rytas

    2004-10-01

    Based on an overview of progress in molecular systematics of the true fungi (Fungi/Eumycota) since 1990, little overlap was found among single-locus data matrices, which explains why no large-scale multilocus phylogenetic analysis had been undertaken to reveal deep relationships among fungi. As part of the project "Assembling the Fungal Tree of Life" (AFTOL), results of four Bayesian analyses are reported with complementary bootstrap assessment of phylogenetic confidence based on (1) a combined two-locus data set (nucSSU and nucLSU rDNA) with 558 species representing all traditionally recognized fungal phyla (Ascomycota, Basidiomycota, Chytridiomycota, Zygomycota) and the Glomeromycota, (2) a combined three-locus data set (nucSSU, nucLSU, and mitSSU rDNA) with 236 species, (3) a combined three-locus data set (nucSSU, nucLSU rDNA, and RPB2) with 157 species, and (4) a combined four-locus data set (nucSSU, nucLSU, mitSSU rDNA, and RPB2) with 103 species. Because of the lack of complementarity among single-locus data sets, the last three analyses included only members of the Ascomycota and Basidiomycota. The four-locus analysis resolved multiple deep relationships within the Ascomycota and Basidiomycota that were not revealed previously or that received only weak support in previous studies. The impact of this newly discovered phylogenetic structure on supraordinal classifications is discussed. Based on these results and reanalysis of subcellular data, current knowledge of the evolution of septal features of fungal hyphae is synthesized, and a preliminary reassessment of ascomal evolution is presented. Based on previously unpublished data and sequences from GenBank, this study provides a phylogenetic synthesis for the Fungi and a framework for future phylogenetic studies on fungi.

  10. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification

    PubMed Central

    Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898

  11. "Trees and Things That Live in Trees": Three Children with Special Needs Experience the Project Approach

    ERIC Educational Resources Information Center

    Griebling, Susan; Elgas, Peg; Konerman, Rachel

    2015-01-01

    The authors report on research conducted during a project investigation undertaken with preschool children, ages 3-5. The report focuses on three children with special needs and the positive outcomes for each child as they engaged in the project Trees and Things That Live in Trees. Two of the children were diagnosed with developmental delays, and…

  12. Use of classification trees to apportion single echo detections to species: Application to the pelagic fish community of Lake Superior

    USGS Publications Warehouse

    Yule, Daniel L.; Adams, Jean V.; Hrabik, Thomas R.; Vinson, Mark R.; Woiak, Zebadiah; Ahrenstroff, Tyler D.

    2013-01-01

    Acoustic methods are used to estimate the density of pelagic fish in large lakes with results of midwater trawling used to assign species composition. Apportionment in lakes having mixed species can be challenging because only a small fraction of the water sampled acoustically is sampled with trawl gear. Here we describe a new method where single echo detections (SEDs) are assigned to species based on classification tree models developed from catch data that separate species based on fish size and the spatial habitats they occupy. During the summer of 2011, we conducted a spatially-balanced lake-wide acoustic and midwater trawl survey of Lake Superior. A total of 51 sites in four bathymetric depth strata (0–30 m, 30–100 m, 100–200 m, and >200 m) were sampled. We developed classification tree models for each stratum and found fish length was the most important variable for separating species. To apply these trees to the acoustic data, we needed to identify a target strength to length (TS-to-L) relationship appropriate for all abundant Lake Superior pelagic species. We tested performance of 7 general (i.e., multi-species) relationships derived from three published studies. The best-performing relationship was identified by comparing predicted and observed catch compositions using a second independent Lake Superior data set. Once identified, the relationship was used to predict lengths of SEDs from the lake-wide survey, and the classification tree models were used to assign each SED to a species. Exotic rainbow smelt (Osmerus mordax) were the most common species at bathymetric depths 100 m (384 million; 6.0 kt). Cisco (Coregonus artedi) were widely distributed over all strata with their population estimated at 182 million (44 kt). The apportionment method we describe should be transferable to other large lakes provided fish are not tightly aggregated, and an appropriate TS-to-L relationship for abundant pelagic fish species can be determined.

  13. AutoClass: A Bayesian Approach to Classification

    NASA Technical Reports Server (NTRS)

    Stutz, John; Cheeseman, Peter; Hanson, Robin; Taylor, Will; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    We describe a Bayesian approach to the untutored discovery of classes in a set of cases, sometimes called finite mixture separation or clustering. The main difference between clustering and our approach is that we search for the "best" set of class descriptions rather than grouping the cases themselves. We describe our classes in terms of a probability distribution or density function, and the locally maximal posterior probability valued function parameters. We rate our classifications with an approximate joint probability of the data and functional form, marginalizing over the parameters. Approximation is necessitated by the computational complexity of the joint probability. Thus, we marginalize w.r.t. local maxima in the parameter space. We discuss the rationale behind our approach to classification. We give the mathematical development for the basic mixture model and describe the approximations needed for computational tractability. We instantiate the basic model with the discrete Dirichlet distribution and multivariant Gaussian density likelihoods. Then we show some results for both constructed and actual data.

  14. A Dynamic Tree Approach to Environmental Transport on Hillslopes

    NASA Astrophysics Data System (ADS)

    Passalacqua, P.; Zaliapin, I.; Foufoula-Georgiou, E.; Ghil, M.; Dietrich, W. E.

    2010-12-01

    The concept of dynamic tree was introduced in Zaliapin et al. (2010) as the basis of an extended conceptual framework to study the transport of spatially heterogeneous fluxes as they propagate down a network of a given topology. Here we are interested in extending this framework over the whole basin by incorporating the hillslope paths and their geometry, which are known to differ from those of the river network. Focusing on the fluxes that start at a source, propagate downstream and have constant velocity, we first capture the static structure of the hillslope network by representing it by a tree (static tree). We then describe the transport down the hillslope tree as a particular case of nearest-neighbor hierarchical aggregation and thus obtaining the so-called dynamic tree. The properties of both the dynamic and static trees are analyzed by applying Horton-Strahler and Tokunaga taxonomies. The results obtained in three hillslope areas of different characteristics, two located in California and one in Oregon, show that both the static and the dynamic tree can be well approximated by Tokunaga self-similar trees (SSTs), in agreement with what previously obtained for the channelized paths of the river network but with different parameters. The degree of side branching is larger for the static tree than for the dynamic. We also observed a phase transition in the dynamics of the three systems which reflects an abrupt emergence of a giant cluster of connected streams.

  15. Full hierarchic versus non-hierarchic classification approaches for mapping sealed surfaces at the rural-urban fringe using high-resolution satellite data.

    PubMed

    De Roeck, Tim; Van de Voorde, Tim; Canters, Frank

    2009-01-01

    Since 2008 more than half of the world population is living in cities and urban sprawl is continuing. Because of these developments, the mapping and monitoring of urban environments and their surroundings is becoming increasingly important. In this study two object-oriented approaches for high-resolution mapping of sealed surfaces are compared: a standard non-hierarchic approach and a full hierarchic approach using both multi-layer perceptrons and decision trees as learning algorithms. Both methods outperform the standard nearest neighbour classifier, which is used as a benchmark scenario. For the multi-layer perceptron approach, applying a hierarchic classification strategy substantially increases the accuracy of the classification. For the decision tree approach a one-against-all hierarchic classification strategy does not lead to an improvement of classification accuracy compared to the standard all-against-all approach. Best results are obtained with the hierarchic multi-layer perceptron classification strategy, producing a kappa value of 0.77. A simple shadow reclassification procedure based on characteristics of neighbouring objects further increases the kappa value to 0.84.

  16. Establishing serological classification tree model in rheumatoid arthritis using combination of MALDI-TOF-MS and magnetic beads.

    PubMed

    Yan, Zhang; Chaojun, Hu; Chuiwen, Deng; Xiaomei, Leng; Xin, Zhang; Yongzhe, Li; Fengchun, Zhang

    2015-02-01

    To establish a serological classification tree model for rheumatoid arthritis (RA), protein/peptide profiles of serum were detected by matrix-assisted laser desorption-ionization time-of-flight mass spectrometry (MALDI-TOF-MS) combined with weak cationic exchange (WCX) from Cohort 1, including 65 patients with RA and 41 healthy controls (HC). The samples were randomly divided into a training set and a test set. Twenty-four differentially expressed peaks (P < 0.05) were identified in the training set and 4 of them, namely m/z 3,939, 5,906, 8,146, and 8,569 were chosen to set up our model. This model exhibited a sensitivity of 100.0% and a specificity of 96.0% for differentiating RA patients from HC. The test set reproduced these high levels of sensitivity and specificity, which were 100.0 and 81.2%, respectively. Cohort 2, which include 228 RA patients, was used to further verify the classification efficiency of this model. It came out that 97.4% of them were classified as RA by this model. In conclusion, MALDI-TOF-MS combined with WCX magnetic beads was a powerful method for constructing a classification tree model for RA, and the model we established was useful in recognizing RA.

  17. Application of object-oriented method for classification of VHR satellite images using rule-based approach and texture measures

    NASA Astrophysics Data System (ADS)

    Lewinski, S.; Bochenek, Z.; Turlej, K.

    2010-01-01

    New approach for classification of high-resolution satellite images is presented in the article. That approach has been developed at the Institute of Geodesy and Cartography, Warsaw, within the Geoland 2 project - SATChMo Core Mapping Service. Classification algorithm, aimed at recognition of generic land cover categories, has been elaborated using the object-oriented approach. Its functionality was tested on the basis of KOMPSAT-2 satellite images, recorded in four multispectral bands (4 m ground resolution) and in panchromatic mode (1 m ground resolution). The structure of the algorithm resembles decision tree and consists of a sequence of processes. The main assumption of the presented approach is to divide image contents into objects characterized by high and low texture measures. The texture measures are generated on the basis of a panchromatic image transformed by Sigma filters. Objects belonging to the so-called high texture are classified at first steps. In the following steps the classification of the remaining objects takes place. Applying parametric criteria of recognition at the first group of objects four generic land cover classes are classified: forests, sparse woody vegetation, urban / artificial areas and bare ground. Non-classified areas are automatically assigned to the second group of objects, which contains water and agricultural land. In the course of classification process a few segmentations are performed, which are dedicated to particular land cover categories. Classified objects, smaller than 0.25 ha are removed in the process of generalization.

  18. A comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification

    NASA Astrophysics Data System (ADS)

    Lin, Yi; Hyyppä, Juha

    2016-04-01

    Tree species information is crucial for digital forestry, and efficient techniques for classifying tree species are extensively demanded. To this end, airborne light detection and ranging (LiDAR) has been introduced. However, the literature review suggests that most of the previous airborne LiDAR-based studies were only based on limited kinds of tree signatures. To address this gap, this study proposed developing a novel modular framework for LiDAR-based tree species classification, by deriving feature parameters in a systematic way. Specifically, feature parameters of point-distribution (PD), laser pulse intensity (IN), crown-internal (CI) and tree-external (TE) structures were proposed and derived. With a support-vector-machine (SVM) classifier used, the classifications were conducted in a leave-one-out-for-cross-validation (LOOCV) mode. Based on the samples of four typical boreal tree species, i.e., Picea abies, Pinus sylvestris, Populus tremula and Quercus robur, tests showed that the accuracies of the classifications based on the acquired PD-, IN-, CI- and TE-categorized feature parameters as well as the integration of their individual optimal parameters are 65.00%, 80.00%, 82.50%, 85.00% and 92.50%, respectively. These results indicate that the procedures proposed in this study can be used as a comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification.

  19. A new approach to modeling tree rainfall interception

    NASA Astrophysics Data System (ADS)

    Xiao, Qingfu; McPherson, E. Gregory; Ustin, Susan L.; Grismer, Mark E.

    2000-12-01

    A three-dimensional physically based stochastic model was developed to describe canopy rainfall interception processes at desired spatial and temporal resolutions. Such model development is important to understand these processes because forest canopy interception may exceed 59% of annual precipitation in old growth trees. The model describes the interception process from a single leaf, to a branch segment, and then up to the individual tree level. It takes into account rainfall, meteorology, and canopy architecture factors as explicit variables. Leaf and stem surface roughness, architecture, and geometric shape control both leaf drip and stemflow. Model predictions were evaluated using actual interception data collected for two mature open grown trees, a 9-year-old broadleaf deciduous pear tree (Pyrus calleryana "Bradford" or Callery pear) and an 8-year-old broadleaf evergreen oak tree (Quercus suber or cork oak). When simulating 18 rainfall events for the oak tree and 16 rainfall events for the pear tree, the model over estimated interception loss by 4.5% and 3.0%, respectively, while stemflow was under estimated by 0.8% and 3.3%, and throughfall was under estimated by 3.7% for the oak tree and over estimated by 0.3% for the pear tree. A model sensitivity analysis indicates that canopy surface storage capacity had the greatest influence on interception, and interception losses were sensitive to leaf and stem surface area indices. Among rainfall factors, interception losses relative to gross precipitation were most sensitive to rainfall amount. Rainfall incident angle had a significant effect on total precipitation intercepting the projected surface area. Stemflow was sensitive to stem segment and leaf zenith angle distributions. Enhanced understanding of interception loss dynamics should lead to improved urban forest ecosystem management.

  20. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    PubMed

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  1. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    PubMed

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  2. Active Optical Sensors for Tree Stem Detection and Classification in Nurseries

    PubMed Central

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J.; Hanson, Bradley D.; Slaughter, David C.

    2014-01-01

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops. PMID:24949638

  3. Comparison of four approaches to a rock facies classification problem

    USGS Publications Warehouse

    Dubois, M.K.; Bohling, G.C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  4. Reflectance properties of West African savanna trees from ground radiometer measurements. II - Classification of components

    NASA Technical Reports Server (NTRS)

    Hanan, N. P.; Prince, S. D.; Franklin, J.

    1993-01-01

    A pole-mounted radiometer was used to measure the reflectance properties in the red and near-IR of three Sahelian tree species. These properties are classified depending on their location over the canopy. A geometrical description of the patterns of shadow and sunlight on and beneath a model tree when viewed from above is given, and six components are defined. Tree canopies are found to be dark in the red waveband with respect to the soil, but have little or no effect on the near-IR.

  5. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2013-01-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  6. An Approach for Automatic Classification of Radiology Reports in Spanish.

    PubMed

    Cotik, Viviana; Filippo, Darío; Castaño, José

    2015-01-01

    Automatic detection of relevant terms in medical reports is useful for educational purposes and for clinical research. Natural language processing (NLP) techniques can be applied in order to identify them. In this work we present an approach to classify radiology reports written in Spanish into two sets: the ones that indicate pathological findings and the ones that do not. In addition, the entities corresponding to pathological findings are identified in the reports. We use RadLex, a lexicon of English radiology terms, and NLP techniques to identify the occurrence of pathological findings. Reports are classified using a simple algorithm based on the presence of pathological findings, negation and hedge terms. The implemented algorithms were tested with a test set of 248 reports annotated by an expert, obtaining a best result of 0.72 F1 measure. The output of the classification task can be used to look for specific occurrences of pathological findings. PMID:26262128

  7. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2012-02-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  8. An Approach for Automatic Classification of Radiology Reports in Spanish.

    PubMed

    Cotik, Viviana; Filippo, Darío; Castaño, José

    2015-01-01

    Automatic detection of relevant terms in medical reports is useful for educational purposes and for clinical research. Natural language processing (NLP) techniques can be applied in order to identify them. In this work we present an approach to classify radiology reports written in Spanish into two sets: the ones that indicate pathological findings and the ones that do not. In addition, the entities corresponding to pathological findings are identified in the reports. We use RadLex, a lexicon of English radiology terms, and NLP techniques to identify the occurrence of pathological findings. Reports are classified using a simple algorithm based on the presence of pathological findings, negation and hedge terms. The implemented algorithms were tested with a test set of 248 reports annotated by an expert, obtaining a best result of 0.72 F1 measure. The output of the classification task can be used to look for specific occurrences of pathological findings.

  9. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy. PMID:26687087

  10. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  11. An improved classification tree analysis of high cost modules based upon an axiomatic definition of complexity

    NASA Technical Reports Server (NTRS)

    Tian, Jianhui; Porter, Adam; Zelkowitz, Marvin V.

    1992-01-01

    Identification of high cost modules has been viewed as one mechanism to improve overall system reliability, since such modules tend to produce more than their share of problems. A decision tree model was used to identify such modules. In this current paper, a previously developed axiomatic model of program complexity is merged with the previously developed decision tree process for an improvement in the ability to identify such modules. This improvement was tested using data from the NASA Software Engineering Laboratory.

  12. Topological classification of binary trees using the Horton-Strahler index.

    PubMed

    Toroczkai, Zoltán

    2002-01-01

    The Horton-Strahler (HS) index r=max(i,j)+delta(i,j) has been shown to be relevant to a number of physical (such as diffusion limited aggregation) geological (river networks), biological (pulmonary arteries, blood vessels, various species of trees), and computational (use of registers) applications. Here we revisit the enumeration problem of the HS index on the rooted, unlabeled, plane binary set of trees, and enumerate the same index on the ambilateral set of rooted, plane binary set of trees of n leaves. The ambilateral set is a set of trees whose elements cannot be obtained from each other via an arbitrary number of reflections with respect to vertical axes passing through any of the nodes on the tree. For the unlabeled set we give an alternate derivation to the existing exact solution. Extending this technique for the ambilateral set, which is described by an infinite series of nonlinear functional equations, we are able to give a double exponentially converging approximant to the generating functions in a neighborhood of their convergence circle, and derive an explicit asymptotic form for the number of such trees.

  13. Industrial and occupational ergonomics in the petrochemical process industry: a regression trees approach.

    PubMed

    Bevilacqua, M; Ciarapica, F E; Giacchetta, G

    2008-07-01

    This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.

  14. Classification of tissue pathological state using optical multiparametric monitoring approach

    NASA Astrophysics Data System (ADS)

    Kutai-Asis, Hofit; Kanter, Ido; Barbiro-Michaely, Efrat; Mayevsky, Avraham

    2008-12-01

    In order to diagnose the development of pathophysiological events in the brain, the evaluation of multiparametric data in real time is highly important. The current work presents a new approach of using cluster analysis for the evaluation of relationship between: mitochondrial NADH, tissue blood flow and hemoglobin oxygenation under various pathophysiological conditions. The Time-Sharing Fluorometer Reflectometer (TSFR) was used for monitoring of mitochondrial NADH, oxyhemoglobin (HbO2), and microcirculatory blood flow simultaneously at the same location from the rat or gerbils cortex. This allows a more accurate assessment of brain functions in real time and a better understanding of the relationship between tissue oxygen supply and demand. Moreover, in some pathophysiological cases, monitoring of only one or two parameters in the cerebral cortex may be misleading. The classification was based on the data collected in experiments where different pathophysiological conditions, such as anoxia, ischemia, and SD were used. These three parameters were plotted in three dimensions. The clustering approach results showed similar patterns in each type of treatment. The distribution of data points in space was used to define the spatial behavior of each treatment in order to produce an index for identifying different treatments. In conclusion, our present study offers a new approach of data analysis that can serve as a reliable tool for tissue pathophysiology.

  15. A regional classification scheme for estimating reference water quality in streams using land-use-adjusted spatial regression-tree analysis

    USGS Publications Warehouse

    Robertson, D.M.; Saad, D.A.; Heisey, D.M.

    2006-01-01

    Various approaches are used to subdivide large areas into regions containing streams that have similar reference or background water quality and that respond similarly to different factors. For many applications, such as establishing reference conditions, it is preferable to use physical characteristics that are not affected by human activities to delineate these regions. However, most approaches, such as ecoregion classifications, rely on land use to delineate regions or have difficulties compensating for the effects of land use. Land use not only directly affects water quality, but it is often correlated with the factors used to define the regions. In this article, we describe modifications to SPARTA (spatial regression-tree analysis), a relatively new approach applied to water-quality and environmental characteristic data to delineate zones with similar factors affecting water quality. In this modified approach, land-use-adjusted (residualized) water quality and environmental characteristics are computed for each site. Regression-tree analysis is applied to the residualized data to determine the most statistically important environmental characteristics describing the distribution of a specific water-quality constituent. Geographic information for small basins throughout the study area is then used to subdivide the area into relatively homogeneous environmental water-quality zones. For each zone, commonly used approaches are subsequently used to define its reference water quality and how its water quality responds to changes in land use. SPARTA is used to delineate zones of similar reference concentrations of total phosphorus and suspended sediment throughout the upper Midwestern part of the United States. ?? 2006 Springer Science+Business Media, Inc.

  16. Bacillary dysentery and meteorological factors in northeastern China: a historical review based on classification and regression trees.

    PubMed

    Guan, Peng; Huang, Desheng; Guo, Junqiao; Wang, Ping; Zhou, Baosen

    2008-09-01

    The relationship between the incidence of bacillary dysentery and meteorological factors was investigated. Data on bacillary dysentery incidence in Shenyang from 1990 to 1996 were obtained from Liaoning Provincial Center for Disease Control and Prevention, and meteorological data such as atmospheric pressure, air temperature, precipitation, evaporation, wind speed, and the amount of solar radiation were obtained from Shenyang Meteorological Bureau. Kendall and Spearman correlations were used to analyze the relationship between bacillary dysentery and meteorological factors. The incidence of bacillary dysentery was treated as a response variable, and meteorological factors were treated as predictable variables. Software R 2.3.1 was used to execute the classification and regression trees (CART). The model improved the accuracy of the fitting results. The residual sum square error of the regression tree model was 53.9, while the residual sum square error of the multivariate linear regression model was 107.2. Among all the meteorological indexes, relative humidity, minimum temperature, and pressure one month prior were statistically influential factors in the multivariate regression tree model. CART may be a useful tool for dealing with heterogeneous data, as it can serve as a decision support tool and is notable for its simplicity and ease.

  17. Application of Classification and Regression Tree (CART) analysis on the microflora of minced meat for classification according to Reg. (EC) 2073/2005.

    PubMed

    Paulsen, P; Smulders, F J M; Tichy, A; Aydin, A; Höck, C

    2011-07-01

    In a retrospective study on the microbiology of minced meat from small food businesses supplying directly to the consumer, the relative contribution of meat supplier, meat species and outlet where meat was minced was assessed by "Classification and Regression Tree" (CART) analysis. Samples (n=888) originated from 129 outlets of a single supermarket chain. Sampling units were 4-5 packs (pork, beef, and mixed pork-beef). Total aerobic counts (TACs) were 5.3±1.0 log CFU/g. In 75.6% of samples, E. coli were <1 log CFU/g. The proportion of "unsatisfactory" sample sets [as defined in Reg. (EC) 2073/2005] were 31.3 and 4.5% for TAC and E. coli, respectively. For classification according to TACs, the outlet where meat was minced and the "meat supplier" were the most important predictors. For E. coli, "outlet" was the most important predictor, but the limit of detection of 1 log CFU/g was not discriminative enough to allow further conclusions.

  18. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification

    PubMed Central

    Botella, Juan; Huang, Huiling; Suero, Manuel

    2013-01-01

    Studies that evaluate the accuracy of binary classification tools are needed. Such studies provide 2 × 2 cross-classifications of test outcomes and the categories according to an unquestionable reference (or gold standard). However, sometimes a suboptimal reliability reference is employed. Several methods have been proposed to deal with studies where the observations are cross-classified with an imperfect reference. These methods require that the status of the reference, as a gold standard or as an imperfect reference, is known. In this paper a procedure for determining whether it is appropriate to maintain the assumption that the reference is a gold standard or an imperfect reference, is proposed. This procedure fits two nested multinomial tree models, and assesses and compares their absolute and incremental fit. Its implementation requires the availability of the results of several independent studies. These should be carried out using similar designs to provide frequencies of cross-classification between a test and the reference under investigation. The procedure is applied in two examples with real data. PMID:24106484

  19. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification.

    PubMed

    Botella, Juan; Huang, Huiling; Suero, Manuel

    2013-01-01

    Studies that evaluate the accuracy of binary classification tools are needed. Such studies provide 2 × 2 cross-classifications of test outcomes and the categories according to an unquestionable reference (or gold standard). However, sometimes a suboptimal reliability reference is employed. Several methods have been proposed to deal with studies where the observations are cross-classified with an imperfect reference. These methods require that the status of the reference, as a gold standard or as an imperfect reference, is known. In this paper a procedure for determining whether it is appropriate to maintain the assumption that the reference is a gold standard or an imperfect reference, is proposed. This procedure fits two nested multinomial tree models, and assesses and compares their absolute and incremental fit. Its implementation requires the availability of the results of several independent studies. These should be carried out using similar designs to provide frequencies of cross-classification between a test and the reference under investigation. The procedure is applied in two examples with real data.

  20. Deep water X-mas tree standardization -- Interchangeability approach

    SciTech Connect

    Paula, M.T.R.; Paulo, C.A.S.; Moreira, C.C.

    1995-12-31

    Aiming the rationalization of subsea operations to turn the production of oil and gas more economical and reliable, standardization of subsea equipment interfaces is a tool that can play a very important role. Continuing the program initiated some years ago, Petrobras is now harvesting the results from the first efforts. Diverless guidelineless subsea Christmas trees from four different suppliers have already been manufactured in accordance to the standardized specification. Tests performed this year in Macae (Campos Basin onshore base), in Brazil, confirmed the interchangeability among subsea Christmas trees, tubing hangers, adapter bases and flowline hubs of different manufacturers. This interchangeability, associated with the use of proven techniques, results in operational flexibility, savings in rig time and reduction in production losses during workovers. By now, 33 complete sets of subsea Christmas trees have already been delivered and successfully tested. Other 28 sets are still being manufactured by the four local suppliers. For the next five years, more than a hundred of these trees will be required for the exploration of the new discoveries. This paper describes the standardized equipment, the role of the operator in an integrated way of working with the manufacturers on the standardization activities, the importance of a frank information flow through the involved companies and how a simple manufacturing philosophy, with the use of construction jigs, has proved to work satisfactorily.

  1. A Fault Tree Approach to Analysis of Organizational Communication Systems.

    ERIC Educational Resources Information Center

    Witkin, Belle Ruth; Stephens, Kent G.

    Fault Tree Analysis (FTA) is a method of examing communication in an organization by focusing on: (1) the complex interrelationships in human systems, particularly in communication systems; (2) interactions across subsystems and system boundaries; and (3) the need to select and "prioritize" channels which will eliminate noise in the system and…

  2. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach

    ERIC Educational Resources Information Center

    Klauer, Karl Christoph

    2010-01-01

    Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…

  3. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii),...

  4. Identification, Classification and Differential Expression of Oleosin Genes in Tung Tree (Vernicia fordii)

    PubMed Central

    Cao, Heping; Zhang, Lin; Tan, Xiaofeng; Long, Hongxu; Shockey, Jay M.

    2014-01-01

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii), whose seeds are rich in novel TAG with a wide range of industrial applications. The objectives of this study were to identify OLE genes, classify OLE proteins and analyze OLE gene expression in tung trees. We identified five tung tree OLE genes coding for small hydrophobic proteins. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that the five tung OLE genes represented the five OLE subfamilies and all contained the “proline knot” motif (PX5SPX3P) shared among 65 OLE from 19 tree species, including the sequenced genomes of Prunus persica (peach), Populus trichocarpa (poplar), Ricinus communis (castor bean), Theobroma cacao (cacao) and Vitis vinifera (grapevine). Tung OLE1, OLE2 and OLE3 belong to the S type and OLE4 and OLE5 belong to the SM type of Arabidopsis OLE. TaqMan and SYBR Green qPCR methods were used to study the differential expression of OLE genes in tung tree tissues. Expression results demonstrated that 1) All five OLE genes were expressed in developing tung seeds, leaves and flowers; 2) OLE mRNA levels were much higher in seeds than leaves or flowers; 3) OLE1, OLE2 and OLE3 genes were expressed in tung seeds at much higher levels than OLE4 and OLE5 genes; 4) OLE mRNA levels rapidly increased during seed development; and 5) OLE gene expression was well-coordinated with tung oil accumulation in the seeds. These results suggest that tung OLE genes 1–3 probably play major roles in tung oil accumulation and/or oil body development. Therefore, they might be preferred targets for tung oil engineering in transgenic plants. PMID:24516650

  5. A Novel Approach on Designing Augmented Fuzzy Cognitive Maps Using Fuzzified Decision Trees

    NASA Astrophysics Data System (ADS)

    Papageorgiou, Elpiniki I.

    This paper proposes a new methodology for designing Fuzzy Cognitive Maps using crisp decision trees that have been fuzzified. Fuzzy cognitive map is a knowledge-based technique that works as an artificial cognitive network inheriting the main aspects of cognitive maps and artificial neural networks. Decision trees, in the other hand, are well known intelligent techniques that extract rules from both symbolic and numeric data. Fuzzy theoretical techniques are used to fuzzify crisp decision trees in order to soften decision boundaries at decision nodes inherent in this type of trees. Comparisons between crisp decision trees and the fuzzified decision trees suggest that the later fuzzy tree is significantly more robust and produces a more balanced decision making. The approach proposed in this paper could incorporate any type of fuzzy decision trees. Through this methodology, new linguistic weights were determined in FCM model, thus producing augmented FCM tool. The framework is consisted of a new fuzzy algorithm to generate linguistic weights that describe the cause-effect relationships among the concepts of the FCM model, from induced fuzzy decision trees.

  6. Increased tree establishment in Lithuanian peat bogs--insights from field and remotely sensed approaches.

    PubMed

    Edvardsson, Johannes; Šimanauskienė, Rasa; Taminskas, Julius; Baužienė, Ieva; Stoffel, Markus

    2015-02-01

    Over the past century an ongoing establishment of Scots pine (Pinus sylvestris L.), sometimes at accelerating rates, is noted at three studied Lithuanian peat bogs, namely Kerėplis, Rėkyva and Aukštumala, all representing different degrees of tree coverage and geographic settings. Present establishment rates seem to depend on tree density on the bog surface and are most significant at sparsely covered sites where about three-fourth of the trees have established since the mid-1990s, whereas the initial establishment in general was during the early to mid-19th century. Three methods were used to detect, compare and describe tree establishment: (1) tree counts in small plots, (2) dendrochronological dating of bog pine trees, and (3) interpretation of aerial photographs and historical maps of the study areas. In combination, the different approaches provide complimentary information but also weigh up each other's drawbacks. Tree counts in plots provided a reasonable overview of age class distributions and enabled capturing of the most recently established trees with ages less than 50 years. The dendrochronological analysis yielded accurate tree ages and a good temporal resolution of long-term changes. Tree establishment and spread interpreted from aerial photographs and historical maps provided a good overview of tree spread and total affected area. It also helped to verify the results obtained with the other methods and an upscaling of findings to the entire peat bogs. The ongoing spread of trees in predominantly undisturbed peat bogs is related to warmer and/or drier climatic conditions, and to a minor degree to land-use changes. Our results therefore provide valuable insights into vegetation changes in peat bogs, also with respect to bog response to ongoing and future climatic changes.

  7. Increased tree establishment in Lithuanian peat bogs--insights from field and remotely sensed approaches.

    PubMed

    Edvardsson, Johannes; Šimanauskienė, Rasa; Taminskas, Julius; Baužienė, Ieva; Stoffel, Markus

    2015-02-01

    Over the past century an ongoing establishment of Scots pine (Pinus sylvestris L.), sometimes at accelerating rates, is noted at three studied Lithuanian peat bogs, namely Kerėplis, Rėkyva and Aukštumala, all representing different degrees of tree coverage and geographic settings. Present establishment rates seem to depend on tree density on the bog surface and are most significant at sparsely covered sites where about three-fourth of the trees have established since the mid-1990s, whereas the initial establishment in general was during the early to mid-19th century. Three methods were used to detect, compare and describe tree establishment: (1) tree counts in small plots, (2) dendrochronological dating of bog pine trees, and (3) interpretation of aerial photographs and historical maps of the study areas. In combination, the different approaches provide complimentary information but also weigh up each other's drawbacks. Tree counts in plots provided a reasonable overview of age class distributions and enabled capturing of the most recently established trees with ages less than 50 years. The dendrochronological analysis yielded accurate tree ages and a good temporal resolution of long-term changes. Tree establishment and spread interpreted from aerial photographs and historical maps provided a good overview of tree spread and total affected area. It also helped to verify the results obtained with the other methods and an upscaling of findings to the entire peat bogs. The ongoing spread of trees in predominantly undisturbed peat bogs is related to warmer and/or drier climatic conditions, and to a minor degree to land-use changes. Our results therefore provide valuable insights into vegetation changes in peat bogs, also with respect to bog response to ongoing and future climatic changes. PMID:25310886

  8. Corpus Callosum MR Image Classification

    NASA Astrophysics Data System (ADS)

    Elsayed, A.; Coenen, F.; Jiang, C.; García-Fiñana, M.; Sluming, V.

    An approach to classifying Magnetic Resonance (MR) image data is described. The specific application is the classification of MRI scan data according to the nature of the corpus callosum, however the approach has more general applicability. A variation of the “spectral segmentation with multi-scale graph decomposition” mechanism is introduced. The result of the segmentation is stored in a quad-tree data structure to which a weighted variation (also developed by the authors) of the gSpan algorithm is applied to identify frequent sub-trees. As a result the images are expressed as a set frequent sub-trees. There may be a great many of these and thus a decision tree based feature reduction technique is applied before classification takes place. The results show that the proposed approach performs both efficiently and effectively, obtaining a classification accuracy of over 95% in the case of the given application.

  9. A Cladistic Approach for the Classification of Oligotrichid Ciliates (Ciliophora: Spirotricha).

    PubMed

    Agatha, Sabine

    2004-01-01

    Currently, gene sequence genealogies of the Oligotrichea Bütschli, 1889 comprise only few species. Therefore, a cladistic approach, especially to the Oligotrichida, was made, applying Hennig's method and computer programs. Twenty-three characters were selected and discussed, i.e., the morphology of the oral apparatus (five characters), the somatic ciliature (eight characters), special organelles (four characters), and ontogenetic particulars (six characters). Nine of these characters developed convergently twice. Although several new features were included into the analyses, the cladograms match other morphological trees in the monophyly of the Oligotrichea, Halteriia, Oligotrichia, Oligotrichida, and Choreotrichida. The main synapomorphies of the Oligotrichea are the enantiotropic division mode and the de novo-origin of the undulating membranes. Although the sister group relationship of the Halteriia and the Oligotrichia contradicts results obtained by gene sequence analyses, no morphologic, ontogenetic or ultrastructural features were found, which support a branching of Halteria grandinella within the Stichotrichida. The cladistic approaches suggest paraphyly of the family Strombidiidae probably due to the scarce knowledge. A revised classification of the Oligotrichea is suggested, including all sufficiently known families and genera.

  10. A Cladistic Approach for the Classification of Oligotrichid Ciliates (Ciliophora: Spirotricha)

    PubMed Central

    AGATHA, Sabine

    2010-01-01

    Summary Currently, gene sequence genealogies of the Oligotrichea Bütschli, 1889 comprise only few species. Therefore, a cladistic approach, especially to the Oligotrichida, was made, applying Hennig's method and computer programs. Twenty-three characters were selected and discussed, i.e., the morphology of the oral apparatus (five characters), the somatic ciliature (eight characters), special organelles (four characters), and ontogenetic particulars (six characters). Nine of these characters developed convergently twice. Although several new features were included into the analyses, the cladograms match other morphological trees in the monophyly of the Oligotrichea, Halteriia, Oligotrichia, Oligotrichida, and Choreotrichida. The main synapomorphies of the Oligotrichea are the enantiotropic division mode and the de novo-origin of the undulating membranes. Although the sister group relationship of the Halteriia and the Oligotrichia contradicts results obtained by gene sequence analyses, no morphologic, ontogenetic or ultrastructural features were found, which support a branching of Halteria grandinella within the Stichotrichida. The cladistic approaches suggest paraphyly of the family Strombidiidae probably due to the scarce knowledge. A revised classification of the Oligotrichea is suggested, including all sufficiently known families and genera. PMID:20396404

  11. Evaluation of Current Approaches to Stream Classification and a Heuristic Guide to Developing Classifications of Integrated Aquatic Networks

    NASA Astrophysics Data System (ADS)

    Melles, S. J.; Jones, N. E.; Schmidt, B. J.

    2014-03-01

    Conservation and management of fresh flowing waters involves evaluating and managing effects of cumulative impacts on the aquatic environment from disturbances such as: land use change, point and nonpoint source pollution, the creation of dams and reservoirs, mining, and fishing. To assess effects of these changes on associated biotic communities it is necessary to monitor and report on the status of lotic ecosystems. A variety of stream classification methods are available to assist with these tasks, and such methods attempt to provide a systematic approach to modeling and understanding complex aquatic systems at various spatial and temporal scales. Of the vast number of approaches that exist, it is useful to group them into three main types. The first involves modeling longitudinal species turnover patterns within large drainage basins and relating these patterns to environmental predictors collected at reach and upstream catchment scales; the second uses regionalized hierarchical classification to create multi-scale, spatially homogenous aquatic ecoregions by grouping adjacent catchments together based on environmental similarities; and the third approach groups sites together on the basis of similarities in their environmental conditions both within and between catchments, independent of their geographic location. We review the literature with a focus on more recent classifications to examine the strengths and weaknesses of the different approaches. We identify gaps or problems with the current approaches, and we propose an eight-step heuristic process that may assist with development of more flexible and integrated aquatic classifications based on the current understanding, network thinking, and theoretical underpinnings.

  12. Evaluation of current approaches to stream classification and a heuristic guide to developing classifications of integrated aquatic networks.

    PubMed

    Melles, S J; Jones, N E; Schmidt, B J

    2014-03-01

    Conservation and management of fresh flowing waters involves evaluating and managing effects of cumulative impacts on the aquatic environment from disturbances such as: land use change, point and nonpoint source pollution, the creation of dams and reservoirs, mining, and fishing. To assess effects of these changes on associated biotic communities it is necessary to monitor and report on the status of lotic ecosystems. A variety of stream classification methods are available to assist with these tasks, and such methods attempt to provide a systematic approach to modeling and understanding complex aquatic systems at various spatial and temporal scales. Of the vast number of approaches that exist, it is useful to group them into three main types. The first involves modeling longitudinal species turnover patterns within large drainage basins and relating these patterns to environmental predictors collected at reach and upstream catchment scales; the second uses regionalized hierarchical classification to create multi-scale, spatially homogenous aquatic ecoregions by grouping adjacent catchments together based on environmental similarities; and the third approach groups sites together on the basis of similarities in their environmental conditions both within and between catchments, independent of their geographic location. We review the literature with a focus on more recent classifications to examine the strengths and weaknesses of the different approaches. We identify gaps or problems with the current approaches, and we propose an eight-step heuristic process that may assist with development of more flexible and integrated aquatic classifications based on the current understanding, network thinking, and theoretical underpinnings. PMID:24464177

  13. One or Two Dimensions in Spontaneous Classification: A Simplicity Approach

    ERIC Educational Resources Information Center

    Pothos, Emmanuel M.; Close, James

    2008-01-01

    When participants are asked to spontaneously categorize a set of items, they typically produce unidimensional classifications, i.e., categorize the items on the basis of only one of their dimensions of variation. We examine whether it is possible to predict unidimensional vs. two-dimensional classification on the basis of the abstract stimulus…

  14. Narrowing historical uncertainty: probabilistic classification of ambiguously identified tree species in historical forest survey data

    USGS Publications Warehouse

    Mladenoff, D.J.; Dahir, S.E.; Nordheim, E.V.; Schulte, L.A.; Guntenspergen, G.R.

    2002-01-01

    Historical data have increasingly become appreciated for insight into the past conditions of ecosystems. Uses of such data include assessing the extent of ecosystem change; deriving ecological baselines for management, restoration, and modeling; and assessing the importance of past conditions on the composition and function of current systems. One historical data set of this type is the Public Land Survey (PLS) of the United States General Land Office, which contains data on multiple tree species, sizes, and distances recorded at each survey point, located at half-mile (0.8 km) intervals on a 1-mi (1.6 km) grid. This survey method was begun in the 1790s on US federal lands extending westward from Ohio. Thus, the data have the potential of providing a view of much of the US landscape from the mid-1800s, and they have been used extensively for this purpose. However, historical data sources, such as those describing the species composition of forests, can often be limited in the detail recorded and the reliability of the data, since the information was often not originally recorded for ecological purposes. Forest trees are sometimes recorded ambiguously, using generic or obscure common names. For the PLS data of northern Wisconsin, USA, we developed a method to classify ambiguously identified tree species using logistic regression analysis, using data on trees that were clearly identified to species and a set of independent predictor variables to build the models. The models were first created on partial data sets for each species and then tested for fit against the remaining data. Validations were conducted using repeated, random subsets of the data. Model prediction accuracy ranged from 81% to 96% in differentiating congeneric species among oak, pine, ash, maple, birch, and elm. Major predictor variables were tree size, associated species, landscape classes indicative of soil type, and spatial location within the study region. Results help to clarify ambiguities

  15. Single-cell approaches for molecular classification of endocrine tumors

    PubMed Central

    Koh, James; Allbritton, Nancy L.; Sosa, Julie A.

    2015-01-01

    Purpose of review In this review, we summarize recent developments in single-cell technologies that can be employed for the functional and molecular classification of endocrine cells in normal and neoplastic tissue. Recent findings The emergence of new platforms for the isolation, analysis, and dynamic assessment of individual cell identity and reactive behavior enables experimental deconstruction of intratumoral heterogeneity and other contexts, where variability in cell signaling and biochemical responsiveness inform biological function and clinical presentation. These tools are particularly appropriate for examining and classifying endocrine neoplasias, as the clinical sequelae of these tumors are often driven by disrupted hormonal responsiveness secondary to compromised cell signaling. Single-cell methods allow for multidimensional experimental designs incorporating both spatial and temporal parameters with the capacity to probe dynamic cell signaling behaviors and kinetic response patterns dependent upon sequential agonist challenge. Summary Intratumoral heterogeneity in the provenance, composition, and biological activity of different forms of endocrine neoplasia presents a significant challenge for prognostic assessment. Single-cell technologies provide an array of powerful new approaches uniquely well suited for dissecting complex endocrine tumors. Studies examining the relationship between clinical behavior and tumor compositional variations in cellular activity are now possible, providing new opportunities to deconstruct the underlying mechanisms of endocrine neoplasia. PMID:26632769

  16. An approach for combining multiple descriptors for image classification

    NASA Astrophysics Data System (ADS)

    Tran, Duc Toan; Jansen, Bart; Deklerck, Rudi; Debeir, Olivier

    2015-02-01

    Recently, efficient image descriptors have shown promise for image classification tasks. Moreover, methods based on the combination of multiple image features provide better performance compared to methods based on a single feature. This work presents a simple and efficient approach for combining multiple image descriptors. We first employ a Naive-Bayes Nearest-Neighbor scheme to evaluate four widely used descriptors. For all features, "Image-to-Class" distances are directly computed without descriptor quantization. Since distances measured by different metrics can be of different nature and they may not be on the same numerical scale, a normalization step is essential to transform these distances into a common domain prior to combining them. Our experiments conducted on a challenging database indicate that z-score normalization followed by a simple sum of distances fusion technique can significantly improve the performance compared to applications in which individual features are used. It was also observed that our experimental results on the Caltech 101 dataset outperform other previous results.

  17. A Tree-based Approach for Modelling Interception Loss From Evergreen Oak Mediterranean Savannas

    NASA Astrophysics Data System (ADS)

    Pereira, F. L.; Gash, J. H.; David, J. S.; David, T. S.; Monteiro, P. R.; Valente, F.

    2009-05-01

    In sparse forests, trees occur as widely spaced individuals rather than as a continuous forest canopy. Therefore, interception loss for this vegetation type can be more adequately modelled if the overall forest evaporation is derived by scaling up the evaporation from individual trees. Evaporation rate for a single tree can be estimated using a simple Dalton-type diffusion equation for water vapour as long as its surface temperature is known. From theory, this temperature is shown to be dependent upon the available energy and windspeed. However, surface temperature of a fully saturated tree crown, under rainy conditions, will approach the wet bulb temperature as the energy input to the tree reduces to zero. This was experimentally confirmed from measurements of the radiation balance and surface temperature of an isolated tree crown. Thus, evaporation of intercepted rainfall can be estimated using an equation which only requires knowledge of the air dry and wet bulb temperatures and of the bulk tree crown aerodynamic conductance. This was taken as the basis of a new approach for modelling interception loss from savanna-type woodland: first, the aforementioned equation was combined with the Gash's analytical model to estimate interception loss from isolated trees; second, interception loss was scaled up to the entire forest accounting for the canopy cover fraction. This modelling approach was tested using data from two Mediterranean savanna-type oak woodlands in southern Portugal. For both sites, simulated interception loss agreed well with the observations indicating the adequacy of this new methodology for modelling interception loss by isolated trees in savanna-type ecosystems. Furthermore, the proposed approach is physically based and only requires a limited amount of data.

  18. Impacts of age-dependent tree sensitivity and dating approaches on dendrogeomorphic time series of landslides

    NASA Astrophysics Data System (ADS)

    Šilhán, Karel; Stoffel, Markus

    2015-05-01

    Different approaches and thresholds have been utilized in the past to date landslides with growth ring series of disturbed trees. Past work was mostly based on conifer species because of their well-defined ring boundaries and the easy identification of compression wood after stem tilting. More recently, work has been expanded to include broad-leaved trees, which are thought to produce less and less evident reactions after landsliding. This contribution reviews recent progress made in dendrogeomorphic landslide analysis and introduces a new approach in which landslides are dated via ring eccentricity formed after tilting. We compare results of this new and the more conventional approaches. In addition, the paper also addresses tree sensitivity to landslide disturbance as a function of tree age and trunk diameter using 119 common beech (Fagus sylvatica L.) and 39 Crimean pine (Pinus nigra ssp. pallasiana) trees growing on two landslide bodies. The landslide events reconstructed with the classical approach (reaction wood) also appear as events in the eccentricity analysis, but the inclusion of eccentricity clearly allowed for more (162%) landslides to be detected in the tree-ring series. With respect to tree sensitivity, conifers and broad-leaved trees show the strongest reactions to landslides at ages comprised between 40 and 60 years, with a second phase of increased sensitivity in P. nigra at ages of ca. 120-130 years. These phases of highest sensitivities correspond with trunk diameters at breast height of 6-8 and 18-22 cm, respectively (P. nigra). This study thus calls for the inclusion of eccentricity analyses in future landslide reconstructions as well as for the selection of trees belonging to different age and diameter classes to allow for a well-balanced and more complete reconstruction of past events.

  19. The Iqmulus Urban Showcase: Automatic Tree Classification and Identification in Huge Mobile Mapping Point Clouds

    NASA Astrophysics Data System (ADS)

    Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.

    2016-06-01

    Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.

  20. Simple, novel approaches to investigating biophysical characteristics of individual mid-latitude deciduous trees

    NASA Astrophysics Data System (ADS)

    Kalibo, Humphrey Wafula

    Forests play a critical role in the functioning of the biosphere and support the livelihoods of millions of people. With increasing anthropogenic influences and looming effects associated with climatic variability, it is crucial that the research community and policy makers take advantage of the capabilities afforded by remote sensing technologies to generate reliable and timely data to support management decisions. Set in the species-rich woodland of Prairie Pines in Lincoln, Nebraska, this research addresses three distinct objectives that could contribute towards forest research and management. First, three supervised classification algorithms were applied to two hyperspectral AISA-Eagle images to evaluate their capability for spectrally identifying selected tree species. The findings show that each algorithm had low to moderate overall classification accuracies (46%-62%), probably due to mixed pixels resulting from pronounced heterogeneity in tree diversity; however, the algorithms could be a rapid means to assess species composition. The second objective is an investigation into how twelve individual morphologically different deciduous trees transmit incoming photosynthetically active radiation (PAR) over the course of the growing season. It was found that more diffuse light was transmitted than direct light, dictated by seasonality, vegetation fraction (VF), and leaf size. In the final objective, VF derived from upward-looking hemispherical photographs of twelve deciduous tree canopies and eight spectral vegetation indices (VIs) calculated from in situ single leaf-level reflectance data were used to investigate whether the VIs could mimic and estimate the temporal patterns of measured VF of each tree over the growing season. The findings show that all the indices accurately depicted the temporal patterns of the photo-derived VF. NDVI and SAVI had the highest correlations (R 2 > 0.7; RMSE 0.7; E > 0.8) and closely mirrored the temporal patterns of VF for nine

  1. A chloroplast tree for Viburnum (Adoxaceae) and its implications for phylogenetic classification and character evolution.

    PubMed

    Clement, Wendy L; Arakaki, Mónica; Sweeney, Patrick W; Edwards, Erika J; Donoghue, Michael J

    2014-06-13

    • Premise of the study: Despite recent progress, significant uncertainties remain concerning relationships among early-branching lineages within Viburnum (Adoxaceae), prohibiting a new classification and hindering studies of character evolution and the increasing use of Viburnum in addressing a wide range of ecological and evolutionary questions. We hoped to resolve these issues by sequencing whole plastid genomes for representative species and combining these with molecular data previously obtained from an expanded taxon sample.• Methods: We performed paired-end Illumina sequencing of plastid genomes of 22 Viburnum species and combined these data with a 10-gene data set to infer phylogenetic relationships for 113 species. We used the results to devise a comprehensive phylogenetic classification and to analyze the evolution of eight morphological characters that vary among early-branching lineages.• Key results: With greatly increased levels of confidence in most of the early branches, we propose a phylogenetic classification of Viburnum, providing formal phylogenetic definitions for 30 clades, including 13 with names recognized under the International Code of Nomenclature for Algae, Fungi, and Plants, eight with previously proposed informal names, and nine newly proposed names for major branches. Our parsimony reconstructions of bud structure, leaf margins, inflorescence form, ruminate endosperm, extrafloral nectaries, glandular trichomes, palisade anatomy, and pollen exine showed varying levels of homoplasy, but collectively provided morphological support for some, though not all, of the major clades.• Conclusions: Our study demonstrates the value of next-generation plastid sequencing, the ease of creating a formal phylogenetic classification, and the utility of such a system in describing patterns of character evolution. PMID:24928633

  2. Crop classification in the U.S. Corn Belt using MODIS imagery

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Land cover classification is essential in studies of land cover change, climate, hydrology, carbon sequestration and yield prediction. Land cover classification uses pattern recognition technique that includes supervised / unsupervised approaches and decision tree technique. Land cover maps for re...

  3. Multi-temporal remote sensing image classification - a multi-view approach

    SciTech Connect

    Chandola, Varun; Vatsavai, Raju

    2010-01-01

    Multispectral remote sensing images have been widely used for automated land use and land cover classification tasks. Often thematic classification is done using single date image, however in many instances a single date image is not informative enough to distinguish between different land cover types. In this paper we show how one can use multiple images, collected at different times of year (for example, during crop growing season), to learn a better classifier. We propose two approaches, an ensemble of classifiers approach and a co-training based approach, and show how both of these methods outperform a straightforward stacked vector approach often used in multi-temporal image classification. Additionally, the co-training based method addresses the challenge of limited labeled training data in supervised classification, as this classification scheme utilizes a large number of unlabeled samples (which comes for free) in conjunction with a small set of labeled training data.

  4. Assessing College Student Interest in Math and/or Computer Science in a Cross-National Sample Using Classification and Regression Trees

    ERIC Educational Resources Information Center

    Kitsantas, Anastasia; Kitsantas, Panagiota; Kitsantas, Thomas

    2012-01-01

    The purpose of this exploratory study was to assess the relative importance of a number of variables in predicting students' interest in math and/or computer science. Classification and regression trees (CART) were employed in the analysis of survey data collected from 276 college students enrolled in two U.S. and Greek universities. The…

  5. Tree carbon allocation dynamics determined using a carbon mass balance approach.

    PubMed

    Klein, Tamir; Hoch, Günter

    2015-01-01

    Tree internal carbon (C) fluxes between compound and compartment pools are difficult to measure directly. Here we used a C mass balance approach to decipher these fluxes and provide a full description of tree C allocation dynamics. We collected independent measurements of tree C sinks, source and pools in Pinus halepensis in a semi-arid forest, and converted all fluxes to g C per tree d(-1) . Using this data set, a process flowchart was created to describe and quantify the tree C allocation on diurnal to annual time-scales. The annual C source of 24.5 kg C per tree yr(-1) was balanced by C sinks of 23.5 kg C per tree yr(-1) , which partitioned into 70%, 17% and 13% between respiration, growth, and litter (plus export to soil), respectively. Large imbalances (up to 57 g C per tree d(-1) ) were observed as C excess during the wet season, and as C deficit during the dry season. Concurrent changes in C reserves (starch) were sufficient to buffer these transient C imbalances. The C pool dynamics calculated using the flowchart were in general agreement with the observed pool sizes, providing confidence regarding our estimations of the timing, magnitude, and direction of the internal C fluxes. PMID:25157793

  6. Tree carbon allocation dynamics determined using a carbon mass balance approach.

    PubMed

    Klein, Tamir; Hoch, Günter

    2015-01-01

    Tree internal carbon (C) fluxes between compound and compartment pools are difficult to measure directly. Here we used a C mass balance approach to decipher these fluxes and provide a full description of tree C allocation dynamics. We collected independent measurements of tree C sinks, source and pools in Pinus halepensis in a semi-arid forest, and converted all fluxes to g C per tree d(-1) . Using this data set, a process flowchart was created to describe and quantify the tree C allocation on diurnal to annual time-scales. The annual C source of 24.5 kg C per tree yr(-1) was balanced by C sinks of 23.5 kg C per tree yr(-1) , which partitioned into 70%, 17% and 13% between respiration, growth, and litter (plus export to soil), respectively. Large imbalances (up to 57 g C per tree d(-1) ) were observed as C excess during the wet season, and as C deficit during the dry season. Concurrent changes in C reserves (starch) were sufficient to buffer these transient C imbalances. The C pool dynamics calculated using the flowchart were in general agreement with the observed pool sizes, providing confidence regarding our estimations of the timing, magnitude, and direction of the internal C fluxes.

  7. Assessment on the classification of landslide risk level using Genetic Algorithm of Operation Tree in central Taiwan

    NASA Astrophysics Data System (ADS)

    Wei, Chiang; Yeh, Hui-Chung; Chen, Yen-Chang

    2015-04-01

    This study assessed the classification of landslide areas by Genetic Algorithm of Operation Tree (GAOT) of Chen-Yu-Lan River upstream watershed of National Taiwan University Experimental Forest (NTUEF) after the Typhoon Morakot in 2009 using remotely and geological data. Landslides of 624.5 ha which accounting for 1.9% of total area were delineated with the threshold of slope (22°) and area size (1 hectare), 48 landslide sites were located in the upstream Chen-Yu-Lan watershed using FORMOSAT-II satellite imagery, the aerial photo and GIS related coverage. The five risk levels of these landslide areas was classified by the area, elevation, slope order, aspect, erosion order and geological factor order using the Simplicity Method suggested in the Technical Regulations for Soil and Water Conservation of Taiwan. If all the landslide sites were considered, the accuracy of classification using GAOT is 97.9%, superior than the K-means, Ward method, Shared Nearest Neighbor method, Maximum Likelihood Classifier and Bayesian Classifier; if 36 sites were used as training samples and the rest 12 sites were tested, the accuracy still can reach 81.3%. More geological data, anthropogenic influence and hydrological factors may be necessary for clarifying the landside area and the results benefit the assessment for future correction and management of the authorities.

  8. Visual words based approach for tissue classification in mammograms

    NASA Astrophysics Data System (ADS)

    Diamant, Idit; Goldberger, Jacob; Greenspan, Hayit

    2013-02-01

    The presence of Microcalcifications (MC) is an important indicator for developing breast cancer. Additional indicators for cancer risk exist, such as breast tissue density type. Different methods have been developed for breast tissue classification for use in Computer-aided diagnosis systems. Recently, the visual words (VW) model has been successfully applied for different classification tasks. The goal of our work is to explore VW based methodologies for various mammography classification tasks. We start with the challenge of classifying breast density and then focus on classification of normal tissue versus Microcalcifications. The presented methodology is based on patch-based visual words model which includes building a dictionary for a training set using local descriptors and representing the image using a visual word histogram. Classification is then performed using k-nearest-neighbour (KNN) and Support vector machine (SVM) classifiers. We tested our algorithm on the MIAS and DDSM publicly available datasets. The input is a representative region-of-interest per mammography image, manually selected and labelled by expert. In the tissue density task, classification accuracy reached 85% using KNN and 88% using SVM, which competes with the state-of-the-art results. For MC vs. normal tissue, accuracy reached 95.6% using SVM. Results demonstrate the feasibility to classify breast tissue using our model. Currently, we are improving the results further while also investigating VW capability to classify additional important mammogram classification problems. We expect that the methodology presented will enable high levels of classification, suggesting new means for automated tools for mammography diagnosis support.

  9. A Tree-based Approach for Modelling Interception Loss From Evergreen Oak Mediterranean Savannas

    NASA Astrophysics Data System (ADS)

    Pereira, Fernando L.; Gash, John H. C.; David, Jorge S.; David, Teresa S.; Monteiro, Paulo R.; Valente, Fernanda

    2010-05-01

    Evaporation of rainfall intercepted by tree canopies is usually an important part of the overall water balance of forested catchments and there have been many studies dedicated to measuring and modelling rainfall interception loss. These studies have mainly been conducted in dense forests; there have been few studies on the very sparse forests which are common in dry and semi-arid areas. Water resources are scarce in these areas making sparse forests particularly important. Methods for modelling interception loss are thus required to support sustainable water management in those areas. In very sparse forests, trees occur as widely spaced individuals rather than as a continuous forest canopy. We therefore suggest that interception loss for this vegetation type can be more adequately modelled if the overall forest evaporation is derived by scaling up the evaporation from individual trees. The evaporation rate for a single tree can be estimated using a simple Dalton-type diffusion equation for water vapour as long as its surface temperature is known. From theory, this temperature is shown to be dependent upon the available energy and windspeed. However, the surface temperature of a fully saturated tree crown, under rainy conditions, should approach the wet bulb temperature as the radiative energy input to the tree reduces to zero. This was experimentally confirmed from measurements of the radiation balance and surface temperature of an isolated tree crown. Thus, evaporation of intercepted rainfall can be estimated using an equation which only requires knowledge of the air dry and wet bulb temperatures and of the bulk tree-crown aerodynamic conductance. This was taken as the basis of a new approach for modelling interception loss from savanna-type woodland, i.e. by combining the Dalton-type equation with the Gash's analytical model to estimate interception loss from isolated trees. This modelling approach was tested using data from two Mediterranean savanna-type oak

  10. Multi-level basis selection of wavelet packet decomposition tree for heart sound classification.

    PubMed

    Safara, Fatemeh; Doraisamy, Shyamala; Azman, Azreen; Jantan, Azrul; Abdullah Ramaiah, Asri Ranga

    2013-10-01

    Wavelet packet transform decomposes a signal into a set of orthonormal bases (nodes) and provides opportunities to select an appropriate set of these bases for feature extraction. In this paper, multi-level basis selection (MLBS) is proposed to preserve the most informative bases of a wavelet packet decomposition tree through removing less informative bases by applying three exclusion criteria: frequency range, noise frequency, and energy threshold. MLBS achieved an accuracy of 97.56% for classifying normal heart sound, aortic stenosis, mitral regurgitation, and aortic regurgitation. MLBS is a promising basis selection to be suggested for signals with a small range of frequencies.

  11. New Approaches to Object Classification in Synoptic Sky Surveys

    SciTech Connect

    Donalek, C.; Mahabal, A.; Djorgovski, S. G.; Marney, S.; Drake, A.; Glikman, E.; Graham, M. J.; Williams, R.

    2008-12-05

    Digital synoptic sky surveys pose several new object classification challenges. In surveys where real-time detection and classification of transient events is a science driver, there is a need for an effective elimination of instrument-related artifacts which can masquerade as transient sources in the detection pipeline, e.g., unremoved large cosmic rays, saturation trails, reflections, crosstalk artifacts, etc. We have implemented such an Artifact Filter, using a supervised neural network, for the real-time processing pipeline in the Palomar-Quest (PQ) survey. After the training phase, for each object it takes as input a set of measured morphological parameters and returns the probability of it being a real object. Despite the relatively low number of training cases for many kinds of artifacts, the overall artifact classification rate is around 90%, with no genuine transients misclassified during our real-time scans. Another question is how to assign an optimal star-galaxy classification in a multi-pass survey, where seeing and other conditions change between different epochs, potentially producing inconsistent classifications for the same object. We have implemented a star/galaxy multipass classifier that makes use of external and a priori knowledge to find the optimal classification from the individually derived ones. Both these techniques can be applied to other, similar surveys and data sets.

  12. Discrimination and classification of olive tree varieties and cultivation zones by biophenol contents.

    PubMed

    Japón-Lujan, R; Ruiz-Jiménez, J; de Castro, M D Luque

    2006-12-27

    The peak areas from a high-performance liquid chromatography-diode array (HPLC-DAD) analysis of biophenols extracted from olive leaves have been used as chemotaxonomic markers to construct chemometric models in order to discriminate and classify (1) 13 varieties of Olea europaea olive trees, namely, Alameño, Arbequina, Azulillo, Chorna, Hojiblanca, Lechín, Manzanillo, Negrillo, Nevadillo, Ocal, Pierra, Sevillano, and Tempranillo, from the same cultivation zone and (2) Arbequina samples from six different geoghaphical origins, namely, Córdoba, Mallorca (north and south), Ciudad Real, Lleida, and Navarra. Models based on principal component analysis (PCA) and hierarchical cluster analysis (HCA) were used for discrimination between samples as a function of the tree varieties and cultivation zone, whereas K nearest neighbors (KNN) and soft independent modeling of class analogy (SIMCA) models were generated to classify the samples used to validate the models into one of the groups previously established by PCA and HCA. KNN classified correctly 93 and 92% of the samples into the variety and cultivation zone, respectively; meanwhile, the SIMCA models predicted 85 and 92%, respectively.

  13. RAVEN. Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Alfonsi, Andrea; Rabiti, Cristian; Mandelli, Diego; Cogliati, Joshua; Kinoshita, Robert

    2014-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  14. RAVEN: Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Andrea Alfonsi; Cristian Rabiti; Diego Mandelli; Joshua Cogliati; Robert Kinoshita

    2013-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  15. Idiopathic interstitial pneumonias and emphysema: detection and classification using a texture-discriminative approach

    NASA Astrophysics Data System (ADS)

    Fetita, C.; Chang-Chien, K. C.; Brillet, P. Y.; Pr"teux, F.; Chang, R. F.

    2012-03-01

    Our study aims at developing a computer-aided diagnosis (CAD) system for fully automatic detection and classification of pathological lung parenchyma patterns in idiopathic interstitial pneumonias (IIP) and emphysema using multi-detector computed tomography (MDCT). The proposed CAD system is based on three-dimensional (3-D) mathematical morphology, texture and fuzzy logic analysis, and can be divided into four stages: (1) a multi-resolution decomposition scheme based on a 3-D morphological filter was exploited to discriminate the lung region patterns at different analysis scales. (2) An additional spatial lung partitioning based on the lung tissue texture was introduced to reinforce the spatial separation between patterns extracted at the same resolution level in the decomposition pyramid. Then, (3) a hierarchic tree structure was exploited to describe the relationship between patterns at different resolution levels, and for each pattern, six fuzzy membership functions were established for assigning a probability of association with a normal tissue or a pathological target. Finally, (4) a decision step exploiting the fuzzy-logic assignments selects the target class of each lung pattern among the following categories: normal (N), emphysema (EM), fibrosis/honeycombing (FHC), and ground glass (GDG). According to a preliminary evaluation on an extended database, the proposed method can overcome the drawbacks of a previously developed approach and achieve higher sensitivity and specificity.

  16. Text Categorization Based on K-Nearest Neighbor Approach for Web Site Classification.

    ERIC Educational Resources Information Center

    Kwon, Oh-Woog; Lee, Jong-Hyeok

    2003-01-01

    Discusses text categorization and Web site classification and proposes a three-step classification system that includes the use of Web pages linked with the home page. Highlights include the k-nearest neighbor (k-NN) approach; improving performance with a feature selection method and a term weighting scheme using HTML tags; and similarity…

  17. Bayesian Decision Tree for the Classification of the Mode of Motion in Single-Molecule Trajectories

    PubMed Central

    Türkcan, Silvan; Masson, Jean-Baptiste

    2013-01-01

    Membrane proteins move in heterogeneous environments with spatially (sometimes temporally) varying friction and with biochemical interactions with various partners. It is important to reliably distinguish different modes of motion to improve our knowledge of the membrane architecture and to understand the nature of interactions between membrane proteins and their environments. Here, we present an analysis technique for single molecule tracking (SMT) trajectories that can determine the preferred model of motion that best matches observed trajectories. The method is based on Bayesian inference to calculate the posteriori probability of an observed trajectory according to a certain model. Information theory criteria, such as the Bayesian information criterion (BIC), the Akaike information criterion (AIC), and modified AIC (AICc), are used to select the preferred model. The considered group of models includes free Brownian motion, and confined motion in 2nd or 4th order potentials. We determine the best information criteria for classifying trajectories. We tested its limits through simulations matching large sets of experimental conditions and we built a decision tree. This decision tree first uses the BIC to distinguish between free Brownian motion and confined motion. In a second step, it classifies the confining potential further using the AIC. We apply the method to experimental Clostridium Perfingens -toxin (CPT) receptor trajectories to show that these receptors are confined by a spring-like potential. An adaptation of this technique was applied on a sliding window in the temporal dimension along the trajectory. We applied this adaptation to experimental CPT trajectories that lose confinement due to disaggregation of confining domains. This new technique adds another dimension to the discussion of SMT data. The mode of motion of a receptor might hold more biologically relevant information than the diffusion coefficient or domain size and may be a better tool to

  18. Machine Learning Approaches for High-resolution Urban Land Cover Classification: A Comparative Study

    SciTech Connect

    Vatsavai, Raju; Chandola, Varun; Cheriyadat, Anil M; Bright, Eddie A; Bhaduri, Budhendra L; Graesser, Jordan B

    2011-01-01

    The proliferation of several machine learning approaches makes it difficult to identify a suitable classification technique for analyzing high-resolution remote sensing images. In this study, ten classification techniques were compared from five broad machine learning categories. Surprisingly, the performance of simple statistical classification schemes like maximum likelihood and Logistic regression over complex and recent techniques is very close. Given that these two classifiers require little input from the user, they should still be considered for most classification tasks. Multiple classifier systems is a good choice if the resources permit.

  19. Classification of EEG for Affect Recognition: An Adaptive Approach

    NASA Astrophysics Data System (ADS)

    Alzoubi, Omar; Calvo, Rafael A.; Stevens, Ronald H.

    Research on affective computing is growing rapidly and new applications are being developed more frequently. They use information about the affective/mental states of users to adapt their interfaces or add new functionalities. Face activity, voice, text physiology and other information about the user are used as input to affect recognition modules, which are built as classification algorithms. Brain EEG signals have rarely been used to build such classifiers due to the lack of a clear theoretical framework. We present here an evaluation of three different classification techniques and their adaptive variations of a 10-class emotion recognition experiment. Our results show that affect recognition from EEG signals might be possible and an adaptive algorithm improves the performance of the classification task.

  20. Neural network approach to classification of infrasound signals

    NASA Astrophysics Data System (ADS)

    Lee, Dong-Chang

    As part of the International Monitoring Systems of the Preparatory Commissions for the Comprehensive Nuclear Test-Ban Treaty Organization, the Infrasound Group at the University of Alaska Fairbanks maintains and operates two infrasound stations to monitor global nuclear activity. In addition, the group specializes in detecting and classifying the man-made and naturally produced signals recorded at both stations by computing various characterization parameters (e.g. mean of the cross correlation maxima, trace velocity, direction of arrival, and planarity values) using the in-house developed weighted least-squares algorithm. Classifying commonly observed low-frequency (0.015--0.1 Hz) signals at out stations, namely mountain associated waves and high trace-velocity signals, using traditional approach (e.g. analysis of power spectral density) presents a problem. Such signals can be separated statistically by setting a window to the trace-velocity estimate for each signal types, and the feasibility of such technique is demonstrated by displaying and comparing various summary plots (e.g. universal, seasonal and azimuthal variations) produced by analyzing infrasound data (2004--2007) from the Fairbanks and Antarctic arrays. Such plots with the availability of magnetic activity information (from the College International Geophysical Observatory located at Fairbanks, Alaska) leads to possible physical sources of the two signal types. Throughout this thesis a newly developed robust algorithm (sum of squares of variance ratios) with improved detection quality (under low signal to noise ratios) over two well-known detection algorithms (mean of the cross correlation maxima and Fisher Statistics) are investigated for its efficacy as a new detector. A neural network is examined for its ability to automatically classify the two signals described above against clutter (spurious signals with common characteristics). Four identical perceptron networks are trained and validated (with

  1. Genome trees constructed using five different approaches suggest new major bacterial clades

    PubMed Central

    Wolf, Yuri I; Rogozin, Igor B; Grishin, Nick V; Tatusov, Roman L; Koonin, Eugene V

    2001-01-01

    Background The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. Results Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the

  2. Using hydrogeomorphic criteria to classify wetlands on Mt. Desert Island, Maine - approach, classification system, and examples

    USGS Publications Warehouse

    Nielsen, Martha G.; Guntenspergen, Glenn R.; Neckles, Hilary A.

    2005-01-01

    A wetland classification system was designed for Mt. Desert Island, Maine, to help categorize the large number of wetlands (over 1,200 mapped units) as an aid to understanding their hydrologic functions. The classification system, developed by the U.S. Geological Survey (USGS), in cooperation with the National Park Service, uses a modified hydrogeomorphic (HGM) approach, and assigns categories based on position in the landscape, soils and surficial geologic setting, and source of water. A dichotomous key was developed to determine a preliminary HGM classification of wetlands on the island. This key is designed for use with USGS topographic maps and 1:24,000 geographic information system (GIS) coverages as an aid to the classification, but may also be used with field data. Hydrologic data collected from a wetland monitoring study were used to determine whether the preliminary classification of individual wetlands using the HGM approach yielded classes that were consistent with actual hydroperiod data. Preliminary HGM classifications of the 20 wetlands in the monitoring study were consistent with the field hydroperiod data. The modified HGM classification approach appears robust, although the method apparently works somewhat better with undisturbed wetlands than with disturbed wetlands. This wetland classification system could be applied to other hydrogeologically similar areas of northern New England.

  3. Oregon Hydrologic Landscapes: An Approach for Broadscale Hydrologic Classification

    EPA Science Inventory

    Gaged streams represent only a small percentage of watershed hydrologic conditions throughout the Unites States and globe, but there is a growing need for hydrologic classification systems that can serve as the foundation for broad-scale assessments of the hydrologic functions of...

  4. Eating Disorder Diagnoses: Empirical Approaches to Classification

    ERIC Educational Resources Information Center

    Wonderlich, Stephen A.; Joiner, Thomas E., Jr.; Keel, Pamela K.; Williamson, Donald A.; Crosby, Ross D.

    2007-01-01

    Decisions about the classification of eating disorders have significant scientific and clinical implications. The eating disorder diagnoses in the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) reflect the collective wisdom of experts in the field but are frequently not supported in…

  5. Comparison of Sub-pixel Classification Approaches for Crop-specific Mapping

    EPA Science Inventory

    The Moderate Resolution Imaging Spectroradiometer (MODIS) data has been increasingly used for crop mapping and other agricultural applications. Phenology-based classification approaches using the NDVI (Normalized Difference Vegetation Index) 16-day composite (250 m) data product...

  6. Data-Driven Multimodal Sleep Apnea Events Detection : Synchrosquezing Transform Processing and Riemannian Geometry Classification Approaches.

    PubMed

    Rutkowski, Tomasz M

    2016-07-01

    A novel multimodal and bio-inspired approach to biomedical signal processing and classification is presented in the paper. This approach allows for an automatic semantic labeling (interpretation) of sleep apnea events based the proposed data-driven biomedical signal processing and classification. The presented signal processing and classification methods have been already successfully applied to real-time unimodal brainwaves (EEG only) decoding in brain-computer interfaces developed by the author. In the current project the very encouraging results are obtained using multimodal biomedical (brainwaves and peripheral physiological) signals in a unified processing approach allowing for the automatic semantic data description. The results thus support a hypothesis of the data-driven and bio-inspired signal processing approach validity for medical data semantic interpretation based on the sleep apnea events machine-learning-related classification. PMID:27194241

  7. Image classification approach for automatic identification of grassland weeds

    NASA Astrophysics Data System (ADS)

    Gebhardt, Steffen; Kühbauch, Walter

    2006-08-01

    The potential of digital image processing for weed mapping in arable crops has widely been investigated in the last decades. In grassland farming these techniques are rarely applied so far. The project presented here focuses on the automatic identification of one of the most invasive and persistent grassland weed species, the broad-leaved dock (Rumex obtusifolius L.) in complex mixtures of grass and herbs. A total of 108 RGB-images were acquired in near range from a field experiment under constant illumination conditions using a commercial digital camera. The objects of interest were separated from the background by transforming the 24 bit RGB-images into 8 bit intensities and then calculating the local homogeneity images. These images were binarised by applying a dynamic grey value threshold. Finally, morphological opening was applied to the binary images. The remaining contiguous regions were considered to be objects. In order to classify these objects into 3 different weed species, a soil and a residue class, a total of 17 object-features related to shape, color and texture of the weeds were extracted. Using MANOVA, 12 of them were identified which contribute to classification. Maximum-likelihood classification was conducted to discriminate the weed species. The total classification rate across all classes ranged from 76 % to 83 %. The classification of Rumex obtusifolius achieved detection rates between 85 % and 93 % by misclassifications below 10 %. Further, Rumex obtusifolius distribution and the density maps were generated based on classification results and transformation of image coordinates into Gauss-Krueger system. These promising results show the high potential of image analysis for weed mapping in grassland and the implementation of site-specific herbicide spraying.

  8. Tree level hydrodynamic approach for resolving aboveground water storage and stomatal conductance and modeling the effects of tree hydraulic strategy

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, Golnazalsadat; Bohrer, Gil; Matheny, Ashley M.; Fatichi, Simone; Moraes Frasson, Renato Prata; Schäfer, Karina V. R.

    2016-07-01

    The finite difference ecosystem-scale tree crown hydrodynamics model version 2 (FETCH2) is a tree-scale hydrodynamic model of transpiration. The FETCH2 model employs a finite difference numerical methodology and a simplified single-beam conduit system to explicitly resolve xylem water potentials throughout the vertical extent of a tree. Empirical equations relate water potential within the stem to stomatal conductance of the leaves at each height throughout the crown. While highly simplified, this approach brings additional realism to the simulation of transpiration by linking stomatal responses to stem water potential rather than directly to soil moisture, as is currently the case in the majority of land surface models. FETCH2 accounts for plant hydraulic traits, such as the degree of anisohydric/isohydric response of stomata, maximal xylem conductivity, vertical distribution of leaf area, and maximal and minimal xylem water content. We used FETCH2 along with sap flow and eddy covariance data sets collected from a mixed plot of two genera (oak/pine) in Silas Little Experimental Forest, NJ, USA, to conduct an analysis of the intergeneric variation of hydraulic strategies and their effects on diurnal and seasonal transpiration dynamics. We define these strategies through the parameters that describe the genus level transpiration and xylem conductivity responses to changes in stem water potential. Our evaluation revealed that FETCH2 considerably improved the simulation of ecosystem transpiration and latent heat flux in comparison to more conventional models. A virtual experiment showed that the model was able to capture the effect of hydraulic strategies such as isohydric/anisohydric behavior on stomatal conductance under different soil-water availability conditions.

  9. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    PubMed

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1

  10. PoMo: An Allele Frequency-Based Approach for Species Tree Estimation

    PubMed Central

    De Maio, Nicola; Schrempf, Dominik; Kosiol, Carolin

    2015-01-01

    Incomplete lineage sorting can cause incongruencies of the overall species-level phylogenetic tree with the phylogenetic trees for individual genes or genomic segments. If these incongruencies are not accounted for, it is possible to incur several biases in species tree estimation. Here, we present a simple maximum likelihood approach that accounts for ancestral variation and incomplete lineage sorting. We use a POlymorphisms-aware phylogenetic MOdel (PoMo) that we have recently shown to efficiently estimate mutation rates and fixation biases from within and between-species variation data. We extend this model to perform efficient estimation of species trees. We test the performance of PoMo in several different scenarios of incomplete lineage sorting using simulations and compare it with existing methods both in accuracy and computational speed. In contrast to other approaches, our model does not use coalescent theory but is allele frequency based. We show that PoMo is well suited for genome-wide species tree estimation and that on such data it is more accurate than previous approaches. PMID:26209413

  11. Odor-baited trap trees: a new approach to monitoring plum curculio (Coleoptera: Curculionidae).

    PubMed

    Prokopy, Ronald J; Chandler, Bradley W; Dynok, Sara A; Piñero, Jaime C

    2003-06-01

    We compared a trap approach with a trap-tree approach to determine the need and timing of insecticide applications against overwintered adult plum curculios, Conotrachelus nenuphar (Herbst.), in commercial apple orchards in Massachusetts in 2002. All traps and trap trees were baited with benzaldehyde (attractive fruit odor) plus grandisoic acid (attractive pheromone). Sticky clear Plexiglas panel traps placed at orchard borders, designed to intercept adults immigrating from border areas by flight, captured significantly more adults than similarly placed black pyramid traps, which are designed to capture adults immigrating primarily by crawling, or Circle traps wrapped around trunks of perimeter-row trees, which are designed to intercept adults crawling up tree trunks. None of these trap types, however, exhibited amounts of captures that correlated significantly with either weekly or season-long amounts of fresh ovipositional injury to fruit by adults. Hence, none appears to offer high promise as a tool for effectively monitoring the seasonal course of plum curculio injury to apples in commercial orchards in Massachusetts. In contrast, baiting branches of selected perimeter-row trees with benzaldehyde plus grandisoic acid led to significant aggregation (14-15-fold) of ovipositional injury, markedly facilitating monitoring of the seasonal course of injury to apples. A concurrent experiment revealed that addition of other synthetic fruit odor attractants to apple trees baited with benzaldehyde plus grandisoic acid did not enhance aggregation of ovipositional injury above that of this dual combination. We conclude that monitoring apples on odor-baited trap trees for fresh ovipositional injury could be a useful new approach for determining need and timing of insecticide application against plum curculio in commercial orchards.

  12. Two Approaches to Estimation of Classification Accuracy Rate under Item Response Theory

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2013-01-01

    Within the framework of item response theory (IRT), there are two recent lines of work on the estimation of classification accuracy (CA) rate. One approach estimates CA when decisions are made based on total sum scores, the other based on latent trait estimates. The former is referred to as the Lee approach, and the latter, the Rudner approach,…

  13. Neural network approach to classification of traffic flow states

    SciTech Connect

    Yang, H.; Qiao, F.

    1998-11-01

    The classification of traffic flow states in China has traditionally been based on the Highway Capacity Manual, published in the United States. Because traffic conditions are generally different from country to country, though, it is important to develop a practical and useful classification method applicable to Chinese highway traffic. In view of the difficulty and complexity of a mathematical and physical realization, modern pattern recognition methods are considered practical in fulfilling this goal. This study applies a self-organizing neural network pattern recognition method to classify highway traffic states into some distinctive cluster centers. A small scale test with actual data is conducted, and the method is found to be potentially applicable in practice.

  14. An ant colony approach for image texture classification

    NASA Astrophysics Data System (ADS)

    Ye, Zhiwei; Zheng, Zhaobao; Ning, Xiaogang; Yu, Xin

    2005-10-01

    Ant colonies, and more generally social insect societies, are distributed systems that show a highly structured social organization in spite of the simplicity of their individuals. As a result of this swarm intelligence, ant colonies can accomplish complex tasks that far exceed the individual capacities of a single ant. As is well known that aerial image texture classification is a long-term difficult problem, which hasn't been fully solved. This paper presents an ant colony optimization methodology for image texture classification, which assigns N images into K type of clusters as clustering is viewed as a combinatorial optimization problem in the article. The algorithm has been tested on some real images and performance of this algorithm is superior to k-means algorithm. Computational simulations reveal very encouraging results in terms of the quality of solution found.

  15. Classification as clustering: a Pareto cooperative-competitive GP approach.

    PubMed

    McIntyre, Andrew R; Heywood, Malcolm I

    2011-01-01

    Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases.

  16. A novel approach to neuro-fuzzy classification.

    PubMed

    Ghosh, Ashish; Shankar, B Uma; Meher, Saroj K

    2009-01-01

    A new model for neuro-fuzzy (NF) classification systems is proposed. The motivation is to utilize the feature-wise degree of belonging of patterns to all classes that are obtained through a fuzzification process. A fuzzification process generates a membership matrix having total number of elements equal to the product of the number of features and classes present in the data set. These matrix elements are the input to neural networks. The effectiveness of the proposed model is established with four benchmark data sets (completely labeled) and two remote sensing images (partially labeled). Different performance measures such as misclassification, classification accuracy and kappa index of agreement for completely labeled data sets, and beta index of homogeneity and Davies-Bouldin (DB) index of compactness for remotely sensed images are used for quantitative analysis of results. All these measures supported the superiority of the proposed NF classification model. The proposed model learns well even with a lower percentage of training data that makes the system fast.

  17. Rapid Erosion Modeling in a Western Kenya Watershed using Visible Near Infrared Reflectance, Classification Tree Analysis and 137Cesium

    PubMed Central

    deGraffenried, Jeff B.; Shepherd, Keith D.

    2010-01-01

    Human induced soil erosion has severe economic and environmental impacts throughout the world. It is more severe in the tropics than elsewhere and results in diminished food production and security. Kenya has limited arable land and 30 percent of the country experiences severe to very severe human induced soil degradation. The purpose of this research was to test visible near infrared diffuse reflectance spectroscopy (VNIR) as a tool for rapid assessment and benchmarking of soil condition and erosion severity class. The study was conducted in the Saiwa River watershed in the northern Rift Valley Province of western Kenya, a tropical highland area. Soil 137Cs concentration was measured to validate spectrally derived erosion classes and establish the background levels for difference land use types. Results indicate VNIR could be used to accurately evaluate a large and diverse soil data set and predict soil erosion characteristics. Soil condition was spectrally assessed and modeled. Analysis of mean raw spectra indicated significant reflectance differences between soil erosion classes. The largest differences occurred between 1,350 and 1,950 nm with the largest separation occurring at 1,920 nm. Classification and Regression Tree (CART) analysis indicated that the spectral model had practical predictive success (72%) with Receiver Operating Characteristic (ROC) of 0.74. The change in 137Cs concentrations supported the premise that VNIR is an effective tool for rapid screening of soil erosion condition. PMID:27397933

  18. Evaluating Two Approaches to Helping College Students Understand Evolutionary Trees through Diagramming Tasks

    ERIC Educational Resources Information Center

    Perry, Judy; Meir, Eli; Herron, Jon C.; Maruca, Susan; Stal, Derek

    2008-01-01

    To understand evolutionary theory, students must be able to understand and use evolutionary trees and their underlying concepts. Active, hands-on curricula relevant to macroevolution can be challenging to implement across large college-level classes where textbook learning is the norm. We evaluated two approaches to helping students learn…

  19. Modern technology calls for a modern approach to classification of epileptic seizures and the epilepsies.

    PubMed

    Lüders, Hans O; Amina, Shahram; Baumgartner, Christopher; Benbadis, Selim; Bermeo-Ovalle, Adriana; Devereaux, Michael; Diehl, Beate; Edwards, Jonathan; Baca-Vaca, Guadalupe Fernandez; Hamer, Hajo; Ikeda, Akio; Kaiboriboon, Kitti; Kellinghaus, Christoph; Koubeissi, Mohamad; Lardizabal, David; Lhatoo, Samden; Lüders, Jürgen; Mani, Jayanti; Mayor, Luis Carlos; Miller, Jonathan; Noachtar, Soheyl; Pestana, Elia; Rosenow, Felix; Sakamoto, Americo; Shahid, Asim; Steinhoff, Bernhard J; Syed, Tanvir; Tanner, Adriana; Tsuji, Sadatoshi

    2012-03-01

    In the last 10-15 years the ILAE Commission on Classification and Terminology has been presenting proposals to modernize the current ILAE Classification of Epileptic Seizures and Epilepsies. These proposals were discussed extensively in a series of articles published recently in Epilepsia and Epilepsy Currents. There is almost universal consensus that the availability of new diagnostic techniques as also of a modern understanding of epilepsy calls for a complete revision of the Classification of Epileptic Seizures and Epilepsies. Unfortunately, however, the Commission is still not prepared to take a bold step ahead and completely revisit our approach to classification of epileptic seizures and epilepsies. In this manuscript we critically analyze the current proposals of the Commission and make suggestions for a classification system that reflects modern diagnostic techniques and our current understanding of epilepsy. PMID:22332669

  20. A phenotypic approach for IUIS PID classification and diagnosis: guidelines for clinicians at the bedside.

    PubMed

    Bousfiha, Ahmed Aziz; Jeddane, Leïla; Ailal, Fatima; Al Herz, Waleed; Conley, Mary Ellen; Cunningham-Rundles, Charlotte; Etzioni, Amos; Fischer, Alain; Franco, Jose Luis; Geha, Raif S; Hammarström, Lennart; Nonoyama, Shigeaki; Ochs, Hans D; Roifman, Chaim M; Seger, Reinhard; Tang, Mimi L K; Puck, Jennifer M; Chapel, Helen; Notarangelo, Luigi D; Casanova, Jean-Laurent

    2013-08-01

    The number of genetically defined Primary Immunodeficiency Diseases (PID) has increased exponentially, especially in the past decade. The biennial classification published by the IUIS PID expert committee is therefore quickly expanding, providing valuable information regarding the disease-causing genotypes, the immunological anomalies, and the associated clinical features of PIDs. These are grouped in eight, somewhat overlapping, categories of immune dysfunction. However, based on this immunological classification, the diagnosis of a specific PID from the clinician's observation of an individual clinical and/or immunological phenotype remains difficult, especially for non-PID specialists. The purpose of this work is to suggest a phenotypic classification that forms the basis for diagnostic trees, leading the physician to particular groups of PIDs, starting from clinical features and combining routine immunological investigations along the way. We present 8 colored diagnostic figures that correspond to the 8 PID groups in the IUIS Classification, including all the PIDs cited in the 2011 update of the IUIS classification and most of those reported since.

  1. Classification

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  2. A generalized representation-based approach for hyperspectral image classification

    NASA Astrophysics Data System (ADS)

    Li, Jiaojiao; Li, Wei; Du, Qian; Li, Yunsong

    2016-05-01

    Sparse representation-based classifier (SRC) is of great interest recently for hyperspectral image classification. It is assumed that a testing pixel is linearly combined with atoms of a dictionary. Under this circumstance, the dictionary includes all the training samples. The objective is to find a weight vector that yields a minimum L2 representation error with the constraint that the weight vector is sparse with a minimum L1 norm. The pixel is assigned to the class whose training samples yield the minimum error. In addition, collaborative representation-based classifier (CRC) is also proposed, where the weight vector has a minimum L2 norm. The CRC has a closed-form solution; when using class-specific representation it can yield even better performance than the SRC. Compared to traditional classifiers such as support vector machine (SVM), SRC and CRC do not have a traditional training-testing fashion as in supervised learning, while their performance is similar to or even better than SVM. In this paper, we investigate a generalized representation-based classifier which uses Lq representation error, Lp weight norm, and adaptive regularization. The classification performance of Lq and Lp combinations is evaluated with several real hyperspectral datasets. Based on these experiments, recommendation is provide for practical implementation.

  3. A novel approach to malignant-benign classification of pulmonary nodules by using ensemble learning classifiers.

    PubMed

    Tartar, A; Akan, A; Kilic, N

    2014-01-01

    Computer-aided detection systems can help radiologists to detect pulmonary nodules at an early stage. In this paper, a novel Computer-Aided Diagnosis system (CAD) is proposed for the classification of pulmonary nodules as malignant and benign. The proposed CAD system using ensemble learning classifiers, provides an important support to radiologists at the diagnosis process of the disease, achieves high classification performance. The proposed approach with bagging classifier results in 94.7 %, 90.0 % and 77.8 % classification sensitivities for benign, malignant and undetermined classes (89.5 % accuracy), respectively. PMID:25571029

  4. Fuzzy logic approach to extraction of intrathoracic airway trees from three-dimensional CT images

    NASA Astrophysics Data System (ADS)

    Park, Wonkyu; Hoffman, Eric A.; Sonka, Milan

    1996-04-01

    Accurate assessment of intrathoracic airway physiology requires sophisticated imaging and image segmentation of the three-dimensional airway tree structure. We have previously reported a rule-based method for three-dimensional airway tree segmentation from electron beam CT (EBCT) images. Here we report a new approach to airway tree segmentation in which fuzzy logic is used for image interpretation. In canine EBCT images, airways identified by the fuzzy logic method matched 276/337 observer-defined airways (81.9%) while the fuzzy method failed to detect the airways in the remaining 61 observer-determined locations (18.1%). By comparing the performance of the new fuzzy logic method and that of our former rule-based method, the fuzzy logic method significantly decreased the number of false airways (p less than 0.001).

  5. Seeing the trees yet not missing the forest: an airborne lidar approach

    NASA Astrophysics Data System (ADS)

    Guo, Q.; Li, W.; Flanagan, J.

    2011-12-01

    Light Detection and Ranging (lidar) is an optical remote sensing technology that measures properties of scattered light to find range and/or other information of a distant object. Due to its ability to generate 3-dimensional data with high spatial resolution and accuracy, lidar technology is being increasingly used in ecology, geography, geology, geomorphology, seismology, remote sensing, and atmospheric physics. In this study, we acquire airborne lidar data for the study of hydrologic, geomorphologic, and geochemical processes at six Critical Zone Observatories: Southern Sierra, Boulder Creek, Shale Hills, Luquillo, Jemez, and Christina River Basin. Each site will have two lidar flights (leaf on/off, or snow on/off). Based on lidar data, we derive various products, including high resolution Digital Elevation Model (DEM), Digital Surface Model (DSM), Canopy Height Model (CHM), canopy cover & closure, tree height, DBH, canopy base height, canopy bulk density, biomass, LAI, etc. A novel approach is also developed to map individual tree based on segmentation of lidar point clouds, and a virtual forest is simulated using the location of individual trees as well as tree structure information. The simulated image is then compared to a camera photo taken at the same location. The two images look very similar, while, our simulated image provides not only a visually impressive visualization of the landscape, but also contains all the detailed information about the individual tree locations and forest structure properties.

  6. Bayesian Evidence Framework for Decision Tree Learning

    NASA Astrophysics Data System (ADS)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  7. Schistosomiasis risk mapping in the state of Minas Gerais, Brazil, using a decision tree approach, remote sensing data and sociological indicators.

    PubMed

    Martins-Bedê, Flávia T; Dutra, Luciano V; Freitas, Corina C; Guimarães, Ricardo J P S; Amaral, Ronaldo S; Drummond, Sandra C; Carvalho, Omar S

    2010-07-01

    Schistosomiasis mansoni is not just a physical disease, but is related to social and behavioural factors as well. Snails of the Biomphalaria genus are an intermediate host for Schistosoma mansoni and infect humans through water. The objective of this study is to classify the risk of schistosomiasis in the state of Minas Gerais (MG). We focus on socioeconomic and demographic features, basic sanitation features, the presence of accumulated water bodies, dense vegetation in the summer and winter seasons and related terrain characteristics. We draw on the decision tree approach to infection risk modelling and mapping. The model robustness was properly verified. The main variables that were selected by the procedure included the terrain's water accumulation capacity, temperature extremes and the Human Development Index. In addition, the model was used to generate two maps, one that included risk classification for the entire of MG and another that included classification errors. The resulting map was 62.9% accurate.

  8. Hierarchical Object-based Image Analysis approach for classification of sub-meter multispectral imagery in Tanzania

    NASA Astrophysics Data System (ADS)

    Chung, C.; Nagol, J. R.; Tao, X.; Anand, A.; Dempewolf, J.

    2015-12-01

    Increasing agricultural production while at the same time preserving the environment has become a challenging task. There is a need for new approaches for use of multi-scale and multi-source remote sensing data as well as ground based measurements for mapping and monitoring crop and ecosystem state to support decision making by governmental and non-governmental organizations for sustainable agricultural development. High resolution sub-meter imagery plays an important role in such an integrative framework of landscape monitoring. It helps link the ground based data to more easily available coarser resolution data, facilitating calibration and validation of derived remote sensing products. Here we present a hierarchical Object Based Image Analysis (OBIA) approach to classify sub-meter imagery. The primary reason for choosing OBIA is to accommodate pixel sizes smaller than the object or class of interest. Especially in non-homogeneous savannah regions of Tanzania, this is an important concern and the traditional pixel based spectral signature approach often fails. Ortho-rectified, calibrated, pan sharpened 0.5 meter resolution data acquired from DigitalGlobe's WorldView-2 satellite sensor was used for this purpose. Multi-scale hierarchical segmentation was performed using multi-resolution segmentation approach to facilitate the use of texture, neighborhood context, and the relationship between super and sub objects for training and classification. eCognition, a commonly used OBIA software program, was used for this purpose. Both decision tree and random forest approaches for classification were tested. The Kappa index agreement for both algorithms surpassed the 85%. The results demonstrate that using hierarchical OBIA can effectively and accurately discriminate classes at even LCCS-3 legend.

  9. Developmental Structuralist Approach to the Classification of Adaptive and Pathologic Personality Organizations: Infancy and Early Childhood.

    ERIC Educational Resources Information Center

    Greenspan, Stanley I.; Lourie, Reginald S.

    This paper applies a developmental structuralist approach to the classification of adaptive and pathologic personality organizations and behavior in infancy and early childhood, and it discusses implications of this approach for preventive intervention. In general, as development proceeds, the structural capacity of the developing infant and child…

  10. A New Approach in Teaching the Features and Classifications of Invertebrate Animals in Biology Courses

    ERIC Educational Resources Information Center

    Sezek, Fatih

    2013-01-01

    This study examined the effectiveness of a new learning approach in teaching classification of invertebrate animals in biology courses. In this approach, we used an impersonal style: the subject jigsaw, which differs from the other jigsaws in that both course topics and student groups are divided. Students in Jigsaw group were divided into five…

  11. A multi-label approach using binary relevance and decision trees applied to functional genomics.

    PubMed

    Tanaka, Erica Akemi; Nozawa, Sérgio Ricardo; Macedo, Alessandra Alaniz; Baranauskas, José Augusto

    2015-04-01

    Many classification problems, especially in the field of bioinformatics, are associated with more than one class, known as multi-label classification problems. In this study, we propose a new adaptation for the Binary Relevance algorithm taking into account possible relations among labels, focusing on the interpretability of the model, not only on its performance. Experiments were conducted to compare the performance of our approach against others commonly found in the literature and applied to functional genomic datasets. The experimental results show that our proposal has a performance comparable to that of other methods and that, at the same time, it provides an interpretable model from the multi-label problem.

  12. Characterizing Vocal Repertoires—Hard vs. Soft Classification Approaches

    PubMed Central

    Wadewitz, Philip; Hammerschmidt, Kurt; Battaglia, Demian; Witt, Annette; Wolf, Fred; Fischer, Julia

    2015-01-01

    To understand the proximate and ultimate causes that shape acoustic communication in animals, objective characterizations of the vocal repertoire of a given species are critical, as they provide the foundation for comparative analyses among individuals, populations and taxa. Progress in this field has been hampered by a lack of standard in methodology, however. One problem is that researchers may settle on different variables to characterize the calls, which may impact on the classification of calls. More important, there is no agreement how to best characterize the overall structure of the repertoire in terms of the amount of gradation within and between call types. Here, we address these challenges by examining 912 calls recorded from wild chacma baboons (Papio ursinus). We extracted 118 acoustic variables from spectrograms, from which we constructed different sets of acoustic features, containing 9, 38, and 118 variables; as well 19 factors derived from principal component analysis. We compared and validated the resulting classifications of k-means and hierarchical clustering. Datasets with a higher number of acoustic features lead to better clustering results than datasets with only a few features. The use of factors in the cluster analysis resulted in an extremely poor resolution of emerging call types. Another important finding is that none of the applied clustering methods gave strong support to a specific cluster solution. Instead, the cluster analysis revealed that within distinct call types, subtypes may exist. Because hard clustering methods are not well suited to capture such gradation within call types, we applied a fuzzy clustering algorithm. We found that this algorithm provides a detailed and quantitative description of the gradation within and between chacma baboon call types. In conclusion, we suggest that fuzzy clustering should be used in future studies to analyze the graded structure of vocal repertoires. Moreover, the use of factor analyses to

  13. Classification of Sherry vinegars by combining multidimensional fluorescence, parafac and different classification approaches.

    PubMed

    Callejón, Raquel M; Amigo, José Manuel; Pairo, Erola; Garmón, Sergio; Ocaña, Juan Antonio; Morales, Maria Lourdes

    2012-01-15

    Sherry vinegar is a much appreciated product from Jerez-Xérès-Sherry, Manzanilla de Sanlúcar and Vinagre de Jerez Protected Designation in southwestern Spain. Its complexity and the extraordinary organoleptic properties are acquired thanks to the method of production followed, the so-called "criaderas y solera" ageing system. Three qualities for Sherry vinegar are considered according to ageing time in oak barrels: "Vinagre de Jerez" (minimum of 6 months), "Reserva" (at least 2 years) and "Gran Reserva" (at least 10 years). In the last few years, there has been an increasing need to develop rapid, inexpensive and effective analytical methods, as well as requiring low sample manipulation for the analysis and characterization of Sherry vinegar. Fluorescence spectroscopy is emerging as a competitive technique for this purpose, since provides in a few seconds an excitation-emission landscape that may be used as a fingerprint of the vinegar. Multi-way analysis, specifically Parallel Factor Analysis (PARAFAC), is a powerful tool for simultaneous determination of fluorescent components, because they extract the most relevant information from the data and allow building robust models. Moreover, the information obtained by PARAFAC can be used to build robust and reliable classification and discrimination models (e.g. by using Support Vector Machines and Partial Least Squares-Discriminant Analysis models). In this context, the aim of this work was to study the possibilities of multi-way fluorescence linked to PARAFAC and to classify the different Sherry vinegars accordingly to their ageing. The results demonstrated that the use of the proposed analytical and chemometric tools are a perfect combination to extract relevant chemical information about the vinegars as well as to classify and discriminate them considering the different ageing. PMID:22265526

  14. Automated classification of histopathology images of prostate cancer using a Bag-of-Words approach

    NASA Astrophysics Data System (ADS)

    Sanghavi, Foram M.; Agaian, Sos S.

    2016-05-01

    The goals of this paper are (1) test the Computer Aided Classification of the prostate cancer histopathology images based on the Bag-of-Words (BoW) approach (2) evaluate the performance of the classification grade 3 and 4 of the proposed method using the results of the approach proposed by the authors Khurd et al. in [9] and (3) classify the different grades of cancer namely, grade 0, 3, 4, and 5 using the proposed approach. The system performance is assessed using 132 prostate cancer histopathology of different grades. The system performance of the SURF features are also analyzed by comparing the results with SIFT features using different cluster sizes. The results show 90.15% accuracy in detection of prostate cancer images using SURF features with 75 clusters for k-mean clustering. The results showed higher sensitivity for SURF based BoW classification compared to SIFT based BoW.

  15. Object-based approaches to image classification for hyperspatial and hyperspectral data

    NASA Astrophysics Data System (ADS)

    Sridharan, Harini

    classifications as well as detailed urban forest tree species classifications were performed to test the performance of the classifier. The results for the two study areas show that the proposed classifier consistently achieves high accuracies, irrespective of the sensor, and also demonstrates superior performance in comparison to other popular object and pixel-based classifiers.

  16. Newer Classification System for Fissured Tongue: An Epidemiological Approach

    PubMed Central

    Sudarshan, Ramachandran; Sree Vijayabala, G.; Samata, Y.; Ravikiran, A.

    2015-01-01

    Introduction. Fissured tongue is a commonly encountered tongue disorder in dental practice. But there is a lack of data on different pattern, severity, and association of fissuring with various systemic disorders and other tongue anomalies. This study attempts to establish a classification system for fissured tongue and to know the correlation with the systemic health and other disorders of the tongue. Materials and Methods. A total of 1000 subjects between the age groups of 10 and 80 years were included in the study. Pattern of fissuring, allied systemic diseases, and related tongue anomalies were tabulated. Results. Out of 1000 subjects, 387 subjects presented with fissured tongue. Out of 387 subjects, hypertension was present in 57 cases, 18 subjects had diabetes, and 3 subjects had both hypertension and diabetes. Central longitudinal type was found to be the most common type of tongue fissuring. Conclusion. Fissured tongue has been found to be associated with certain systemic disease and further researches are required to know positive correlation. If a correlation exists, such disorders could be diagnosed earlier by identifying fissured tongue at an earlier age. PMID:26457087

  17. Statistical methods and neural network approaches for classification of data from multiple sources

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon Atli; Swain, Philip H.

    1990-01-01

    Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.

  18. Marker-Based Hierarchical Segmentation and Classification Approach for Hyperspectral Imagery

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Tilton, James C.; Benediktsson, Jon Atli; Chanussot, Jocelyn

    2011-01-01

    The Hierarchical SEGmentation (HSEG) algorithm, which is a combination of hierarchical step-wise optimization and spectral clustering, has given good performances for hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. First, pixelwise classification is performed and the most reliably classified pixels are selected as markers, with the corresponding class labels. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. The experimental results show that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for hyperspectral image analysis.

  19. Power system distributed on-line fault section estimation using decision tree based neural nets approach

    SciTech Connect

    Yang, H.T.; Chang, W.Y.; Huang, C.L.

    1995-01-01

    This paper proposes a distributed neural nets decision approach to on-line estimation of the fault section of a transmission and distribution (T and D) system. The distributed processing alleviates the burden of communication between the control center and local substations, and increases the reliability and flexibility of the diagnosis system. Besides, by using the algorithms of data-driven decision tree induction and direct mapping from the decision tree into neural net, the proposed diagnosis system features parallel processing and easy implementation, overcoming the limitations of overly large and complex system. The approach has been practically tested on a typical Taiwan Power (Taipower) T and D system. The feasibility of such a diagnosis system is presented.

  20. Tropical forest structure characterization using airborne lidar data: an individual tree level approach

    NASA Astrophysics Data System (ADS)

    Ferraz, A.; Saatchi, S. S.

    2015-12-01

    Fine scale tropical forest structure characterization has been performed by means of field measurements techniques that record both the specie and the diameter at the breast height (dbh) for every tree within a given area. Due to dense and complex vegetation, additional important ecological variables (e.g. the tree height and crown size) are usually not measured because they are hardly recognized from the ground. The poor knowledge on the 3D tropical forest structure has been a major limitation for the understanding of different ecological issues such as the spatial distribution of carbon stocks, regeneration and competition dynamics and light penetration gradient assessments. Airborne laser scanning (ALS) is an active remote sensing technique that provides georeferenced distance measurements between the aircraft and the surface. It provides an unstructured 3D point cloud that is a high-resolution model of the forest. This study presents the first approach for tropical forest characterization at a fine scale using remote sensing data. The multi-modal lidar point cloud is decomposed into 3D clusters that correspond to single trees by means of a technique called Adaptive Mean Shift Segmentation (AMS3D). The ability of the corresponding individual tree metrics (tree height, crown area and crown volume) for the estimation of above ground biomass (agb) over the 50 ha CTFS plot in Barro Colorado Island is here assessed. We conclude that our approach is able to map the agb spatial distribution with an error of nearly 12% (RMSE=28 Mg ha-1) compared with field-based estimates over 1ha plots.

  1. A first approach to the homogenization of daily data using weather types classifications (HOWCLASS)

    NASA Astrophysics Data System (ADS)

    Garcia-Borés, I.; Aguilar, E.; Rasilla, D.; Rodrigo, F. S.; Fernández-Montes, S.; Luna, M. Y.; Sigró, J.; Brunet, M.

    2009-04-01

    The homogenization of daily data is a difficult task, as involves adjusting values recorded under very different and specific meteorological situations and due to larger inter-diurnal and spatial variations compared to those characterizing lower resolution data (i.e. monthly, seasonal and annual data). We introduce here a new method for the (Homogenization of Daily Data Using Weather Types Classifications, HOWCLASS ) for the adjustment of climatological elements on a daily resolution. We benefit from the intensive research which has been done recently in the fields of homogenization and weather types classifications, specially in the framework of two COST Actions: Action ES0601, Advances in homogenisation methods of climate series: an integrated approach (HOME) and Action 733, Harmonisation and Applications of Weather Types Classifications for European Regions). The basic idea underlying HOWCLASS is the aggregation of daily values into weather types and the calculation of average adjustments for each one of them. The development of HOWCLASS needs to combine 3 basic items: the specific characteristics of the climatological element (temperature, precipitation, etc.); the detection/correction algorithm (SNHT, RhTest, Caussinus-Mestre, MASH, etc.) and the adequate weather types classification (manual classifications, like those derived from the Lamb weather; automated classifications, based either on different correlation analyses or clustering methods; hybrid classifications ) for the geographical domain of the studied time series. Other factors cannot be missregarded, like the metadata availability, the impact of the annual cycle, the network density, etc. As a first step, we start -and present here - with a simple approach using the SDATS (Spanish Daily Temperature Series), the Standard Normal Homogeneity Test and the weather type classification developed by D. Rasilla for the Iberian Peninsula, using the EMSLP pressure data, to correct a selection of

  2. Robust Machine Learning Applied to Astronomical Data Sets. I. Star-Galaxy Classification of the Sloan Digital Sky Survey DR3 Using Decision Trees

    NASA Astrophysics Data System (ADS)

    Ball, Nicholas M.; Brunner, Robert J.; Myers, Adam D.; Tcheng, David

    2006-10-01

    We provide classifications for all 143 million nonrepeat photometric objects in the Third Data Release of the SDSS using decision trees trained on 477,068 objects with SDSS spectroscopic data. We demonstrate that these star/galaxy classifications are expected to be reliable for approximately 22 million objects with r<~20. The general machine learning environment Data-to-Knowledge and supercomputing resources enabled extensive investigation of the decision tree parameter space. This work presents the first public release of objects classified in this way for an entire SDSS data release. The objects are classified as either galaxy, star, or nsng (neither star nor galaxy), with an associated probability for each class. To demonstrate how to effectively make use of these classifications, we perform several important tests. First, we detail selection criteria within the probability space defined by the three classes to extract samples of stars and galaxies to a given completeness and efficiency. Second, we investigate the efficacy of the classifications and the effect of extrapolating from the spectroscopic regime by performing blind tests on objects in the SDSS, 2dFGRS, and 2QZ surveys. Given the photometric limits of our spectroscopic training data, we effectively begin to extrapolate past our star-galaxy training set at r~18. By comparing the number counts of our training sample with the classified sources, however, we find that our efficiencies appear to remain robust to r~20. As a result, we expect our classifications to be accurate for 900,000 galaxies and 6.7 million stars and remain robust via extrapolation for a total of 8.0 million galaxies and 13.9 million stars.

  3. Investigating the limitations of tree species classification using the Combined Cluster and Discriminant Analysis method for low density ALS data from a dense forest region in Aggtelek (Hungary)

    NASA Astrophysics Data System (ADS)

    Koma, Zsófia; Deák, Márton; Kovács, József; Székely, Balázs; Kelemen, Kristóf; Standovár, Tibor

    2016-04-01

    Airborne Laser Scanning (ALS) is a widely used technology for forestry classification applications. However, single tree detection and species classification from low density ALS point cloud is limited in a dense forest region. In this study we investigate the division of a forest into homogenous groups at stand level. The study area is located in the Aggtelek karst region (Northeast Hungary) with a complex relief topography. The ALS dataset contained only 4 discrete echoes (at 2-4 pt/m2 density) from the study area during leaf-on season. Ground-truth measurements about canopy closure and proportion of tree species cover are available for every 70 meter in 500 square meter circular plots. In the first step, ALS data were processed and geometrical and intensity based features were calculated into a 5×5 meter raster based grid. The derived features contained: basic statistics of relative height, canopy RMS, echo ratio, openness, pulse penetration ratio, basic statistics of radiometric feature. In the second step the data were investigated using Combined Cluster and Discriminant Analysis (CCDA, Kovács et al., 2014). The CCDA method first determines a basic grouping for the multiple circle shaped sampling locations using hierarchical clustering and then for the arising grouping possibilities a core cycle is executed comparing the goodness of the investigated groupings with random ones. Out of these comparisons difference values arise, yielding information about the optimal grouping out of the investigated ones. If sub-groups are then further investigated, one might even find homogeneous groups. We found that low density ALS data classification into homogeneous groups are highly dependent on canopy closure, and the proportion of the dominant tree species. The presented results show high potential using CCDA for determination of homogenous separable groups in LiDAR based tree species classification. Aggtelek Karst/Slovakian Karst Caves" (HUSK/1101/221/0180, Aggtelek NP

  4. [Computational approaches for identification and classification of transposable elements in eukaryotic genomes].

    PubMed

    Xu, Hong-En; Zhang, Hua-Hao; Han, Min-Jin; Shen, Yi-Hong; Huang, Xian-Zhi; Xiang, Zhong-Huai; Zhang, Ze

    2012-08-01

    Repetitive sequences (repeats) represent a significant fraction of the eukaryotic genomes and can be divided into tandem repeats, segmental duplications, and interspersed repeats on the basis of their sequence characteristics and how they are formed. Most interspersed repeats are derived from transposable elements (TEs). Eukaryotic TEs have been subdivided into two major classes according to the intermediate they use to move. The transposition and amplification of TEs have a great impact on the evolution of genes and the stability of genomes. However, identification and classification of TEs are complex and difficult due to the fact that their structure and classification are complex and diverse compared with those of other types of repeats. Here, we briefly introduced the function and classification of TEs, and summarized three different steps for identification, classification and annotation of TEs in eukaryotic genomes: (1) assembly of a repeat library, (2) repeat correction and classification, and (3) genome annotation. The existing computational approaches for each step were summarized and the advantages and disadvantages of the approaches were also highlighted in this review. To accurately identify, classify, and annotate the TEs in eukaryotic genomes requires combined methods. This review provides useful information for biologists who are not familiar with these approaches to find their way through the forest of programs.

  5. Multi-variate flood damage assessment: a tree-based data-mining approach

    NASA Astrophysics Data System (ADS)

    Merz, B.; Kreibich, H.; Lall, U.

    2013-01-01

    The usual approach for flood damage assessment consists of stage-damage functions which relate the relative or absolute damage for a certain class of objects to the inundation depth. Other characteristics of the flooding situation and of the flooded object are rarely taken into account, although flood damage is influenced by a variety of factors. We apply a group of data-mining techniques, known as tree-structured models, to flood damage assessment. A very comprehensive data set of more than 1000 records of direct building damage of private households in Germany is used. Each record contains details about a large variety of potential damage-influencing characteristics, such as hydrological and hydraulic aspects of the flooding situation, early warning and emergency measures undertaken, state of precaution of the household, building characteristics and socio-economic status of the household. Regression trees and bagging decision trees are used to select the more important damage-influencing variables and to derive multi-variate flood damage models. It is shown that these models outperform existing models, and that tree-structured models are a promising alternative to traditional damage models.

  6. Branch-and-bound approach for parsimonious inference of a species tree from a set of gene family trees.

    PubMed

    Doyon, Jean-Philippe; Chauve, Cedric

    2011-01-01

    We describe a Branch-and-Bound algorithm for computing a parsimonious species tree, given a set of gene family trees. Our algorithm can consider three cost measures: number of gene duplications, number of gene losses, and both combined. Moreover, to cope with intrinsic limitations of Branch-and-Bound algorithms for species trees inference regarding the number of taxa that can be considered, our algorithm can naturally take into account predefined relationships between sets of taxa. We test our algorithm on a dataset of eukaryotic gene families spanning 29 taxa.

  7. Voxel-Based Approach for Estimating Urban Tree Volume from Terrestrial Laser Scanning Data

    NASA Astrophysics Data System (ADS)

    Vonderach, C.; Voegtle, T.; Adler, P.

    2012-07-01

    The importance of single trees and the determination of related parameters has been recognized in recent years, e.g. for forest inventories or management. For urban areas an increasing interest in the data acquisition of trees can be observed concerning aspects like urban climate, CO2 balance, and environmental protection. Urban trees differ significantly from natural systems with regard to the site conditions (e.g. technogenic soils, contaminants, lower groundwater level, regular disturbance), climate (increased temperature, reduced humidity) and species composition and arrangement (habitus and health status) and therefore allometric relations cannot be transferred from natural sites to urban areas. To overcome this problem an extended approach was developed for a fast and non-destructive extraction of branch volume, DBH (diameter at breast height) and height of single trees from point clouds of terrestrial laser scanning (TLS). For data acquisition, the trees were scanned with highest scan resolution from several (up to five) positions located around the tree. The resulting point clouds (20 to 60 million points) are analysed with an algorithm based on voxel (volume elements) structure, leading to an appropriate data reduction. In a first step, two kinds of noise reduction are carried out: the elimination of isolated voxels as well as voxels with marginal point density. To obtain correct volume estimates, the voxels inside the stem and branches (interior voxels) where voxels contain no laser points must be regarded. For this filling process, an easy and robust approach was developed based on a layer-wise (horizontal layers of the voxel structure) intersection of four orthogonal viewing directions. However, this procedure also generates several erroneous "phantom" voxels, which have to be eliminated. For this purpose the previous approach was extended by a special region growing algorithm. In a final step the volume is determined layer-wise based on the extracted

  8. Optimization of a Non-traditional Unsupervised Classification Approach for Land Cover Analysis

    NASA Technical Reports Server (NTRS)

    Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.

    1982-01-01

    The conditions under which a hybrid of clustering and canonical analysis for image classification produce optimum results were analyzed. The approach involves generation of classes by clustering for input to canonical analysis. The importance of the number of clusters input and the effect of other parameters of the clustering algorithm (ISOCLS) were examined. The approach derives its final result by clustering the canonically transformed data. Therefore the importance of number of clusters requested in this final stage was also examined. The effect of these variables were studied in terms of the average separability (as measured by transformed divergence) of the final clusters, the transformation matrices resulting from different numbers of input classes, and the accuracy of the final classifications. The research was performed with LANDSAT MSS data over the Hazleton/Berwick Pennsylvania area. Final classifications were compared pixel by pixel with an existing geographic information system to provide an indication of their accuracy.

  9. On the relation between tree crown morphology and particulate matter deposition on urban tree leaves: A ground-based LiDAR approach

    NASA Astrophysics Data System (ADS)

    Hofman, Jelle; Bartholomeus, Harm; Calders, Kim; Van Wittenberghe, Shari; Wuyts, Karen; Samson, Roeland

    2014-12-01

    Urban dwellers often breathe air that does not meet the European and WHO standards. Next to legislative initiatives to lower atmospheric pollutants, much research has been conducted on the potential of urban trees as mitigation tool for atmospheric particles. While leaf-deposited dust has shown to vary significantly throughout single tree crowns, this study evaluated the influence of micro-scale tree crown morphology (leaf density) on the amount of leaf-deposited dust. Using a ground-based LiDAR approach, the three-dimensional tree crown morphology was obtained and compared to gravimetric measurements of leaf-deposited dust within three different size fractions (>10, 3-10 and 0.2-3 μm). To our knowledge, this is the first application of ground-based LiDAR for comparison with gravimetric results of leaf-deposited particulate matter. Overall, an increasing leaf density appears to reduce leaf-deposition of atmospheric particles. This might be explained by a reduced wind velocity, suppressing turbulent deposition of atmospheric particles through impaction. Nevertheless, the effect of tree crown morphology on particulate deposition appears almost negligible (7% AIC decrease) compared to the influence of physical factors like height, azimuth and tree position.

  10. Updating the US Hydrologic Classification: An Approach to Clustering and Stratifying Ecohydrologic Data

    SciTech Connect

    McManamay, Ryan A; Bevelhimer, Mark S; Kao, Shih-Chieh

    2013-01-01

    Hydrologic classifications unveil the structure of relationships among groups of streams with differing stream flow and provide a foundation for drawing inferences about the principles that govern those relationships. Hydrologic classes provide a template to describe ecological patterns, generalize hydrologic responses to disturbance, and stratify research and management needs applicable to ecohydrology. We developed two updated hydrologic classifications for the continental US using two streamflow datasets of varying reference standards. Using only reference-quality gages, we classified 1715 stream gages into 12 classes across the US. By including more streamflow gages (n=2618) in a separate classification, we increased the dimensionality (i.e. classes) and hydrologic distinctiveness within regions at the expense of decreasing the natural flow standards (i.e. reference quality). Greater numbers of classes and higher regional affiliation within our hydrologic classifications compared to that of the previous US hydrologic classification (Poff, 1996) suggested that the level of hydrologic variation and resolution was not completely represented in smaller sample sizes. Part of the utility of classification systems rests in their ability classify new objects and stratify analyses. We constructed separate random forests to predict hydrologic class membership based on hydrologic indices or landscape variables. In addition, we provide an approach to assessing potential outliers due to hydrologic alteration based on class assignment. Departures from class membership due to disturbance take into account multiple hydrologic indices simultaneously; thus, classes can be used to determine if disturbed streams are functioning within the realm of natural hydrology.

  11. Gaussian Kernel Based Classification Approach for Wheat Identification

    NASA Astrophysics Data System (ADS)

    Aggarwal, R.; Kumar, A.; Raju, P. L. N.; Krishna Murthy, Y. V. N.

    2014-11-01

    Agriculture holds a pivotal role in context to India, which is basically agrarian economy. Crop type identification is a key issue for monitoring agriculture and is the basis for crop acreage and yield estimation. However, it is very challenging to identify a specific crop using single date imagery. Hence, it is highly important to go for multi-temporal analysis approach for specific crop identification. This research work deals with implementation of fuzzy classifier; Possibilistic c-Means (PCM) with and without kernel based approach, using temporal data of Landsat 8- OLI (Operational Land Imager) for identification of wheat in Radaur City, Haryana. The multi- temporal dataset covers complete phenological cycle that is from seedling to ripening of wheat crop growth. The experimental results show that inclusion of Gaussian kernel, with Euclidean Norm (ED Norm) in Possibilistic c-Means (KPCM), soft classifier has been more robust in identification of the wheat crop. Also, identification of all the wheat fields is dependent upon appropriate selection of the temporal date. The best combination of temporal data corresponds to tillering, stem extension, heading and ripening stages of wheat crop. Entropy at testing sites of wheat has been used to validate the classified results. The entropy value at testing sites was observed to be low, implying lower uncertainty of existence of any other class at wheat test sites and high certainty of existence of wheat crop.

  12. A sampling and classification item selection approach with content balancing.

    PubMed

    Chen, Pei-Hua

    2015-03-01

    Existing automated test assembly methods typically employ constrained combinatorial optimization. Constructing forms sequentially based on an optimization approach usually results in unparallel forms and requires heuristic modifications. Methods based on a random search approach have the major advantage of producing parallel forms sequentially without further adjustment. This study incorporated a flexible content-balancing element into the statistical perspective item selection method of the cell-only method (Chen et al. in Educational and Psychological Measurement, 72(6), 933-953, 2012). The new method was compared with a sequential interitem distance weighted deviation model (IID WDM) (Swanson & Stocking in Applied Psychological Measurement, 17(2), 151-166, 1993), a simultaneous IID WDM, and a big-shadow-test mixed integer programming (BST MIP) method to construct multiple parallel forms based on matching a reference form item-by-item. The results showed that the cell-only method with content balancing and the sequential and simultaneous versions of IID WDM yielded results comparable to those obtained using the BST MIP method. The cell-only method with content balancing is computationally less intensive than the sequential and simultaneous versions of IID WDM. PMID:24610145

  13. A new approach to plane-sweep overlay: topological structuring and line-segment classification

    USGS Publications Warehouse

    van Roessel, Jan W.

    1991-01-01

    An integrated approach to spatial overlay was developed with the objective of creating a single function that can perform most of the tasks now assigned to discrete functions in current systems. Two important components of this system are a unique method for topological structuring, and a method for attribute propagation and line-segment classification. -Author

  14. A novel information transferring approach for the classification of remote sensing images

    NASA Astrophysics Data System (ADS)

    Gao, Jianqiang; Xu, Lizhong; Shen, Jie; Huang, Fengchen; Xu, Feng

    2015-12-01

    Traditional remote sensing images classification methods focused on using a large amount of labeled target data to train an efficient classification model. However, these approaches were generally based on the target data without considering a host of auxiliary data or the additional information of auxiliary data. If the valuable information from auxiliary data could be successfully transferred to the target data, the performance of the classification model would be improved. In addition, from the perspective of practical application, these valuable information from auxiliary data should be fully used. Therefore, in this paper, based on the transfer learning idea, we proposed a novel information transferring approach to improve the remote sensing images classification performance. The main rationale of this approach is that first, the information of the same areas associated with each pixel is modeled as the intra-class set, and the information of different areas associated with each pixel is modeled as the inter-class set, and then the obtained texture feature information of each area from auxiliary is transferred to the target data set such that the inter-class set is separated and intra-class set is gathered as far as possible. Experiments show that the proposed approach is effective and feasible.

  15. Wittgenstein's philosophy and a dimensional approach to the classification of mental disorders -- a preliminary scheme.

    PubMed

    Mackinejad, Kioumars; Sharifi, Vandad

    2006-01-01

    In this paper the importance of Wittgenstein's philosophical ideas for the justification of a dimensional approach to the classification of mental disorders is discussed. Some of his basic concepts in his Philosophical Investigations, such as 'family resemblances', 'grammar' and 'language-game' and their relations to the concept of mental disorder are explored.

  16. A Consensus Tree Approach for Reconstructing Human Evolutionary History and Detecting Population Substructure

    NASA Astrophysics Data System (ADS)

    Tsai, Ming-Chi; Blelloch, Guy; Ravi, R.; Schwartz, Russell

    The random accumulation of variations in the human genome over time implicitly encodes a history of how human populations have arisen, dispersed, and intermixed since we emerged as a species. Reconstructing that history is a challenging computational and statistical problem but has important applications both to basic research and to the discovery of genotype-phenotype correlations. In this study, we present a novel approach to inferring human evolutionary history from genetic variation data. Our approach uses the idea of consensus trees, a technique generally used to reconcile species trees from divergent gene trees, adapting it to the problem of finding the robust relationships within a set of intraspecies phylogenies derived from local regions of the genome. We assess the quality of the method on two large-scale genetic variation data sets: the HapMap Phase II and the Human Genome Diversity Project. Qualitative comparison to a consensus model of the evolution of modern human population groups shows that our inferences closely match our best current understanding of human evolutionary history. A further comparison with results of a leading method for the simpler problem of population substructure assignment verifies that our method provides comparable accuracy in identifying meaningful population subgroups in addition to inferring the relationships among them.

  17. Sequence comparison alignment-free approach based on suffix tree and L-words frequency.

    PubMed

    Soares, Inês; Goios, Ana; Amorim, António

    2012-01-01

    The vast majority of methods available for sequence comparison rely on a first sequence alignment step, which requires a number of assumptions on evolutionary history and is sometimes very difficult or impossible to perform due to the abundance of gaps (insertions/deletions). In such cases, an alternative alignment-free method would prove valuable. Our method starts by a computation of a generalized suffix tree of all sequences, which is completed in linear time. Using this tree, the frequency of all possible words with a preset length L-L-words--in each sequence is rapidly calculated. Based on the L-words frequency profile of each sequence, a pairwise standard Euclidean distance is then computed producing a symmetric genetic distance matrix, which can be used to generate a neighbor joining dendrogram or a multidimensional scaling graph. We present an improvement to word counting alignment-free approaches for sequence comparison, by determining a single optimal word length and combining suffix tree structures to the word counting tasks. Our approach is, thus, a fast and simple application that proved to be efficient and powerful when applied to mitochondrial genomes. The algorithm was implemented in Python language and is freely available on the web.

  18. Mapping raised bogs with an iterative one-class classification approach

    NASA Astrophysics Data System (ADS)

    Mack, Benjamin; Roscher, Ribana; Stenzel, Stefanie; Feilhauer, Hannes; Schmidtlein, Sebastian; Waske, Björn

    2016-10-01

    Land use and land cover maps are one of the most commonly used remote sensing products. In many applications the user only requires a map of one particular class of interest, e.g. a specific vegetation type or an invasive species. One-class classifiers are appealing alternatives to common supervised classifiers because they can be trained with labeled training data of the class of interest only. However, training an accurate one-class classification (OCC) model is challenging, particularly when facing a large image, a small class and few training samples. To tackle these problems we propose an iterative OCC approach. The presented approach uses a biased Support Vector Machine as core classifier. In an iterative pre-classification step a large part of the pixels not belonging to the class of interest is classified. The remaining data is classified by a final classifier with a novel model and threshold selection approach. The specific objective of our study is the classification of raised bogs in a study site in southeast Germany, using multi-seasonal RapidEye data and a small number of training sample. Results demonstrate that the iterative OCC outperforms other state of the art one-class classifiers and approaches for model selection. The study highlights the potential of the proposed approach for an efficient and improved mapping of small classes such as raised bogs. Overall the proposed approach constitutes a feasible approach and useful modification of a regular one-class classifier.

  19. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach

    PubMed Central

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  20. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach.

    PubMed

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  1. Sovereign debt crisis in the European Union: A minimum spanning tree approach

    NASA Astrophysics Data System (ADS)

    Dias, João

    2012-03-01

    In the wake of the financial crisis, sovereign debt crisis has emerged and is severely affecting some countries in the European Union, threatening the viability of the euro and even the EU itself. This paper applies recent developments in econophysics, in particular the minimum spanning tree approach and the associate hierarchical tree, to analyze the asynchronization between the four most affected countries and other resilient countries in the euro area. For this purpose, daily government bond yield rates are used, covering the period from April 2007 to October 2010, thus including yield rates before, during and after the financial crises. The results show an increasing separation of the two groups of euro countries with the deepening of the government bond crisis.

  2. The practice of classification and the theory of evolution, and what the demise of Charles Darwin's tree of life hypothesis means for both of them

    PubMed Central

    Doolittle, W. Ford

    2009-01-01

    Debates over the status of the tree of life (TOL) often proceed without agreement as to what it is supposed to be: a hierarchical classification scheme, a tracing of genomic and organismal history or a hypothesis about evolutionary processes and the patterns they can generate. I will argue that for Darwin it was a hypothesis, which lateral gene transfer in prokaryotes now shows to be false. I will propose a more general and relaxed evolutionary theory and point out why anti-evolutionists should take no comfort from disproof of the TOL hypothesis. PMID:19571242

  3. The practice of classification and the theory of evolution, and what the demise of Charles Darwin's tree of life hypothesis means for both of them.

    PubMed

    Doolittle, W Ford

    2009-08-12

    Debates over the status of the tree of life (TOL) often proceed without agreement as to what it is supposed to be: a hierarchical classification scheme, a tracing of genomic and organismal history or a hypothesis about evolutionary processes and the patterns they can generate. I will argue that for Darwin it was a hypothesis, which lateral gene transfer in prokaryotes now shows to be false. I will propose a more general and relaxed evolutionary theory and point out why anti-evolutionists should take no comfort from disproof of the TOL hypothesis.

  4. Risk assessment for enterprise resource planning (ERP) system implementations: a fault tree analysis approach

    NASA Astrophysics Data System (ADS)

    Zeng, Yajun; Skibniewski, Miroslaw J.

    2013-08-01

    Enterprise resource planning (ERP) system implementations are often characterised with large capital outlay, long implementation duration, and high risk of failure. In order to avoid ERP implementation failure and realise the benefits of the system, sound risk management is the key. This paper proposes a probabilistic risk assessment approach for ERP system implementation projects based on fault tree analysis, which models the relationship between ERP system components and specific risk factors. Unlike traditional risk management approaches that have been mostly focused on meeting project budget and schedule objectives, the proposed approach intends to address the risks that may cause ERP system usage failure. The approach can be used to identify the root causes of ERP system implementation usage failure and quantify the impact of critical component failures or critical risk events in the implementation process.

  5. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  6. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  7. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  8. Computer-aided diagnosis of interstitial lung disease: a texture feature extraction and classification approach

    NASA Astrophysics Data System (ADS)

    Vargas-Voracek, Rene; McAdams, H. Page; Floyd, Carey E., Jr.

    1998-06-01

    An approach for the classification of normal or abnormal lung parenchyma from selected regions of interest (ROIs) of chest radiographs is presented for computer aided diagnosis of interstitial lung disease (ILD). The proposed approach uses a feed-forward neural network to classify each ROI based on a set of isotropic texture measures obtained from the joint grey level distribution of pairs of pixels separated by a specific distance. Two hundred ROIs, each 64 X 64 pixels in size (11 X 11 mm), were extracted from digitized chest radiographs for testing. Diagnosis performance was evaluated with the leave-one-out method. Classification of independent ROIs achieved a sensitivity of 90% and a specificity of 84% with an area under the receiver operating characteristic curve of 0.85. The diagnosis for each patient was correct for all cases when a `majority vote' criterion for the classification of the corresponding ROIs was applied to issue a normal or ILD patient classification. The proposed approach is a simple, fast, and consistent method for computer aided diagnosis of ILD with a very good performance. Further research will include additional cases, including differential diagnosis among ILD manifestations.

  9. A new approach to the hazard classification of alloys based on transformation/dissolution.

    PubMed

    Skeaff, James M; Hardy, David J; King, Pierrette

    2008-01-01

    Most of the metals produced for commercial application enter into service as alloys which, together with metals and all other chemicals in commerce, are subject to a hazard identification and classification initiative now being implemented in a number of jurisdictions worldwide, including the European Union Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) initiative, effective 1 June 2007. This initiative has considerable implications for environmental protection and market access. While a method for the hazard identification and classification of metals is available in the recently developed United Nations (UN) guidance document on the Globally Harmonized System of Hazard Classification and Labelling (GHS), an approach for alloys has yet to be formulated. Within the GHS, a transformation/dissolution protocol (T/ DP) for metals and sparingly soluble metal compounds is provided as a standard laboratory method for measuring the rate and extent of the release of metals into aqueous media from metal-bearing substances. By comparison with ecotoxicity reference data, T/D data can be used to derive UN GHS classification proposals. In this study we applied the T/DP for the 1st time to several economically important metals and alloys: iron powder, nickel powder, copper powder, and the alloys Fe-2Cu-0.6C (copper = 2%, carbon = 0.6%), Fe-2Ni-0.6C, Stainless Steel 304, Monel, brass, Inconel, and nickel-silver. The iron and copper powders and the iron and nickel powders had been sintered to produce the Fe-2Me-0.6C (Me = copper or nickel) alloys which made them essentially resistant to reaction with the aqueous media, so they would not classify under the GHS, although their component copper and nickel metal powders would. Forming a protective passivating film, chromium in the Stainless Steel 304 and Inconel alloys protected them from reaction with the aqueous media, so that their metal releases were minimal and would not result in GHS classification

  10. A neural-network approach to nonparametric and robust classification procedures.

    PubMed

    Voudouri-Maniati, E; Kurz, L; Kowalski, J M

    1997-01-01

    In this paper algorithms of neural-network type are introduced for solving estimation and classification problems when assumptions about independence, Gaussianity, and stationarity of the observation samples are no longer valid. Specifically, the asymptotic normality of several nonparametric classification tests is demonstrated and their implementation using a neural-network approach is presented. Initially, the neural nets train themselves via learning samples for nominal noise and alternative hypotheses distributions resulting in near optimum performance in a particular stochastic environment. In other than the nominal environments, however, high efficiency is maintained by adapting the optimum nonlinearities to changing conditions during operation via parallel networks, without disturbing the classification process. Furthermore, the superiority in performance of the proposed networks over more traditional neural nets is demonstrated in an application involving pattern recognition.

  11. Stygoregions – a promising approach to a bioregional classification of groundwater systems

    PubMed Central

    Stein, Heide; Griebler, Christian; Berkhoff, Sven; Matzke, Dirk; Fuchs, Andreas; Hahn, Hans Jürgen

    2012-01-01

    Linked to diverse biological processes, groundwater ecosystems deliver essential services to mankind, the most important of which is the provision of drinking water. In contrast to surface waters, ecological aspects of groundwater systems are ignored by the current European Union and national legislation. Groundwater management and protection measures refer exclusively to its good physicochemical and quantitative status. Current initiatives in developing ecologically sound integrative assessment schemes by taking groundwater fauna into account depend on the initial classification of subsurface bioregions. In a large scale survey, the regional and biogeographical distribution patterns of groundwater dwelling invertebrates were examined for many parts of Germany. Following an exploratory approach, our results underline that the distribution patterns of invertebrates in groundwater are not in accordance with any existing bioregional classification system established for surface habitats. In consequence, we propose to develope a new classification scheme for groundwater ecosystems based on stygoregions. PMID:22993698

  12. Stygoregions--a promising approach to a bioregional classification of groundwater systems.

    PubMed

    Stein, Heide; Griebler, Christian; Berkhoff, Sven; Matzke, Dirk; Fuchs, Andreas; Hahn, Hans Jürgen

    2012-01-01

    Linked to diverse biological processes, groundwater ecosystems deliver essential services to mankind, the most important of which is the provision of drinking water. In contrast to surface waters, ecological aspects of groundwater systems are ignored by the current European Union and national legislation. Groundwater management and protection measures refer exclusively to its good physicochemical and quantitative status. Current initiatives in developing ecologically sound integrative assessment schemes by taking groundwater fauna into account depend on the initial classification of subsurface bioregions. In a large scale survey, the regional and biogeographical distribution patterns of groundwater dwelling invertebrates were examined for many parts of Germany. Following an exploratory approach, our results underline that the distribution patterns of invertebrates in groundwater are not in accordance with any existing bioregional classification system established for surface habitats. In consequence, we propose to develope a new classification scheme for groundwater ecosystems based on stygoregions.

  13. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery using a Probabilistic Learning Framework

    NASA Astrophysics Data System (ADS)

    Basu, S.; Ganguly, S.; Michaelis, A.; Votava, P.; Roy, A.; Mukhopadhyay, S.; Nemani, R. R.

    2015-12-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  14. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery Using a Probabilistic Learning Framework

    NASA Technical Reports Server (NTRS)

    Basu, Saikat; Ganguly, Sangram; Michaelis, Andrew; Votava, Petr; Roy, Anshuman; Mukhopadhyay, Supratik; Nemani, Ramakrishna

    2015-01-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets, which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  15. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs

    NASA Astrophysics Data System (ADS)

    Haaf, Ezra; Barthel, Roland

    2016-04-01

    When assessing hydrogeological conditions at the regional scale, the analyst is often confronted with uncertainty of structures, inputs and processes while having to base inference on scarce and patchy data. Haaf and Barthel (2015) proposed a concept for handling this predicament by developing a groundwater systems classification framework, where information is transferred from similar, but well-explored and better understood to poorly described systems. The concept is based on the central hypothesis that similar systems react similarly to the same inputs and vice versa. It is conceptually related to PUB (Prediction in ungauged basins) where organization of systems and processes by quantitative methods is intended and used to improve understanding and prediction. Furthermore, using the framework it is expected that regional conceptual and numerical models can be checked or enriched by ensemble generated data from neighborhood-based estimators. In a first step, groundwater hydrographs from a large dataset in Southern Germany are compared in an effort to identify structural similarity in groundwater dynamics. A number of approaches to group hydrographs, mostly based on a similarity measure - which have previously only been used in local-scale studies, can be found in the literature. These are tested alongside different global feature extraction techniques. The resulting classifications are then compared to a visual "expert assessment"-based classification which serves as a reference. A ranking of the classification methods is carried out and differences shown. Selected groups from the classifications are related to geological descriptors. Here we present the most promising results from a comparison of classifications based on series correlation, different series distances and series features, such as the coefficients of the discrete Fourier transform and the intrinsic mode functions of empirical mode decomposition. Additionally, we show examples of classes

  16. A game-theoretic tree matching approach for object detection in high-resolution remotely sensed images

    NASA Astrophysics Data System (ADS)

    Liang, Yilong; Cahill, Nathan D.; Saber, Eli; Messinger, David W.

    2015-10-01

    In this paper, we propose a game-theoretic tree matching algorithm for object detection in high resolution (HR) remotely sensed images, where, given a scene image and an object image, the goal is to determine whether or not the object exists in the scene image. To that effect, tree based representations of the images are obtained using a hierarchical scale space approach. The nodes of the tree denote regions in the image and edges represent the relative containment between different regions. Once we have the tree representations of each image, the task of object detection is reformulated as a tree matching problem. We propose a game-theoretic technique to search for the node correspondences between a pair of trees. This method involves defining a non-cooperative matching game, where strategies denote the possible pairs of matching regions and payoffs determine the compatibilities between these strategies. Trees are matched by finding the evolutionary stable states (ESS) of the game. To validate the effectiveness of the proposed algorithm, we perform experiments on both synthetic and HR remotely sensed images. Our results demonstrate the robustness of the tree representation with respect to different spatial variations of the images, as well as the effectiveness of the proposed game-theoretic tree matching algorithm.

  17. UAV based tree height estimation in apple orchards: potential of multiple approaches

    NASA Astrophysics Data System (ADS)

    Mejia-Aguilar, Abraham; Tomelleri, Enrico; Vilardi, Andrea; Zebisch, Marc

    2015-04-01

    Canopy height, as part of vegetation structure, is ecologically important for ecological studies on biomass, matter flows or meteorology. Measuring the growth of canopy can be undertaken by the use multiple remote sensing techniques. In this study, we firstly use data generated from an Unmanned Aerial Vehicles (UAV) with a simultaneous consumer-grade RGB and modified IR cameras, configured in nadir and multi-angle views to generate 3D models for Digital Surface Model (DSM) and Digital Terrain Models (DTM) in order to estimate tree height in apple orchards in South Tyrol, Italy. We evaluate the use of Ground Control Points (GCP) to minimize the error in scale and orientation. Then, we validate and compare the results of our primary data collection with data generated by geolocated field measurements over several selected tree species. Additionally, we compare DSM and DTM obtained from a recent 1-meter resolution LIDAR campaign (Light Detection and Ranging). The main purpose of this study is to contrast multiple estimation approaches and evaluate their utility for the estimation of canopy height, highlighting the use of UAV systems as a fast, reliable and non-expensive technique especially for small scale applications. The study is conducted in a homogenous tree canopy consisting of apple orchards located in Caldaro -South Tyrol, Italy. We end with proposing a potential low-cost and inexpensive application combining models for DSM from the UAV with DTM obtained from LIDAR for applications that should be updated frequently.

  18. A simple semi-automatic approach for land cover classification from multispectral remote sensing imagery.

    PubMed

    Jiang, Dong; Huang, Yaohuan; Zhuang, Dafang; Zhu, Yunqiang; Xu, Xinliang; Ren, Hongyan

    2012-01-01

    Land cover data represent a fundamental data source for various types of scientific research. The classification of land cover based on satellite data is a challenging task, and an efficient classification method is needed. In this study, an automatic scheme is proposed for the classification of land use using multispectral remote sensing images based on change detection and a semi-supervised classifier. The satellite image can be automatically classified using only the prior land cover map and existing images; therefore human involvement is reduced to a minimum, ensuring the operability of the method. The method was tested in the Qingpu District of Shanghai, China. Using Environment Satellite 1(HJ-1) images of 2009 with 30 m spatial resolution, the areas were classified into five main types of land cover based on previous land cover data and spectral features. The results agreed on validation of land cover maps well with a Kappa value of 0.79 and statistical area biases in proportion less than 6%. This study proposed a simple semi-automatic approach for land cover classification by using prior maps with satisfied accuracy, which integrated the accuracy of visual interpretation and performance of automatic classification methods. The method can be used for land cover mapping in areas lacking ground reference information or identifying rapid variation of land cover regions (such as rapid urbanization) with convenience.

  19. A Simple Semi-Automatic Approach for Land Cover Classification from Multispectral Remote Sensing Imagery

    PubMed Central

    Jiang, Dong; Huang, Yaohuan; Zhuang, Dafang; Zhu, Yunqiang; Xu, Xinliang; Ren, Hongyan

    2012-01-01

    Land cover data represent a fundamental data source for various types of scientific research. The classification of land cover based on satellite data is a challenging task, and an efficient classification method is needed. In this study, an automatic scheme is proposed for the classification of land use using multispectral remote sensing images based on change detection and a semi-supervised classifier. The satellite image can be automatically classified using only the prior land cover map and existing images; therefore human involvement is reduced to a minimum, ensuring the operability of the method. The method was tested in the Qingpu District of Shanghai, China. Using Environment Satellite 1(HJ-1) images of 2009 with 30 m spatial resolution, the areas were classified into five main types of land cover based on previous land cover data and spectral features. The results agreed on validation of land cover maps well with a Kappa value of 0.79 and statistical area biases in proportion less than 6%. This study proposed a simple semi-automatic approach for land cover classification by using prior maps with satisfied accuracy, which integrated the accuracy of visual interpretation and performance of automatic classification methods. The method can be used for land cover mapping in areas lacking ground reference information or identifying rapid variation of land cover regions (such as rapid urbanization) with convenience. PMID:23049886

  20. Geographical characterization of greek virgin olive oils (cv. Koroneiki) using 1H and 31P NMR fingerprinting with canonical discriminant analysis and classification binary trees.

    PubMed

    Petrakis, Panos V; Agiomyrgianaki, Alexia; Christophoridou, Stella; Spyros, Apostolos; Dais, Photis

    2008-05-14

    This work deals with the prediction of the geographical origin of monovarietal virgin olive oil (cv. Koroneiki) samples from three regions of southern Greece, namely, Peloponnesus, Crete, and Zakynthos, and collected in five harvesting years (2001-2006). All samples were chemically analyzed by means of 1H and 31P NMR spectroscopy and characterized according to their content in fatty acids, phenolics, diacylglycerols, total free sterols, free acidity, and iodine number. Biostatistical analysis showed that the fruiting pattern of the olive tree complicates the geographical separation of oil samples and the selection of significant chemical compounds. In this way the inclusion of the harvesting year improved the classification of samples, but increased the dimensionality of the data. Discriminant analysis showed that the geographical prediction at the level of three regions is very high (87%) and becomes (74%) when we pass to the thinner level of six sites (Chania, Sitia, and Heraklion in Crete; Lakonia and Messinia in Peloponnesus; Zakynthos). The use of classification and binary trees made possible the construction of a geographical prediction algorithm for unknown samples in a self-improvement fashion, which can be readily extended to other varieties and areas.

  1. Ship classification using nonlinear features of radiated sound: an approach based on empirical mode decomposition.

    PubMed

    Bao, Fei; Li, Chen; Wang, Xinlong; Wang, Qingfu; Du, Shuanping

    2010-07-01

    Classification for ship-radiated underwater sound is one of the most important and challenging subjects in underwater acoustical signal processing. An approach to ship classification is proposed in this work based on analysis of ship-radiated acoustical noise in subspaces of intrinsic mode functions attained via the ensemble empirical mode decomposition. It is shown that detection and acquisition of stable and reliable nonlinear features become practically feasible by nonlinear analysis of the time series of individual decomposed components, each of which is simple enough and well represents an oscillatory mode of ship dynamics. Surrogate and nonlinear predictability analysis are conducted to probe and measure the nonlinearity and regularity. The results of both methods, which verify each other, substantiate that ship-radiated noises contain components with deterministic nonlinear features well serving for efficient classification of ships. The approach perhaps opens an alternative avenue in the direction toward object classification and identification. It may also import a new view of signals as complex as ship-radiated sound.

  2. A Novel Approach to Probabilistic Biomarker-Based Classification Using Functional Near-Infrared Spectroscopy

    PubMed Central

    Hahn, Tim; Marquand, Andre F; Plichta, Michael M; Ehlis, Ann-Christine; Schecklmann, Martin W; Dresler, Thomas; Jarczok, Tomasz A; Eirich, Elisa; Leonhard, Christine; Reif, Andreas; Lesch, Klaus-Peter; Brammer, Michael J; Mourao-Miranda, Janaina; Fallgatter, Andreas J

    2013-01-01

    Pattern recognition approaches to the analysis of neuroimaging data have brought new applications such as the classification of patients and healthy controls within reach. In our view, the reliance on expensive neuroimaging techniques which are not well tolerated by many patient groups and the inability of most current biomarker algorithms to accommodate information about prior class frequencies (such as a disorder's prevalence in the general population) are key factors limiting practical application. To overcome both limitations, we propose a probabilistic pattern recognition approach based on cheap and easy-to-use multi-channel near-infrared spectroscopy (fNIRS) measurements. We show the validity of our method by applying it to data from healthy controls (n = 14) enabling differentiation between the conditions of a visual checkerboard task. Second, we show that high-accuracy single subject classification of patients with schizophrenia (n = 40) and healthy controls (n = 40) is possible based on temporal patterns of fNIRS data measured during a working memory task. For classification, we integrate spatial and temporal information at each channel to estimate overall classification accuracy. This yields an overall accuracy of 76% which is comparable to the highest ever achieved in biomarker-based classification of patients with schizophrenia. In summary, the proposed algorithm in combination with fNIRS measurements enables the analysis of sub-second, multivariate temporal patterns of BOLD responses and high-accuracy predictions based on low-cost, easy-to-use fNIRS patterns. In addition, our approach can easily compensate for variable class priors, which is highly advantageous in making predictions in a wide range of clinical neuroimaging applications. Hum Brain Mapp, 2013. © 2012 Wiley Periodicals, Inc. PMID:22965654

  3. Accurate multi-source forest species mapping using the multiple spectral-spatial classification approach

    NASA Astrophysics Data System (ADS)

    Stavrakoudis, Dimitris; Gitas, Ioannis; Karydas, Christos; Kolokoussis, Polychronis; Karathanassi, Vassilia

    2015-10-01

    This paper proposes an efficient methodology for combining multiple remotely sensed imagery, in order to increase the classification accuracy in complex forest species mapping tasks. The proposed scheme follows a decision fusion approach, whereby each image is first classified separately by means of a pixel-wise Fuzzy-Output Support Vector Machine (FO-SVM) classifier. Subsequently, the multiple results are fused according to the so-called multiple spectral- spatial classifier using the minimum spanning forest (MSSC-MSF) approach, which constitutes an effective post-regularization procedure for enhancing the result of a single pixel-based classification. For this purpose, the original MSSC-MSF has been extended in order to handle multiple classifications. In particular, the fuzzy outputs of the pixel-based classifiers are stacked and used to grow the MSF, whereas the markers are also determined considering both classifications. The proposed methodology has been tested on a challenging forest species mapping task in northern Greece, considering a multispectral (GeoEye) and a hyper-spectral (CASI) image. The pixel-wise classifications resulted in overall accuracies (OA) of 68.71% for the GeoEye and 77.95% for the CASI images, respectively. Both of them are characterized by high levels of speckle noise. Applying the proposed multi-source MSSC-MSF fusion, the OA climbs to 90.86%, which is attributed both to the ability of MSSC-MSF to tackle the salt-and-pepper effect, as well as the fact that the fusion approach exploits the relative advantages of both information sources.

  4. Time-dependent approach for single trial classification of covert visuospatial attention

    NASA Astrophysics Data System (ADS)

    Tonin, L.; Leeb, R.; Millán, J. del R.

    2012-08-01

    Recently, several studies have started to explore covert visuospatial attention as a control signal for brain-computer interfaces (BCIs). Covert visuospatial attention represents the ability to change the focus of attention from one point in the space without overt eye movements. Nevertheless, the full potential and possible applications of this paradigm remain relatively unexplored. Voluntary covert visuospatial attention might allow a more natural and intuitive interaction with real environments as neither stimulation nor gazing is required. In order to identify brain correlates of covert visuospatial attention, classical approaches usually rely on the whole α-band over long time intervals. In this work, we propose a more detailed analysis in the frequency and time domains to enhance classification performance. In particular, we investigate the contribution of α sub-bands and the role of time intervals in carrying information about visual attention. Previous neurophysiological studies have already highlighted the role of temporal dynamics in attention mechanisms. However, these important aspects are not yet exploited in BCI. In this work, we studied different methods that explicitly cope with the natural brain dynamics during visuospatial attention tasks in order to enhance BCI robustness and classification performances. Results with ten healthy subjects demonstrate that our approach identifies spectro-temporal patterns that outperform the state-of-the-art classification method. On average, our time-dependent classification reaches 0.74 ± 0.03 of the area under the ROC (receiver operating characteristic) curve (AUC) value with an increase of 12.3% with respect to standard methods (0.65 ± 0.4). In addition, the proposed approach allows faster classification (<1 instead of 3 s), without compromising performances. Finally, our analysis highlights the fact that discriminant patterns are not stable for the whole trial period but are changing over short time

  5. Application Of Decision Tree Approach To Student Selection Model- A Case Study

    NASA Astrophysics Data System (ADS)

    Harwati; Sudiya, Amby

    2016-01-01

    The main purpose of the institution is to provide quality education to the students and to improve the quality of managerial decisions. One of the ways to improve the quality of students is to arrange the selection of new students with a more selective. This research takes the case in the selection of new students at Islamic University of Indonesia, Yogyakarta, Indonesia. One of the university's selection is through filtering administrative selection based on the records of prospective students at the high school without paper testing. Currently, that kind of selection does not yet has a standard model and criteria. Selection is only done by comparing candidate application file, so the subjectivity of assessment is very possible to happen because of the lack standard criteria that can differentiate the quality of students from one another. By applying data mining techniques classification, can be built a model selection for new students which includes criteria to certain standards such as the area of origin, the status of the school, the average value and so on. These criteria are determined by using rules that appear based on the classification of the academic achievement (GPA) of the students in previous years who entered the university through the same way. The decision tree method with C4.5 algorithm is used here. The results show that students are given priority for admission is that meet the following criteria: came from the island of Java, public school, majoring in science, an average value above 75, and have at least one achievement during their study in high school.

  6. Decision-tree-model identification of nitrate pollution activities in groundwater: A combination of a dual isotope approach and chemical ions.

    PubMed

    Xue, Dongmei; Pang, Fengmei; Meng, Fanqiao; Wang, Zhongliang; Wu, Wenliang

    2015-09-01

    To develop management practices for agricultural crops to protect against NO3(-) contamination in groundwater, dominant pollution activities require reliable classification. In this study, we (1) classified potential NO3(-) pollution activities via an unsupervised learning algorithm based on δ(15)N- and δ(18)O-NO3(-) and physico-chemical properties of groundwater at 55 sampling locations; and (2) determined which water quality parameters could be used to identify the sources of NO3(-) contamination via a decision tree model. When a combination of δ(15)N-, δ(18)O-NO3(-) and physico-chemical properties of groundwater was used as an input for the k-means clustering algorithm, it allowed for a reliable clustering of the 55 sampling locations into 4 corresponding agricultural activities: well irrigated agriculture (28 sampling locations), sewage irrigated agriculture (16 sampling locations), a combination of sewage irrigated agriculture, farm and industry (5 sampling locations) and a combination of well irrigated agriculture and farm (6 sampling locations). A decision tree model with 97.5% classification success was developed based on SO4(2-) and Cl(-) variables. The NO3(-) and the δ(15)N- and δ(18)O-NO3(-) variables demonstrated limitation in developing a decision tree model as multiple N sources and fractionation processes both resulted in difficulties of discriminating NO3(-) concentrations and isotopic values. Although only the SO4(2-) and Cl(-) were selected as important discriminating variables, concentration data alone could not identify the specific NO3(-) sources responsible for groundwater contamination. This is a result of comprehensive analysis. To further reduce NO3(-) contamination, an integrated approach should be set-up by combining N and O isotopes of NO3(-) with land-uses and physico-chemical properties, especially in areas with complex agricultural activities.

  7. Decision-tree-model identification of nitrate pollution activities in groundwater: A combination of a dual isotope approach and chemical ions.

    PubMed

    Xue, Dongmei; Pang, Fengmei; Meng, Fanqiao; Wang, Zhongliang; Wu, Wenliang

    2015-09-01

    To develop management practices for agricultural crops to protect against NO3(-) contamination in groundwater, dominant pollution activities require reliable classification. In this study, we (1) classified potential NO3(-) pollution activities via an unsupervised learning algorithm based on δ(15)N- and δ(18)O-NO3(-) and physico-chemical properties of groundwater at 55 sampling locations; and (2) determined which water quality parameters could be used to identify the sources of NO3(-) contamination via a decision tree model. When a combination of δ(15)N-, δ(18)O-NO3(-) and physico-chemical properties of groundwater was used as an input for the k-means clustering algorithm, it allowed for a reliable clustering of the 55 sampling locations into 4 corresponding agricultural activities: well irrigated agriculture (28 sampling locations), sewage irrigated agriculture (16 sampling locations), a combination of sewage irrigated agriculture, farm and industry (5 sampling locations) and a combination of well irrigated agriculture and farm (6 sampling locations). A decision tree model with 97.5% classification success was developed based on SO4(2-) and Cl(-) variables. The NO3(-) and the δ(15)N- and δ(18)O-NO3(-) variables demonstrated limitation in developing a decision tree model as multiple N sources and fractionation processes both resulted in difficulties of discriminating NO3(-) concentrations and isotopic values. Although only the SO4(2-) and Cl(-) were selected as important discriminating variables, concentration data alone could not identify the specific NO3(-) sources responsible for groundwater contamination. This is a result of comprehensive analysis. To further reduce NO3(-) contamination, an integrated approach should be set-up by combining N and O isotopes of NO3(-) with land-uses and physico-chemical properties, especially in areas with complex agricultural activities. PMID:26231989

  8. Bag-of-features approach for improvement of lung tissue classification in diffuse lung disease

    NASA Astrophysics Data System (ADS)

    Kato, Noriji; Fukui, Motofumi; Isozaki, Takashi

    2009-02-01

    Many automated techniques have been proposed to classify diffuse lung disease patterns. Most of the techniques utilize texture analysis approaches with second and higher order statistics, and show successful classification result among various lung tissue patterns. However, the approaches do not work well for the patterns with inhomogeneous texture distribution within a region of interest (ROI), such as reticular and honeycombing patterns, because the statistics can only capture averaged feature over the ROI. In this work, we have introduced the bag-of-features approach to overcome this difficulty. In the approach, texture images are represented as histograms or distributions of a few basic primitives, which are obtained by clustering local image features. The intensity descriptor and the Scale Invariant Feature Transformation (SIFT) descriptor are utilized to extract the local features, which have significant discriminatory power due to their specificity to a particular image class. In contrast, the drawback of the local features is lack of invariance under translation and rotation. We improved the invariance by sampling many local regions so that the distribution of the local features is unchanged. We evaluated the performance of our system in the classification task with 5 image classes (ground glass, reticular, honeycombing, emphysema, and normal) using 1109 ROIs from 211 patients. Our system achieved high classification accuracy of 92.8%, which is superior to that of the conventional system with the gray level co-occurrence matrix (GLCM) feature especially for inhomogeneous texture patterns.

  9. A robust approach for tree segmentation in deciduous forests using small-footprint airborne LiDAR data

    NASA Astrophysics Data System (ADS)

    Hamraz, Hamid; Contreras, Marco A.; Zhang, Jun

    2016-10-01

    This paper presents a non-parametric approach for segmenting trees from airborne LiDAR data in deciduous forests. Based on the LiDAR point cloud, the approach collects crown information such as steepness and height on-the-fly to delineate crown boundaries, and most importantly, does not require a priori assumptions of crown shape and size. The approach segments trees iteratively starting from the tallest within a given area to the smallest until all trees have been segmented. To evaluate its performance, the approach was applied to the University of Kentucky Robinson Forest, a deciduous closed-canopy forest with complex terrain and vegetation conditions. The approach identified 94% of dominant and co-dominant trees with a false detection rate of 13%. About 62% of intermediate, overtopped, and dead trees were also detected with a false detection rate of 15%. The overall segmentation accuracy was 77%. Correlations of the segmentation scores of the proposed approach with local terrain and stand metrics was not significant, which is likely an indication of the robustness of the approach as results are not sensitive to the differences in terrain and stand structures.

  10. Factors Associated with Caregiver Stability in Permanent Placements: A Classification Tree Approach

    ERIC Educational Resources Information Center

    Proctor, Laura J.; Van Dusen Randazzo, Katherine; Litrownik, Alan J.; Newton, Rae R.; Davis, Inger P.; Villodas, Miguel

    2011-01-01

    Objective: Identify individual and environmental variables associated with caregiver stability and instability for children in diverse permanent placement types (i.e., reunification, adoption, and long-term foster care/guardianship with relatives or non-relatives), following 5 or more months in out-of-home care prior to age 4 due to substantiated…

  11. Which sociodemographic factors are important on smoking behaviour of high school students? The contribution of classification and regression tree methodology in a broad epidemiological survey

    PubMed Central

    Özge, C; Toros, F; Bayramkaya, E; Çamdeviren, H; Şaşmaz, T

    2006-01-01

    Background The purpose of this study is to evaluate the most important sociodemographic factors on smoking status of high school students using a broad randomised epidemiological survey. Methods Using in‐class, self administered questionnaire about their sociodemographic variables and smoking behaviour, a representative sample of total 3304 students of preparatory, 9th, 10th, and 11th grades, from 22 randomly selected schools of Mersin, were evaluated and discriminative factors have been determined using appropriate statistics. In addition to binary logistic regression analysis, the study evaluated combined effects of these factors using classification and regression tree methodology, as a new statistical method. Results The data showed that 38% of the students reported lifetime smoking and 16.9% of them reported current smoking with a male predominancy and increasing prevalence by age. Second hand smoking was reported at a 74.3% frequency with father predominance (56.6%). The significantly important factors that affect current smoking in these age groups were increased by household size, late birth rank, certain school types, low academic performance, increased second hand smoking, and stress (especially reported as separation from a close friend or because of violence at home). Classification and regression tree methodology showed the importance of some neglected sociodemographic factors with a good classification capacity. Conclusions It was concluded that, as closely related with sociocultural factors, smoking was a common problem in this young population, generating important academic and social burden in youth life and with increasing data about this behaviour and using new statistical methods, effective coping strategies could be composed. PMID:16891446

  12. [Proposals for social class classification based on the Spanish National Classification of Occupations 2011 using neo-Weberian and neo-Marxist approaches].

    PubMed

    Domingo-Salvany, Antònia; Bacigalupe, Amaia; Carrasco, José Miguel; Espelt, Albert; Ferrando, Josep; Borrell, Carme

    2013-01-01

    In Spain, the new National Classification of Occupations (Clasificación Nacional de Ocupaciones [CNO-2011]) is substantially different to the 1994 edition, and requires adaptation of occupational social classes for use in studies of health inequalities. This article presents two proposals to measure social class: the new classification of occupational social class (CSO-SEE12), based on the CNO-2011 and a neo-Weberian perspective, and a social class classification based on a neo-Marxist approach. The CSO-SEE12 is the result of a detailed review of the CNO-2011 codes. In contrast, the neo-Marxist classification is derived from variables related to capital and organizational and skill assets. The proposed CSO-SEE12 consists of seven classes that can be grouped into a smaller number of categories according to study needs. The neo-Marxist classification consists of 12 categories in which home owners are divided into three categories based on capital goods and employed persons are grouped into nine categories composed of organizational and skill assets. These proposals are complemented by a proposed classification of educational level that integrates the various curricula in Spain and provides correspondences with the International Standard Classification of Education.

  13. [Proposals for social class classification based on the Spanish National Classification of Occupations 2011 using neo-Weberian and neo-Marxist approaches].

    PubMed

    Domingo-Salvany, Antònia; Bacigalupe, Amaia; Carrasco, José Miguel; Espelt, Albert; Ferrando, Josep; Borrell, Carme

    2013-01-01

    In Spain, the new National Classification of Occupations (Clasificación Nacional de Ocupaciones [CNO-2011]) is substantially different to the 1994 edition, and requires adaptation of occupational social classes for use in studies of health inequalities. This article presents two proposals to measure social class: the new classification of occupational social class (CSO-SEE12), based on the CNO-2011 and a neo-Weberian perspective, and a social class classification based on a neo-Marxist approach. The CSO-SEE12 is the result of a detailed review of the CNO-2011 codes. In contrast, the neo-Marxist classification is derived from variables related to capital and organizational and skill assets. The proposed CSO-SEE12 consists of seven classes that can be grouped into a smaller number of categories according to study needs. The neo-Marxist classification consists of 12 categories in which home owners are divided into three categories based on capital goods and employed persons are grouped into nine categories composed of organizational and skill assets. These proposals are complemented by a proposed classification of educational level that integrates the various curricula in Spain and provides correspondences with the International Standard Classification of Education. PMID:23394892

  14. Image-Based Airborne Sensors: A Combined Approach for Spectral Signatures Classification through Deterministic Simulated Annealing

    PubMed Central

    Guijarro, María; Pajares, Gonzalo; Herrera, P. Javier

    2009-01-01

    The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the Deterministic Simulated Annealing (DSA) framework. The simple classifiers used are the well tested supervised parametric Bayesian estimator and the Fuzzy Clustering. The DSA is an optimization approach, which minimizes an energy function. The main contribution of DSA is its ability to avoid local minima during the optimization process thanks to the annealing scheme. It outperforms simple classifiers used for the combination and some combined strategies, including a scheme based on the fuzzy cognitive maps and an optimization approach based on the Hopfield neural network paradigm. PMID:22399989

  15. A new computer approach to mixed feature classification for forestry application

    NASA Technical Reports Server (NTRS)

    Kan, E. P.

    1976-01-01

    A computer approach for mapping mixed forest features (i.e., types, classes) from computer classification maps is discussed. Mixed features such as mixed softwood/hardwood stands are treated as admixtures of softwood and hardwood areas. Large-area mixed features are identified and small-area features neglected when the nominal size of a mixed feature can be specified. The computer program merges small isolated areas into surrounding areas by the iterative manipulation of the postprocessing algorithm that eliminates small connected sets. For a forestry application, computer-classified LANDSAT multispectral scanner data of the Sam Houston National Forest were used to demonstrate the proposed approach. The technique was successful in cleaning the salt-and-pepper appearance of multiclass classification maps and in mapping admixtures of softwood areas and hardwood areas. However, the computer-mapped mixed areas matched very poorly with the ground truth because of inadequate resolution and inappropriate definition of mixed features.

  16. What determines tree mortality in dry environments? A multi-perspective approach.

    PubMed

    Dorman, Michael; Svoray, Tal; Perevolotsky, Avi; Moshe, Yitzhak; Sarris, Dimitrios

    2015-06-01

    dendrochronological and remotely sensed performance indicators, in contrast to potential bias when using a single approach. For example, dendrochronological data suggested highly resilient tree growth, since it was based only on the "surviving" portion of the population, thus failing to identify past demographic changes evident through remote sensing. We therefore suggest that evaluation of forest resilience should be based on several metrics, each suited for detecting transitions at a different level of organization.

  17. What determines tree mortality in dry environments? A multi-perspective approach.

    PubMed

    Dorman, Michael; Svoray, Tal; Perevolotsky, Avi; Moshe, Yitzhak; Sarris, Dimitrios

    2015-06-01

    dendrochronological and remotely sensed performance indicators, in contrast to potential bias when using a single approach. For example, dendrochronological data suggested highly resilient tree growth, since it was based only on the "surviving" portion of the population, thus failing to identify past demographic changes evident through remote sensing. We therefore suggest that evaluation of forest resilience should be based on several metrics, each suited for detecting transitions at a different level of organization. PMID:26465042

  18. Automatic Training Sample Selection for a Multi-Evidence Based Crop Classification Approach

    NASA Astrophysics Data System (ADS)

    Chellasamy, M.; Ferre, P. A. Ty; Humlekrog Greve, M.

    2014-09-01

    An approach to use the available agricultural parcel information to automatically select training samples for crop classification is investigated. Previous research addressed the multi-evidence crop classification approach using an ensemble classifier. This first produced confidence measures using three Multi-Layer Perceptron (MLP) neural networks trained separately with spectral, texture and vegetation indices; classification labels were then assigned based on Endorsement Theory. The present study proposes an approach to feed this ensemble classifier with automatically selected training samples. The available vector data representing crop boundaries with corresponding crop codes are used as a source for training samples. These vector data are created by farmers to support subsidy claims and are, therefore, prone to errors such as mislabeling of crop codes and boundary digitization errors. The proposed approach is named as ECRA (Ensemble based Cluster Refinement Approach). ECRA first automatically removes mislabeled samples and then selects the refined training samples in an iterative training-reclassification scheme. Mislabel removal is based on the expectation that mislabels in each class will be far from cluster centroid. However, this must be a soft constraint, especially when working with a hypothesis space that does not contain a good approximation of the targets classes. Difficulty in finding a good approximation often exists either due to less informative data or a large hypothesis space. Thus this approach uses the spectral, texture and indices domains in an ensemble framework to iteratively remove the mislabeled pixels from the crop clusters declared by the farmers. Once the clusters are refined, the selected border samples are used for final learning and the unknown samples are classified using the multi-evidence approach. The study is implemented with WorldView-2 multispectral imagery acquired for a study area containing 10 crop classes. The proposed

  19. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  20. A bag of cells approach for antinuclear antibodies HEp-2 image classification.

    PubMed

    Wiliem, Arnold; Hobson, Peter; Minchin, Rodney F; Lovell, Brian C

    2015-06-01

    The antinuclear antibody (ANA) test via indirect immunofluorescence applied on Human Epithelial type 2 (HEp-2) cells is a pathology test commonly used to identify connective tissue diseases (CTDs). Despite its effectiveness, the test is still considered labor intensive and time consuming. Applying image-based computer aided diagnosis (CAD) systems is one of the possible ways to address these issues. Ideally, a CAD system should be able to classify ANA HEp-2 images taken by a camera fitted to a fluorescence microscope. Unfortunately, most prior works have primarily focused on the HEp-2 cell image classification problem which is one of the early essential steps in the system pipeline. In this work we directly tackle the specimen image classification problem. We aim to develop a system that can be easily scaled and has competitive accuracy. ANA HEp-2 images or ANA images are generally comprised of a number of cells. Patterns exhibiting in the cells are then used to make inference on the ANA image pattern. To that end, we adapted a popular approach for general image classification problems, namely a bag of visual words approach. Each specimen is considered as a visual document containing visual vocabularies represented by its cells. A specimen image is then represented by a histogram of visual vocabulary occurrences. We name this approach as the Bag of Cells approach. We studied the performance of the proposed approach on a set of images taken from 262 ANA positive patient sera. The results show the proposed approach has competitive performance compared to the recent state-of-the-art approaches. Our proposal can also be expanded to other tests involving examining patterns of human cells to make inferences.

  1. Functional classification of CATH superfamilies: a domain-based approach for protein function annotation

    PubMed Central

    Das, Sayoni; Lee, David; Sillitoe, Ian; Dawson, Natalie L.; Lees, Jonathan G.; Orengo, Christine A.

    2015-01-01

    Motivation: Computational approaches that can predict protein functions are essential to bridge the widening function annotation gap especially since <1.0% of all proteins in UniProtKB have been experimentally characterized. We present a domain-based method for protein function classification and prediction of functional sites that exploits functional sub-classification of CATH superfamilies. The superfamilies are sub-classified into functional families (FunFams) using a hierarchical clustering algorithm supervised by a new classification method, FunFHMMer. Results: FunFHMMer generates more functionally coherent groupings of protein sequences than other domain-based protein classifications. This has been validated using known functional information. The conserved positions predicted by the FunFams are also found to be enriched in known functional residues. Moreover, the functional annotations provided by the FunFams are found to be more precise than other domain-based resources. FunFHMMer currently identifies 110 439 FunFams in 2735 superfamilies which can be used to functionally annotate > 16 million domain sequences. Availability and implementation: All FunFam annotation data are made available through the CATH webpages (http://www.cathdb.info). The FunFHMMer webserver (http://www.cathdb.info/search/by_funfhmmer) allows users to submit query sequences for assignment to a CATH FunFam. Contact: sayoni.das.12@ucl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26139634

  2. Multi-Stage Approach to Travel-Mode Segmentation and Classification of GPS Traces

    NASA Astrophysics Data System (ADS)

    Zhang, L.; Dalyot, S.; Eggert, D.; Sester, M.

    2011-08-01

    This paper presents a multi-stage approach toward the robust classification of travel-modes from GPS traces. Due to the fact that GPS traces are often composed of more than one travel-mode, they are segmented to find sub-traces characterized as an individual travel-mode. This is conducted by finding individual movement segments by identifying stops. In the first stage of classification three main travel-mode classes are identified: pedestrian, bicycle, and motorized vehicles; this is achieved based on the identified segments using speed, acceleration and heading related parameters. Then, segments are linked up to form sub-traces of individual travel-mode. After the first stage is achieved, a breakdown classification of the motorized vehicles class is implemented based on sub-traces of individual travel-mode of cars, buses, trams and trains using Support Vector Machines (SVMs) method. This paper presents a qualitative classification of travel-modes, thus introducing new robust and precise capabilities for the problem at hand.

  3. A Comparison of Computer-Based Classification Testing Approaches Using Mixed-Format Tests with the Generalized Partial Credit Model

    ERIC Educational Resources Information Center

    Kim, Jiseon

    2010-01-01

    Classification testing has been widely used to make categorical decisions by determining whether an examinee has a certain degree of ability required by established standards. As computer technologies have developed, classification testing has become more computerized. Several approaches have been proposed and investigated in the context of…

  4. Comparison of Standard and Novel Signal Analysis Approaches to Obstructive Sleep Apnea Classification

    PubMed Central

    Roebuck, Aoife; Clifford, Gari D.

    2015-01-01

    Obstructive sleep apnea (OSA) is a disorder characterized by repeated pauses in breathing during sleep, which leads to deoxygenation and voiced chokes at the end of each episode. OSA is associated by daytime sleepiness and an increased risk of serious conditions such as cardiovascular disease, diabetes, and stroke. Between 2 and 7% of the adult population globally has OSA, but it is estimated that up to 90% of those are undiagnosed and untreated. Diagnosis of OSA requires expensive and cumbersome screening. Audio offers a potential non-contact alternative, particularly with the ubiquity of excellent signal processing on every phone. Previous studies have focused on the classification of snoring and apneic chokes. However, such approaches require accurate identification of events. This leads to limited accuracy and small study populations. In this work, we propose an alternative approach which uses multiscale entropy (MSE) coefficients presented to a classifier to identify disorder in vocal patterns indicative of sleep apnea. A database of 858 patients was used, the largest reported in this domain. Apneic choke, snore, and noise events encoded with speech analysis features were input into a linear classifier. Coefficients of MSE derived from the first 4 h of each recording were used to train and test a random forest to classify patients as apneic or not. Standard speech analysis approaches for event classification achieved an out-of-sample accuracy (Ac) of 76.9% with a sensitivity (Se) of 29.2% and a specificity (Sp) of 88.7% but high variance. For OSA severity classification, MSE provided an out-of-sample Ac of 79.9%, Se of 66.0%, and Sp = 88.8%. Including demographic information improved the MSE-based classification performance to Ac = 80.5%, Se = 69.2%, and Sp = 87.9%. These results indicate that audio recordings could be used in screening for OSA, but are generally under-sensitive. PMID:26380256

  5. Novel consensus approaches to the reliable ranking of features for seabed imagery classification.

    PubMed

    Harrison, Richard; Birchall, Roger; Mann, Dave; Wang, Wenjia

    2012-12-01

    Feature saliency estimation and feature selection are important tasks in machine learning applications. Filters, such as distance measures are commonly used as an efficient means of estimating the saliency of individual features. However, feature rankings derived from different distance measures are frequently inconsistent. This can present reliability issues when the rankings are used for feature selection. Two novel consensus approaches to creating a more robust ranking are presented in this paper. Our experimental results show that the consensus approaches can improve reliability over a range of feature parameterizations and various seabed texture classification tasks in sidescan sonar mosaic imagery.

  6. Detection of fallen trees in ALS point clouds using a Normalized Cut approach trained by simulation

    NASA Astrophysics Data System (ADS)

    Polewski, Przemyslaw; Yao, Wei; Heurich, Marco; Krzystek, Peter; Stilla, Uwe

    2015-07-01

    Downed dead wood is regarded as an important part of forest ecosystems from an ecological perspective, which drives the need for investigating its spatial distribution. Based on several studies, Airborne Laser Scanning (ALS) has proven to be a valuable remote sensing technique for obtaining such information. This paper describes a unified approach to the detection of fallen trees from ALS point clouds based on merging short segments into whole stems using the Normalized Cut algorithm. We introduce a new method of defining the segment similarity function for the clustering procedure, where the attribute weights are learned from labeled data. Based on a relationship between Normalized Cut's similarity function and a class of regression models, we show how to learn the similarity function by training a classifier. Furthermore, we propose using an appearance-based stopping criterion for the graph cut algorithm as an alternative to the standard Normalized Cut threshold approach. We set up a virtual fallen tree generation scheme to simulate complex forest scenarios with multiple overlapping fallen stems. This simulated data is then used as a basis to learn both the similarity function and the stopping criterion for Normalized Cut. We evaluate our approach on 5 plots from the strictly protected mixed mountain forest within the Bavarian Forest National Park using reference data obtained via a manual field inventory. The experimental results show that our method is able to detect up to 90% of fallen stems in plots having 30-40% overstory cover with a correctness exceeding 80%, even in quite complex forest scenes. Moreover, the performance for feature weights trained on simulated data is competitive with the case when the weights are calculated using a grid search on the test data, which indicates that the learned similarity function and stopping criterion can generalize well on new plots.

  7. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests

    ERIC Educational Resources Information Center

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2009-01-01

    Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…

  8. Insights into geomorphic and vegetation spatial patterns within dynamic river floodplains using soft classification approaches

    NASA Astrophysics Data System (ADS)

    Guneralp, I.; Filippi, A. M.; Guneralp, B.; You, M.

    2014-12-01

    Lowland rivers in broad alluvial floodplains create one of the most dynamic landscapes, governed by multiple, and commonly nonlinear, interactions among geomorphic, hydrologic, and ecologic processes. Fluvial landforms and land-cover patches composing the floodplains of lowland rivers vary in their shapes and sizes because of variations in vegetation biomass, topography, and soil composition (e.g., of abandoned meanders versus accreting bars) across space. Such floodplain heterogeneity, in turn, influences future river-channel evolution by creating variability in channel-migration rates. In this study, using Landsat 5 Thematic Mapper data and alternative image-classification approaches, we investigate geomorphic and vegetation spatial patterns in a dynamic large tropical river. Specifically, we examine the spatial relations between river-channel planform and fluvial-landform and land-cover patterns across the floodplain. We classify the images using both hard and soft classification algorithms. We characterize the structure of geomorphic landform and vegetation components of the floodplain by computing a range of class-level landscape metrics based on the classified images. Results indicate that comparable classification accuracies are accrued for the inherently hard and (hardened) soft classification images, ranging from 89.8% to 91.8% overall accuracy. However, soft classification images provide unique information regarding spatially-varying similarities and differences in water-column properties of oxbow lakes and the main river channel. Proximity analyses, where buffer zones along the river with distances corresponding to 5, 10, and 20 river-channel widths are constructed, reveal that the average size of forest patches first increase away from the river banks but they become sparse after a distance of 10 channel widths away from the river.

  9. A Unified Experimental Approach for Estimation of Irrigationwater and Nitrate Leaching in Tree Crops

    NASA Astrophysics Data System (ADS)

    Hopmans, J. W.; Kandelous, M. M.; Moradi, A. B.

    2014-12-01

    Groundwater quality is specifically vulnerable in irrigated agricultural lands in California and many other(semi-)arid regions of the world. The routine application of nitrogen fertilizers with irrigation water in California is likely responsible for the high nitrate concentrations in groundwater, underlying much of its main agricultural areas. To optimize irrigation/fertigation practices, it is essential that irrigation and fertilizers are applied at the optimal concentration, place, and time to ensure maximum root uptake and minimize leaching losses to the groundwater. The applied irrigation water and dissolved fertilizer, as well as root growth and associated nitrate and water uptake, interact with soil properties and fertilizer source(s) in a complex manner that cannot easily be resolved. It is therefore that coupled experimental-modeling studies are required to allow for unraveling of the relevant complexities that result from typical field-wide spatial variations of soil texture and layering across farmer-managed fields. We present experimental approaches across a network of tree crop orchards in the San Joaquin Valley, that provide the necessary soil data of soil moisture, water potential and nitrate concentration to evaluate and optimize irrigation water management practices. Specifically, deep tensiometers were used to monitor in-situ continuous soil water potential gradients, for the purpose to compute leaching fluxes of water and nitrate at both the individual tree and field scale.

  10. A new approach for clustered MCs classification with sparse features learning and TWSVM.

    PubMed

    Zhang, Xin-Sheng

    2014-01-01

    In digital mammograms, an early sign of breast cancer is the existence of microcalcification clusters (MCs), which is very important to the early breast cancer detection. In this paper, a new approach is proposed to classify and detect MCs. We formulate this classification problem as sparse feature learning based classification on behalf of the test samples with a set of training samples, which are also known as a "vocabulary" of visual parts. A visual information-rich vocabulary of training samples is manually built up from a set of samples, which include MCs parts and no-MCs parts. With the prior ground truth of MCs in mammograms, the sparse feature learning is acquired by the l(P)-regularized least square approach with the interior-point method. Then we designed the sparse feature learning based MCs classification algorithm using twin support vector machines (TWSVMs). To investigate its performance, the proposed method is applied to DDSM datasets and compared with support vector machines (SVMs) with the same dataset. Experiments have shown that performance of the proposed method is more efficient or better than the state-of-art methods. PMID:24764773

  11. Extended Gabor approach applied to classification of emphysematous patterns in computed tomography

    PubMed Central

    Escalante-Ramírez, Boris; Cristóbal, Gabriel; Estépar, Raúl San José

    2014-01-01

    Chronic obstructive pulmonary disease (COPD) is a progressive and irreversible lung condition typically related to emphysema. It hinders air from passing through airpaths and causes that alveolar sacs lose their elastic quality. Findings of COPD may be manifested in a variety of computed tomography (CT) studies. Nevertheless, visual assessment of CT images is time-consuming and depends on trained observers. Hence, a reliable computer-aided diagnosis system would be useful to reduce time and inter-evaluator variability. In this paper, we propose a new emphysema classification framework based on complex Gabor filters and local binary patterns. This approach simultaneously encodes global characteristics and local information to describe emphysema morphology in CT images. Kernel Fisher analysis was used to reduce dimensionality and to find the most discriminant nonlinear boundaries among classes. Finally, classification was performed using the k-nearest neighbor classifier. The results have shown the effectiveness of our approach for quantifying lesions due to emphysema and that the combination of descriptors yields to a better classification performance. PMID:24496558

  12. A water balance approach for reconstructing streamflow using tree-ring proxy records

    NASA Astrophysics Data System (ADS)

    Saito, Laurel; Biondi, Franco; Devkota, Rajan; Vittori, Jasmine; Salas, Jose D.

    2015-10-01

    Tree-ring data have been used to augment limited instrumental records of climate and provide a longer view of past variability, thus improving assessments of future scenarios. For streamflow reconstructions, traditional regression-based approaches cannot examine factors that may alter streamflow independently of climate, such as changes in land use or land cover. In this study, seasonal water balance models were used as a mechanistic approach to reconstruct streamflow with proxy inputs of precipitation and air temperature. We examined a Thornthwaite water balance model modified to have seasonal components and a simple water balance model with a snow component. These two models were calibrated with a shuffled complex evolution approach using PRISM and proxy seasonal temperature and precipitation to reconstruct streamflow for the upper reaches of the West Walker River basin at Coleville, CA. Overall, the modified Thornthwaite model performed best during calibration, with R2 values of 0.96 and 0.80 using PRISM and proxy inputs, respectively. The modified Thornthwaite model was then used to reconstruct streamflow during AD 1500-1980 for the West Walker River basin. The reconstruction included similar wet and dry episodes as other regression-based records for the Great Basin, and provided estimates of actual evapotranspiration and of April 1 snow water equivalence. Given its limited input requirements, this approach is suitable in areas where sparse instrumental data are available to improve proxy-based streamflow reconstructions and to explore non-climatic reasons for streamflow variability during the reconstruction period.

  13. An object-based approach to hierarchical classification of the Earth's topography from SRTM data

    NASA Astrophysics Data System (ADS)

    Eisank, C.; Dragut, L.

    2012-04-01

    Digital classification of the Earth's surface has significantly benefited from the availability of global DEMs and recent advances in image processing techniques. Such an innovative approach is object-based analysis, which integrates multi-scale segmentation and rule-based classification. Since the classification is based on spatially configured objects and no longer on solely thematically defined cells, the resulting landforms or landform types are represented in a more realistic way. However, up to now, the object-based approach has not been adopted for broad-scale topographic modelling. Existing global to almost-global terrain classification systems have been implemented on per cell schemes, accepting disadvantages such as the speckled character of outputs and the non-consideration of space. We introduce the first object-based method to automatically classify the Earth's surface as represented by the SRTM into a three-level hierarchy of topographic regions. The new method relies on the concept of decomposing land-surface complexity into ever more homogeneous domains. The SRTM elevation layer is automatically segmented and classified at three levels that represent domains of complexity by using self-adaptive, data-driven techniques. For each domain, scales in the data are detected with the help of local variance and segmentation is performed at these recognised scales. Objects resulting from segmentation are partitioned into sub-domains based on thresholds given by the mean values of elevation and standard deviation of elevation respectively. Results resemble patterns of existing global and regional classifications, displaying a level of detail close to manually drawn maps. Statistical evaluation indicates that most of the classes satisfy the regionalisation requirements of maximising internal homogeneity while minimising external homogeneity. Most objects have boundaries matching natural discontinuities at the regional level. The method is simple and fully

  14. A Lagrangian Relax-and-Cut Approach for the Bounded Diameter Minimum Spanning Tree Problem

    NASA Astrophysics Data System (ADS)

    Raidl, Günther R.; Gruber, Martin

    2008-09-01

    We consider the problem of finding for a given weighted graph a minimum cost spanning tree whose diameter does not exceed a specified upper bound. This problem is NP-hard and has several applications, e.g. when designing communication networks and quality of service is of concern. We model the problem as an integer linear program (ILP) using so-called jump inequalities. Since the number of these constraints grows exponentially with the problem size, solving this ILP directly is not feasible. Instead, we relax the jump constraints in a Lagrangian fashion and apply a cutting plane algorithm to separate violated inequalities. This relax-and-cut approach yields relatively tight lower bounds especially for larger problem instances on which exact techniques are not applicable. High quality feasible solutions, i.e. upper bounds, are obtained by a repair heuristic in combination with a powerful variable neighborhood descent strategy.

  15. Semi-automatic classification of glaciovolcanic landforms: An object-based mapping approach based on geomorphometry

    NASA Astrophysics Data System (ADS)

    Pedersen, G. B. M.

    2016-02-01

    A new object-oriented approach is developed to classify glaciovolcanic landforms (Procedure A) and their landform elements boundaries (Procedure B). It utilizes the principle that glaciovolcanic edifices are geomorphometrically distinct from lava shields and plains (Pedersen and Grosse, 2014), and the approach is tested on data from Reykjanes Peninsula, Iceland. The outlined procedures utilize slope and profile curvature attribute maps (20 m/pixel) and the classified results are evaluated quantitatively through error matrix maps (Procedure A) and visual inspection (Procedure B). In procedure A, the highest obtained accuracy is 94.1%, but even simple mapping procedures provide good results (> 90% accuracy). Successful classification of glaciovolcanic landform element boundaries (Procedure B) is also achieved and this technique has the potential to delineate the transition from intraglacial to subaerial volcanic activity in orthographic view. This object-oriented approach based on geomorphometry overcomes issues with vegetation cover, which has been typically problematic for classification schemes utilizing spectral data. Furthermore, it handles complex edifice outlines well and is easily incorporated into a GIS environment, where results can be edited or fused with other mapping results. The approach outlined here is designed to map glaciovolcanic edifices within the Icelandic neovolcanic zone but may also be applied to similar subaerial or submarine volcanic settings, where steep volcanic edifices are surrounded by flat plains.

  16. Hydrometeor classification through statistical clustering of polarimetric radar measurements: a semi-supervised approach

    NASA Astrophysics Data System (ADS)

    Besic, Nikola; Ventura, Jordi Figueras i.; Grazioli, Jacopo; Gabella, Marco; Germann, Urs; Berne, Alexis

    2016-09-01

    Polarimetric radar-based hydrometeor classification is the procedure of identifying different types of hydrometeors by exploiting polarimetric radar observations. The main drawback of the existing supervised classification methods, mostly based on fuzzy logic, is a significant dependency on a presumed electromagnetic behaviour of different hydrometeor types. Namely, the results of the classification largely rely upon the quality of scattering simulations. When it comes to the unsupervised approach, it lacks the constraints related to the hydrometeor microphysics. The idea of the proposed method is to compensate for these drawbacks by combining the two approaches in a way that microphysical hypotheses can, to a degree, adjust the content of the classes obtained statistically from the observations. This is done by means of an iterative approach, performed offline, which, in a statistical framework, examines clustered representative polarimetric observations by comparing them to the presumed polarimetric properties of each hydrometeor class. Aside from comparing, a routine alters the content of clusters by encouraging further statistical clustering in case of non-identification. By merging all identified clusters, the multi-dimensional polarimetric signatures of various hydrometeor types are obtained for each of the studied representative datasets, i.e. for each radar system of interest. These are depicted by sets of centroids which are then employed in operational labelling of different hydrometeors. The method has been applied on three C-band datasets, each acquired by different operational radar from the MeteoSwiss Rad4Alp network, as well as on two X-band datasets acquired by two research mobile radars. The results are discussed through a comparative analysis which includes a corresponding supervised and unsupervised approach, emphasising the operational potential of the proposed method.

  17. Operational optimization of irrigation scheduling for citrus trees using an ensemble based data assimilation approach

    NASA Astrophysics Data System (ADS)

    Hendricks Franssen, H.; Han, X.; Martinez, F.; Jimenez, M.; Manzano, J.; Chanzy, A.; Vereecken, H.

    2013-12-01

    Data assimilation (DA) techniques, like the local ensemble transform Kalman filter (LETKF) not only offer the opportunity to update model predictions by assimilating new measurement data in real time, but also provide an improved basis for real-time (DA-based) control. This study focuses on the optimization of real-time irrigation scheduling for fields of citrus trees near Picassent (Spain). For three selected fields the irrigation was optimized with DA-based control, and for other fields irrigation was optimized on the basis of a more traditional approach where reference evapotranspiration for citrus trees was estimated using the FAO-method. The performance of the two methods is compared for the year 2013. The DA-based real-time control approach is based on ensemble predictions of soil moisture profiles, using the Community Land Model (CLM). The uncertainty in the model predictions is introduced by feeding the model with weather predictions from an ensemble prediction system (EPS) and uncertain soil hydraulic parameters. The model predictions are updated daily by assimilating soil moisture data measured by capacitance probes. The measurement data are assimilated with help of LETKF. The irrigation need was calculated for each of the ensemble members, averaged, and logistic constraints (hydraulics, energy costs) were taken into account for the final assigning of irrigation in space and time. For the operational scheduling based on this approach only model states and no model parameters were updated by the model. Other, non-operational simulation experiments for the same period were carried out where (1) neither ensemble weather forecast nor DA were used (open loop), (2) Only ensemble weather forecast was used, (3) Only DA was used, (4) also soil hydraulic parameters were updated in data assimilation and (5) both soil hydraulic and plant specific parameters were updated. The FAO-based and DA-based real-time irrigation control are compared in terms of soil moisture

  18. A novel approach to ECG classification based upon two-layered HMMs in body sensor networks.

    PubMed

    Liang, Wei; Zhang, Yinlong; Tan, Jindong; Li, Yang

    2014-03-27

    This paper presents a novel approach to ECG signal filtering and classification. Unlike the traditional techniques which aim at collecting and processing the ECG signals with the patient being still, lying in bed in hospitals, our proposed algorithm is intentionally designed for monitoring and classifying the patient's ECG signals in the free-living environment. The patients are equipped with wearable ambulatory devices the whole day, which facilitates the real-time heart attack detection. In ECG preprocessing, an integral-coefficient-band-stop (ICBS) filter is applied, which omits time-consuming floating-point computations. In addition, two-layered Hidden Markov Models (HMMs) are applied to achieve ECG feature extraction and classification. The periodic ECG waveforms are segmented into ISO intervals, P subwave, QRS complex and T subwave respectively in the first HMM layer where expert-annotation assisted Baum-Welch algorithm is utilized in HMM modeling. Then the corresponding interval features are selected and applied to categorize the ECG into normal type or abnormal type (PVC, APC) in the second HMM layer. For verifying the effectiveness of our algorithm on abnormal signal detection, we have developed an ECG body sensor network (BSN) platform, whereby real-time ECG signals are collected, transmitted, displayed and the corresponding classification outcomes are deduced and shown on the BSN screen.

  19. Cloud field classification based upon high spatial resolution textural features. II - Simplified vector approaches

    NASA Technical Reports Server (NTRS)

    Chen, D. W.; Sengupta, S. K.; Welch, R. M.

    1989-01-01

    This paper compares the results of cloud-field classification derived from two simplified vector approaches, the Sum and Difference Histogram (SADH) and the Gray Level Difference Vector (GLDV), with the results produced by the Gray Level Cooccurrence Matrix (GLCM) approach described by Welch et al. (1988). It is shown that the SADH method produces accuracies equivalent to those obtained using the GLCM method, while the GLDV method fails to resolve error clusters. Compared to the GLCM method, the SADH method leads to a 31 percent saving in run time and a 50 percent saving in storage requirements, while the GLVD approach leads to a 40 percent saving in run time and an 87 percent saving in storage requirements.

  20. A multi-label, semi-supervised classification approach applied to personality prediction in social media.

    PubMed

    Lima, Ana Carolina E S; de Castro, Leandro Nunes

    2014-10-01

    Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naïve Bayes classifier, a Support Vector Machine, and a Multilayer Perceptron neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others.

  1. A multi-label, semi-supervised classification approach applied to personality prediction in social media.

    PubMed

    Lima, Ana Carolina E S; de Castro, Leandro Nunes

    2014-10-01

    Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naïve Bayes classifier, a Support Vector Machine, and a Multilayer Perceptron neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others. PMID:24969690

  2. Proposition of novel classification approach and features for improved real-time arrhythmia monitoring.

    PubMed

    Kim, Yoon Jae; Heo, Jeong; Park, Kwang Suk; Kim, Sungwan

    2016-08-01

    Arrhythmia refers to a group of conditions in which the heartbeat is irregular, fast, or slow due to abnormal electrical activity in the heart. Some types of arrhythmia such as ventricular fibrillation may result in cardiac arrest or death. Thus, arrhythmia detection becomes an important issue, and various studies have been conducted. Additionally, an arrhythmia detection algorithm for portable devices such as mobile phones has recently been developed because of increasing interest in e-health care. This paper proposes a novel classification approach and features, which are validated for improved real-time arrhythmia monitoring. The classification approach that was employed for arrhythmia detection is based on the concept of ensemble learning and the Taguchi method and has the advantage of being accurate and computationally efficient. The electrocardiography (ECG) data for arrhythmia detection was obtained from the MIT-BIH Arrhythmia Database (n=48). A novel feature, namely the heart rate variability calculated from 5s segments of ECG, which was not considered previously, was used. The novel classification approach and feature demonstrated arrhythmia detection accuracy of 89.13%. When the same data was classified using the conventional support vector machine (SVM), the obtained accuracy was 91.69%, 88.14%, and 88.74% for Gaussian, linear, and polynomial kernels, respectively. In terms of computation time, the proposed classifier was 5821.7 times faster than conventional SVM. In conclusion, the proposed classifier and feature showed performance comparable to those of previous studies, while the computational complexity and update interval were highly reduced. PMID:27318329

  3. A tri-fold hybrid classification approach for diagnostics with unexampled faulty states

    NASA Astrophysics Data System (ADS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2015-01-01

    System health diagnostics provides diversified benefits such as improved safety, improved reliability and reduced costs for the operation and maintenance of engineered systems. Successful health diagnostics requires the knowledge of system failures. However, with an increasing system complexity, it is extraordinarily difficult to have a well-tested system so that all potential faulty states can be realized and studied at product testing stage. Thus, real time health diagnostics requires automatic detection of unexampled system faulty states based upon sensory data to avoid sudden catastrophic system failures. This paper presents a trifold hybrid classification (THC) approach for structural health diagnosis with unexampled health states (UHS), which comprises of preliminary UHS identification using a new thresholded Mahalanobis distance (TMD) classifier, UHS diagnostics using a two-class support vector machine (SVM) classifier, and exampled health states diagnostics using a multi-class SVM classifier. The proposed THC approach, which takes the advantages of both TMD and SVM-based classification techniques, is able to identify and isolate the unexampled faulty states through interactively detecting the deviation of sensory data from the exampled health states and forming new ones autonomously. The proposed THC approach is further extended to a generic framework for health diagnostics problems with unexampled faulty states and demonstrated with health diagnostics case studies for power transformers and rolling bearings.

  4. [The establishment, development and application of classification approach of freshwater phytoplankton based on the functional group: a review].

    PubMed

    Yang, Wen; Zhu, Jin-Yong; Lu, Kai-Hong; Wan, Li; Mao, Xiao-Hua

    2014-06-01

    Appropriate schemes for classification of freshwater phytoplankton are prerequisites and important tools for revealing phytoplanktonic succession and studying freshwater ecosystems. An alternative approach, functional group of freshwater phytoplankton, has been proposed and developed due to the deficiencies of Linnaean and molecular identification in ecological applications. The functional group of phytoplankton is a classification scheme based on autoecology. In this study, the theoretical basis and classification criterion of functional group (FG), morpho-functional group (MFG) and morphology-based functional group (MBFG) were summarized, as well as their merits and demerits. FG was considered as the optimal classification approach for the aquatic ecology research and aquatic environment evaluation. The application status of FG was introduced, with the evaluation standards and problems of two approaches to assess water quality on the basis of FG, index methods of Q and QR, being briefly discussed.

  5. A multidimensional signal processing approach for classification of microwave measurements with application to stroke type diagnosis.

    PubMed

    Mesri, Hamed Yousefi; Najafabadi, Masoud Khazaeli; McKelvey, Tomas

    2011-01-01

    A multidimensional signal processing method is described for detection of bleeding stroke based on microwave measurements from an antenna array placed around the head of the patient. The method is data driven and the algorithm uses samples from a healthy control group to calculate the feature used for classification. The feature is derived using a tensor approach and the higher order singular value decomposition is a key component. A leave-one-out validation method is used to evaluate the properties of the method using clinical data.

  6. Membrane positioning for high- and low-resolution protein structures through a binary classification approach.

    PubMed

    Postic, Guillaume; Ghouzam, Yassine; Guiraud, Vincent; Gelly, Jean-Christophe

    2016-03-01

    The critical importance of algorithms for orienting proteins in the lipid bilayer stems from the extreme difficulty in obtaining experimental data about the membrane boundaries. Here, we present a computational method for positioning protein structures in the membrane, based on the sole alpha carbon coordinates and, therefore, compatible with both high and low structural resolutions. Our algorithm follows a new and simple approach, by treating the membrane assignment problem as a binary classification. Compared with the state-of-the-art algorithms, our method achieves similar accuracy, while being faster. Finally, our open-source software is also capable of processing coarse-grained models of protein structures. PMID:26685702

  7. Exploring tree species signature using waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Zhou, T.; Popescu, S. C.; Krause, K.

    2015-12-01

    Successful classification of tree species with waveform LiDAR data would be of considerable value to estimate the biomass stocks and changes in forests. Current approaches emphasize converting the full waveform data into discrete points to get larger amount of parameters and identify tree species using several discrete-points variables. However, ignores intensity values and waveform shapes which convey important structural characteristics. The overall goal of this study was to employ the intensity and waveform shape of individual tree as the waveform signature to detect tree species. The data was acquired by the National Ecological Observatory Network (NEON) within 250*250 m study area located in San Joaquin Experimental Range. Specific objectives were to: (1) segment individual trees using the smoothed canopy height model (CHM) derived from discrete LiDAR points; (2) link waveform LiDAR with above individual tree boundaries to derive sample signatures of three tree species and use these signatures to discriminate tree species in a large area; and (3) compare tree species detection results from discrete LiDAR data and waveform LiDAR data. An overall accuracy of the segmented individual tree of more than 80% was obtained. The preliminary results show that compared with the discrete LiDAR data, the waveform LiDAR signature has a higher potential for accurate tree species classification.

  8. On the Biogeography of Centipeda: A Species-Tree Diffusion Approach

    PubMed Central

    Nylinder, Stephan; Lemey, Philippe; De Bruyn, Mark; Suchard, Marc A.; Pfeil, Bernard E.; Walsh, Neville; Anderberg, Arne A.

    2014-01-01

    Reconstructing the biogeographic history of groups present in continuous arid landscapes is challenging due to the difficulties in defining discrete areas for analyses, and even more so when species largely overlap both in terms of geography and habitat preference. In this study, we use a novel approach to estimate ancestral areas for the small plant genus Centipeda. We apply continuous diffusion of geography by a relaxed random walk where each species is sampled from its extant distribution on an empirical distribution of time-calibrated species-trees. Using a distribution of previously published substitution rates of the internal transcribed spacer (ITS) for Asteraceae, we show how the evolution of Centipeda correlates with the temporal increase of aridity in the arid zone since the Pliocene. Geographic estimates of ancestral species show a consistent pattern of speciation of early lineages in the Lake Eyre region, with a division in more northerly and southerly groups since ∼840 ka. Summarizing the geographic slices of species-trees at the time of the latest speciation event (∼20 ka), indicates no presence of the genus in Australia west of the combined desert belt of the Nullabor Plain, the Great Victoria Desert, the Gibson Desert, and the Great Sandy Desert, or beyond the main continental shelf of Australia. The result indicates all western occurrences of the genus to be a result of recent dispersal rather than ancient vicariance. This study contributes to our understanding of the spatiotemporal processes shaping the flora of the arid zone, and offers a significant improvement in inference of ancestral areas for any organismal group distributed where it remains difficult to describe geography in terms of discrete areas. PMID:24335493

  9. An inverse modeling approach for tree-ring-based climate reconstructions under changing atmospheric CO2 concentrations

    NASA Astrophysics Data System (ADS)

    Boucher, É.; Guiot, J.; Hatté, C.; Daux, V.; Danis, P.-A.; Dussouillez, P.

    2013-11-01

    Over the last decades, dendroclimatologists have relied upon linear transfer functions to reconstruct historical climate. Transfer functions need to be calibrated using recent data from periods where CO2 concentrations reached unprecedented levels (near 400 ppm). Based on these transfer functions, dendroclimatologists must then reconstruct a different past, a past where CO2 concentrations were much below 300 ppm. However, relying upon transfer functions calibrated in this way may introduce an unanticipated bias in the reconstruction of past climate, particularly if CO2 levels have had a noticeable fertilizing effect since the beginning of the industrial era. As an alternative to the transfer function approach, we run the MAIDENiso ecophysiological model in an inverse mode to link together climatic variables, atmospheric CO2 concentrations and tree growth parameters. Our approach endeavors to find the optimal combination of meteorological conditions that best simulate observed tree ring patterns. We test our approach in the Fontainebleau forest (France). By comparing two different CO2 scenarios, we present evidence that increasing CO2 concentrations have had a slight, yet significant, effect on reconstruction results. We demonstrate that higher CO2 concentrations augment the efficiency of water use by trees, therefore favoring the reconstruction of a warmer and drier climate. Under elevated CO2 concentrations, trees close their stomata and need less water to produce the same amount of wood. Inverse process-based modeling represents a powerful alternative to the transfer function technique, especially for the study of divergent tree-ring-to-climate relationships. The approach has several advantages, most notably its ability to distinguish between climatic effects and CO2 imprints on tree growth. Therefore our method produces reconstructions that are less biased by anthropogenic greenhouse gas emissions and that are based on sound ecophysiological knowledge.

  10. Semi-Automated Approach for Mapping Urban Trees from Integrated Aerial LiDAR Point Cloud and Digital Imagery Datasets

    NASA Astrophysics Data System (ADS)

    Dogon-Yaro, M. A.; Kumar, P.; Rahman, A. Abdul; Buyuksalih, G.

    2016-09-01

    Mapping of trees plays an important role in modern urban spatial data management, as many benefits and applications inherit from this detailed up-to-date data sources. Timely and accurate acquisition of information on the condition of urban trees serves as a tool for decision makers to better appreciate urban ecosystems and their numerous values which are critical to building up strategies for sustainable development. The conventional techniques used for extracting trees include ground surveying and interpretation of the aerial photography. However, these techniques are associated with some constraints, such as labour intensive field work and a lot of financial requirement which can be overcome by means of integrated LiDAR and digital image datasets. Compared to predominant studies on trees extraction mainly in purely forested areas, this study concentrates on urban areas, which have a high structural complexity with a multitude of different objects. This paper presented a workflow about semi-automated approach for extracting urban trees from integrated processing of airborne based LiDAR point cloud and multispectral digital image datasets over Istanbul city of Turkey. The paper reveals that the integrated datasets is a suitable technology and viable source of information for urban trees management. As a conclusion, therefore, the extracted information provides a snapshot about location, composition and extent of trees in the study area useful to city planners and other decision makers in order to understand how much canopy cover exists, identify new planting, removal, or reforestation opportunities and what locations have the greatest need or potential to maximize benefits of return on investment. It can also help track trends or changes to the urban trees over time and inform future management decisions.

  11. SDM: a fast distance-based approach for (super) tree building in phylogenomics.

    PubMed

    Criscuolo, Alexis; Berry, Vincent; Douzery, Emmanuel J P; Gascuel, Olivier

    2006-10-01

    Phylogenomic studies aim to build phylogenies from large sets of homologous genes. Such "genome-sized" data require fast methods, because of the typically large numbers of taxa examined. In this framework, distance-based methods are useful for exploratory studies and building a starting tree to be refined by a more powerful maximum likelihood (ML) approach. However, estimating evolutionary distances directly from concatenated genes gives poor topological signal as genes evolve at different rates. We propose a novel method, named super distance matrix (SDM), which follows the same line as average consensus supertree (ACS; Lapointe and Cucumel, 1997) and combines the evolutionary distances obtained from each gene into a single distance supermatrix to be analyzed using a standard distance-based algorithm. SDM deforms the source matrices, without modifying their topological message, to bring them as close as possible to each other; these deformed matrices are then averaged to obtain the distance supermatrix. We show that this problem is equivalent to the minimization of a least-squares criterion subject to linear constraints. This problem has a unique solution which is obtained by resolving a linear system. As this system is sparse, its practical resolution requires O(naka) time, where n is the number of taxa, k the number of matrices, and a < 2, which allows the distance supermatrix to be quickly obtained. Several uses of SDM are proposed, from fast exploratory studies to more accurate approaches requiring heavier computing time. Using simulations, we show that SDM is a relevant alternative to the standard matrix representation with parsimony (MRP) method, notably when the taxa sets of the different genes have low overlap. We also show that SDM can be used to build an excellent starting tree for an ML approach, which both reduces the computing time and increases the topogical accuracy. We use SDM to analyze the data set of Gatesy et al. (2002, Syst. Biol. 51: 652

  12. Complementing boosted regression trees models of SOC stocks distributions with geostatistical approaches

    NASA Astrophysics Data System (ADS)

    martin, manuel; Lacarce, Eva; Meersmans, Jeroen; Orton, Thomas; Saby, Nicolas; Paroissien, Jean-Baptiste; Jolivet, Claudy; Boulonne, Line; Arrouays, Dominique

    2013-04-01

    Soil organic carbon (SOC) plays a major role in the global carbon budget. It can act as a source or a sink of atmospheric carbon, thereby possibly influencing the course of climate change. Improving the tools that model the spatial distributions of SOC stocks at national scales is a priority, both for monitoring changes in SOC and as an input for global carbon cycles studies. In this paper, first, we considered several increasingly complex boosted regression trees (BRT), a convenient and efficient multiple regression model from the statistical learning field. Further, we considered and a robust geostatistical approach coupled to the BRT models. Testing the different approaches was performed on the dataset from the French Soil Monitoring Network, with a consistent cross-validation procedure. We showed that the BRT models, given its ease of use and its predictive performance, could be preferred to geostatistical models for SOC mapping at the national scale, and if possible be joined with geostatistical models. This conclusion is valid provided that care is exercised in model fitting and validating, that the dataset does not allow for modeling local spatial autocorrelations, as it is the case for many national systematic sampling schemes, and when good quality data about SOC drivers included in the models is available.

  13. Early and Mid-Holocene Climate Variability - A Multi-Proxy Approach from Multi-Millennial Tree Ring Records

    NASA Astrophysics Data System (ADS)

    Ziehmer, Malin Michelle; Nicolussi, Kurt; Schlüchter, Christian; Leuenberger, Markus

    2016-04-01

    Most reconstructions of Holocene climate variability in the Alps are based on low-frequency archives such as glacier and tree line fluctuations. However; recent finds of wood remains in glacier forefields in the Alps reveal a unique high-frequency archive allowing climate reconstruction over the entire Holocene. The evolution of Holocene climate can be reconstructed by using a multi-proxy approach combining tree ring width and multiple stable isotope chronologies by establishing highly resolved stable isotope records from calendar-dated wood which covers the past 9000 years b2k. Therefore, we collected samples in the Alps covering a large SW-NE transect, primarily in glacier forefields but also in peat bogs and small lakes. The multiple sample locations allow the analysis of climatic conditions along a climatic gradient characterized by the change from an Atlantic to a more continental climate. Subsequently, tree ring widths are measured and samples are calendrically dated by means of tree ring analysis. Due to the large amount of samples for stable isotope analysis (> 8000 samples to cover the entire Holocene by guaranteeing a sample replication of 4 samples per time unit of 5 years), dated wood samples are separated into 5-year tree ring blocks. These blocks are sliced and the cellulose is extracted after a standardized procedure and crushed by ultrasonic homogenization. In order to establish multi-proxy records, the stable isotopes of carbon, oxygen and hydrogen are simultaneously measured. Both the 5-year tree ring width and multiple stable isotope series offer new insights into the Early and Mid-Holocene climate and its variability in the Alps. The stable isotope records reveal interesting low-frequency variability. But they also display expected offsets caused by the measurement of individual trees revealing effects of sampling site, tree species and growth trend. These effects offer an additional insight into the tree growth and stand behavior of single

  14. Neuropsychological assessment of individuals with brain tumor: comparison of approaches used in the classification of impairment.

    PubMed

    Dwan, Toni Maree; Ownsworth, Tamara; Chambers, Suzanne; Walker, David G; Shum, David H K

    2015-01-01

    Approaches to classifying neuropsychological impairment after brain tumor vary according to testing level (individual tests, domains, or global index) and source of reference (i.e., norms, controls, and pre-morbid functioning). This study aimed to compare rates of impairment according to different classification approaches. Participants were 44 individuals (57% female) with a primary brain tumor diagnosis (mean age = 45.6 years) and 44 matched control participants (59% female, mean age = 44.5 years). All participants completed a test battery that assesses pre-morbid IQ (Wechsler adult reading test), attention/processing speed (digit span, trail making test A), memory (Hopkins verbal learning test-revised, Rey-Osterrieth complex figure-recall), and executive function (trail making test B, Rey-Osterrieth complex figure copy, controlled oral word association test). Results indicated that across the different sources of reference, 86-93% of participants were classified as impaired at a test-specific level, 61-73% were classified as impaired at a domain-specific level, and 32-50% were classified as impaired at a global level. Rates of impairment did not significantly differ according to source of reference (p > 0.05); however, at the individual participant level, classification based on estimated pre-morbid IQ was often inconsistent with classification based on the norms or controls. Participants with brain tumor performed significantly poorer than matched controls on tests of neuropsychological functioning, including executive function (p = 0.001) and memory (p < 0.001), but not attention/processing speed (p > 0.05). These results highlight the need to examine individuals' performance across a multi-faceted neuropsychological test battery to avoid over- or under-estimation of impairment.

  15. Neuropsychological Assessment of Individuals with Brain Tumor: Comparison of Approaches Used in the Classification of Impairment

    PubMed Central

    Dwan, Toni Maree; Ownsworth, Tamara; Chambers, Suzanne; Walker, David G.; Shum, David H. K.

    2015-01-01

    Approaches to classifying neuropsychological impairment after brain tumor vary according to testing level (individual tests, domains, or global index) and source of reference (i.e., norms, controls, and pre-morbid functioning). This study aimed to compare rates of impairment according to different classification approaches. Participants were 44 individuals (57% female) with a primary brain tumor diagnosis (mean age = 45.6 years) and 44 matched control participants (59% female, mean age = 44.5 years). All participants completed a test battery that assesses pre-morbid IQ (Wechsler adult reading test), attention/processing speed (digit span, trail making test A), memory (Hopkins verbal learning test-revised, Rey–Osterrieth complex figure-recall), and executive function (trail making test B, Rey–Osterrieth complex figure copy, controlled oral word association test). Results indicated that across the different sources of reference, 86–93% of participants were classified as impaired at a test-specific level, 61–73% were classified as impaired at a domain-specific level, and 32–50% were classified as impaired at a global level. Rates of impairment did not significantly differ according to source of reference (p > 0.05); however, at the individual participant level, classification based on estimated pre-morbid IQ was often inconsistent with classification based on the norms or controls. Participants with brain tumor performed significantly poorer than matched controls on tests of neuropsychological functioning, including executive function (p = 0.001) and memory (p < 0.001), but not attention/processing speed (p > 0.05). These results highlight the need to examine individuals’ performance across a multi-faceted neuropsychological test battery to avoid over- or under-estimation of impairment. PMID:25815271

  16. A graph-theoretic approach for classification and structure prediction of transmembrane β-barrel proteins

    PubMed Central

    2012-01-01

    Background Transmembrane β-barrel proteins are a special class of transmembrane proteins which play several key roles in human body and diseases. Due to experimental difficulties, the number of transmembrane β-barrel proteins with known structures is very small. Over the years, a number of learning-based methods have been introduced for recognition and structure prediction of transmembrane β-barrel proteins. Most of these methods emphasize on homology search rather than any biological or chemical basis. Results We present a novel graph-theoretic model for classification and structure prediction of transmembrane β-barrel proteins. This model folds proteins based on energy minimization rather than a homology search, avoiding any assumption on availability of training dataset. The ab initio model presented in this paper is the first method to allow for permutations in the structure of transmembrane proteins and provides more structural information than any known algorithm. The model is also able to recognize β-barrels by assessing the pseudo free energy. We assess the structure prediction on 41 proteins gathered from existing databases on experimentally validated transmembrane β-barrel proteins. We show that our approach is quite accurate with over 90% F-score on strands and over 74% F-score on residues. The results are comparable to other algorithms suggesting that our pseudo-energy model is close to the actual physical model. We test our classification approach and show that it is able to reject α-helical bundles with 100% accuracy and β-barrel lipocalins with 97% accuracy. Conclusions We show that it is possible to design models for classification and structure prediction for transmembrane β-barrel proteins which do not depend essentially on training sets but on combinatorial properties of the structures to be proved. These models are fairly accurate, robust and can be run very efficiently on PC-like computers. Such models are useful for the genome

  17. Multicasting in Wireless Communications (Ad-Hoc Networks): Comparison against a Tree-Based Approach

    NASA Astrophysics Data System (ADS)

    Rizos, G. E.; Vasiliadis, D. C.

    2007-12-01

    We examine on-demand multicasting in ad hoc networks. The Core Assisted Mesh Protocol (CAMP) is a well-known protocol for multicast routing in ad-hoc networks, generalizing the notion of core-based trees employed for internet multicasting into multicast meshes that have much richer connectivity than trees. On the other hand, wireless tree-based multicast routing protocols use much simpler structures for determining route paths, using only parent-child relationships. In this work, we compare the performance of the CAMP protocol against the performance of wireless tree-based multicast routing protocols, in terms of two important factors, namely packet delay and ratio of dropped packets.

  18. Detection and classification of interstitial lung diseases and emphysema using a joint morphological-fuzzy approach

    NASA Astrophysics Data System (ADS)

    Chang Chien, Kuang-Che; Fetita, Catalin; Brillet, Pierre-Yves; Prêteux, Françoise; Chang, Ruey-Feng

    2009-02-01

    Multi-detector computed tomography (MDCT) has high accuracy and specificity on volumetrically capturing serial images of the lung. It increases the capability of computerized classification for lung tissue in medical research. This paper proposes a three-dimensional (3D) automated approach based on mathematical morphology and fuzzy logic for quantifying and classifying interstitial lung diseases (ILDs) and emphysema. The proposed methodology is composed of several stages: (1) an image multi-resolution decomposition scheme based on a 3D morphological filter is used to detect and analyze the different density patterns of the lung texture. Then, (2) for each pattern in the multi-resolution decomposition, six features are computed, for which fuzzy membership functions define a probability of association with a pathology class. Finally, (3) for each pathology class, the probabilities are combined up according to the weight assigned to each membership function and two threshold values are used to decide the final class of the pattern. The proposed approach was tested on 10 MDCT cases and the classification accuracy was: emphysema: 95%, fibrosis/honeycombing: 84% and ground glass: 97%.

  19. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis.

  20. Texture classification of anatomical structures in CT using a context-free machine learning approach

    NASA Astrophysics Data System (ADS)

    Jiménez del Toro, Oscar A.; Foncubierta-Rodríguez, Antonio; Depeursinge, Adrien; Müller, Henning

    2015-03-01

    Medical images contain a large amount of visual information about structures and anomalies in the human body. To make sense of this information, human interpretation is often essential. On the other hand, computer-based approaches can exploit information contained in the images by numerically measuring and quantifying specific visual features. Annotation of organs and other anatomical regions is an important step before computing numerical features on medical images. In this paper, a texture-based organ classification algorithm is presented, which can be used to reduce the time required for annotating medical images. The texture of organs is analyzed using a combination of state-of-the-art techniques: the Riesz transform and a bag of meaningful visual words. The effect of a meaningfulness transformation in the visual word space yields two important advantages that can be seen in the results. The number of descriptors is enormously reduced down to 10% of the original size, whereas classification accuracy is improved by up to 25% with respect to the baseline approach.

  1. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis. PMID:14735943

  2. Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

    NASA Astrophysics Data System (ADS)

    Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

    2015-07-01

    While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.

  3. Applying Robust Directional Similarity based Clustering approach RDSC to classification of gene expression data.

    PubMed

    Li, H X; Wang, Shitong; Xiu, Yu

    2006-06-01

    Despite the fact that the classification of gene expression data from a cDNA microarrays has been extensively studied, nowadays a robust clustering method, which can estimate an appropriate number of clusters and be insensitive to its initialization has not yet been developed. In this work, a novel Robust Clustering approach, RDSC, based on the new Directional Similarity measure is presented. This new approach RDSC, which integrates the Directional Similarity based Clustering Algorithm, DSC, with the Agglomerative Hierarchical Clustering Algorithm, AHC, exhibits its robustness to initialization and its capability to determine the appropriate number of clusters reasonably. RDSC has been successfully employed to both artificial and benchmarking gene expression datasets. Our experimental results demonstrate its distinctive superiority over the conventional method Kmeans and the two typical directional clustering algorithms SPKmeans and moVMF.

  4. A case-comparison study of automatic document classification utilizing both serial and parallel approaches

    NASA Astrophysics Data System (ADS)

    Wilges, B.; Bastos, R. C.; Mateus, G. P.; Dantas, M. A. R.

    2014-10-01

    A well-known problem faced by any organization nowadays is the high volume of data that is available and the required process to transform this volume into differential information. In this study, a case-comparison study of automatic document classification (ADC) approach is presented, utilizing both serial and parallel paradigms. The serial approach was implemented by adopting the RapidMiner software tool, which is recognized as the worldleading open-source system for data mining. On the other hand, considering the MapReduce programming model, the Hadoop software environment has been used. The main goal of this case-comparison study is to exploit differences between these two paradigms, especially when large volumes of data such as Web text documents are utilized to build a category database. In the literature, many studies point out that distributed processing in unstructured documents have been yielding efficient results in utilizing Hadoop. Results from our research indicate a threshold to such efficiency.

  5. Land cover classification of Landsat 8 satellite data based on Fuzzy Logic approach

    NASA Astrophysics Data System (ADS)

    Taufik, Afirah; Sakinah Syed Ahmad, Sharifah

    2016-06-01

    The aim of this paper is to propose a method to classify the land covers of a satellite image based on fuzzy rule-based system approach. The study uses bands in Landsat 8 and other indices, such as Normalized Difference Water Index (NDWI), Normalized difference built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) as input for the fuzzy inference system. The selected three indices represent our main three classes called water, built- up land, and vegetation. The combination of the original multispectral bands and selected indices provide more information about the image. The parameter selection of fuzzy membership is performed by using a supervised method known as ANFIS (Adaptive neuro fuzzy inference system) training. The fuzzy system is tested for the classification on the land cover image that covers Klang Valley area. The results showed that the fuzzy system approach is effective and can be explored and implemented for other areas of Landsat data.

  6. A probabilistic approach to segmentation and classification of neoplasia in uterine cervix images using color and geometric features

    NASA Astrophysics Data System (ADS)

    Srinivasan, Yeshwanth; Hernes, Dana; Tulpule, Bhakti; Yang, Shuyu; Guo, Jiangling; Mitra, Sunanda; Yagneswaran, Sriraja; Nutter, Brian; Jeronimo, Jose; Phillips, Benny; Long, Rodney; Ferris, Daron

    2005-04-01

    Automated segmentation and classification of diagnostic markers in medical imagery are challenging tasks. Numerous algorithms for segmentation and classification based on statistical approaches of varying complexity are found in the literature. However, the design of an efficient and automated algorithm for precise classification of desired diagnostic markers is extremely image-specific. The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating an archive of 60,000 digitized color images of the uterine cervix. NLM is developing tools for the analysis and dissemination of these images over the Web for the study of visual features correlated with precancerous neoplasia and cancer. To enable indexing of images of the cervix, it is essential to develop algorithms for the segmentation of regions of interest, such as acetowhitened regions, and automatic identification and classification of regions exhibiting mosaicism and punctation. Success of such algorithms depends, primarily, on the selection of relevant features representing the region of interest. We present color and geometric features based statistical classification and segmentation algorithms yielding excellent identification of the regions of interest. The distinct classification of the mosaic regions from the non-mosaic ones has been obtained by clustering multiple geometric and color features of the segmented sections using various morphological and statistical approaches. Such automated classification methodologies will facilitate content-based image retrieval from the digital archive of uterine cervix and have the potential of developing an image based screening tool for cervical cancer.

  7. A novel semi-supervised hyperspectral image classification approach based on spatial neighborhood information and classifier combination

    NASA Astrophysics Data System (ADS)

    Tan, Kun; Hu, Jun; Li, Jun; Du, Peijun

    2015-07-01

    In the process of semi-supervised hyperspectral image classification, spatial neighborhood information of training samples is widely applied to solve the small sample size problem. However, the neighborhood information of unlabeled samples is usually ignored. In this paper, we propose a new algorithm for hyperspectral image semi-supervised classification in which the spatial neighborhood information is combined with classifier to enhance the classification ability in determining the class label of the selected unlabeled samples. There are two key points in this algorithm: (1) it is considered that the correct label should appear in the spatial neighborhood of unlabeled samples; (2) the combination of classifier can obtains better results. Two classifiers multinomial logistic regression (MLR) and k-nearest neighbor (KNN) are combined together in the above way to further improve the performance. The performance of the proposed approach was assessed with two real hyperspectral data sets, and the obtained results indicate that the proposed approach is effective for hyperspectral classification.

  8. Immunotyping of Human Immunodeficiency Virus Type 1 (HIV): an Approach to Immunologic Classification of HIV

    PubMed Central

    Zolla-Pazner, Susan; Gorny, Miroslaw K.; Nyambi, Phillipe N.; VanCott, Thomas C.; Nádas, Arthur

    1999-01-01

    Because immunologic classification of human immunodeficiency virus type 1 (HIV) might be more relevant than genotypic classification for designing polyvalent vaccines, studies were undertaken to determine whether immunologically defined groups of HIV (“immunotypes”) could be identified. For these experiments, the V3 region of the 120-kDa envelope glycoprotein (gp120) was chosen for study. Although antibodies (Abs) to V3 may not play a major protective role in preventing HIV infection, identification of a limited number of immunologically defined structures in this extremely variable region would set a precedent supporting the hypothesis that, despite its diversity, the HIV family, like the V3 region, might be divisible into immunotypes. Consequently, the immunochemical reactivities of 1,176 combinations of human anti-V3 monoclonal Abs (MAbs) and V3 peptides, derived from viruses of several clades, were studied. Extensive cross-clade reactivity was observed. The patterns of reactivities of 21 MAbs with 50 peptides from clades A through H were then analyzed by a multivariate statistical technique. To test the validity of the mathematical approach, a cluster analysis of the 21 MAbs was performed. Five groups were identified, and these MAb clusters corresponded to classifications of these same MAbs based on the epitopes which they recognize. The concordance between the MAb clusters identified by mathematical analysis and by their specificities supports the validity of the mathematical approach. Therefore, the same mathematical technique was used to identify clusters within the 50 peptides. Seven groups of peptides, each containing peptides from more than one clade, were defined. Inspection of the amino acid sequences of the peptides in each of the mathematically defined peptide clusters revealed unique “signature sequences” that suggest structural motifs characteristic of each V3-based immunotype. The results suggest that cluster analysis of immunologic data

  9. Regression-Based Approach For Feature Selection In Classification Issues. Application To Breast Cancer Detection And Recurrence

    NASA Astrophysics Data System (ADS)

    Belciug, Smaranda; Serbanescu, Mircea-Sebastian

    2015-09-01

    Feature selection is considered a key factor in classifications/decision problems. It is currently used in designing intelligent decision systems to choose the best features which allow the best performance. This paper proposes a regression-based approach to select the most important predictors to significantly increase the classification performance. Application to breast cancer detection and recurrence using publically available datasets proved the efficiency of this technique.

  10. Evaluating an ensemble classification approach for crop diversity verification in Danish greening subsidy control

    NASA Astrophysics Data System (ADS)

    Chellasamy, Menaka; Ferré, Ty Paul Andrew; Greve, Mogens Humlekrog

    2016-07-01

    Beginning in 2015, Danish farmers are obliged to meet specific crop diversification rules based on total land area and number of crops cultivated to be eligible for new greening subsidies. Hence, there is a need for the Danish government to extend their subsidy control system to verify farmers' declarations to warrant greening payments under the new crop diversification rules. Remote Sensing (RS) technology has been used since 1992 to control farmers' subsidies in Denmark. However, a proper RS-based approach is yet to be finalised to validate new crop diversity requirements designed for assessing compliance under the recent subsidy scheme (2014-2020); This study uses an ensemble classification approach (proposed by the authors in previous studies) for validating the crop diversity requirements of the new rules. The approach uses a neural network ensemble classification system with bi-temporal (spring and early summer) WorldView-2 imagery (WV2) and includes the following steps: (1) automatic computation of pixel-based prediction probabilities using multiple neural networks; (2) quantification of the classification uncertainty using Endorsement Theory (ET); (3) discrimination of crop pixels and validation of the crop diversification rules at farm level; and (4) identification of farmers who are violating the requirements for greening subsidies. The prediction probabilities are computed by a neural network ensemble supplied with training samples selected automatically using farmers declared parcels (field vectors containing crop information and the field boundary of each crop). Crop discrimination is performed by considering a set of conclusions derived from individual neural networks based on ET. Verification of the diversification rules is performed by incorporating pixel-based classification uncertainty or confidence intervals with the class labels at the farmer level. The proposed approach was tested with WV2 imagery acquired in 2011 for a study area in Vennebjerg

  11. Effects of a Peer Assessment System Based on a Grid-Based Knowledge Classification Approach on Computer Skills Training

    ERIC Educational Resources Information Center

    Hsu, Ting-Chia

    2016-01-01

    In this study, a peer assessment system using the grid-based knowledge classification approach was developed to improve students' performance during computer skills training. To evaluate the effectiveness of the proposed approach, an experiment was conducted in a computer skills certification course. The participants were divided into three…

  12. Diagnostic classification of specific phobia subtypes using structural MRI data: a machine-learning approach.

    PubMed

    Lueken, Ulrike; Hilbert, Kevin; Wittchen, Hans-Ulrich; Reif, Andreas; Hahn, Tim

    2015-01-01

    While neuroimaging research has advanced our knowledge about fear circuitry dysfunctions in anxiety disorders, findings based on diagnostic groups do not translate into diagnostic value for the individual patient. Machine-learning generates predictive information that can be used for single subject classification. We applied Gaussian process classifiers to a sample of patients with specific phobia as a model disorder for pathological forms of anxiety to test for classification based on structural MRI data. Gray (GM) and white matter (WM) volumetric data were analyzed in 33 snake phobics (SP; animal subtype), 26 dental phobics (DP; blood-injection-injury subtype) and 37 healthy controls (HC). Results showed good accuracy rates for GM and WM data in predicting phobia subtypes (GM: 62 % phobics vs. HC, 86 % DP vs. HC, 89 % SP vs. HC, 89 % DP vs. SP; WM: 88 % phobics vs. HC, 89 % DP vs. HC, 79 % SP vs. HC, 79 % DP vs. HC). Regarding GM, classification improved when considering the subtype compared to overall phobia status. The discriminatory brain pattern was not solely based on fear circuitry structures but included widespread cortico-subcortical networks. Results demonstrate that multivariate pattern recognition represents a promising approach for the development of neuroimaging-based diagnostic markers that could support clinical decisions. Regarding the increasing number of fMRI studies on anxiety disorders, researchers are encouraged to use functional and structural data not only for studying phenotype characteristics on a group level, but also to evaluate their incremental value for diagnostic or prognostic purposes.

  13. From expert judgement to supervised classification: a new approach to assess ecological status in lowland streams.

    PubMed

    Baattrup-Pedersen, Annette; Larsen, Søren E; Riis, Tenna

    2013-03-01

    The EC Water Framework Directive (WFD) clearly states that undisturbed reference states of aquatic ecosystems should be used to set standards for restoration. Across Europe defining biological reference status and setting boundaries for ecological status classes continues to represent a major challenge. In the present study we investigate if a paradigm exists among experts that can guide the development of assessment systems based on the normative definitions of ecological status classes of the WFD. Our main questions were: 1) Will experts from species abundance data and typology descriptors independently arrive at similar assessments of ecological status, and 2) Can the expert interpretation of ecological status be transferred into a statistical model allowing for a standardization of assessments from plant assemblages in lowland streams? We used a large dataset covering 1244 randomly distributed stream sites in Denmark and asked a group of experts to independently classify the sites using the WFD's normative definitions of ecological status. According to the combined expert group, no Danish stream sites belonged to the undisturbed reference state. For the remaining ecological status classes we found good concordance in the classification made by the five experts. From this we infer that a common paradigm does exist, which may guide the development of assessment methods for aquatic plants in lowland streams. We also found that the common view of the experts could be transferred into a supervised classification model that can serve as a classification tool for aquatic plant assemblages in lowland streams. We conclude that the combined use of experts and advanced multivariate statistics can provide a useful approach in the development of systems for assessment of ecological status in water types, where a reference network cannot be established.

  14. Classification of non native tree species in Adda Park (Italy) through multispectral and multitemporal surveys from UAV

    NASA Astrophysics Data System (ADS)

    Pinto, Livio; Sona, Giovanna; Biffi, Andrea; Dosso, Paolo; Passoni, Daniele; Baracani, Matteo

    2014-05-01

    July, was realized over a longer period : from 09/07/2013 to 28/08/2013, due to weather condition and technical reasons. In any case the vegetation characteristics resulted to be unchanged. The second set of flights, in autumn, were done in a shorter period, during the days 16-17-18 October 2013, thus obtaining even better homogeneity of the vegetation conditions. Image and data processing are based on standard classification techniques, both pixel and object based, applied simultaneously on multispectral and multitemporal data, with the aim of producing a thematic map of the species of interest. The classification accuracies will be computed on the basis of ground truth comparison, to study possible misclassification among species.

  15. Selection bias in species distribution models: An econometric approach on forest trees based on structural modeling

    NASA Astrophysics Data System (ADS)

    Martin-StPaul, N. K.; Ay, J. S.; Guillemot, J.; Doyen, L.; Leadley, P.

    2014-12-01

    Species distribution models (SDMs) are widely used to study and predict the outcome of global changes on species. In human dominated ecosystems the presence of a given species is the result of both its ecological suitability and human footprint on nature such as land use choices. Land use choices may thus be responsible for a selection bias in the presence/absence data used in SDM calibration. We present a structural modelling approach (i.e. based on structural equation modelling) that accounts for this selection bias. The new structural species distribution model (SSDM) estimates simultaneously land use choices and species responses to bioclimatic variables. A land use equation based on an econometric model of landowner choices was joined to an equation of species response to bioclimatic variables. SSDM allows the residuals of both equations to be dependent, taking into account the possibility of shared omitted variables and measurement errors. We provide a general description of the statistical theory and a set of applications on forest trees over France using databases of climate and forest inventory at different spatial resolution (from 2km to 8km). We also compared the outputs of the SSDM with outputs of a classical SDM (i.e. Biomod ensemble modelling) in terms of bioclimatic response curves and potential distributions under current climate and climate change scenarios. The shapes of the bioclimatic response curves and the modelled species distribution maps differed markedly between SSDM and classical SDMs, with contrasted patterns according to species and spatial resolutions. The magnitude and directions of these differences were dependent on the correlations between the errors from both equations and were highest for higher spatial resolutions. A first conclusion is that the use of classical SDMs can potentially lead to strong miss-estimation of the actual and future probability of presence modelled. Beyond this selection bias, the SSDM we propose represents

  16. Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations.

    PubMed

    Torkzaban, Bahareh; Kayvanjoo, Amir Hossein; Ardalan, Arman; Mousavi, Soraya; Mariotti, Roberto; Baldoni, Luciana; Ebrahimie, Esmaeil; Ebrahimi, Mansour; Hosseini-Mazinani, Mehdi

    2015-01-01

    Finding efficient analytical techniques is overwhelmingly turning into a bottleneck for the effectiveness of large biological data. Machine learning offers a novel and powerful tool to advance classification and modeling solutions in molecular biology. However, these methods have been less frequently used with empirical population genetics data. In this study, we developed a new combined approach of data analysis using microsatellite marker data from our previous studies of olive populations using machine learning algorithms. Herein, 267 olive accessions of various origins including 21 reference cultivars, 132 local ecotypes, and 37 wild olive specimens from the Iranian plateau, together with 77 of the most represented Mediterranean varieties were investigated using a finely selected panel of 11 microsatellite markers. We organized data in two '4-targeted' and '16-targeted' experiments. A strategy of assaying different machine based analyses (i.e. data cleaning, feature selection, and machine learning classification) was devised to identify the most informative loci and the most diagnostic alleles to represent the population and the geography of each olive accession. These analyses revealed microsatellite markers with the highest differentiating capacity and proved efficiency for our method of clustering olive accessions to reflect upon their regions of origin. A distinguished highlight of this study was the discovery of the best combination of markers for better differentiating of populations via machine learning models, which can be exploited to distinguish among other biological populations.

  17. Land cover data from Landsat single-date archive imagery: an integrated classification approach

    NASA Astrophysics Data System (ADS)

    Bajocco, Sofia; Ceccarelli, Tomaso; Rinaldo, Simone; De Angelis, Antonella; Salvati, Luca; Perini, Luigi

    2012-10-01

    The analysis of land cover dynamics provides insight into many environmental problems. However, there are few data sources which can be used to derive consistent time series, remote sensing being one of the most valuable ones. Due to their multi-temporal and spatial coverage needs, such analysis is usually based on large land cover datasets, which requires automated, objective and repeatable procedures. The USGS Landsat archives provide free access to multispectral, high-resolution remotely sensed data starting from the mid-eighties; in many cases, however, only single date images are available. This paper suggests an objective approach for generating land cover information from 30m resolution and single date Landsat archive satellite imagery. A procedure was developed integrating pixel-based and object-oriented classifiers, which consists of the following basic steps: i) pre-processing of the satellite image, including radiance and reflectance calibration, texture analysis and derivation of vegetation indices, ii) segmentation of the pre-processed image, iii) its classification integrating both radiometric and textural properties. The integrated procedure was tested for an area in Sardinia Region, Italy, and compared with a purely pixel-based one. Results demonstrated that a better overall accuracy, evaluated against the available land cover cartography, was obtained with the integrated (86%) compared to the pixel-based classification (68%) at the first CORINE Land Cover level. The proposed methodology needs to be further tested for evaluating its trasferability in time (constructing comparable land cover time series) and space (for covering larger areas).

  18. TransportTP: A two-phase classification approach for membrane transporter prediction and characterization

    PubMed Central

    2009-01-01

    Background Membrane transporters play crucial roles in living cells. Experimental characterization of transporters is costly and time-consuming. Current computational methods for transporter characterization still require extensive curation efforts, especially for eukaryotic organisms. We developed a novel genome-scale transporter prediction and characterization system called TransportTP that combined homology-based and machine learning methods in a two-phase classification approach. First, traditional homology methods were employed to predict novel transporters based on sequence similarity to known classified proteins in the Transporter Classification Database (TCDB). Second, machine learning methods were used to integrate a variety of features to refine the initial predictions. A set of rules based on transporter features was developed by machine learning using well-curated proteomes as guides. Results In a cross-validation using the yeast proteome for training and the proteomes of ten other organisms for testing, TransportTP achieved an equivalent recall and precision of 81.8%, based on TransportDB, a manually annotated transporter database. In an independent test using the Arabidopsis proteome for training and four recently sequenced plant proteomes for testing, it achieved a recall of 74.6% and a precision of 73.4%, according to our manual curation. Conclusions TransportTP is the most effective tool for eukaryotic transporter characterization up to date. PMID:20003433

  19. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    PubMed Central

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.

    2015-01-01

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required. PMID:26156000

  20. High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    SciTech Connect

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijhout, Falko P.

    2015-07-09

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.

  1. High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    DOE PAGES

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijhout, Falko P.

    2015-07-09

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguishmore » between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.« less

  2. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures.

    PubMed

    Musah, Rabi A; Espinoza, Edgard O; Cody, Robert B; Lesiak, Ashton D; Christensen, Earl D; Moore, Hannah E; Maleknia, Simin; Drijfhout, Falko P

    2015-07-09

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.

  3. A machine learning approach for classification of anatomical coverage in CT

    NASA Astrophysics Data System (ADS)

    Wang, Xiaoyong; Lo, Pechin; Ramakrishna, Bharath; Goldin, Johnathan; Brown, Matthew

    2016-03-01

    Automatic classification of anatomical coverage of medical images is critical for big data mining and as a pre-processing step to automatically trigger specific computer aided diagnosis systems. The traditional way to identify scans through DICOM headers has various limitations due to manual entry of series descriptions and non-standardized naming conventions. In this study, we present a machine learning approach where multiple binary classifiers were used to classify different anatomical coverages of CT scans. A one-vs-rest strategy was applied. For a given training set, a template scan was selected from the positive samples and all other scans were registered to it. Each registered scan was then evenly split into k × k × k non-overlapping blocks and for each block the mean intensity was computed. This resulted in a 1 × k3 feature vector for each scan. The feature vectors were then used to train a SVM based classifier. In this feasibility study, four classifiers were built to identify anatomic coverages of brain, chest, abdomen-pelvis, and chest-abdomen-pelvis CT scans. Each classifier was trained and tested using a set of 300 scans from different subjects, composed of 150 positive samples and 150 negative samples. Area under the ROC curve (AUC) of the testing set was measured to evaluate the performance in a two-fold cross validation setting. Our results showed good classification performance with an average AUC of 0.96.

  4. A multimodal temporal panorama approach for moving vehicle detection, reconstruction, and classification

    NASA Astrophysics Data System (ADS)

    Wang, Tao; Zhu, Zhigang

    2012-06-01

    Moving vehicle detection and classification using multimodal data is a challenging task in data collection, audio-visual alignment, data labeling and feature selection under uncontrolled environments with occlusions, motion blurs, varying image resolutions and perspective distortions. In this work, we propose an effective multimodal temporal panorama approach for the task using a novel long-range audio-visual sensing system. A new audio-visual vehicle (AVV) dataset for moving vehicle detection and classification is created, which features automatic vehicle detection and audio-visual alignment, accurate vehicle extraction and reconstruction, and efficient data labeling. In particular, vehicles' visual images are reconstructed once detected in order to remove most of the occlusions, motion blurs, and variations of perspective views. Multimodal audio-visual features are extracted, including global geometric features (aspect ratios, profiles), local structure features (HOGs), as well various audio features (MFCCs, etc). Using radial-based SVMs, the effectiveness of the integration of these multimodal features is thoroughly and systemically studied. The concept of MTP may not be only limited to visual, motion and audio modalities; it could also be applicable to other sensing modalities that can obtain data in the temporal domain.

  5. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    NASA Astrophysics Data System (ADS)

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.

    2015-07-01

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.

  6. An effective band selection approach for classification in remote sensing imagery

    NASA Astrophysics Data System (ADS)

    Cukur, Hüseyin; Binol, Hamidullah; Uslu, Faruk S.; Bal, Abdullah

    2015-10-01

    Hyperspectral imagery (HSI) is a special imaging form that is characterized by high spectral resolution with up to hundreds of very narrow and contiguous bands which is ranging from the visible to the infrared region. Since HSI contains more distinctive features than conventional images, its computation cost of processing is very high. That's why; dimensionality reduction is become significant for classification performance. In this study, dimension reduction has been achieved via VNS based band selection method on hyperspectral images. This method is based on systematic change of neighborhood used in the search space. In order to improve the band selection performance, we have offered clustering technique based on mutual information (MI) before applying VNS. The offered combination technique is called MI-VNS. Support Vector Machine (SVM) has been used as a classifier to evaluate the performance of the proposed band selection technique. The experimental results show that MI-VNS approach has increased the classification performance and decrease the computational time compare to without band selection and conventional VNS.

  7. Fractal geometry-based classification approach for the recognition of lung cancer cells

    NASA Astrophysics Data System (ADS)

    Xia, Deshen; Gao, Wenqing; Li, Hua

    1994-05-01

    This paper describes a new fractal geometry based classification approach for the recognition of lung cancer cells, which is used in the health inspection for lung cancers, because cancer cells grow much faster and more irregularly than normal cells do, the shape of the segmented cancer cells is very irregular and considered as a graph without characteristic length. We use Texture Energy Intensity Rn to do fractal preprocessing to segment the cells from the image and to calculate the fractal dimention value for extracting the fractal features, so that we can get the figure characteristics of different cancer cells and normal cells respectively. Fractal geometry gives us a correct description of cancer-cell shapes. Through this method, a good recognition of Adenoma, Squamous, and small cancer cells can be obtained.

  8. ISOLATING CONTENT AND METADATA FROM WEBLOGS USING CLASSIFICATION AND RULE-BASED APPROACHES

    SciTech Connect

    Marshall, Eric J.; Bell, Eric B.

    2011-09-04

    The emergence and increasing prevalence of social media, such as internet forums, weblogs (blogs), wikis, etc., has created a new opportunity to measure public opinion, attitude, and social structures. A major challenge in leveraging this information is isolating the content and metadata in weblogs, as there is no standard, universally supported, machine-readable format for presenting this information. We present two algorithms for isolating this information. The first uses web block classification, where each node in the Document Object Model (DOM) for a page is classified according to one of several pre-defined attributes from a common blog schema. The second uses a set of heuristics to select web blocks. These algorithms perform at a level suitable for initial use, validating this approach for isolating content and metadata from blogs. The resultant data serves as a starting point for analytical work on the content and substance of collections of weblog pages.

  9. Schizophrenia Detection and Classification by Advanced Analysis of EEG Recordings Using a Single Electrode Approach

    PubMed Central

    Dvey-Aharon, Zack; Fogelson, Noa; Peled, Avi; Intrator, Nathan

    2015-01-01

    Electroencephalographic (EEG) analysis has emerged as a powerful tool for brain state interpretation and diagnosis, but not for the diagnosis of mental disorders; this may be explained by its low spatial resolution or depth sensitivity. This paper concerns the diagnosis of schizophrenia using EEG, which currently suffers from several cardinal problems: it heavily depends on assumptions, conditions and prior knowledge regarding the patient. Additionally, the diagnostic experiments take hours, and the accuracy of the analysis is low or unreliable. This article presents the “TFFO” (Time-Frequency transformation followed by Feature-Optimization), a novel approach for schizophrenia detection showing great success in classification accuracy with no false positives. The methodology is designed for single electrode recording, and it attempts to make the data acquisition process feasible and quick for most patients. PMID:25837521

  10. Classification and surgical approaches for transnasal endoscopic skull base chordoma resection: a 6-year experience with 161 cases.

    PubMed

    Gui, Songbai; Zong, Xuyi; Wang, Xinsheng; Li, Chuzhong; Zhao, Peng; Cao, Lei; Zhang, Yazhuo

    2016-04-01

    The aim of this study is to retrospectively analyze 161 cases of surgically treated skull base chordoma, so as to summarize the clinical classification of this tumor and the surgical approaches for its treatment via transnasal endoscopic surgery. Between August 2007 and October 2013, a total of 161 patients (92 males and 69 females) undergoing surgical treatment of skull base chordoma were evaluated with regard to the clinical classification, surgical approach, and surgical efficacy. The tumor was located in the midline region of the skull base in 134 cases, and in the midline and paramedian regions in 27 cases (extensive type). Resection was performed via the transnasal endoscopic approach in 124 cases (77%), via the open cranial base approach in 11 cases (6.8%), and via staged resection combined with the transnasal endoscopic approach and open cranial base approach in 26 cases (16.2%). Total resection was achieved in 38 cases (23.6%); subtotal resection, 86 cases (53.4%); partial resection of 80-95%, 29 cases (18%); and partial resection <80%, 8 cases (5%). The clinical classification method used in this study seems suitable for selection of transnasal endoscopic surgical approach which may improve the resection degree and surgical efficacy of skull base chordoma. Gross total resection of skull base chordoma via endoscopic endonasal surgery (with addition of an open approach as needed) is a safe and viable alternative to the traditional open approach.

  11. Sensitivity of Bovine Tuberculosis Surveillance in Wildlife in France: A Scenario Tree Approach.

    PubMed

    Rivière, Julie; Le Strat, Yann; Dufour, Barbara; Hendrikx, Pascal

    2015-01-01

    Bovine tuberculosis (bTB) is a common disease in cattle and wildlife, with an impact on animal and human health, and economic implications. Infected wild animals have been detected in some European countries, and bTB reservoirs in wildlife have been identified, potentially hindering the eradication of bTB from cattle populations. However, the surveillance of bTB in wildlife involves several practical difficulties and is not currently covered by EU legislation. We report here the first assessment of the sensitivity of the bTB surveillance system for free-ranging wildlife launched in France in 2011 (the Sylvatub system), based on scenario tree modelling. Three surveillance system components were identified: (i) passive scanning surveillance for hunted wild boar, red deer and roe deer, based on carcass examination, (ii) passive surveillance on animals found dead, moribund or with abnormal behaviour, for wild boar, red deer, roe deer and badger and (iii) active surveillance for wild boar and badger. The application of these three surveillance system components depends on the geographic risk of bTB infection in wildlife, which in turn depends on the prevalence of bTB in cattle. We estimated the effectiveness of the three components of the Sylvatub surveillance system quantitatively, for each species separately. Active surveillance and passive scanning surveillance by carcass examination were the approaches most likely to detect at least one infected animal in a population with a given design prevalence, regardless of the local risk level and species considered. The awareness of hunters, which depends on their training and the geographic risk, was found to affect surveillance sensitivity. The results obtained are relevant for hunters and veterinary authorities wishing to determine the actual efficacy of wildlife bTB surveillance as a function of geographic area and species, and could provide support for decision-making processes concerning the enhancement of surveillance

  12. Sensitivity of Bovine Tuberculosis Surveillance in Wildlife in France: A Scenario Tree Approach.

    PubMed

    Rivière, Julie; Le Strat, Yann; Dufour, Barbara; Hendrikx, Pascal

    2015-01-01

    Bovine tuberculosis (bTB) is a common disease in cattle and wildlife, with an impact on animal and human health, and economic implications. Infected wild animals have been detected in some European countries, and bTB reservoirs in wildlife have been identified, potentially hindering the eradication of bTB from cattle populations. However, the surveillance of bTB in wildlife involves several practical difficulties and is not currently covered by EU legislation. We report here the first assessment of the sensitivity of the bTB surveillance system for free-ranging wildlife launched in France in 2011 (the Sylvatub system), based on scenario tree modelling. Three surveillance system components were identified: (i) passive scanning surveillance for hunted wild boar, red deer and roe deer, based on carcass examination, (ii) passive surveillance on animals found dead, moribund or with abnormal behaviour, for wild boar, red deer, roe deer and badger and (iii) active surveillance for wild boar and badger. The application of these three surveillance system components depends on the geographic risk of bTB infection in wildlife, which in turn depends on the prevalence of bTB in cattle. We estimated the effectiveness of the three components of the Sylvatub surveillance system quantitatively, for each species separately. Active surveillance and passive scanning surveillance by carcass examination were the approaches most likely to detect at least one infected animal in a population with a given design prevalence, regardless of the local risk level and species considered. The awareness of hunters, which depends on their training and the geographic risk, was found to affect surveillance sensitivity. The results obtained are relevant for hunters and veterinary authorities wishing to determine the actual efficacy of wildlife bTB surveillance as a function of geographic area and species, and could provide support for decision-making processes concerning the enhancement of surveillance

  13. Classification of boreal forest by satellite and inventory data using neural network approach

    NASA Astrophysics Data System (ADS)

    Romanov, A. A.

    2012-12-01

    The main objective of this research was to develop methodology for boreal (Siberian Taiga) land cover classification in a high accuracy level. The study area covers the territories of Central Siberian several parts along the Yenisei River (60-62 degrees North Latitude): the right bank includes mixed forest and dark taiga, the left - pine forests; so were taken as a high heterogeneity and statistically equal surfaces concerning spectral characteristics. Two main types of data were used: time series of middle spatial resolution satellite images (Landsat 5, 7 and SPOT4) and inventory datasets from the nature fieldworks (used for training samples sets preparation). Method of collecting field datasets included a short botany description (type/species of vegetation, density, compactness of the crowns, individual height and max/min diameters representative of each type, surface altitude of the plot), at the same time the geometric characteristic of each training sample unit corresponded to the spatial resolution of satellite images and geo-referenced (prepared datasets both of the preliminary processing and verification). The network of test plots was planned as irregular and determined by the landscape oriented approach. The main focus of the thematic data processing has been allocated for the use of neural networks (fuzzy logic inc.); therefore, the results of field studies have been converting input parameter of type / species of vegetation cover of each unit and the degree of variability. Proposed approach involves the processing of time series separately for each image mainly for the verification: shooting parameters taken into consideration (time, albedo) and thus expected to assess the quality of mapping. So the input variables for the networks were sensor bands, surface altitude, solar angels and land surface temperature (for a few experiments); also given attention to the formation of the formula class on the basis of statistical pre-processing of results of

  14. Clinical features of organophosphate poisoning: A review of different classification systems and approaches

    PubMed Central

    Peter, John Victor; Sudarsan, Thomas Isiah; Moran, John L.

    2014-01-01

    Purpose: The typical toxidrome in organophosphate (OP) poisoning comprises of the Salivation, Lacrimation, Urination, Defecation, Gastric cramps, Emesis (SLUDGE) symptoms. However, several other manifestations are described. We review the spectrum of symptoms and signs in OP poisoning as well as the different approaches to clinical features in these patients. Materials and Methods: Articles were obtained by electronic search of PubMed® between 1966 and April 2014 using the search terms organophosphorus compounds or phosphoric acid esters AND poison or poisoning AND manifestations. Results: Of the 5026 articles on OP poisoning, 2584 articles pertained to human poisoning; 452 articles focusing on clinical manifestations in human OP poisoning were retrieved for detailed evaluation. In addition to the traditional approach of symptoms and signs of OP poisoning as peripheral (muscarinic, nicotinic) and central nervous system receptor stimulation, symptoms were alternatively approached using a time-based classification. In this, symptom onset was categorized as acute (within 24-h), delayed (24-h to 2-week) or late (beyond 2-week). Although most symptoms occur with minutes or hours following acute exposure, delayed onset symptoms occurring after a period of minimal or mild symptoms, may impact treatment and timing of the discharge following acute exposure. Symptoms and signs were also viewed as an organ specific as cardiovascular, respiratory or neurological manifestations. An organ specific approach enables focused management of individual organ dysfunction that may vary with different OP compounds. Conclusions: Different approaches to the symptoms and signs in OP poisoning may better our understanding of the underlying mechanism that in turn may assist with the management of acutely poisoned patients. PMID:25425841

  15. A support vector machine approach for classification of welding defects from ultrasonic signals

    NASA Astrophysics Data System (ADS)

    Chen, Yuan; Ma, Hong-Wei; Zhang, Guang-Ming

    2014-07-01

    Defect classification is an important issue in ultrasonic non-destructive evaluation. A layered multi-class support vector machine (LMSVM) classification system, which combines multiple SVM classifiers through a layered architecture, is proposed in this paper. The proposed LMSVM classification system is applied to the classification of welding defects from ultrasonic test signals. The measured ultrasonic defect echo signals are first decomposed into wavelet coefficients by the wavelet packet transform. The energy of the wavelet coefficients at different frequency channels are used to construct the feature vectors. The bees algorithm (BA) is then used for feature selection and SVM parameter optimisation for the LMSVM classification system. The BA-based feature selection optimises the energy feature vectors. The optimised feature vectors are input to the LMSVM classification system for training and testing. Experimental results of classifying welding defects demonstrate that the proposed technique is highly robust, precise and reliable for ultrasonic defect classification.

  16. Assessment of the classification abilities of the CNS multi-parametric optimization approach by the method of logistic regression.

    PubMed

    Raevsky, O A; Polianczyk, D E; Mukhametov, A; Grigorev, V Y

    2016-08-01

    Assessment of "CNS drugs/CNS candidates" classification abilities of the multi-parametric optimization (CNS MPO) approach was performed by logistic regression. It was found that the five out of the six separately used physical-chemical properties (topological polar surface area, number of hydrogen-bonded donor atoms, basicity, lipophilicity of compound in neutral form and at pH = 7.4) provided accuracy of recognition below 60%. Only the descriptor of molecular weight (MW) could correctly classify two-thirds of the studied compounds. Aggregation of all six properties in the MPOscore did not improve the classification, which was worse than the classification using only MW. The results of our study demonstrate the imperfection of the CNS MPO approach; in its current form it is not very useful for computer design of new, effective CNS drugs. PMID:27477321

  17. A dual neural network ensemble approach for multiclass brain tumor classification.

    PubMed

    Sachdeva, Jainy; Kumar, Vinod; Gupta, Indra; Khandelwal, Niranjan; Ahuja, Chirag Kamal

    2012-11-01

    The present study is conducted to develop an interactive computer aided diagnosis (CAD) system for assisting radiologists in multiclass classification of brain tumors. In this paper, primary brain tumors such as astrocytoma, glioblastoma multiforme, childhood tumor-medulloblastoma, meningioma and secondary tumor-metastases along with normal regions are classified by a dual level neural network ensemble. Two hundred eighteen texture and intensity features are extracted from 856 segmented regions of interest (SROIs) and are taken as input. PCA is used for reduction of dimensionality of the feature space. The study is performed on a diversified dataset of 428 post contrast T1-weighted magnetic resonance images of 55 patients. Two sets of experiments are performed. In the first experiment, random selection is used which may allow SROIs from the same patient having similar characteristics to appear in both training and testing simultaneously. In the second experiment, not even a single SROI from the same patient is common during training and testing. In the first experiment, it is observed that the dual level neural network ensemble has enhanced the overall accuracy to 95.85% compared with 91.97% of single level artificial neural network. The proposed method delivers high accuracy for each class. The accuracy obtained for each class is: astrocytoma 96.29%, glioblastoma multiforme 96.15%, childhood tumor-medulloblastoma 90%, meningioma 93.00%, secondary tumor-metastases 96.67% and normal regions 97.41%. This study reveals that dual level neural network ensemble provides better results than the single level artificial neural network. In the second experiment, overall classification accuracy of 90.4% was achieved. The generalization ability of this approach can be tested by analyzing larger datasets. The extensive training will also further improve the performance of the proposed dual network ensemble. Quantitative results obtained from the proposed method will assist the

  18. Automatic approach to solve the morphological galaxy classification problem using the sparse representation technique and dictionary learning

    NASA Astrophysics Data System (ADS)

    Diaz-Hernandez, R.; Ortiz-Esquivel, A.; Peregrina-Barreto, H.; Altamirano-Robles, L.; Gonzalez-Bernal, J.

    2016-06-01

    The observation of celestial objects in the sky is a practice that helps astronomers to understand the way in which the Universe is structured. However, due to the large number of observed objects with modern telescopes, the analysis of these by hand is a difficult task. An important part in galaxy research is the morphological structure classification based on the Hubble sequence. In this research, we present an approach to solve the morphological galaxy classification problem in an automatic way by using the Sparse Representation technique and dictionary learning with K-SVD. For the tests in this work, we use a database of galaxies extracted from the Principal Galaxy Catalog (PGC) and the APM Equatorial Catalogue of Galaxies obtaining a total of 2403 useful galaxies. In order to represent each galaxy frame, we propose to calculate a set of 20 features such as Hu's invariant moments, galaxy nucleus eccentricity, gabor galaxy ratio and some other features commonly used in galaxy classification. A stage of feature relevance analysis was performed using Relief-f in order to determine which are the best parameters for the classification tests using 2, 3, 4, 5, 6 and 7 galaxy classes making signal vectors of different length values with the most important features. For the classification task, we use a 20-random cross-validation technique to evaluate classification accuracy with all signal sets achieving a score of 82.27 % for 2 galaxy classes and up to 44.27 % for 7 galaxy classes.

  19. An Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources

    SciTech Connect

    Rocco, D; Critchlow, T J

    2003-05-01

    The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources--e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

  20. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.

    PubMed

    Galpert, Deborah; Del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

  1. Assessment of Sampling Approaches for Remote Sensing Image Classification in the Iranian Playa Margins

    NASA Astrophysics Data System (ADS)

    Kazem Alavipanah, Seyed

    There are some problems in soil salinity studies based upon remotely sensed data: 1-spectral world is full of ambiguity and therefore soil reflectance can not be attributed to a single soil property such as salinity, 2) soil surface conditions as a function of time and space is a complex phenomena, 3) vegetation with a dynamic biological nature may create some problems in the study of soil salinity. Due to these problems the first question which may arise is how to overcome or minimise these problems. In this study we hypothesised that different sources of data, well established sampling plan and optimum approach could be useful. In order to choose representative training sites in the Iranian playa margins, to define the spectral and informational classes and to overcome some problems encountered in the variation within the field, the following attempts were made: 1) Principal Component Analysis (PCA) in order: a) to determine the most important variables, b) to understand the Landsat satellite images and the most informative components, 2) the photomorphic unit (PMU) consideration and interpretation; 3) study of salt accumulation and salt distribution in the soil profile, 4) use of several forms of field data, such as geologic, geomorphologic and soil information; 6) confirmation of field data and land cover types with farmers and the members of the team. The results led us to find at suitable approaches with a high and acceptable image classification accuracy and image interpretation. KEY WORDS; Photo Morphic Unit, Pprincipal Ccomponent Analysis, Soil Salinity, Field Work, Remote Sensing

  2. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

    PubMed Central

    Galpert, Deborah; del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

  3. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.

    PubMed

    Galpert, Deborah; Del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.

  4. Single event and TREE latchup mitigation for a star tracker sensor: An innovative approach to system level latchup mitigation

    SciTech Connect

    Kimbrough, J.R.; Colella, N.J.; Davis, R.W.; Bruener, D.B.; Coakley, P.G.; Lutjens, S.W.; Mallon, C.E.

    1994-08-01

    Electronic packages designed for spacecraft should be fault-tolerant and operate without ground control intervention through extremes in the space radiation environment. If designed for military use, the electronics must survive and function in a nuclear radiation environment. This paper presents an innovative ``blink`` approach rather than the typical ``operate through`` approach to achieve system level latchup mitigation on a prototype star tracker camera. Included are circuit designs, flash x-ray test data, and heavy ion data demonstrating latchup mitigation protecting micro-electronics from current latchup and burnout due to Single Event Latchup (SEL) and Transient Radiation Effects on Electronics (TREE).

  5. Buildings classification from airborne LiDAR point clouds through OBIA and ontology driven approach

    NASA Astrophysics Data System (ADS)

    Tomljenovic, Ivan; Belgiu, Mariana; Lampoltshammer, Thomas J.

    2013-04-01

    In the last years, airborne Light Detection and Ranging (LiDAR) data proved to be a valuable information resource for a vast number of applications ranging from land cover mapping to individual surface feature extraction from complex urban environments. To extract information from LiDAR data, users apply prior knowledge. Unfortunately, there is no consistent initiative for structuring this knowledge into data models that can be shared and reused across different applications and domains. The absence of such models poses great challenges to data interpretation, data fusion and integration as well as information transferability. The intention of this work is to describe the design, development and deployment of an ontology-based system to classify buildings from airborne LiDAR data. The novelty of this approach consists of the development of a domain ontology that specifies explicitly the knowledge used to extract features from airborne LiDAR data. The overall goal of this approach is to investigate the possibility for classification of features of interest from LiDAR data by means of domain ontology. The proposed workflow is applied to the building extraction process for the region of "Biberach an der Riss" in South Germany. Strip-adjusted and georeferenced airborne LiDAR data is processed based on geometrical and radiometric signatures stored within the point cloud. Region-growing segmentation algorithms are applied and segmented regions are exported to the GeoJSON format. Subsequently, the data is imported into the ontology-based reasoning process used to automatically classify exported features of interest. Based on the ontology it becomes possible to define domain concepts, associated properties and relations. As a consequence, the resulting specific body of knowledge restricts possible interpretation variants. Moreover, ontologies are machinable and thus it is possible to run reasoning on top of them. Available reasoners (FACT++, JESS, Pellet) are used to check

  6. Molecular Property eXplorer: a novel approach to visualizing SAR using tree-maps and heatmaps.

    PubMed

    Kibbey, Christopher; Calvet, Alain

    2005-01-01

    The tremendous increase in chemical structure and biological activity data brought about through combinatorial chemistry and high-throughput screening technologies has created the need for sophisticated graphical tools for visualizing and exploring structure-activity data. Visualization plays an important role in exploring and understanding relationships within such multidimensional data sets. Many chemoinformatics software applications apply standard clustering techniques to organize structure-activity data, but they differ significantly in their approaches to visualizing clustered data. Molecular Property eXplorer (MPX) is unique in its presentation of clustered data in the form of heatmaps and tree-maps. MPX employs agglomerative hierarchical clustering to organize data on the basis of the similarity between 2D chemical structures or similarity across a predefined profile of biological assay values. Visualization of hierarchical clusters as tree-maps and heatmaps provides simultaneous representation of cluster members along with their associated assay values. Tree-maps convey both the spatial relationship among cluster members and the value of a single property (activity) associated with each member. Heatmaps provide visualization of the cluster members across an activity profile. Unlike a tree-map, however, a heatmap does not convey the spatial relationship between cluster members. MPX seamlessly integrates tree-maps and heatmaps to represent multidimensional structure-activity data in a visually intuitive manner. In addition, MPX provides tools for clustering data on the basis of chemical structure or activity profile, displaying 2D chemical structures, and querying the data based over a specified activity range, or set of chemical structure criteria (e.g., Tanimoto similarity, substructure match, and "R-group" analysis).

  7. A discriminative model-constrained EM approach to 3D MRI brain tissue classification and intensity non-uniformity correction

    NASA Astrophysics Data System (ADS)

    Wels, Michael; Zheng, Yefeng; Huber, Martin; Hornegger, Joachim; Comaniciu, Dorin

    2011-06-01

    We describe a fully automated method for tissue classification, which is the segmentation into cerebral gray matter (GM), cerebral white matter (WM), and cerebral spinal fluid (CSF), and intensity non-uniformity (INU) correction in brain magnetic resonance imaging (MRI) volumes. It combines supervised MRI modality-specific discriminative modeling and unsupervised statistical expectation maximization (EM) segmentation into an integrated Bayesian framework. While both the parametric observation models and the non-parametrically modeled INUs are estimated via EM during segmentation itself, a Markov random field (MRF) prior model regularizes segmentation and parameter estimation. Firstly, the regularization takes into account knowledge about spatial and appearance-related homogeneity of segments in terms of pairwise clique potentials of adjacent voxels. Secondly and more importantly, patient-specific knowledge about the global spatial distribution of brain tissue is incorporated into the segmentation process via unary clique potentials. They are based on a strong discriminative model provided by a probabilistic boosting tree (PBT) for classifying image voxels. It relies on the surrounding context and alignment-based features derived from a probabilistic anatomical atlas. The context considered is encoded by 3D Haar-like features of reduced INU sensitivity. Alignment is carried out fully automatically by means of an affine registration algorithm minimizing cross-correlation. Both types of features do not immediately use the observed intensities provided by the MRI modality but instead rely on specifically transformed features, which are less sensitive to MRI artifacts. Detailed quantitative evaluations on standard phantom scans and standard real-world data show the accuracy and robustness of the proposed method. They also demonstrate relative superiority in comparison to other state-of-the-art approaches to this kind of computational task: our method achieves average

  8. Contrasting regional and national mechanisms for predicting elevated arsenic in private wells across the United States using classification and regression trees.

    PubMed

    Frederick, Logan; VanDerslice, James; Taddie, Marissa; Malecki, Kristen; Gregg, Josh; Faust, Nicholas; Johnson, William P

    2016-03-15

    Arsenic contamination in groundwater is a public health and environmental concern in the United States (U.S.) particularly where monitoring is not required under the Safe Water Drinking Act. Previous studies suggest the influence of regional mechanisms for arsenic mobilization into groundwater; however, no study has examined how influencing parameters change at a continental scale spanning multiple regions. We herein examine covariates for groundwater in the western, central and eastern U.S. regions representing mechanisms associated with arsenic concentrations exceeding the U.S. Environmental Protection Agency maximum contamination level (MCL) of 10 parts per billion (ppb). Statistically significant covariates were identified via classification and regression tree (CART) analysis, and included hydrometeorological and groundwater chemical parameters. The CART analyses were performed at two scales: national and regional; for which three physiographic regions located in the western (Payette Section and the Snake River Plain), central (Osage Plains of the Central Lowlands), and eastern (Embayed Section of the Coastal Plains) U.S. were examined. Validity of each of the three regional CART models was indicated by values >85% for the area under the receiver-operating characteristic curve. Aridity (precipitation minus potential evapotranspiration) was identified as the primary covariate associated with elevated arsenic at the national scale. At the regional scale, aridity and pH were the major covariates in the arid to semi-arid (western) region; whereas dissolved iron (taken to represent chemically reducing conditions) and pH were major covariates in the temperate (eastern) region, although additional important covariates emerged, including elevated phosphate. Analysis in the central U.S. region indicated that elevated arsenic concentrations were driven by a mixture of those observed in the western and eastern regions.

  9. Contrasting regional and national mechanisms for predicting elevated arsenic in private wells across the United States using classification and regression trees.

    PubMed

    Frederick, Logan; VanDerslice, James; Taddie, Marissa; Malecki, Kristen; Gregg, Josh; Faust, Nicholas; Johnson, William P

    2016-03-15

    Arsenic contamination in groundwater is a public health and environmental concern in the United States (U.S.) particularly where monitoring is not required under the Safe Water Drinking Act. Previous studies suggest the influence of regional mechanisms for arsenic mobilization into groundwater; however, no study has examined how influencing parameters change at a continental scale spanning multiple regions. We herein examine covariates for groundwater in the western, central and eastern U.S. regions representing mechanisms associated with arsenic concentrations exceeding the U.S. Environmental Protection Agency maximum contamination level (MCL) of 10 parts per billion (ppb). Statistically significant covariates were identified via classification and regression tree (CART) analysis, and included hydrometeorological and groundwater chemical parameters. The CART analyses were performed at two scales: national and regional; for which three physiographic regions located in the western (Payette Section and the Snake River Plain), central (Osage Plains of the Central Lowlands), and eastern (Embayed Section of the Coastal Plains) U.S. were examined. Validity of each of the three regional CART models was indicated by values >85% for the area under the receiver-operating characteristic curve. Aridity (precipitation minus potential evapotranspiration) was identified as the primary covariate associated with elevated arsenic at the national scale. At the regional scale, aridity and pH were the major covariates in the arid to semi-arid (western) region; whereas dissolved iron (taken to represent chemically reducing conditions) and pH were major covariates in the temperate (eastern) region, although additional important covariates emerged, including elevated phosphate. Analysis in the central U.S. region indicated that elevated arsenic concentrations were driven by a mixture of those observed in the western and eastern regions. PMID:26803265

  10. Toward the improvement of trail classification in national parks using the recreation opportunity spectrum approach.

    PubMed

    Oishi, Yoshitaka

    2013-06-01

    Trail settings in national parks are essential management tools for improving both ecological conservation efforts and the quality of visitor experiences. This study proposes a plan for the appropriate maintenance of trails in Chubusangaku National Park, Japan, based on the recreation opportunity spectrum (ROS) approach. First, we distributed 452 questionnaires to determine park visitors' preferences for setting a trail (response rate = 68 %). Respondents' preferences were then evaluated according to the following seven parameters: access, remoteness, naturalness, facilities and site management, social encounters, visitor impact, and visitor management. Using nonmetric multidimensional scaling and cluster analysis, the visitors were classified into seven groups. Last, we classified the actual trails according to the visitor questionnaire criteria to examine the discrepancy between visitors' preferences and actual trail settings. The actual trail classification indicated that while most developed trails were located in accessible places, primitive trails were located in remote areas. However, interestingly, two visitor groups seemed to prefer a well-conserved natural environment and, simultaneously, easily accessible trails. This finding does not correspond to a premise of the ROS approach, which supposes that primitive trails should be located in remote areas without ready access. Based on this study's results, we propose that creating trails, which afford visitors the opportunity to experience a well-conserved natural environment in accessible areas is a useful means to provide visitors with diverse recreation opportunities. The process of data collection and analysis in this study can be one approach to produce ROS maps for providing visitors with recreational opportunities of greater diversity and higher quality.

  11. Adhesive restorations in the posterior area with subgingival cervical margins: new classification and differentiated treatment approach.

    PubMed

    Veneziani, Marco

    2010-01-01

    The aim of this article is to analyze some of the issues related to the adhesive restoration of teeth with deep cervical and/or subgingival margins in the posterior area. Three different problems tend to occur during restoration: loss of dental substance, detection of subgingival cervical margins, and dentin sealing of the cervical margins. These conditions, together with the presence of medium/large-sized cavities associated with cuspal involvement and absence of cervical enamel, are indications for indirect adhesive restorations. Subgingival margins are associated with biological and technical problems such as difficulty in isolating the working field with a dental dam, adhesion procedures, impression taking, and final positioning of the restoration itself. A new classification is suggested based on two clinical parameters: 1) a technicaloperative parameter (possibility of correct isolation through the dental dam) and 2) a biological parameter (depending on the biologic width). Three different clinical situations and three different therapeutic approaches are identified (1st, 2nd, and 3rd, respectively): coronal relocation of the margin, surgical exposure of the margin, and clinical crown lengthening. The latter is associated with three further operative sequences: immediate, early, or delayed impression taking. The different therapeutic options are described and illustrated by several clinical cases. The surgical-restorative approach, whereby surgery is strictly associated with buildup, onlay preparation, and impression taking is particularly interesting. The restoration is cemented after only 1 week. This approach makes it possible to speed up the therapy by eliminating the intermediate phases associated with positioning the provisional restorations, and with fast and efficient healing of the soft marginal tissue. PMID:20305873

  12. Phylogeny and Classification of the Trapdoor Spider Genus Myrmekiaphila: An Integrative Approach to Evaluating Taxonomic Hypotheses

    PubMed Central

    Bailey, Ashley L.; Brewer, Michael S.; Hendrixson, Brent E.; Bond, Jason E.

    2010-01-01

    Background Revised by Bond and Platnick in 2007, the trapdoor spider genus Myrmekiaphila comprises 11 species. Species delimitation and placement within one of three species groups was based on modifications of the male copulatory device. Because a phylogeny of the group was not available these species groups might not represent monophyletic lineages; species definitions likewise were untested hypotheses. The purpose of this study is to reconstruct the phylogeny of Myrmekiaphila species using molecular data to formally test the delimitation of species and species-groups. We seek to refine a set of established systematic hypotheses by integrating across molecular and morphological data sets. Methods and Findings Phylogenetic analyses comprising Bayesian searches were conducted for a mtDNA matrix composed of contiguous 12S rRNA, tRNA-val, and 16S rRNA genes and a nuclear DNA matrix comprising the glutamyl and prolyl tRNA synthetase gene each consisting of 1348 and 481 bp, respectively. Separate analyses of the mitochondrial and nuclear genome data and a concatenated data set yield M. torreya and M. millerae paraphyletic with respect to M. coreyi and M. howelli and polyphyletic fluviatilis and foliata species groups. Conclusions Despite the perception that molecular data present a solution to a crisis in taxonomy, studies like this demonstrate the efficacy of an approach that considers data from multiple sources. A DNA barcoding approach during the species discovery process would fail to recognize at least two species (M. coreyi and M. howelli) whereas a combined approach more accurately assesses species diversity and illuminates speciation pattern and process. Concomitantly these data also demonstrate that morphological characters likewise fail in their ability to recover monophyletic species groups and result in an unnatural classification. Optimizations of these characters demonstrate a pattern of “Dollo evolution” wherein a complex character evolves only once

  13. Area Estimation for Winter Wheat over the North China Plain Using a Sub-Pixel Classification Approach

    NASA Astrophysics Data System (ADS)

    van Hoolst, Roel; Dong, Qinghan; Eerens, Herman; Bydekrke, Lieven; Kerdiles, Herve

    2013-01-01

    This study examined the potential of sub-pixel classification for regional crop area estimation. The approach uses a neural network, trained on a high resolution crop map, to estimate sub-pixel crop area fractions using time series of S10 NDVI-composites of the 1 km resolution sensor SPOT-VEGETATION. The classification of the high resolution imagery such as LANDSAT TM was used to train the network. The application of such a trained network on an extended spatial area and temporal period has been studied, focusing especially on planting area of winter wheat on the North China Plain for the period 2005-2009.

  14. Comments on "A modified reachability tree approach to analysis of unbounded Petri nets".

    PubMed

    Ru, Yu; Wu, Weimin; Hadjicostis, Christoforos N

    2006-10-01

    The above paper introduced the construction of a modified reachability tree (MRT) for (unbounded) Petri nets and its application to reachability, liveness, and deadlock analysis. This note shows via a counterexample that some of the MRT properties claimed in the above paper are incorrect.

  15. A Fault Tree Approach to Analysis of Behavioral Systems: An Overview.

    ERIC Educational Resources Information Center

    Stephens, Kent G.

    Developed at Brigham Young University, Fault Tree Analysis (FTA) is a technique for enhancing the probability of success in any system by analyzing the most likely modes of failure that could occur. It provides a logical, step-by-step description of possible failure events within a system and their interaction--the combinations of potential…

  16. Detection of dispersed radio pulses: a machine learning approach to candidate identification and classification

    NASA Astrophysics Data System (ADS)

    Devine, Thomas Ryan; Goseva-Popstojanova, Katerina; McLaughlin, Maura

    2016-06-01

    Searching for extraterrestrial, transient signals in astronomical data sets is an active area of current research. However, machine learning techniques are lacking in the literature concerning single-pulse detection. This paper presents a new, two-stage approach for identifying and classifying dispersed pulse groups (DPGs) in single-pulse search output. The first stage identified DPGs and extracted features to characterize them using a new peak identification algorithm which tracks sloping tendencies around local maxima in plots of signal-to-noise ratio versus dispersion measure. The second stage used supervised machine learning to classify DPGs. We created four benchmark data sets: one unbalanced and three balanced versions using three different imbalance treatments. We empirically evaluated 48 classifiers by training and testing binary and multiclass versions of six machine learning algorithms on each of the four benchmark versions. While each classifier had advantages and disadvantages, all classifiers with imbalance treatments had higher recall values than those with unbalanced data, regardless of the machine learning algorithm used. Based on the benchmarking results, we selected a subset of classifiers to classify the full, unlabelled data set of over 1.5 million DPGs identified in 42 405 observations made by the Green Bank Telescope. Overall, the classifiers using a multiclass ensemble tree learner in combination with two oversampling imbalance treatments were the most efficient; they identified additional known pulsars not in the benchmark data set and provided six potential discoveries, with significantly less false positives than the other classifiers.

  17. The suitability of the dual isotope approach (δ13C and δ18O) in tree ring studies

    NASA Astrophysics Data System (ADS)

    Siegwolf, Rolf; Saurer, Matthias

    2016-04-01

    The use of stable isotopes, complementary to tree ring width data in tree ring research has proven to be a powerful tool in studying the impact of environmental parameters on tree physiology and growth. These three proxies are thus instrumental for climate reconstruction and improve the understanding of underlying causes of growth changes. In various cases, however, their use suggests non-plausible interpretations. Often the use of one isotope alone does not allow the detection of such "erroneous isotope responses". A careful analysis of these deviating results shows that either the validity of the carbon isotope discrimination concept is no longer true (Farquhar et al. 1982) or the assumptions for the leaf water enrichment model (Cernusak et al., 2003) are violated and thus both fractionation models are not applicable. In this presentation we discuss such cases when the known fractionation concepts fail and do not allow a correct interpretation of the isotope data. With the help of the dual isotope approach (Scheidegger et al.; 2000) it is demonstrated, how to detect and uncover the causes for such anomalous isotope data. The fractionation concepts and their combinations before the background of CO2 and H2O gas exchange are briefly explained and the specific use of the dual isotope approach for tree ring data analyses and interpretations are demonstrated. References: Cernusak, L. A., Arthur, D. J., Pate, J. S. and Farquhar, G. D.: Water relations link carbon and oxygen isotope discrimination to phloem sap sugar concentration in Eucalyptus globules, Plant Physiol., 131, 1544-1554, 2003. Farquhar, G. D., O'Leary, M. H. and Berry, J. A.: On the relationship between carbon isotope discrimination and the intercellular carbon dioxide concentration in leaves, Aust. J. Plant Physiol., 9, 121-137, 1982. Scheidegger, Y., Saurer, M., Bahn, M. and Siegwolf, R.: Linking stable oxygen and carbon isotopes with stomatal conductance and photosynthetic capacity: A conceptual model

  18. Alternative standardization approaches to improving streamflow reconstructions with ring-width indices of riparian trees

    USGS Publications Warehouse

    Meko, David M; Friedman, Jonathan M.; Touchan, Ramzi; Edmondson, Jesse R.; Griffin, Eleanor R.; Scott, Julian A.

    2015-01-01

    Old, multi-aged populations of riparian trees provide an opportunity to improve reconstructions of streamflow. Here, ring widths of 394 plains cottonwood (Populus deltoids, ssp. monilifera) trees in the North Unit of Theodore Roosevelt National Park, North Dakota, are used to reconstruct streamflow along the Little Missouri River (LMR), North Dakota, US. Different versions of the cottonwood chronology are developed by (1) age-curve standardization (ACS), using age-stratified samples and a single estimated curve of ring width against estimated ring age, and (2) time-curve standardization (TCS), using a subset of longer ring-width series individually detrended with cubic smoothing splines of width against year. The cottonwood chronologies are combined with the first principal component of four upland conifer chronologies developed by conventional methods to investigate the possible value of riparian tree-ring chronologies for streamflow reconstruction of the LMR. Regression modeling indicates that the statistical signal for flow is stronger in the riparian cottonwood than in the upland chronologies. The flow signal from cottonwood complements rather than repeats the signal from upland conifers and is especially strong in young trees (e.g. 5–35 years). Reconstructions using a combination of cottonwoods and upland conifers are found to explain more than 50% of the variance of LMR flow over a 1935–1990 calibration period and to yield reconstruction of flow to 1658. The low-frequency component of reconstructed flow is sensitive to the choice of standardization method for the cottonwood. In contrast to the TCS version, the ACS reconstruction features persistent low flows in the 19th century. Results demonstrate the value to streamflow reconstruction of riparian cottonwood and suggest that more studies are needed to exploit the low-frequency streamflow signal in densely sampled age-stratified stands of riparian trees.

  19. Machine learning in soil classification.

    PubMed

    Bhattacharya, B; Solomatine, D P

    2006-03-01

    In a number of engineering problems, e.g. in geotechnics, petroleum engineering, etc. intervals of measured series data (signals) are to be attributed a class maintaining the constraint of contiguity and standard classification methods could be inadequate. Classification in this case needs involvement of an expert who observes the magnitude and trends of the signals in addition to any a priori information that might be available. In this paper, an approach for automating this classification procedure is presented. Firstly, a segmentation algorithm is developed and applied to segment the measured signals. Secondly, the salient features of these segments are extracted using boundary energy method. Based on the measured data and extracted features to assign classes to the segments classifiers are built; they employ Decision Trees, ANN and Support Vector Machines. The methodology was tested in classifying sub-surface soil using measured data from Cone Penetration Testing and satisfactory results were obtained. PMID:16530382

  20. A Philosophical Approach to Describing Science Content: An Example From Geologic Classification.

    ERIC Educational Resources Information Center

    Finley, Fred N.

    1981-01-01

    Examines how research of philosophers of science may be useful to science education researchers and curriculum developers in the development of descriptions of science content related to classification schemes. Provides examples of concept analysis of two igneous rock classification schemes. (DS)

  1. Building and Solving Odd-One-Out Classification Problems: A Systematic Approach

    ERIC Educational Resources Information Center

    Ruiz, Philippe E.

    2011-01-01

    Classification problems ("find the odd-one-out") are frequently used as tests of inductive reasoning to evaluate human or animal intelligence. This paper introduces a systematic method for building the set of all possible classification problems, followed by a simple algorithm for solving the problems of the R-ASCM, a psychometric test derived…

  2. Comparison of two approaches for the classification of 16S rRNA gene sequences.

    PubMed

    Chatellier, Sonia; Mugnier, Nathalie; Allard, Françoise; Bonnaud, Bertrand; Collin, Valérie; van Belkum, Alex; Veyrieras, Jean-Baptiste; Emler, Stefan

    2014-10-01

    The use of 16S rRNA gene sequences for microbial identification in clinical microbiology is accepted widely, and requires databases and algorithms. We compared a new research database containing curated 16S rRNA gene sequences in combination with the lca (lowest common ancestor) algorithm (RDB-LCA) to a commercially available 16S rDNA Centroid approach. We used 1025 bacterial isolates characterized by biochemistry, matrix-assisted laser desorption/ionization time-of-flight MS and 16S rDNA sequencing. Nearly 80 % of isolates were identified unambiguously at the species level by both classification platforms used. The remaining isolates were mostly identified correctly at the genus level due to the limited resolution of 16S rDNA sequencing. Discrepancies between both 16S rDNA platforms were due to differences in database content and the algorithm used, and could amount to up to 10.5 %. Up to 1.4 % of the analyses were found to be inconclusive. It is important to realize that despite the overall good performance of the pipelines for analysis, some inconclusive results remain that require additional in-depth analysis performed using supplementary methods.

  3. Moving beyond the Galloway diagrams for delta classification: A graph-theoretic approach.

    NASA Astrophysics Data System (ADS)

    Tejedor, Alejandro; Longjas, Anthony; Caldwell, Rebecca; Edmonds, Douglas; Zaliapin, Ilya; Foufoula-Georgiou, Efi

    2016-04-01

    Delta channel networks self-organize to a variety of stunning and complex patterns in response to different forcings (e.g., river, tides and waves) and the physical properties of their sediment (e.g., particle size, cohesiveness). Understanding and quantifying properties of these patterns is an essential step to solve the inverse problem of inferring process from form. A recently introduced framework based on spectral graph theory allows us to assess delta channel network complexity from a topologic (channel connectivity) and dynamic (flux exchange) perspective [Tejedor et al., 2015a,b]. We demonstrate the potential of this framework, together with numerical and experimental deltas, wherein different delta properties can be varied individually, to replace the qualitative approach still in use today [Galloway, 1975; Orton and Reading, 1993]. Specifically, in this work we have examined the effect of sediment parameters (grain size, cohesiveness) on the channel structure of river dominated deltas generated by a morphodynamic model (Delft3D). Our analysis shows that deltas with coarser incoming sediment are more complex topologically (increased number of looped pathways) but simpler dynamically (reduced flux exchange between subnetworks). We capitalize on the combined approach of controlled simulation (with known drivers) and quantitative comparison by positioning field and simulated deltas in the so-called TopoDynamic space to open up a path to provide valuable information towards a refined classification and inference scheme of delta morphology. Furthermore, numerical deltas allow us to explore the delta channel structure not only in a spatially explicit manner but also temporally, since the complete temporal record of delta evolution is available

  4. A divide and conquer approach for imbalanced multi-class classification and its application to medical decision making.

    PubMed

    Li, Hu

    2016-03-01

    Many real world data contains more than two categories and the number of instances in each category differs greatly. Such as in medical diagnostic data, there may be several types of cancer and each with tens instances, but contains even more normal instances. Similarly, there may be very few abnormal samples in pharmaceutical test but which may cause great harm. Classification of such type of data is often summarized as imbalanced multi-class classification. Most existing researches study multi-class classification and imbalanced data classification separately, few study in a combination way, in particular for medical diagnosis data classification. In the context of medical diagnosis and pharmaceutical test, in this paper, we propose a divide and conquer approach to partition multi-class data and a self-adaptive data resample method for imbalanced data. The proposed methods are tested on 23 UCI datasets in medical, pharmaceutical and other fields. Experiment results show that the proposed methods outperform other compared methods, in particular on those medical and pharmaceutical dataset. PMID:27113314

  5. Multi-locus tree and species tree approaches toward resolving a complex clade of downy mildews (Straminipila, Oomycota), including pathogens of beet and spinach.

    PubMed

    Choi, Young-Joon; Klosterman, Steven J; Kummer, Volker; Voglmayr, Hermann; Shin, Hyeon-Dong; Thines, Marco

    2015-05-01

    Accurate species determination of plant pathogens is a prerequisite for their control and quarantine, and further for assessing their potential threat to crops. The family Peronosporaceae (Straminipila; Oomycota) consists of obligate biotrophic pathogens that cause downy mildew disease on angiosperms, including a large number of cultivated plants. In the largest downy mildew genus Peronospora, a phylogenetically complex clade includes the economically important downy mildew pathogens of spinach and beet, as well as the type species of the genus Peronospora. To resolve this complex clade at the species level and to infer evolutionary relationships among them, we used multi-locus phylogenetic analysis and species tree estimation. Both approaches discriminated all nine currently accepted species and revealed four previously unrecognized lineages, which are specific to a host genus or species. This is in line with a narrow species concept, i.e. that a downy mildew species is associated with only a particular host plant genus or species. Instead of applying the dubious name Peronospora farinosa, which has been proposed for formal rejection, our results provide strong evidence that Peronospora schachtii is an independent species from lineages on Atriplex and apparently occurs exclusively on Beta vulgaris. The members of the clade investigated, the Peronospora rumicis clade, associate with three different host plant families, Amaranthaceae, Caryophyllaceae, and Polygonaceae, suggesting that they may have speciated following at least two recent inter-family host shifts, rather than contemporary cospeciation with the host plants. PMID:25772799

  6. Data mining approach identifies research priorities and data requirements for resolving the red algal tree of life

    PubMed Central

    2010-01-01

    Background The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Results Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Conclusions Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable

  7. Tropical dendrochemistry: A novel approach for reconstructing seasonally-resolved growth rates from ringless tropical trees

    NASA Astrophysics Data System (ADS)

    Poussart, P. M.; Myneni, S. C.

    2005-12-01

    Although tropical forests play an active role in the global carbon cycle and are host to a variety of pristine paleoclimate archives, they remain poorly characterized as compared to other ecosystems on the planet. In particular, dating and reconstructing the growth rate history of tropical trees remains a challenge and continues to delay research efforts towards understanding tropical forest dynamics. Traditional dendrochronological techniques have found limited applications in the tropics because temperature seasonality is often too small to initiate the production of visible annual growth rings. Dendrometers, cambium scarring methods and sub-annual records of oxygen and carbon isotopes from tree cellulose may be used to estimate growth rate histories when growth rings are absent. However, dendrometer records rarely extend beyond the past couple of decades and the generation of seasonally-resolved isotopic records remains labour intensive, currently prohibiting the level of record replication necessary for statistical analysis. Here, we present evidence that Ca may also be used as a proxy for dating and reconstructing growth rates of trees lacking visible growth rings. Using the Brookhaven National Lab Synchrotron, we recover a radial record of cyclic variations in Ca from a Miliusa velutina tree from northern Thailand. We determine that the Ca cycles are seasonal based on a comparison between radiocarbon age estimates and a trace element age model, which agree within 2 years over the period of 1955 to 2000. The amplitude of the Ca annual cycle is significantly correlated with growth rate estimates, which are also correlated to the amount of dry season rainfall. The measurements at the Synchrotron are fast, non-destructive and require little sample preparation. Application of this technique in the tropics holds the potential to resolve longstanding questions about tropical forest dynamics and interannual to decadal changes in the carbon cycle.

  8. Decision tree approach to evaluating inactive uranium processing sites for liner requirements

    SciTech Connect

    Relyea, J.F.

    1983-03-01

    Recently, concern has been expressed about potential toxic effects of both radon emission and release of toxic elements in leachate from inactive uranium mill tailings piles. Remedial action may be required to meet disposal standards set by the states and the US Environmental Protection Agency (EPA). In some cases, a possible disposal option is the exhumation and reburial (either on site or at a new location) of tailings and reliance on engineered barriers to satisfy the objectives established for remedial actions. Liners under disposal pits are the major engineered barrier for preventing contaminant release to ground and surface water. The purpose of this report is to provide a logical sequence of action, in the form of a decision tree, which could be followed to show whether a selected tailings disposal design meets the objectives for subsurface contaminant release without a liner. This information can be used to determine the need and type of liner for sites exhibiting a potential groundwater problem. The decision tree is based on the capability of hydrologic and mass transport models to predict the movement of water and contaminants with time. The types of modeling capabilities and data needed for those models are described, and the steps required to predict water and contaminant movement are discussed. A demonstration of the decision tree procedure is given to aid the reader in evaluating the need for the adequacy of a liner.

  9. Nitrogen isotopes in Tree-Rings - An approach combining soil biogeochemistry and isotopic long series with statistical modeling

    NASA Astrophysics Data System (ADS)

    Savard, Martine M.; Bégin, Christian; Paré, David; Marion, Joëlle; Laganière, Jérôme; Séguin, Armand; Stefani, Franck; Smirnoff, Anna

    2016-04-01

    Monitoring atmospheric emissions from industrial centers in North America generally started less than 25 years ago. To compensate for the lack of monitoring, previous investigations have interpreted tree-ring N changes using the known chronology of human activities, without facing the challenge of separating climatic effects from potential anthropogenic impacts. Here we document such an attempt conducted in the oil sands (OS) mining region of Northeastern Alberta, Canada. The reactive nitrogen (Nr)-emitting oil extraction operations began in 1967, but air quality measurements were only initiated in 1997. To investigate if the beginning and intensification of OS operations induced changes in the forest N-cycle, we sampled white spruce (Picea glauca (Moench) Voss) stands located at various distances from the main mining area, and receiving low, but different N deposition. Our approach combines soil biogeochemical and metagenomic characterization with long, well dated, tree-ring isotopic series. To objectively delineate the natural N isotopic behaviour in trees, we have characterized tree-ring N isotope (15N/14N) ratios between 1880 and 2009, used statistical analyses of the isotopic values and local climatic parameters of the pre-mining period to calibrate response functions and project the isotopic responses to climate during the extraction period. During that period, the measured series depart negatively from the projected natural trends. In addition, these long-term negative isotopic trends are better reproduced by multiple-regression models combining climatic parameters with the proxy for regional mining Nr emissions. These negative isotopic trends point towards changes in the forest soil biogeochemical N cycle. The biogeochemical data and ultimate soil mechanisms responsible for such changes will be discussed during the presentation.

  10. Downscaling Transpiration from the Field to the Tree Scale using the Neural Network Approach

    NASA Astrophysics Data System (ADS)

    Hopmans, J. W.

    2015-12-01

    Estimating actual evapotranspiration (ETa) spatial variability in orchards is key when trying to quantify water (and associated nutrients) leaching, both with the mass balance and inverse modeling methods. ETa measurements however generally occur at larger scales (e.g. Eddy-covariance method) or have a limited quantitative accuracy. In this study we propose to establish a statistical relation between field ETa and field averaged variables known to be closely related to it, such as stem water potential (WP), soil water storage (WS) and ETc. For that we use 4 years of soil and almond trees water status data to train artificial neural networks (ANNs) predicting field scale ETa and downscale the relation to the individual tree scale. ANNs composed of only two neurons in a hidden layer (11 parameters on total) proved to be the most accurate (overall RMSE = 0.0246 mm/h, R2 = 0.944), seemingly because adding more neurons generated overfitting of noise in the training dataset. According to the optimized weights in the best ANNs, the first hidden neuron could be considered in charge of relaying the ETc information while the other one would deal with the water stress response to stem WP, soil WS, and ETc. As individual trees had specific signatures for combinations of these variables, variability was generated in their ETa responses. The relative canopy cover was the main source of variability of ETa while stem WP was the most influent factor for the ETa / ETc ratio. Trees on drip-irrigated side of the orchard appeared to be less affected by low estimated soil WS in the root zone than on the fanjet micro-sprinklers side, possibly due to a combination of (i) more substantial root biomass increasing the plant hydraulic conductance, (ii) bias in the soil WS estimation due to soil moisture heterogeneity on the drip-side, and (iii) the access to deeper water resource. Tree scale ETa responses are in good agreement with soil-plant water relations reported in the literature, and

  11. Linking Tree Growth Response to Measured Microclimate - A Field Based Approach

    NASA Astrophysics Data System (ADS)

    Martin, J. T.; Hoylman, Z. H.; Looker, N. T.; Jencso, K. G.; Hu, J.

    2015-12-01

    The general relationship between climate and tree growth is a well established and important tenet shaping both paleo and future perspectives of forest ecosystem growth dynamics. Across much of the American west, water limits growth via physiological mechanisms that tie regional and local climatic conditions to forest productivity in a relatively predictable way, and these growth responses are clearly evident in tree ring records. However, within the annual cycle of a forest landscape, water availability varies across both time and space, and interacts with other potentially growth limiting factors such as temperature, light, and nutrients. In addition, tree growth responses may lag climate drivers and may vary in terms of where in a tree carbon is allocated. As such, determining when and where water actually limits forest growth in real time can be a significant challenge. Despite these challenges, we present data suggestive of real-time growth limitation driven by soil moisture supply and atmospheric water demand reflected in high frequency field measurements of stem radii and cell structure across ecological gradients. The experiment was conducted at the Lubrecht Experimental Forest in western Montana where, over two years, we observed intra-annual growth rates of four dominant conifer species: Douglas fir, Ponderosa Pine, Engelmann Spruce and Western Larch using point dendrometers and microcores. In all four species studied, compensatory use of stored water (inferred from stem water deficit) appears to exhibit a threshold relationship with a critical balance point between water supply and demand. The occurrence of this point in time coincided with a decrease in stem growth rates, and the while the timing varied up to one month across topographic and elevational gradients, the onset date of growth limitation was a reliable predictor of overall annual growth. Our findings support previous model-based observations of nonlinearity in the relationship between

  12. Machine Learning Approaches for Integrating Clinical and Imaging Features in LLD Classification and Response Prediction

    PubMed Central

    Patel, Meenal J.; Andreescu, Carmen; Price, Julie C.; Edelman, Kathryn L.; Reynolds, Charles F.; Aizenstein, Howard J.

    2015-01-01

    Objective Currently, depression diagnosis relies primarily on behavioral symptoms and signs, and treatment is guided by trial and error instead of evaluating associated underlying brain characteristics. Unlike past studies, we attempted to estimate accurate prediction models for late-life depression diagnosis and treatment response using multiple machine learning methods with inputs of multi-modal imaging and non-imaging whole brain and network-based features. Methods Late-life depression patients (medicated post-recruitment) [n=33] and elderly non-depressed individuals [n=35] were recruited. Their demographics and cognitive ability scores were recorded, and brain characteristics were acquired using multi-modal magnetic resonance imaging pre-treatment. Linear and nonlinear learning methods were tested for estimating accurate prediction models. Results A learning method called alternating decision trees estimated the most accurate prediction models for late-life depression diagnosis (87.27% accuracy) and treatment response (89.47% accuracy). The diagnosis model included measures of age, mini-mental state examination score, and structural imaging (e.g. whole brain atrophy and global white mater hyperintensity burden). The treatment response model included measures of structural and functional connectivity. Conclusions Combinations of multi-modal imaging and/or non-imaging measures may help better predict late-life depression diagnosis and treatment response. As a preliminary observation, we speculate the results may also suggest that different underlying brain characteristics defined by multi-modal imaging measures—rather than region-based differences—are associated with depression versus depression recovery since to our knowledge this is the first depression study to accurately predict both using the same approach. These findings may help better understand late-life depression and identify preliminary steps towards personalized late-life depression treatment

  13. Aspen Trees.

    ERIC Educational Resources Information Center

    Canfield, Elaine

    2002-01-01

    Describes a fifth-grade art activity that offers a new approach to creating pictures of Aspen trees. Explains that the students learned about art concepts, such as line and balance, in this lesson. Discusses the process in detail for creating the pictures. (CMK)

  14. A wrapper-based approach for feature selection and classification of major depressive disorder-bipolar disorders.

    PubMed

    Tekin Erguzel, Turker; Tas, Cumhur; Cebi, Merve

    2015-09-01

    Feature selection (FS) and classification are consecutive artificial intelligence (AI) methods used in data analysis, pattern classification, data mining and medical informatics. Beside promising studies in the application of AI methods to health informatics, working with more informative features is crucial in order to contribute to early diagnosis. Being one of the prevalent psychiatric disorders, depressive episodes of bipolar disorder (BD) is often misdiagnosed as major depressive disorder (MDD), leading to suboptimal therapy and poor outcomes. Therefore discriminating MDD and BD at earlier stages of illness could help to facilitate efficient and specific treatment. In this study, a nature inspired and novel FS algorithm based on standard Ant Colony Optimization (ACO), called improved ACO (IACO), was used to reduce the number of features by removing irrelevant and redundant data. The selected features were then fed into support vector machine (SVM), a powerful mathematical tool for data classification, regression, function estimation and modeling processes, in order to classify MDD and BD subjects. Proposed method used coherence, a promising quantitative electroencephalography (EEG) biomarker, values calculated from alpha, theta and delta frequency bands. The noteworthy performance of novel IACO-SVM approach stated that it is possible to discriminate 46 BD and 55 MDD subjects using 22 of 48 features with 80.19% overall classification accuracy. The performance of IACO algorithm was also compared to the performance of standard ACO, genetic algorithm (GA) and particle swarm optimization (PSO) algorithms in terms of their classification accuracy and number of selected features. In order to provide an almost unbiased estimate of classification error, the validation process was performed using nested cross-validation (CV) procedure. PMID:26164033

  15. A wrapper-based approach for feature selection and classification of major depressive disorder-bipolar disorders.

    PubMed

    Tekin Erguzel, Turker; Tas, Cumhur; Cebi, Merve

    2015-09-01

    Feature selection (FS) and classification are consecutive artificial intelligence (AI) methods used in data analysis, pattern classification, data mining and medical informatics. Beside promising studies in the application of AI methods to health informatics, working with more informative features is crucial in order to contribute to early diagnosis. Being one of the prevalent psychiatric disorders, depressive episodes of bipolar disorder (BD) is often misdiagnosed as major depressive disorder (MDD), leading to suboptimal therapy and poor outcomes. Therefore discriminating MDD and BD at earlier stages of illness could help to facilitate efficient and specific treatment. In this study, a nature inspired and novel FS algorithm based on standard Ant Colony Optimization (ACO), called improved ACO (IACO), was used to reduce the number of features by removing irrelevant and redundant data. The selected features were then fed into support vector machine (SVM), a powerful mathematical tool for data classification, regression, function estimation and modeling processes, in order to classify MDD and BD subjects. Proposed method used coherence, a promising quantitative electroencephalography (EEG) biomarker, values calculated from alpha, theta and delta frequency bands. The noteworthy performance of novel IACO-SVM approach stated that it is possible to discriminate 46 BD and 55 MDD subjects using 22 of 48 features with 80.19% overall classification accuracy. The performance of IACO algorithm was also compared to the performance of standard ACO, genetic algorithm (GA) and particle swarm optimization (PSO) algorithms in terms of their classification accuracy and number of selected features. In order to provide an almost unbiased estimate of classification error, the validation process was performed using nested cross-validation (CV) procedure.

  16. Tree Testing of Hierarchical Menu Structures for Health Applications

    PubMed Central

    Le, Thai; Chaudhuri, Shomir; Chung, Jane; Thompson, Hilaire J; Demiris, George

    2014-01-01

    To address the need for greater evidence-based evaluation of Health Information Technology (HIT) systems we introduce a method of usability testing termed tree testing. In a tree test, participants are presented with an abstract hierarchical tree of the system taxonomy and asked to navigate through the tree in completing representative tasks. We apply tree testing to a commercially available health application, demonstrating a use case and providing a comparison with more traditional in-person usability testing methods. Online tree tests (N=54) and in-person usability tests (N=15) were conducted from August to September 2013. Tree testing provided a method to quantitatively evaluate the information structure of a system using various navigational metrics including completion time, task accuracy, and path length. The results of the analyses compared favorably to the results seen from the traditional usability test. Tree testing provides a flexible, evidence-based approach for researchers to evaluate the information structure of HITs. In addition, remote tree testing provides a quick, flexible, and high volume method of acquiring feedback in a structured format that allows for quantitative comparisons. With the diverse nature and often large quantities of health information available, addressing issues of terminology and concept classifications during the early development process of a health information system will improve navigation through the system and save future resources. Tree testing is a usability method that can be used to quickly and easily assess information hierarchy of health information systems. PMID:24582924

  17. New approach for phylogenetic tree recovery based on genome-scale metabolic networks.

    PubMed

    Gamermann, Daniel; Montagud, Arnaud; Conejero, J Alberto; Urchueguía, Javier F; de Córdoba, Pedro Fernández

    2014-07-01

    A wide range of applications and research has been done with genome-scale metabolic models. In this work, we describe an innovative methodology for comparing metabolic networks constructed from genome-scale metabolic models and how to apply this comparison in order to infer evolutionary distances between different organisms. Our methodology allows a quantification of the metabolic differences between different species from a broad range of families and even kingdoms. This quantification is then applied in order to reconstruct phylogenetic trees for sets of various organisms.

  18. Morpho-geometrical approach for 3D segmentation of pulmonary vascular tree in multi-slice CT

    NASA Astrophysics Data System (ADS)

    Fetita, Catalin; Brillet, Pierre-Yves; Prêteux, Françoise J.

    2009-02-01

    The analysis of pulmonary vessels provides better insights into the lung physio-pathology and offers the basis for a functional investigation of the respiratory system. In order to be performed in clinical routine, such analysis has to be compatible with the general protocol for thorax imaging based on multi-slice CT (MSCT), which does not involve the use of contrast agent for vessels enhancement. Despite the fact that a visual assessment of the pulmonary vascular tree is facilitated by the natural contrast existing between vessels and lung parenchyma, a quantitative analysis becomes quickly tedious due to the high spatial density and subdivision complexity of these anatomical structures. In this paper, we develop an automated 3D approach for the segmentation of the pulmonary vessels in MSCT allowing further quantification facilities for the lung function. The proposed approach combines mathematical morphology and discrete geometry operators in order to reach distal small caliber blood vessels and to preserve the border with the wall of the bronchial tree which features identical intensity values. In this respect, the pulmonary field is first roughly segmented using thresholding, and the trachea and the main bronchi removed. The lung shape is then regularized by morphological alternate filtering and the high opacities (vessels, bronchi, and other eventual pathologic features) selected. After the attenuation of the bronchus wall for large and medium airways, the set of vessel candidates are obtained by morphological grayscale reconstruction and binarization. The residual bronchus wall components are then removed by means of a geometrical shape filtering which includes skeletonization and cylindrical shape estimation. The morphology of the reconstructed pulmonary vessels can be visually investigated with volume rendering, by associating a specific color code with the local vessel caliber. The complement set of the vascular tree among the high intensity structures in

  19. Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data.

    PubMed

    O'Toole, Alice J; Jiang, Fang; Abdi, Hervé; Pénard, Nils; Dunlop, Joseph P; Parent, Marc A

    2007-11-01

    The goal of pattern-based classification of functional neuroimaging data is to link individual brain activation patterns to the experimental conditions experienced during the scans. These "brain-reading" analyses advance functional neuroimaging on three fronts. From a technical standpoint, pattern-based classifiers overcome fatal f laws in the status quo inferential and exploratory multivariate approaches by combining pattern-based analyses with a direct link to experimental variables. In theoretical terms, the results that emerge from pattern-based classifiers can offer insight into the nature of neural representations. This shifts the emphasis in functional neuroimaging studies away from localizing brain activity toward understanding how patterns of brain activity encode information. From a practical point of view, pattern-based classifiers are already well established and understood in many areas of cognitive science. These tools are familiar to many researchers and provide a quantitatively sound and qualitatively satisfying answer to most questions addressed in functional neuroimaging studies. Here, we examine the theoretical, statistical, and practical underpinnings of pattern-based classification approaches to functional neuroimaging analyses. Pattern-based classification analyses are well positioned to become the standard approach to analyzing functional neuroimaging data.

  20. Physiotherapy movement based classification approaches to low back pain: comparison of subgroups through review and developer/expert survey

    PubMed Central

    2012-01-01

    Background Several classification schemes, each with its own philosophy and categorizing method, subgroup low back pain (LBP) patients with the intent to guide treatment. Physiotherapy derived schemes usually have a movement impairment focus, but the extent to which other biological, psychological, and social factors of pain are encompassed requires exploration. Furthermore, within the prevailing 'biological' domain, the overlap of subgrouping strategies within the orthopaedic examination remains unexplored. The aim of this study was "to review and clarify through developer/expert survey, the theoretical basis and content of physical movement classification schemes, determine their relative reliability and similarities/differences, and to consider the extent of incorporation of the bio-psycho-social framework within the schemes". Methods A database search for relevant articles related to LBP and subgrouping or classification was conducted. Five dominant movement-based schemes were identified: Mechanical Diagnosis and Treatment (MDT), Treatment Based Classification (TBC), Pathoanatomic Based Classification (PBC), Movement System Impairment Classification (MSI), and O'Sullivan Classification System (OCS) schemes. Data were extracted and a survey sent to the classification scheme developers/experts to clarify operational criteria, reliability, decision-making, and converging/diverging elements between schemes. Survey results were integrated into the review and approval obtained for accuracy. Results Considerable diversity exists between schemes in how movement informs subgrouping and in the consideration of broader neurosensory, cognitive, emotional, and behavioural dimensions of LBP. Despite differences in assessment philosophy, a common element lies in their objective to identify a movement pattern related to a pain reduction strategy. Two dominant movement paradigms emerge: (i) loading strategies (MDT, TBC, PBC) aimed at eliciting a phenomenon of centralisation of

  1. Quantifying ozone uptake at the canopy level of spruce, pine and larch trees at the alpine timberline: an approach based on sap flow measurement.

    PubMed

    Wieser, G; Matyssek, R; Köstner, B; Oberhuber, W

    2003-01-01

    Micro-climatic and ambient ozone data were combined with measurements of sap flow through tree trunks in order to estimate whole-tree ozone uptake of adult Norway spruce (Picea abies), cembran pine (Pinus cembra), and European larch (Larix decidua) trees. Sap flow was monitored by means of the heat balance approach in two trees of each species during the growing season of 1998. In trees making up the stand canopy, the ozone uptake by evergreen foliages was significantly higher than by deciduous ones, when scaled to the ground area. However, if expressed per unit of whole-tree foliage area, ozone flux through the stomata into the needle mesophyll was 1.09, 1.18 and 1.40 nmol m(-2) s(-1) in Picea abies, Pinus cembra and Larix decidua, respectively. These fluxes are consistent with findings from measurements of needle gas exchange, published from the same species at the study site. It is concluded that the sap flow-based approach offers an inexpensive, spatially and temporally integrating way for estimating ozone uptake at the whole-tree and stand level, intrinsicly covering the effect of boundary layers on ozone flux.

  2. Automated Classification of Radiology Reports for Acute Lung Injury: Comparison of Keyword and Machine Learning Based Natural Language Processing Approaches

    PubMed Central

    Solti, Imre; Cooke, Colin R.; Xia, Fei; Wurfel, Mark M.

    2010-01-01

    This paper compares the performance of keyword and machine learning-based chest x-ray report classification for Acute Lung Injury (ALI). ALI mortality is approximately 30 percent. High mortality is, in part, a consequence of delayed manual chest x-ray classification. An automated system could reduce the time to recognize ALI and lead to reductions in mortality. For our study, 96 and 857 chest x-ray reports in two corpora were labeled by domain experts for ALI. We developed a keyword and a Maximum Entropy-based classification system. Word unigram and character n-grams provided the features for the machine learning system. The Maximum Entropy algorithm with character 6-gram achieved the highest performance (Recall=0.91, Precision=0.90 and F-measure=0.91) on the 857-report corpus. This study has shown that for the classification of ALI chest x-ray reports, the machine learning approach is superior to the keyword based system and achieves comparable results to highest performing physician annotators. PMID:21152268

  3. An approach for automated fault diagnosis based on a fuzzy decision tree and boundary analysis of a reconstructed phase space.

    PubMed

    Aydin, Ilhan; Karakose, Mehmet; Akin, Erhan

    2014-03-01

    Although reconstructed phase space is one of the most powerful methods for analyzing a time series, it can fail in fault diagnosis of an induction motor when the appropriate pre-processing is not performed. Therefore, boundary analysis based a new feature extraction method in phase space is proposed for diagnosis of induction motor faults. The proposed approach requires the measurement of one phase current signal to construct the phase space representation. Each phase space is converted into an image, and the boundary of each image is extracted by a boundary detection algorithm. A fuzzy decision tree has been designed to detect broken rotor bars and broken connector faults. The results indicate that the proposed approach has a higher recognition rate than other methods on the same dataset. PMID:24296116

  4. Bayesian Ensemble Trees (BET) for Clustering and Prediction in Heterogeneous Data

    PubMed Central

    Duan, Leo L.; Clancy, John P.; Szczesniak, Rhonda D.

    2016-01-01

    We propose a novel “tree-averaging” model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian Ensemble Trees (BET) and model them as a Dirichlet process. We show that BET determines the optimal number of trees by adapting to the data heterogeneity. Compared with the other ensemble methods, BET requires much fewer trees and shows equivalent prediction accuracy using weighted averaging. Moreover, each tree in BET provides variable selection criterion and interpretation for each subset. We developed an efficient estimating procedure with improved estimation strategies in both CART and mixture models. We demonstrate these advantages of BET with simulations and illustrate the approach with a real-world data example involving regression of lung function measurements obtained from patients with cystic fibrosis. Supplemental materials are available online. PMID:27524872

  5. Computer-aided differential diagnosis of pulmonary nodules based on a hybrid classification approach

    NASA Astrophysics Data System (ADS)

    Kawata, Yoshiki; Niki, Noboru; Omatsu, Hironobu; Kusumoto, Masahiko; Kakinuma, Ryutaro; Mori, Kiyoshi; Nishiyama, Hiroyuki; Eguchi, Kenji; Kaneko, Masahiro; Moriyama, Noriyuki

    2001-07-01

    We are developing computerized feature extraction and classification methods to analyze malignant and benign pulmonary nodules in 3D thoracic CT images. Internal structure features were derived form CT density and 3D curvatures to characterize the inhomogeneous of CT density distribution inside the nodule. In the classification step, we combined an unsupervised k-means clustering (KMC) procedure and a supervised linear discriminate (LD) classifier. The KMC procedure classified the sample nodules into two classes by using the mean CT density values for two different regions such as a core region and a complement of the core region in 3D nodule image. The LD classifier was designed for each class by using internal structure features. The forward stepwise procedure was used to select the best feature subset from multi-dimensional feature spaces. The discriminant scores output form the classifier were analyzed by receiver operating characteristic (ROC) method and the classification accuracy was quantified by the area, Ax, under the ROC curve. We analyzed a data set of 248 pulmonary nodules in this study. The hybrid classifier was more effective than the LD classifier alone in distinguishing malignant and benign nodules. The improvement was statistically significant in comparison to classification in the LD classifier alone. The results of this study indicate the potential of combining the KMC procedure and the LD classifier for computer-aided classification of pulmonary nodules.

  6. Artificial Root Exudate System (ARES): a field approach to simulate tree root exudation in soils

    NASA Astrophysics Data System (ADS)

    Lopez-Sangil, Luis; Estradera-Gumbau, Eduard; George, Charles; Sayer, Emma

    2016-04-01

    The exudation of labile solutes by fine roots represents an important strategy for plants to promote soil nutrient availability in terrestrial ecosystems. Compounds exuded by roots (mainly sugars, carboxylic and amino acids) provide energy to soil microbes, thus priming the mineralization of soil organic matter (SOM) and the consequent release of inorganic nutrients into the rhizosphere. Studies in several forest ecosystems suggest that tree root exudates represent 1 to 10% of the total photoassimilated C, with exudation rates increasing markedly under elevated CO2 scenarios. Despite their importance in ecosystem functioning, we know little about how tree root exudation affect soil carbon dynamics in situ. This is mainly because there has been no viable method to experimentally control inputs of root exudates at field scale. Here, I present a method to apply artificial root exudates below the soil surface in small field plots. The artificial root exudate system (ARES) consists of a water container with a mixture of labile carbon solutes (mimicking tree root exudate rates and composition), which feeds a system of drip-tips covering an area of 1 m2. The tips are evenly distributed every 20 cm and inserted 4-cm into the soil with minimal disturbance. The system is regulated by a mechanical timer, such that artificial root exudate solution can be applied at frequent, regular daily intervals. We tested ARES from April to September 2015 (growing season) within a leaf-litter manipulation experiment ongoing in temperate deciduous woodland in the UK. Soil respiration was measured monthly, and soil samples were taken at the end of the growing season for PLFA, enzymatic activity and nutrient analyses. First results show a very rapid mineralization of the root exudate compounds and, interestingly, long-term increases in SOM respiration, with negligible effects on soil moisture levels. Large positive priming effects (2.5-fold increase in soil respiration during the growing

  7. Cuckoo search optimisation for feature selection in cancer classification: a new approach.

    PubMed

    Gunavathi, C; Premalatha, K

    2015-01-01

    Cuckoo Search (CS) optimisation algorithm is used for feature selection in cancer classification using microarray gene expression data. Since the gene expression data has thousands of genes and a small number of samples, feature selection methods can be used for the selection of informative genes to improve the classification accuracy. Initially, the genes are ranked based on T-statistics, Signal-to-Noise Ratio (SNR) and F-statistics values. The CS is used to find the informative genes from the top-m ranked genes. The classification accuracy of k-Nearest Neighbour (kNN) technique is used as the fitness function for CS. The proposed method is experimented and analysed with ten different cancer gene expression datasets. The results show that the CS gives 100% average accuracy for DLBCL Harvard, Lung Michigan, Ovarian Cancer, AML-ALL and Lung Harvard2 datasets and it outperforms the existing techniques in DLBCL outcome and prostate datasets. PMID:26547979

  8. Cuckoo search optimisation for feature selection in cancer classification: a new approach.

    PubMed

    Gunavathi, C; Premalatha, K

    2015-01-01

    Cuckoo Search (CS) optimisation algorithm is used for feature selection in cancer classification using microarray gene expression data. Since the gene expression data has thousands of genes and a small number of samples, feature selection methods can be used for the selection of informative genes to improve the classification accuracy. Initially, the genes are ranked based on T-statistics, Signal-to-Noise Ratio (SNR) and F-statistics values. The CS is used to find the informative genes from the top-m ranked genes. The classification accuracy of k-Nearest Neighbour (kNN) technique is used as the fitness function for CS. The proposed method is experimented and analysed with ten different cancer gene expression datasets. The results show that the CS gives 100% average accuracy for DLBCL Harvard, Lung Michigan, Ovarian Cancer, AML-ALL and Lung Harvard2 datasets and it outperforms the existing techniques in DLBCL outcome and prostate datasets.

  9. An approach to classification and capacitance expressions in electrochemical capacitors technology.

    PubMed

    Roldán, Silvia; Barreda, Daniel; Granda, Marcos; Menéndez, Rosa; Santamaría, Ricardo; Blanco, Clara

    2015-01-14

    The proliferation of novel types and designs of electrochemical capacitors makes it necessary to obtain a better understanding of the behavior of these systems together with a more systematic classification of them. In this study a rational classification of supercapacitors based on the charge storage mechanism and the active material of each electrode is proposed. The internationally accepted terminology - the terms symmetric, asymmetric and hybrid - is also clarified in an attempt to standardize the current definitions and facilitate the systematic classification of each device. Additionally, the selection of suitable mathematical expressions to calculate the capacitance of each kind of system is rationalized throughout the discussion taking into account the behavioral characteristics of each electrode. An examination of the potential evolution profile of each electrode during the galvanostatic cycling of the supercapacitor is presented as a key tool for understanding the fundamental behavior of these devices.

  10. Biomedical literature classification using encyclopedic knowledge: a Wikipedia-based bag-of-concepts approach

    PubMed Central

    Pérez Rodríguez, Roberto; Anido Rifón, Luis E.

    2015-01-01

    Automatic classification of text documents into a set of categories has a lot of applications. Among those applications, the automatic classification of biomedical literature stands out as an important application for automatic document classification strategies. Biomedical staff and researchers have to deal with a lot of literature in their daily activities, so it would be useful a system that allows for accessing to documents of interest in a simple and effective way; thus, it is necessary that these documents are sorted based on some criteria—that is to say, they have to be classified. Documents to classify are usually represented following the bag-of-words (BoW) paradigm. Features are words in the text—thus suffering from synonymy and polysemy—and their weights are just based on their frequency of occurrence. This paper presents an empirical study of the efficiency of a classifier that leverages encyclopedic background knowledge—concretely Wikipedia—in order to create bag-of-concepts (BoC) representations of documents, understanding concept as “unit of meaning”, and thus tackling synonymy and polysemy. Besides, the weighting of concepts is based on their semantic relevance in the text. For the evaluation of the proposal, empirical experiments have been conducted with one of the commonly used corpora for evaluating classification and retrieval of biomedical information, OHSUMED, and also with a purpose-built corpus of MEDLINE biomedical abstracts, UVigoMED. Results obtained show that the Wikipedia-based bag-of-concepts representation outperforms the classical bag-of-words representation up to 157% in the single-label classification problem and up to 100% in the multi-label problem for OHSUMED corpus, and up to 122% in the single-label classification problem and up to 155% in the multi-label problem for UVigoMED corpus. PMID:26468436

  11. A practical approach to accurate classification and staging of mycosis fungoides and Sézary syndrome.

    PubMed

    Thomas, Bjorn Rhys; Whittaker, Sean

    2012-12-01

    Cutaneous T-cell lymphomas are rare, distinct forms of non-Hodgkin's lymphomas. Of which, mycosis fungoides (MF) and Sézary syndrome (SS) are two of the most common forms. Careful, clear classification and staging of these lymphomas allow dermatologists to commence appropriate therapy and allow correct prognostic stratification for those patients affected. Of note, patients with more advanced disease will require multi-disciplinary input in determining specialist therapy. Literature has been summarized into an outline for classification/staging of MF and SS with the aim to provide clinical dermatologists with a concise review.

  12. An approach toward a combined scheme for the petrographic classification of fly ash: Revision and clarification

    USGS Publications Warehouse

    Hower, J.C.; Suarez-Ruiz, I.; Mastalerz, Maria

    2005-01-01

    Hower and Mastalerz's classification scheme for fly ash is modified to make more widely acceptable. First, proper consideration is given to the potential role of anthracite in the development of isotropic and anisotropic chars. Second, the role of low-reflectance inertinite in producing vesicular chars is noted. It is shown that noncoal chars in the fuel can potentially produce chars that have the potential to stretch the limits of the classification. With care, it is possible to classify certain biomass chars as being distinct from coal-derived chars.

  13. A Characteristics-Based Approach to Radioactive Waste Classification in Advanced Nuclear Fuel Cycles

    NASA Astrophysics Data System (ADS)

    Djokic, Denia

    The radioactive waste classification system currently used in the United States primarily relies on a source-based framework. This has lead to numerous issues, such as wastes that are not categorized by their intrinsic risk, or wastes that do not fall under a category within the framework and therefore are without a legal imperative for responsible management. Furthermore, in the possible case that advanced fuel cycles were to be deployed in the United States, the shortcomings of the source-based classification system would be exacerbated: advanced fuel cycles implement processes such as the separation of used nuclear fuel, which introduce new waste streams of varying characteristics. To be able to manage and dispose of these potential new wastes properly, development of a classification system that would assign appropriate level of management to each type of waste based on its physical properties is imperative. This dissertation explores how characteristics from wastes generated from potential future nuclear fuel cycles could be coupled with a characteristics-based classification framework. A static mass flow model developed under the Department of Energy's Fuel Cycle Research & Development program, called the Fuel-cycle Integration and Tradeoffs (FIT) model, was used to calculate the composition of waste streams resulting from different nuclear fuel cycle choices: two modified open fuel cycle cases (recycle in MOX reactor) and two different continuous-recycle fast reactor recycle cases (oxide and metal fuel fast reactors). This analysis focuses on the impact of waste heat load on waste classification practices, although future work could involve coupling waste heat load with metrics of radiotoxicity and longevity. The value of separation of heat-generating fission products and actinides in different fuel cycles and how it could inform long- and short-term disposal management is discussed. It is shown that the benefits of reducing the short-term fission

  14. Ecological status classification of the Taizi River Basin, China: a comparison of integrated risk assessment approaches.

    PubMed

    Fan, Juntao; Semenzin, Elena; Meng, Wei; Giubilato, Elisa; Zhang, Yuan; Critto, Andrea; Zabeo, Alex; Zhou, Yun; Ding, Sen; Wan, Jun; He, Mengchang; Lin, Chunye

    2015-10-01

    Integrated risk assessment approaches allow to achieve a sound evaluation of ecological status of river basins and to gain knowledge about the likely causes of impairment, useful for informing and supporting the decision-making process. In this paper, the integrated risk assessment (IRA) methodology developed in the EU MODELKEY project (and implemented in the MODELKEY Decision Support System) is applied to the Taizi River (China), in order to assess its Ecological and Chemical Status according to EU Water Framework Directive (WFD) requirements. The available dataset is derived by an extensive survey carried out in 2009 and 2010 across the Taizi River catchment, including the monitoring of physico-chemical (i.e. DO, EC, NH3-_N, chemical oxygen demand (COD), biological oxygen demand in 5 days (BOD5) and TP), chemical (i.e. polycyclic aromatic hydrocarbons (PAHs) and metals), biological (i.e. macroinvertebrates, fish, and algae), and hydromorphological parameters (i.e. water quantity, channel change and morphology diversity). The results show a negative trend in the ecological status from the highland to the lowland of the Taizi River Basin. Organic pollution from agriculture and domestic sources (i.e. COD and BOD5), unstable hydrological regime (i.e. water quantity shortage) and chemical pollutants from industry (i.e. PAHs and metals) are found to be the main stressors impacting the ecological status of the Taizi River Basin. The comparison between the results of the IRA methodology and those of a previous study (Leigh et al. 2012) indicates that the selection of indicators and integrating methodologies can have a relevant impact on the classification of the ecological status. The IRA methodology, which integrates information from five lines of evidence (i.e., biology, physico-chemistry, chemistry, ecotoxicology and hydromorphology) required by WFD, allows to better identify the biological communities that are potentially at risk and the stressors that are most