Science.gov

Sample records for decision tree induction

  1. Finding the right decision tree's induction strategy for a hard real world problem.

    PubMed

    Zorman, M; Podgorelec, V; Kokol, P; Peterson, M; Sprogar, M; Ojstersek, M

    2001-09-01

    Decision trees have been already successfully used in medicine, but as in traditional statistics, some hard real world problems can not be solved successfully using the traditional way of induction. In our experiments we tested various methods for building univariate decision trees in order to find the best induction strategy. On a hard real world problem of the Orthopaedic fracture data with 2637 cases, described by 23 attributes and a decision with three possible values, we built decision trees with four classical approaches, one hybrid approach where we combined neural networks and decision trees, and with an evolutionary approach. The results show that all approaches had problems with either accuracy, sensitivity, or decision tree size. The comparison shows that the best compromise in hard real world problem decision trees building is the evolutionary approach. PMID:11518670

  2. Automatic design of decision-tree induction algorithms tailored to flexible-receptor docking data

    PubMed Central

    2012-01-01

    Background This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor. PMID:23171000

  3. Decision Tree based Prediction and Rule Induction for Groundwater Trichloroethene (TCE) Pollution Vulnerability

    NASA Astrophysics Data System (ADS)

    Park, J.; Yoo, K.

    2013-12-01

    For groundwater resource conservation, it is important to accurately assess groundwater pollution sensitivity or vulnerability. In this work, we attempted to use data mining approach to assess groundwater pollution vulnerability in a TCE (trichloroethylene) contaminated Korean industrial site. The conventional DRASTIC method failed to describe TCE sensitivity data with a poor correlation with hydrogeological properties. Among the different data mining methods such as Artificial Neural Network (ANN), Multiple Logistic Regression (MLR), Case Base Reasoning (CBR), and Decision Tree (DT), the accuracy and consistency of Decision Tree (DT) was the best. According to the following tree analyses with the optimal DT model, the failure of the conventional DRASTIC method in fitting with TCE sensitivity data may be due to the use of inaccurate weight values of hydrogeological parameters for the study site. These findings provide a proof of concept that DT based data mining approach can be used in predicting and rule induction of groundwater TCE sensitivity without pre-existing information on weights of hydrogeological properties.

  4. Lazy decision trees

    SciTech Connect

    Friedman, J.H.; Yun, Yeogirl; Kohavi, R.

    1996-12-31

    Lazy learning algorithms, exemplified by nearest-neighbor algorithms, do not induce a concise hypothesis from a given training set; the inductive process is delayed until a test instance is given. Algorithms for constructing decision trees, such as C4.5, ID3, and CART create a single {open_quotes}best{close_quotes} decision tree during the training phase, and this tree is then used to classify test instances. The tests at the nodes of the constructed tree are good on average, but there may be better tests for classifying a specific instance. We propose a lazy decision tree algorithm-LazyDT-that conceptually constructs the {open_quotes}best{close_quote} decision tree for each test instance. In practice, only a path needs to be constructed, and a caching scheme makes the algorithm fast. The algorithm is robust with respect to missing values without resorting to the complicated methods usually seen in induction of decision trees. Experiments on real and artificial problems are presented.

  5. Decision-Tree-based data mining and rule induction for predicting and mapping soil bacterial diversity.

    PubMed

    Kim, Kangsuk; Yoo, Keunje; Ki, Dongwon; Son, Il Suh; Oh, Kyong Joo; Park, Joonhong

    2011-07-01

    Soilmicrobial ecology plays a significant role in global ecosystems. Nevertheless, methods of model prediction and mapping have yet to be established for soil microbial ecology. The present study was undertaken to develop an artificial-intelligence- and geographical information system (GIS)-integrated framework for predicting and mapping soil bacterial diversity using pre-existing environmental geospatial database information, and to further evaluate the applicability of soil bacterial diversity mapping for planning construction of eco-friendly roads. Using a stratified random sampling, soil bacterial diversity was measured in 196 soil samples in a forest area where construction of an eco-friendly road was planned. Model accuracy, coherence analyses, and tree analysis were systematically performed, and four-class discretized decision tree (DT) with ordinary pair-wise partitioning (OPP) was selected as the optimal model among tested five DT model variants. GIS-based simulations of the optimal DT model with varying weights assigned to soil ecological quality showed that the inclusion of soil ecology in environmental components, which are considered in environmental impact assessment, significantly affects the spatial distributions of overall environmental quality values as well as the determination of an environmentally optimized road route. This work suggests a guideline to use systematic accuracy, coherence, and tree analyses in selecting an optimal DT model from multiple candidate model variants, and demonstrates the applicability of the OPP-improved DT integrated with GIS in rule induction for mapping bacterial diversity. These findings also provide implication on the significance of soil microbial ecology in environmental impact assessment and eco-friendly construction planning. PMID:21072585

  6. Decision-tree and rule-induction approach to integration of remotely sensed and GIS data in mapping vegetation in disturbed or hilly environments

    NASA Astrophysics Data System (ADS)

    Lees, Brian G.; Ritman, Kim

    1991-11-01

    The integration of Landsat TM and environmental GIS data sets using artificial intelligence rule-induction and decision-tree analysis is shown to facilitate the production of vegetation maps with both floristic and structural information. This technique is particularly suited to vegetation mapping in disturbed or hilly environments that are unsuited to either conventional remote sensing methods or GIS modeling using environmental data bases.

  7. Human decision error (HUMDEE) trees

    SciTech Connect

    Ostrom, L.T.

    1993-08-01

    Graphical presentations of human actions in incident and accident sequences have been used for many years. However, for the most part, human decision making has been underrepresented in these trees. This paper presents a method of incorporating the human decision process into graphical presentations of incident/accident sequences. This presentation is in the form of logic trees. These trees are called Human Decision Error Trees or HUMDEE for short. The primary benefit of HUMDEE trees is that they graphically illustrate what else the individuals involved in the event could have done to prevent either the initiation or continuation of the event. HUMDEE trees also present the alternate paths available at the operator decision points in the incident/accident sequence. This is different from the Technique for Human Error Rate Prediction (THERP) event trees. There are many uses of these trees. They can be used for incident/accident investigations to show what other courses of actions were available and for training operators. The trees also have a consequence component so that not only the decision can be explored, also the consequence of that decision.

  8. Decision-Tree Program

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1994-01-01

    IND computer program introduces Bayesian and Markov/maximum-likelihood (MML) methods and more-sophisticated methods of searching in growing trees. Produces more-accurate class-probability estimates important in applications like diagnosis. Provides range of features and styles with convenience for casual user, fine-tuning for advanced user or for those interested in research. Consists of four basic kinds of routines: data-manipulation, tree-generation, tree-testing, and tree-display. Written in C language.

  9. Creating ensembles of decision trees through sampling

    SciTech Connect

    Kamath, C; Cantu-Paz, E

    2001-02-02

    Recent work in classification indicates that significant improvements in accuracy can be obtained by growing an ensemble of classifiers and having them vote for the most popular class. This paper focuses on ensembles of decision trees that are created with a randomized procedure based on sampling. Randomization can be introduced by using random samples of the training data (as in bagging or arcing) and running a conventional tree-building algorithm, or by randomizing the induction algorithm itself. The objective of this paper is to describe our first experiences with a novel randomized tree induction method that uses a subset of samples at a node to determine the split. Our empirical results show that ensembles generated using this approach yield results that are competitive in accuracy and superior in computational cost.

  10. Creating Ensembles of Decision Trees Through Sampling

    SciTech Connect

    Kamath,C; Cantu-Paz, E

    2001-07-26

    Recent work in classification indicates that significant improvements in accuracy can be obtained by growing an ensemble of classifiers and having them vote for the most popular class. This paper focuses on ensembles of decision trees that are created with a randomized procedure based on sampling. Randomization can be introduced by using random samples of the training data (as in bagging or boosting) and running a conventional tree-building algorithm, or by randomizing the induction algorithm itself. The objective of this paper is to describe the first experiences with a novel randomized tree induction method that uses a sub-sample of instances at a node to determine the split. The empirical results show that ensembles generated using this approach yield results that are competitive in accuracy and superior in computational cost to boosting and bagging.

  11. Decision tree modeling using R.

    PubMed

    Zhang, Zhongheng

    2016-08-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building. PMID:27570769

  12. Decision tree modeling using R

    PubMed Central

    2016-01-01

    In machine learning field, decision tree learner is powerful and easy to interpret. It employs recursive binary partitioning algorithm that splits the sample in partitioning variable with the strongest association with the response variable. The process continues until some stopping criteria are met. In the example I focus on conditional inference tree, which incorporates tree-structured regression models into conditional inference procedures. While growing a single tree is subject to small changes in the training data, random forests procedure is introduced to address this problem. The sources of diversity for random forests come from the random sampling and restricted set of input variables to be selected. Finally, I introduce R functions to perform model based recursive partitioning. This method incorporates recursive partitioning into conventional parametric model building. PMID:27570769

  13. Evolutionary induction of sparse neural trees

    PubMed

    Zhang; Ohm; Muhlenbein

    1997-01-01

    This paper is concerned with the automatic induction of parsimonious neural networks. In contrast to other program induction situations, network induction entails parametric learning as well as structural adaptation. We present a novel representation scheme called neural trees that allows efficient learning of both network architectures and parameters by genetic search. A hybrid evolutionary method is developed for neural tree induction that combines genetic programming and the breeder genetic algorithm under the unified framework of the minimum description length principle. The method is successfully applied to the induction of higher order neural trees while still keeping the resulting structures sparse to ensure good generalization performance. Empirical results are provided on two chaotic time series prediction problems of practical interest. PMID:10021759

  14. Visualization method and tool for interactive learning of large decision trees

    NASA Astrophysics Data System (ADS)

    Nguyen, Trong Dung; Ho, TuBao

    2002-03-01

    When learning from large datasets, decision tree induction programs often produce very large trees. How to visualize efficiently trees in the learning process, particularly large trees, is still questionable and currently requires efficient tools. This paper presents a visualization method and tool for interactive learning of large decision trees, that includes a new visualization technique called T2.5D (stands for Tress 2.5 Dimensions). After a brief discussion on requirements for tree visualizers and related work, the paper focuses on presenting developing techniques for the issues (1) how to visualize efficiently large decision trees; and (2) how to visualize decision trees in the learning process.

  15. Bayesian Evidence Framework for Decision Tree Learning

    NASA Astrophysics Data System (ADS)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  16. Creating ensembles of decision trees through sampling

    DOEpatents

    Kamath, Chandrika; Cantu-Paz, Erick

    2005-08-30

    A system for decision tree ensembles that includes a module to read the data, a module to sort the data, a module to evaluate a potential split of the data according to some criterion using a random sample of the data, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method is based on statistical sampling techniques and includes the steps of reading the data; sorting the data; evaluating a potential split according to some criterion using a random sample of the data, splitting the data, and combining multiple decision trees in ensembles.

  17. Weighted Hybrid Decision Tree Model for Random Forest Classifier

    NASA Astrophysics Data System (ADS)

    Kulkarni, Vrushali Y.; Sinha, Pradeep K.; Petare, Manisha C.

    2016-06-01

    Random Forest is an ensemble, supervised machine learning algorithm. An ensemble generates many classifiers and combines their results by majority voting. Random forest uses decision tree as base classifier. In decision tree induction, an attribute split/evaluation measure is used to decide the best split at each node of the decision tree. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation among them. The work presented in this paper is related to attribute split measures and is a two step process: first theoretical study of the five selected split measures is done and a comparison matrix is generated to understand pros and cons of each measure. These theoretical results are verified by performing empirical analysis. For empirical analysis, random forest is generated using each of the five selected split measures, chosen one at a time. i.e. random forest using information gain, random forest using gain ratio, etc. The next step is, based on this theoretical and empirical analysis, a new approach of hybrid decision tree model for random forest classifier is proposed. In this model, individual decision tree in Random Forest is generated using different split measures. This model is augmented by weighted voting based on the strength of individual tree. The new approach has shown notable increase in the accuracy of random forest.

  18. From Family Trees to Decision Trees.

    ERIC Educational Resources Information Center

    Trobian, Helen R.

    This paper is a preliminary inquiry by a non-mathematician into graphic methods of sequential planning and ways in which hierarchical analysis and tree structures can be helpful in developing interest in the use of mathematical modeling in the search for creative solutions to real-life problems. Highlights include a discussion of hierarchical…

  19. Decision Tree Technique for Particle Identification

    SciTech Connect

    Quiller, Ryan

    2003-09-05

    Particle identification based on measurements such as the Cerenkov angle, momentum, and the rate of energy loss per unit distance (-dE/dx) is fundamental to the BaBar detector for particle physics experiments. It is particularly important to separate the charged forms of kaons and pions. Currently, the Neural Net, an algorithm based on mapping input variables to an output variable using hidden variables as intermediaries, is one of the primary tools used for identification. In this study, a decision tree classification technique implemented in the computer program, CART, was investigated and compared to the Neural Net over the range of momenta, 0.25 GeV/c to 5.0 GeV/c. For a given subinterval of momentum, three decision trees were made using different sets of input variables. The sensitivity and specificity were calculated for varying kaon acceptance thresholds. This data was used to plot Receiver Operating Characteristic curves (ROC curves) to compare the performance of the classification methods. Also, input variables used in constructing the decision trees were analyzed. It was found that the Neural Net was a significant contributor to decision trees using dE/dx and the Cerenkov angle as inputs. Furthermore, the Neural Net had poorer performance than the decision tree technique, but tended to improve decision tree performance when used as an input variable. These results suggest that the decision tree technique using Neural Net input may possibly increase accuracy of particle identification in BaBar.

  20. Classification based on full decision trees

    NASA Astrophysics Data System (ADS)

    Genrikhov, I. E.; Djukova, E. V.

    2012-04-01

    The ideas underlying a series of the authors' studies dealing with the design of classification algorithms based on full decision trees are further developed. It is shown that the decision tree construction under consideration takes into account all the features satisfying a branching criterion. Full decision trees with an entropy branching criterion are studied as applied to precedent-based pattern recognition problems with real-valued data. Recognition procedures are constructed for solving problems with incomplete data (gaps in the feature descriptions of the objects) in the case when the learning objects are nonuniformly distributed over the classes. The authors' basic results previously obtained in this area are overviewed.

  1. The decision tree approach to classification

    NASA Technical Reports Server (NTRS)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  2. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  3. A survey of decision tree classifier methodology

    NASA Technical Reports Server (NTRS)

    Safavian, S. Rasoul; Landgrebe, David

    1990-01-01

    Decision Tree Classifiers (DTC's) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps, the most important feature of DTC's is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issue. After considering potential advantages of DTC's over single stage classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  4. A survey of decision tree classifier methodology

    NASA Technical Reports Server (NTRS)

    Safavian, S. R.; Landgrebe, David

    1991-01-01

    Decision tree classifiers (DTCs) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition. Perhaps the most important feature of DTCs is their capability to break down a complex decision-making process into a collection of simpler decisions, thus providing a solution which is often easier to interpret. A survey of current methods is presented for DTC designs and the various existing issues. After considering potential advantages of DTCs over single-state classifiers, subjects of tree structure design, feature selection at each internal node, and decision and search strategies are discussed.

  5. Parallel object-oriented decision tree system

    DOEpatents

    Kamath; Chandrika , Cantu-Paz; Erick

    2006-02-28

    A data mining decision tree system that uncovers patterns, associations, anomalies, and other statistically significant structures in data by reading and displaying data files, extracting relevant features for each of the objects, and using a method of recognizing patterns among the objects based upon object features through a decision tree that reads the data, sorts the data if necessary, determines the best manner to split the data into subsets according to some criterion, and splits the data.

  6. Speeding up Boosting decision trees training

    NASA Astrophysics Data System (ADS)

    Zheng, Chao; Wei, Zhenzhong

    2015-10-01

    To overcome the drawback that Boosting decision trees perform fast speed in the test time while the training process is relatively too slow to meet the requirements of applications with real-time learning, we propose a fast decision trees training method by pruning those noneffective features in advance. And basing on this method, we also design a fast Boosting decision trees training algorithm. Firstly, we analyze the structure of each decision trees node, and prove that the classification error of each node has a bound through derivation. Then, by using the error boundary to prune non-effective features in the early stage, we greatly accelerate the decision tree training process, and would not affect the training results at all. Finally, the decision tree accelerated training method is integrated into the general Boosting process forming a fast boosting decision trees training algorithm. This algorithm is not a new variant of Boosting, on the contrary, it should be used in conjunction with existing Boosting algorithms to achieve more training acceleration. To test the algorithm's speedup performance and performance combined with other accelerated algorithms, the original AdaBoost and two typical acceleration algorithms LazyBoost and StochasticBoost were respectively used in conjunction with this algorithm into three fast versions, and their classification performance was tested by using the Lsis face database which contained 12788 images. Experimental results reveal that this fast algorithm can achieve more than double training speedup without affecting the results of the trained classifier, and can be combined with other acceleration algorithms. Key words: Boosting algorithm, decision trees, classifier training, preliminary classification error, face detection

  7. Decision Tree Approach for Soil Liquefaction Assessment

    PubMed Central

    Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498

  8. Using Evolutionary Algorithms to Induce Oblique Decision Trees

    SciTech Connect

    Cantu-Paz, E.; Kamath, C.

    2000-01-21

    This paper illustrates the application of evolutionary algorithms (EAs) to the problem of oblique decision tree induction. The objectives are to demonstrate that EAs can find classifiers whose accuracy is competitive with other oblique tree construction methods, and that this can be accomplished in a shorter time. Experiments were performed with a (1+1) evolutionary strategy and a simple genetic algorithm on public domain and artificial data sets. The empirical results suggest that the EAs quickly find Competitive classifiers, and that EAs scale up better than traditional methods to the dimensionality of the domain and the number of training instances.

  9. Fast Image Texture Classification Using Decision Trees

    NASA Technical Reports Server (NTRS)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  10. Algorithms for optimal dyadic decision trees

    SciTech Connect

    Hush, Don; Porter, Reid

    2009-01-01

    A new algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter that guarantees optimal solutions for all relevant trees sizes, revising the core tree-building algorithm so that its run time is substantially smaller for most regularization parameter values on the grid, and incorporating new data structures and data pre-processing steps that provide significant run time enhancement in practice.

  11. Decision Tree Modeling for Ranking Data

    NASA Astrophysics Data System (ADS)

    Yu, Philip L. H.; Wan, Wai Ming; Lee, Paul H.

    Ranking/preference data arises from many applications in marketing, psychology, and politics. We establish a new decision tree model for the analysis of ranking data by adopting the concept of classification and regression tree. The existing splitting criteria are modified in a way that allows them to precisely measure the impurity of a set of ranking data. Two types of impurity measures for ranking data are introduced, namelyg-wise and top-k measures. Theoretical results show that the new measures exhibit properties of impurity functions. In model assessment, the area under the ROC curve (AUC) is applied to evaluate the tree performance. Experiments are carried out to investigate the predictive performance of the tree model for complete and partially ranked data and promising results are obtained. Finally, a real-world application of the proposed methodology to analyze a set of political rankings data is presented.

  12. IND - THE IND DECISION TREE PACKAGE

    NASA Technical Reports Server (NTRS)

    Buntine, W.

    1994-01-01

    A common approach to supervised classification and prediction in artificial intelligence and statistical pattern recognition is the use of decision trees. A tree is "grown" from data using a recursive partitioning algorithm to create a tree which has good prediction of classes on new data. Standard algorithms are CART (by Breiman Friedman, Olshen and Stone) and ID3 and its successor C4 (by Quinlan). As well as reimplementing parts of these algorithms and offering experimental control suites, IND also introduces Bayesian and MML methods and more sophisticated search in growing trees. These produce more accurate class probability estimates that are important in applications like diagnosis. IND is applicable to most data sets consisting of independent instances, each described by a fixed length vector of attribute values. An attribute value may be a number, one of a set of attribute specific symbols, or it may be omitted. One of the attributes is delegated the "target" and IND grows trees to predict the target. Prediction can then be done on new data or the decision tree printed out for inspection. IND provides a range of features and styles with convenience for the casual user as well as fine-tuning for the advanced user or those interested in research. IND can be operated in a CART-like mode (but without regression trees, surrogate splits or multivariate splits), and in a mode like the early version of C4. Advanced features allow more extensive search, interactive control and display of tree growing, and Bayesian and MML algorithms for tree pruning and smoothing. These often produce more accurate class probability estimates at the leaves. IND also comes with a comprehensive experimental control suite. IND consists of four basic kinds of routines: data manipulation routines, tree generation routines, tree testing routines, and tree display routines. The data manipulation routines are used to partition a single large data set into smaller training and test sets. The

  13. Two Trees: Migrating Fault Trees to Decision Trees for Real Time Fault Detection on International Space Station

    NASA Technical Reports Server (NTRS)

    Lee, Charles; Alena, Richard L.; Robinson, Peter

    2004-01-01

    We started from ISS fault trees example to migrate to decision trees, presented a method to convert fault trees to decision trees. The method shows that the visualizations of root cause of fault are easier and the tree manipulating becomes more programmatic via available decision tree programs. The visualization of decision trees for the diagnostic shows a format of straight forward and easy understands. For ISS real time fault diagnostic, the status of the systems could be shown by mining the signals through the trees and see where it stops at. The other advantage to use decision trees is that the trees can learn the fault patterns and predict the future fault from the historic data. The learning is not only on the static data sets but also can be online, through accumulating the real time data sets, the decision trees can gain and store faults patterns in the trees and recognize them when they come.

  14. AncesTrees: ancestry estimation with randomized decision trees.

    PubMed

    Navega, David; Coelho, Catarina; Vicente, Ricardo; Ferreira, Maria Teresa; Wasterlain, Sofia; Cunha, Eugénia

    2015-09-01

    In forensic anthropology, ancestry estimation is essential in establishing the individual biological profile. The aim of this study is to present a new program--AncesTrees--developed for assessing ancestry based on metric analysis. AncesTrees relies on a machine learning ensemble algorithm, random forest, to classify the human skull. In the ensemble learning paradigm, several models are generated and co-jointly used to arrive at the final decision. The random forest algorithm creates ensembles of decision trees classifiers, a non-linear and non-parametric classification technique. The database used in AncesTrees is composed by 23 craniometric variables from 1,734 individuals, representative of six major ancestral groups and selected from the Howells' craniometric series. The program was tested in 128 adult crania from the following collections: the African slaves' skeletal collection of Valle da Gafaria; the Medical School Skull Collection and the Identified Skeletal Collection of 21st Century, both curated at the University of Coimbra. The first step of the test analysis was to perform ancestry estimation including all the ancestral groups of the database. The second stage of our test analysis was to conduct ancestry estimation including only the European and the African ancestral groups. In the first test analysis, 75% of the individuals of African ancestry and 79.2% of the individuals of European ancestry were correctly identified. The model involving only African and European ancestral groups had a better performance: 93.8% of all individuals were correctly classified. The obtained results show that AncesTrees can be a valuable tool in forensic anthropology. PMID:25053239

  15. An Application of Decision Tree Based on ID3

    NASA Astrophysics Data System (ADS)

    Xiaohu, Wang; Lele, Wang; Nianfeng, Li

    This article deals with the application of classical decision tree ID3 of the data mining in a certain site data. It constitutes a decision tree based on information gain and thus produces some useful purchasing behavior rules. It also proves that the decision tree has a wide applicable future in the sale field on site.

  16. CUDT: A CUDA Based Decision Tree Algorithm

    PubMed Central

    Sheu, Ruey-Kai; Chiu, Chun-Chieh

    2014-01-01

    Decision tree is one of the famous classification methods in data mining. Many researches have been proposed, which were focusing on improving the performance of decision tree. However, those algorithms are developed and run on traditional distributed systems. Obviously the latency could not be improved while processing huge data generated by ubiquitous sensing node in the era without new technology help. In order to improve data processing latency in huge data mining, in this paper, we design and implement a new parallelized decision tree algorithm on a CUDA (compute unified device architecture), which is a GPGPU solution provided by NVIDIA. In the proposed system, CPU is responsible for flow control while the GPU is responsible for computation. We have conducted many experiments to evaluate system performance of CUDT and made a comparison with traditional CPU version. The results show that CUDT is 5∼55 times faster than Weka-j48 and is 18 times speedup than SPRINT for large data set. PMID:25140346

  17. Quantum Decision Trees and Semidefinite Programming.

    SciTech Connect

    Barnum, Howard; Saks, M.; Szegedy, M.

    2001-01-01

    We reformulate the notion of quantum query complexity in terms of inequalities and equations for a set of positive matrices, which we view as a quantum analogue of a decision tree. Using the new formulation we show that: 1. every quantum query algorithm needs to use at most n quantum bits in addition to the query register. 2. For any function f there is an algorithm that runs in polynomial time in terms the truth table of f and (for {var_epsilon} > 0) computes the {var_epsilon}-error quantum decision tree complexity of f. 3. Using the dual of our system we can treat lower bound methods on a uniform platform, which paves the way to their future comparison. In particular we describe Ambainis's bound in our framework. 4. The output condition on quantum algorithms used by Ambainis and others is not sufficient for an algorithm to compute a function with {var_epsilon}-bounded error: we show the existence of algorithms whose final entanglement matrix satisfies the condition, but for which the value of f cannot be determined from a quantum measurement on the accessible part of the computer.

  18. Identification of metabolic syndrome using decision tree analysis.

    PubMed

    Worachartcheewan, Apilak; Nantasenamat, Chanin; Isarankura-Na-Ayudhya, Chartchalerm; Pidetcha, Phannee; Prachayasittikul, Virapong

    2010-10-01

    This study employs decision tree as a decision support system for rapid and automated identification of individuals with metabolic syndrome (MS) among a Thai population. Results demonstrated strong predictivity of the decision tree in classification of individuals with and without MS, displaying an overall accuracy in excess of 99%. PMID:20619912

  19. Safety validation of decision trees for hepatocellular carcinoma

    PubMed Central

    Wang, Xian-Qiang; Liu, Zhe; Lv, Wen-Ping; Luo, Ying; Yang, Guang-Yun; Li, Chong-Hui; Meng, Xiang-Fei; Liu, Yang; Xu, Ke-Sen; Dong, Jia-Hong

    2015-01-01

    AIM: To evaluate a different decision tree for safe liver resection and verify its efficiency. METHODS: A total of 2457 patients underwent hepatic resection between January 2004 and December 2010 at the Chinese PLA General Hospital, and 634 hepatocellular carcinoma (HCC) patients were eligible for the final analyses. Post-hepatectomy liver failure (PHLF) was identified by the association of prothrombin time < 50% and serum bilirubin > 50 μmol/L (the “50-50” criteria), which were assessed at day 5 postoperatively or later. The Swiss-Clavien decision tree, Tokyo University-Makuuchi decision tree, and Chinese consensus decision tree were adopted to divide patients into two groups based on those decision trees in sequence, and the PHLF rates were recorded. RESULTS: The overall mortality and PHLF rate were 0.16% and 3.0%. A total of 19 patients experienced PHLF. The numbers of patients to whom the Swiss-Clavien, Tokyo University-Makuuchi, and Chinese consensus decision trees were applied were 581, 573, and 622, and the PHLF rates were 2.75%, 2.62%, and 2.73%, respectively. Significantly more cases satisfied the Chinese consensus decision tree than the Swiss-Clavien decision tree and Tokyo University-Makuuchi decision tree (P < 0.01,P < 0.01); nevertheless, the latter two shared no difference (P = 0.147). The PHLF rate exhibited no significant difference with respect to the three decision trees. CONCLUSION: The Chinese consensus decision tree expands the indications for hepatic resection for HCC patients and does not increase the PHLF rate compared to the Swiss-Clavien and Tokyo University-Makuuchi decision trees. It would be a safe and effective algorithm for hepatectomy in patients with hepatocellular carcinoma. PMID:26309366

  20. Self-Adaptive Induction of Regression Trees.

    PubMed

    Fidalgo-Merino, Raúl; Núñez, Marlon

    2011-08-01

    A new algorithm for incremental construction of binary regression trees is presented. This algorithm, called SAIRT, adapts the induced model when facing data streams involving unknown dynamics, like gradual and abrupt function drift, changes in certain regions of the function, noise, and virtual drift. It also handles both symbolic and numeric attributes. The proposed algorithm can automatically adapt its internal parameters and model structure to obtain new patterns, depending on the current dynamics of the data stream. SAIRT can monitor the usefulness of nodes and can forget examples from selected regions, storing the remaining ones in local windows associated to the leaves of the tree. On these conditions, current regression methods need a careful configuration depending on the dynamics of the problem. Experimentation suggests that the proposed algorithm obtains better results than current algorithms when dealing with data streams that involve changes with different speeds, noise levels, sampling distribution of examples, and partial or complete changes of the underlying function. PMID:21263164

  1. 15 CFR Supplement 1 to Part 732 - Decision Tree

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 15 Commerce and Foreign Trade 2 2011-01-01 2011-01-01 false Decision Tree 1 Supplement 1 to Part 732 Commerce and Foreign Trade Regulations Relating to Commerce and Foreign Trade (Continued) BUREAU... THE EAR Pt. 732, Supp. 1 Supplement 1 to Part 732—Decision Tree ER06FE04.000...

  2. 15 CFR Supplement No 1 to Part 732 - Decision Tree

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 15 Commerce and Foreign Trade 2 2013-01-01 2013-01-01 false Decision Tree No Supplement No 1 to Part 732 Commerce and Foreign Trade Regulations Relating to Commerce and Foreign Trade (Continued... THE EAR Pt. 732, Supp. 1 Supplement No 1 to Part 732—Decision Tree ER06FE04.000...

  3. 15 CFR Supplement No 1 to Part 732 - Decision Tree

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 15 Commerce and Foreign Trade 2 2014-01-01 2014-01-01 false Decision Tree No Supplement No 1 to Part 732 Commerce and Foreign Trade Regulations Relating to Commerce and Foreign Trade (Continued... THE EAR Pt. 732, Supp. 1 Supplement No 1 to Part 732—Decision Tree ER06FE04.000...

  4. 15 CFR Supplement 1 to Part 732 - Decision Tree

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false Decision Tree 1 Supplement 1 to Part 732 Commerce and Foreign Trade Regulations Relating to Commerce and Foreign Trade (Continued) BUREAU... THE EAR Pt. 732, Supp. 1 Supplement 1 to Part 732—Decision Tree ER06FE04.000...

  5. 15 CFR Supplement 1 to Part 732 - Decision Tree

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 15 Commerce and Foreign Trade 2 2012-01-01 2012-01-01 false Decision Tree 1 Supplement 1 to Part 732 Commerce and Foreign Trade Regulations Relating to Commerce and Foreign Trade (Continued) BUREAU... THE EAR Pt. 732, Supp. 1 Supplement 1 to Part 732—Decision Tree ER06FE04.000...

  6. Decision-Tree Formulation With Order-1 Lateral Execution

    NASA Technical Reports Server (NTRS)

    James, Mark

    2007-01-01

    A compact symbolic formulation enables mapping of an arbitrarily complex decision tree of a certain type into a highly computationally efficient multidimensional software object. The type of decision trees to which this formulation applies is that known in the art as the Boolean class of balanced decision trees. Parallel lateral slices of an object created by means of this formulation can be executed in constant time considerably less time than would otherwise be required. Decision trees of various forms are incorporated into almost all large software systems. A decision tree is a way of hierarchically solving a problem, proceeding through a set of true/false responses to a conclusion. By definition, a decision tree has a tree-like structure, wherein each internal node denotes a test on an attribute, each branch from an internal node represents an outcome of a test, and leaf nodes represent classes or class distributions that, in turn represent possible conclusions. The drawback of decision trees is that execution of them can be computationally expensive (and, hence, time-consuming) because each non-leaf node must be examined to determine whether to progress deeper into a tree structure or to examine an alternative. The present formulation was conceived as an efficient means of representing a decision tree and executing it in as little time as possible. The formulation involves the use of a set of symbolic algorithms to transform a decision tree into a multi-dimensional object, the rank of which equals the number of lateral non-leaf nodes. The tree can then be executed in constant time by means of an order-one table lookup. The sequence of operations performed by the algorithms is summarized as follows: 1. Determination of whether the tree under consideration can be encoded by means of this formulation. 2. Extraction of decision variables. 3. Symbolic optimization of the decision tree to minimize its form. 4. Expansion and transformation of all nested conjunctive

  7. Computational study of developing high-quality decision trees

    NASA Astrophysics Data System (ADS)

    Fu, Zhiwei

    2002-03-01

    Recently, decision tree algorithms have been widely used in dealing with data mining problems to find out valuable rules and patterns. However, scalability, accuracy and efficiency are significant concerns regarding how to effectively deal with large and complex data sets in the implementation. In this paper, we propose an innovative machine learning approach (we call our approach GAIT), combining genetic algorithm, statistical sampling, and decision tree, to develop intelligent decision trees that can alleviate some of these problems. We design our computational experiments and run GAIT on three different data sets (namely Socio- Olympic data, Westinghouse data, and FAA data) to test its performance against standard decision tree algorithm, neural network classifier, and statistical discriminant technique, respectively. The computational results show that our approach outperforms standard decision tree algorithm profoundly at lower sampling levels, and achieves significantly better results with less effort than both neural network and discriminant classifiers.

  8. An automated approach to the design of decision tree classifiers

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Chin, R.; Beaudet, P.

    1982-01-01

    An automated technique is presented for designing effective decision tree classifiers predicated only on a priori class statistics. The procedure relies on linear feature extractions and Bayes table look-up decision rules. Associated error matrices are computed and utilized to provide an optimal design of the decision tree at each so-called 'node'. A by-product of this procedure is a simple algorithm for computing the global probability of correct classification assuming the statistical independence of the decision rules. Attention is given to a more precise definition of decision tree classification, the mathematical details on the technique for automated decision tree design, and an example of a simple application of the procedure using class statistics acquired from an actual Landsat scene.

  9. Decision tree methods: applications for classification and prediction.

    PubMed

    Song, Yan-Yan; Lu, Ying

    2015-04-25

    Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure. PMID:26120265

  10. Decision tree based transient stability method -- A case study

    SciTech Connect

    Wehenkel, L.; Pavella, M. . Inst. Montefiore); Euxibie, E.; Heilbronn, B. . Direction des Etudes et Recherches)

    1994-02-01

    The decision tree transient stability method is revisited via a case study carried out on the French EHV power system. In short, the method consists of building off-line decision trees, able to subsequently assess the system transient behavior in terms of precontingency parameters (or attributes'') of it, likely to drive the stability phenomena. This case study aims at investigating practical feasibility aspects and features of the trees, at enhancing their reliability to the extent possible, and at generalizing them. Feasibility aspects encompass data base generation, candidate attributes, stability classes; tree features concern in particular complexity in terms of their size and interpretability capabilities, robustness with respect to both their building and use. Reliability is enhanced by defining and exploiting pragmatic quality measures. Generalization concerns multicontingency, instead of single-contingency trees. The results obtained show real promise for the method to meet practical needs of electric power utilities.

  11. RNA search with decision trees and partial covariance models.

    PubMed

    Smith, Jennifer A

    2009-01-01

    The use of partial covariance models to search for RNA family members in genomic sequence databases is explored. The partial models are formed from contiguous subranges of the overall RNA family multiple alignment columns. A binary decision-tree framework is presented for choosing the order to apply the partial models and the score thresholds on which to make the decisions. The decision trees are chosen to minimize computation time subject to the constraint that all of the training sequences are passed to the full covariance model for final evaluation. Computational intelligence methods are suggested to select the decision tree since the tree can be quite complex and there is no obvious method to build the tree in these cases. Experimental results from seven RNA families shows execution times of 0.066-0.268 relative to using the full covariance model alone. Tests on the full sets of known sequences for each family show that at least 95 percent of these sequences are found for two families and 100 percent for five others. Since the full covariance model is run on all sequences accepted by the partial model decision tree, the false alarm rate is at least as low as that of the full model alone. PMID:19644178

  12. Learning accurate very fast decision trees from uncertain data streams

    NASA Astrophysics Data System (ADS)

    Liang, Chunquan; Zhang, Yang; Shi, Peng; Hu, Zhengguo

    2015-12-01

    Most existing works on data stream classification assume the streaming data is precise and definite. Such assumption, however, does not always hold in practice, since data uncertainty is ubiquitous in data stream applications due to imprecise measurement, missing values, privacy protection, etc. The goal of this paper is to learn accurate decision tree models from uncertain data streams for classification analysis. On the basis of very fast decision tree (VFDT) algorithms, we proposed an algorithm for constructing an uncertain VFDT tree with classifiers at tree leaves (uVFDTc). The uVFDTc algorithm can exploit uncertain information effectively and efficiently in both the learning and the classification phases. In the learning phase, it uses Hoeffding bound theory to learn from uncertain data streams and yield fast and reasonable decision trees. In the classification phase, at tree leaves it uses uncertain naive Bayes (UNB) classifiers to improve the classification performance. Experimental results on both synthetic and real-life datasets demonstrate the strong ability of uVFDTc to classify uncertain data streams. The use of UNB at tree leaves has improved the performance of uVFDTc, especially the any-time property, the benefit of exploiting uncertain information, and the robustness against uncertainty.

  13. An expert-guided decision tree construction strategy: an application in knowledge discovery with medical databases.

    PubMed Central

    Tsai, Y. S.; King, P. H.; Higgins, M. S.; Pierce, D.; Patel, N. P.

    1997-01-01

    With the steady growth in electronic patient records and clinical medical informatics systems, the data collected for routine clinical use have been accumulating at a dramatic rate. Inter-disciplinary research provides a new generation of computation tools in knowledge discovery and data management is in great demand. In this study, an expert-guided decision tree construction strategy is proposed to offer an user-oriented knowledge discovery environment. The strategy allows experts, based on their expertise and/or preference, to override inductive decision tree construction process. Moreover, by reviewing decision paths, experts could focus on subsets of data that may be clues to new findings, or simply contaminated cases. PMID:9357618

  14. EEG feature selection method based on decision tree.

    PubMed

    Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun

    2015-01-01

    This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results. PMID:26405856

  15. Automatic sleep staging using state machine-controlled decision trees.

    PubMed

    Imtiaz, Syed Anas; Rodriguez-Villegas, Esther

    2015-01-01

    Automatic sleep staging from a reduced number of channels is desirable to save time, reduce costs and make sleep monitoring more accessible by providing home-based polysomnography. This paper introduces a novel algorithm for automatic scoring of sleep stages using a combination of small decision trees driven by a state machine. The algorithm uses two channels of EEG for feature extraction and has a state machine that selects a suitable decision tree for classification based on the prevailing sleep stage. Its performance has been evaluated using the complete dataset of 61 recordings from PhysioNet Sleep EDF Expanded database achieving an overall accuracy of 82% and 79% on training and test sets respectively. The algorithm has been developed with a very small number of decision tree nodes that are active at any given time making it suitable for use in resource-constrained wearable systems. PMID:26736278

  16. An automated approach to the design of decision tree classifiers

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Chin, P.; Beaudet, P.

    1980-01-01

    The classification of large dimensional data sets arising from the merging of remote sensing data with more traditional forms of ancillary data is considered. Decision tree classification, a popular approach to the problem, is characterized by the property that samples are subjected to a sequence of decision rules before they are assigned to a unique class. An automated technique for effective decision tree design which relies only on apriori statistics is presented. This procedure utilizes a set of two dimensional canonical transforms and Bayes table look-up decision rules. An optimal design at each node is derived based on the associated decision table. A procedure for computing the global probability of correct classfication is also provided. An example is given in which class statistics obtained from an actual LANDSAT scene are used as input to the program. The resulting decision tree design has an associated probability of correct classification of .76 compared to the theoretically optimum .79 probability of correct classification associated with a full dimensional Bayes classifier. Recommendations for future research are included.

  17. Reconciliation of Decision-Making Heuristics Based on Decision Trees Topologies and Incomplete Fuzzy Probabilities Sets

    PubMed Central

    Doubravsky, Karel; Dohnal, Mirko

    2015-01-01

    Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (re)checked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items) can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities) are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details. PMID:26158662

  18. Reconciliation of Decision-Making Heuristics Based on Decision Trees Topologies and Incomplete Fuzzy Probabilities Sets.

    PubMed

    Doubravsky, Karel; Dohnal, Mirko

    2015-01-01

    Complex decision making tasks of different natures, e.g. economics, safety engineering, ecology and biology, are based on vague, sparse, partially inconsistent and subjective knowledge. Moreover, decision making economists / engineers are usually not willing to invest too much time into study of complex formal theories. They require such decisions which can be (re)checked by human like common sense reasoning. One important problem related to realistic decision making tasks are incomplete data sets required by the chosen decision making algorithm. This paper presents a relatively simple algorithm how some missing III (input information items) can be generated using mainly decision tree topologies and integrated into incomplete data sets. The algorithm is based on an easy to understand heuristics, e.g. a longer decision tree sub-path is less probable. This heuristic can solve decision problems under total ignorance, i.e. the decision tree topology is the only information available. But in a practice, isolated information items e.g. some vaguely known probabilities (e.g. fuzzy probabilities) are usually available. It means that a realistic problem is analysed under partial ignorance. The proposed algorithm reconciles topology related heuristics and additional fuzzy sets using fuzzy linear programming. The case study, represented by a tree with six lotteries and one fuzzy probability, is presented in details. PMID:26158662

  19. Evaluation of Decision Trees for Cloud Detection from AVHRR Data

    NASA Technical Reports Server (NTRS)

    Shiffman, Smadar; Nemani, Ramakrishna

    2005-01-01

    Automated cloud detection and tracking is an important step in assessing changes in radiation budgets associated with global climate change via remote sensing. Data products based on satellite imagery are available to the scientific community for studying trends in the Earth's atmosphere. The data products include pixel-based cloud masks that assign cloud-cover classifications to pixels. Many cloud-mask algorithms have the form of decision trees. The decision trees employ sequential tests that scientists designed based on empirical astrophysics studies and simulations. Limitations of existing cloud masks restrict our ability to accurately track changes in cloud patterns over time. In a previous study we compared automatically learned decision trees to cloud masks included in Advanced Very High Resolution Radiometer (AVHRR) data products from the year 2000. In this paper we report the replication of the study for five-year data, and for a gold standard based on surface observations performed by scientists at weather stations in the British Islands. For our sample data, the accuracy of automatically learned decision trees was greater than the accuracy of the cloud masks p < 0.001.

  20. Three-dimensional object recognition using similar triangles and decision trees

    NASA Technical Reports Server (NTRS)

    Spirkovska, Lilly

    1993-01-01

    A system, TRIDEC, that is capable of distinguishing between a set of objects despite changes in the objects' positions in the input field, their size, or their rotational orientation in 3D space is described. TRIDEC combines very simple yet effective features with the classification capabilities of inductive decision tree methods. The feature vector is a list of all similar triangles defined by connecting all combinations of three pixels in a coarse coded 127 x 127 pixel input field. The classification is accomplished by building a decision tree using the information provided from a limited number of translated, scaled, and rotated samples. Simulation results are presented which show that TRIDEC achieves 94 percent recognition accuracy in the 2D invariant object recognition domain and 98 percent recognition accuracy in the 3D invariant object recognition domain after training on only a small sample of transformed views of the objects.

  1. The Decision-Identification Tree: A New NEPA Scoping Tool.

    PubMed

    Eccleston

    2000-10-01

    / No single methodology has been universally accepted for determining the appropriate scope of analysis for an environmental impact statement (EIS). Most typically, the scope of analysis is determined by first identifying actions and facilities that need to be analyzed. Once the scope of actions and facilities is identified, the scope of impacts is determined. Yet agencies sometimes complete an EIS only to discover that the analysis does not adequately support decisions that need to be made. Such discrepancies can often be traced to disconnects between scoping, the subsequent analysis, and the final decision-making process that follows. A new and markedly different approach-decision-based scoping-provides an effective methodology for improving the EIS scoping process. Decision-based scoping, in conjunction with a new tool, the decision-identification tree (DIT), places emphasis on first identifying the potential decisions that may eventually need to be made. The DIT provides a methodology for mapping alternative courses of action as a function of fundamental decision points. Once these decision points have been correctly identified, the range of actions, alternatives, and impacts can be more accurately assessed; this approach can improve the effectiveness of EIS planning, while reducing the risk of future disconnects between the EIS analysis and reaching a final decision. This approach also has applications in other planning disciplines beyond that of the EIS. PMID:10954809

  2. Improving ensemble decision tree performance using Adaboost and Bagging

    NASA Astrophysics Data System (ADS)

    Hasan, Md. Rajib; Siraj, Fadzilah; Sainin, Mohd Shamrie

    2015-12-01

    Ensemble classifier systems are considered as one of the most promising in medical data classification and the performance of deceision tree classifier can be increased by the ensemble method as it is proven to be better than single classifiers. However, in a ensemble settings the performance depends on the selection of suitable base classifier. This research employed two prominent esemble s namely Adaboost and Bagging with base classifiers such as Random Forest, Random Tree, j48, j48grafts and Logistic Model Regression (LMT) that have been selected independently. The empirical study shows that the performance varries when different base classifiers are selected and even some places overfitting issue also been noted. The evidence shows that ensemble decision tree classfiers using Adaboost and Bagging improves the performance of selected medical data sets.

  3. Decision trees for denoising in H.264/AVC video sequences

    NASA Astrophysics Data System (ADS)

    Huchet, G.; Chouinard, J.-Y.; Wang, D.; Vincent, A.

    2008-01-01

    All existing video coding standards are based on block-wise motion compensation and block-wise DCT. At high levels of quantization, block-wise motion compensation and transform produces blocking artifacts in the decoded video, a form of distortion to which the human visual system is very sensitive. The latest video coding standard, H.264/AVC, introduces a deblocking filter to reduce the blocking artifacts. However, there is still visible distortion after the filtering when compared to the original video. In this paper, we propose a non-conventional filter to further reduce the distortion and to improve the decoded picture quality. Different from conventional filters, the proposed filter is based on a machine learning algorithm (decision tree). The decision trees are used to classify the filter's inputs and select the best filter coeffcients for the inputs. Experimental results with 4 × 4 DCT indicate that using the filter holds promise in improving the quality of H.264/AVC video sequences.

  4. Application of decision tree algorithm for identification of rock forming minerals using energy dispersive spectrometry

    NASA Astrophysics Data System (ADS)

    Akkaş, Efe; Çubukçu, H. Evren; Artuner, Harun

    2014-05-01

    Rapid and automated mineral identification is compulsory in certain applications concerning natural rocks. Among all microscopic and spectrometric methods, energy dispersive X-ray spectrometers (EDS) integrated with scanning electron microscopes produce rapid information with reliable chemical data. Although obtaining elemental data with EDS analyses is fast and easy by the help of improving technology, it is rather challenging to perform accurate and rapid identification considering the large quantity of minerals in a rock sample with varying dimensions ranging between nanometer to centimeter. Furthermore, the physical properties of the specimen (roughness, thickness, electrical conductivity, position in the instrument etc.) and the incident electron beam (accelerating voltage, beam current, spot size etc.) control the produced characteristic X-ray, which in turn affect the elemental analyses. In order to minimize the effects of these physical constraints and develop an automated mineral identification system, a rule induction paradigm has been applied to energy dispersive spectral data. Decision tree classifiers divide training data sets into subclasses using generated rules or decisions and thereby it produces classification or recognition associated with these data sets. A number of thinsections prepared from rock samples with suitable mineralogy have been investigated and a preliminary 12 distinct mineral groups (olivine, orthopyroxene, clinopyroxene, apatite, amphibole, plagioclase, K- feldspar, zircon, magnetite, titanomagnetite, biotite, quartz), comprised mostly of silicates and oxides, have been selected. Energy dispersive spectral data for each group, consisting of 240 reference and 200 test analyses, have been acquired under various, non-standard, physical and electrical conditions. The reference X-Ray data have been used to assign the spectral distribution of elements to the specified mineral groups. Consequently, the test data have been analyzed using

  5. The xeroderma pigmentosum pathway: decision tree analysis of DNA quality.

    PubMed

    Naegeli, Hanspeter; Sugasawa, Kaoru

    2011-07-15

    The nucleotide excision repair (NER) system is a fundamental cellular stress response that uses only a handful of DNA binding factors, mutated in the cancer-prone syndrome xeroderma pigmentosum (XP), to detect an astounding diversity of bulky base lesions, including those induced by ultraviolet light, electrophilic chemicals, oxygen radicals and further genetic insults. Several of these XP proteins are characterized by a mediocre preference for damaged substrates over the native double helix but, intriguingly, none of them recognizes injured bases with sufficient selectivity to account for the very high precision of bulky lesion excision. Instead, substrate versatility as well as damage specificity and strand selectivity are achieved by a multistage quality control strategy whereby different subunits of the XP pathway, in succession, interrogate the DNA double helix for a distinct abnormality in its structural or dynamic parameters. Through this step-by-step filtering procedure, the XP proteins operate like a systematic decision making tool, generally known as decision tree analysis, to sort out rare damaged bases embedded in a vast excess of native DNA. The present review is focused on the mechanisms by which multiple XP subunits of the NER pathway contribute to the proposed decision tree analysis of DNA quality in eukaryotic cells. PMID:21684221

  6. Toward the Decision Tree for Inferring Requirements Maturation Types

    NASA Astrophysics Data System (ADS)

    Nakatani, Takako; Kondo, Narihito; Shirogane, Junko; Kaiya, Haruhiko; Hori, Shozo; Katamine, Keiichi

    Requirements are elicited step by step during the requirements engineering (RE) process. However, some types of requirements are elicited completely after the scheduled requirements elicitation process is finished. Such a situation is regarded as problematic situation. In our study, the difficulties of eliciting various kinds of requirements is observed by components. We refer to the components as observation targets (OTs) and introduce the word “Requirements maturation.” It means when and how requirements are elicited completely in the project. The requirements maturation is discussed on physical and logical OTs. OTs Viewed from a logical viewpoint are called logical OTs, e.g. quality requirements. The requirements of physical OTs, e.g., modules, components, subsystems, etc., includes functional and non-functional requirements. They are influenced by their requesters' environmental changes, as well as developers' technical changes. In order to infer the requirements maturation period of each OT, we need to know how much these factors influence the OTs' requirements maturation. According to the observation of actual past projects, we defined the PRINCE (Pre Requirements Intelligence Net Consideration and Evaluation) model. It aims to guide developers in their observation of the requirements maturation of OTs. We quantitatively analyzed the actual cases with their requirements elicitation process and extracted essential factors that influence the requirements maturation. The results of interviews of project managers are analyzed by WEKA, a data mining system, from which the decision tree was derived. This paper introduces the PRINCE model and the category of logical OTs to be observed. The decision tree that helps developers infer the maturation type of an OT is also described. We evaluate the tree through real projects and discuss its ability to infer the requirements maturation types.

  7. Using decision trees to understand structure in missing data

    PubMed Central

    Tierney, Nicholas J; Harden, Fiona A; Harden, Maurice J; Mengersen, Kerrie L

    2015-01-01

    Objectives Demonstrate the application of decision trees—classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs)—to understand structure in missing data. Setting Data taken from employees at 3 different industrial sites in Australia. Participants 7915 observations were included. Materials and methods The approach was evaluated using an occupational health data set comprising results of questionnaires, medical tests and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the type of data (medical or environmental), the site in which it was collected, the number of visits, and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured as compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusions Researchers are encouraged to use CART and BRT models to explore and understand missing data. PMID:26124509

  8. DECISION TREE CLASSIFIERS FOR STAR/GALAXY SEPARATION

    SciTech Connect

    Vasconcellos, E. C.; Ruiz, R. S. R.; De Carvalho, R. R.; Capelato, H. V.; Gal, R. R.; LaBarbera, F. L.; Frago Campos Velho, H.; Trevisan, M.

    2011-06-15

    We study the star/galaxy classification efficiency of 13 different decision tree algorithms applied to photometric objects in the Sloan Digital Sky Survey Data Release Seven (SDSS-DR7). Each algorithm is defined by a set of parameters which, when varied, produce different final classification trees. We extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. The efficiency of star-galaxy separation is measured using the completeness function. We find that the Functional Tree algorithm (FT) yields the best results as measured by the mean completeness in two magnitude intervals: 14 {<=} r {<=} 21 (85.2%) and r {>=} 19 (82.1%). We compare the performance of the tree generated with the optimal FT configuration to the classifications provided by the SDSS parametric classifier, 2DPHOT, and Ball et al. We find that our FT classifier is comparable to or better in completeness over the full magnitude range 15 {<=} r {<=} 21, with much lower contamination than all but the Ball et al. classifier. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (>80%) while simultaneously achieving low contamination ({approx}2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 {<=} r {<=} 21.

  9. A Novel Approach on Designing Augmented Fuzzy Cognitive Maps Using Fuzzified Decision Trees

    NASA Astrophysics Data System (ADS)

    Papageorgiou, Elpiniki I.

    This paper proposes a new methodology for designing Fuzzy Cognitive Maps using crisp decision trees that have been fuzzified. Fuzzy cognitive map is a knowledge-based technique that works as an artificial cognitive network inheriting the main aspects of cognitive maps and artificial neural networks. Decision trees, in the other hand, are well known intelligent techniques that extract rules from both symbolic and numeric data. Fuzzy theoretical techniques are used to fuzzify crisp decision trees in order to soften decision boundaries at decision nodes inherent in this type of trees. Comparisons between crisp decision trees and the fuzzified decision trees suggest that the later fuzzy tree is significantly more robust and produces a more balanced decision making. The approach proposed in this paper could incorporate any type of fuzzy decision trees. Through this methodology, new linguistic weights were determined in FCM model, thus producing augmented FCM tool. The framework is consisted of a new fuzzy algorithm to generate linguistic weights that describe the cause-effect relationships among the concepts of the FCM model, from induced fuzzy decision trees.

  10. Ethical decision-making made easier. The use of decision trees in case management.

    PubMed

    Storl, H; DuBois, B; Seline, J

    1999-01-01

    Case managers have never before faced the multitude of difficult ethical dilemmas that now confront them daily. Legal, medical, social, and ethical considerations often fly in the face of previously reliable intuitions. The importance and urgency of facing these dilemmas head-on has resulted in clear calls for action. What are the appropriate legal, ethical, and professional parameters for effective decision making? Are normatively sensitive, but also practically sensible protocols possible? In an effort to address these concerns, Alternatives for the Older Adult, Inc., Rock Island, Illinois established an ethics committee to look into possible means of resolving or dissolving commonly occurring dilemmas. As a result of year-long deliberations, the committee formulated a decision-making strategy whose central apparatus is the decision tree--a flowchart of reasonable decisions and their consequent implications. In this article, we explore the development of this approach as well as the theory that underlies it. PMID:10695172

  11. Using histograms to introduce randomization in the generation of ensembles of decision trees

    DOEpatents

    Kamath, Chandrika; Cantu-Paz, Erick; Littau, David

    2005-02-22

    A system for decision tree ensembles that includes a module to read the data, a module to create a histogram, a module to evaluate a potential split according to some criterion using the histogram, a module to select a split point randomly in an interval around the best split, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method includes the steps of reading the data; creating a histogram; evaluating a potential split according to some criterion using the histogram, selecting a split point randomly in an interval around the best split, splitting the data, and combining multiple decision trees in ensembles.

  12. Branch induction in spur-type Delicious apple nursery trees

    SciTech Connect

    Popenoe, J.

    1987-01-01

    Long sylleptic shoots produced on apple trees in the nursery result in increased early yields once the trees are planted in the orchard. Spur-type Delicious trees do not naturally produce branches in the nursery. To achieve branched spur-type Delicious trees, applications of combinations of growth regulators benzyladenine (BA) and gibberellic acid 4 + 7 (GA) and leaf removal (LR) techniques were tested. Spacings of 15, 25, 35, and 45 cm and MM.106, M.7, M.26 and seedling rootstocks were tested for their effects on branching. Carbon partitioning changes caused by these treatments were evaluated by dry weight analysis and for benzyladenine, leaf removal and tipping treatments by {sup 14}C-photoassimilate labelling. Possible involvement of roots produced cytokinins was examined by {sup 14}C-benzyladenine labeling through the xylem and by analyzing relationships between root mass and branching characteristics. Although partitioning of {sup 14}C-photoassimilate was increased to the top of the plant by BA sprays, and to the bottom of the plant by LR and tipping for up to six days after treatment, final plant weights were not different. No relationship between branching and root mass or {sup 14}C-benzyladenine mobilization was found. This evidence indicates branched trees possessed no greater dry weight than unbranched trees, only a redistribution of the dry weight into a form more suited to early fruit production in high density planting systems.

  13. Prediction model based on decision tree analysis for laccase mediators.

    PubMed

    Medina, Fabiola; Aguila, Sergio; Baratto, Maria Camilla; Martorana, Andrea; Basosi, Riccardo; Alderete, Joel B; Vazquez-Duhalt, Rafael

    2013-01-10

    A Structure Activity Relationship (SAR) study for laccase mediator systems was performed in order to correctly classify different natural phenolic mediators. Decision tree (DT) classification models with a set of five quantum-chemical calculated molecular descriptors were used. These descriptors included redox potential (ɛ°), ionization energy (E(i)), pK(a), enthalpy of formation of radical (Δ(f)H), and OH bond dissociation energy (D(O-H)). The rationale for selecting these descriptors is derived from the laccase-mediator mechanism. To validate the DT predictions, the kinetic constants of different compounds as laccase substrates, their ability for pesticide transformation as laccase-mediators, and radical stability were experimentally determined using Coriolopsis gallica laccase and the pesticide dichlorophen. The prediction capability of the DT model based on three proposed descriptors showed a complete agreement with the obtained experimental results. PMID:23199741

  14. Using Decision Trees for Comparing Pattern Recognition Feature Sets

    SciTech Connect

    Proctor, D D

    2005-08-18

    Determination of the best set of features has been acknowledged as one of the most difficult tasks in the pattern recognition process. In this report significance tests on the sort-ordered, sample-size normalized vote distribution of an ensemble of decision trees is introduced as a method of evaluating relative quality of feature sets. Alternative functional forms for feature sets are also examined. Associated standard deviations provide the means to evaluate the effect of the number of folds, the number of classifiers per fold, and the sample size on the resulting classifications. The method is applied to a problem for which a significant portion of the training set cannot be classified unambiguously.

  15. Classification of Liss IV Imagery Using Decision Tree Methods

    NASA Astrophysics Data System (ADS)

    Verma, Amit Kumar; Garg, P. K.; Prasad, K. S. Hari; Dadhwal, V. K.

    2016-06-01

    Image classification is a compulsory step in any remote sensing research. Classification uses the spectral information represented by the digital numbers in one or more spectral bands and attempts to classify each individual pixel based on this spectral information. Crop classification is the main concern of remote sensing applications for developing sustainable agriculture system. Vegetation indices computed from satellite images gives a good indication of the presence of vegetation. It is an indicator that describes the greenness, density and health of vegetation. Texture is also an important characteristics which is used to identifying objects or region of interest is an image. This paper illustrate the use of decision tree method to classify the land in to crop land and non-crop land and to classify different crops. In this paper we evaluate the possibility of crop classification using an integrated approach methods based on texture property with different vegetation indices for single date LISS IV sensor 5.8 meter high spatial resolution data. Eleven vegetation indices (NDVI, DVI, GEMI, GNDVI, MSAVI2, NDWI, NG, NR, NNIR, OSAVI and VI green) has been generated using green, red and NIR band and then image is classified using decision tree method. The other approach is used integration of texture feature (mean, variance, kurtosis and skewness) with these vegetation indices. A comparison has been done between these two methods. The results indicate that inclusion of textural feature with vegetation indices can be effectively implemented to produce classifiedmaps with 8.33% higher accuracy for Indian satellite IRS-P6, LISS IV sensor images.

  16. The value of decision tree analysis in planning anaesthetic care in obstetrics.

    PubMed

    Bamber, J H; Evans, S A

    2016-08-01

    The use of decision tree analysis is discussed in the context of the anaesthetic and obstetric management of a young pregnant woman with joint hypermobility syndrome with a history of insensitivity to local anaesthesia and a previous difficult intubation due to a tongue tumour. The multidisciplinary clinical decision process resulted in the woman being delivered without complication by elective caesarean section under general anaesthesia after an awake fibreoptic intubation. The decision process used is reviewed and compared retrospectively to a decision tree analytical approach. The benefits and limitations of using decision tree analysis are reviewed and its application in obstetric anaesthesia is discussed. PMID:27026589

  17. Decision-Tree Models of Categorization Response Times, Choice Proportions, and Typicality Judgments

    ERIC Educational Resources Information Center

    Lafond, Daniel; Lacouture, Yves; Cohen, Andrew L.

    2009-01-01

    The authors present 3 decision-tree models of categorization adapted from T. Trabasso, H. Rollins, and E. Shaughnessy (1971) and use them to provide a quantitative account of categorization response times, choice proportions, and typicality judgments at the individual-participant level. In Experiment 1, the decision-tree models were fit to…

  18. An Improved Decision Tree for Predicting a Major Product in Competing Reactions

    ERIC Educational Resources Information Center

    Graham, Kate J.

    2014-01-01

    When organic chemistry students encounter competing reactions, they are often overwhelmed by the task of evaluating multiple factors that affect the outcome of a reaction. The use of a decision tree is a useful tool to teach students to evaluate a complex situation and propose a likely outcome. Specifically, a decision tree can help students…

  19. Decision trees and decision committee applied to star/galaxy separation problem

    NASA Astrophysics Data System (ADS)

    Vasconcellos, Eduardo Charles

    Vasconcellos et al [1] study the efficiency of 13 diferente decision tree algorithms applied to photometric data in the Sloan Digital Sky Digital Survey Data Release Seven (SDSS-DR7) to perform star/galaxy separation. Each algorithm is defined by a set fo parameters which, when varied, produce diferente final classifications trees. In that work we extensively explore the parameter space of each algorithm, using the set of 884,126 SDSS objects with spectroscopic data as the training set. We find that Functional Tree algorithm (FT) yields the best results by the mean completeness function (galaxy true positive rate) in two magnitude intervals:14<=r<=21 (85.2%) and r>=19 (82.1%). We compare FT classification to the SDSS parametric, 2DPHOT and Ball et al (2006) classifications. At the faintest magnitudes (r > 19), our classifier is the only one that maintains high completeness (>80%) while simultaneously achieving low contamination ( 2.5%). We also examine the SDSS parametric classifier (psfMag - modelMag) to see if the dividing line between stars and galaxies can be adjusted to improve the classifier. We find that currently stars in close pairs are often misclassified as galaxies, and suggest a new cut to improve the classifier. Finally, we apply our FT classifier to separate stars from galaxies in the full set of 69,545,326 SDSS photometric objects in the magnitude range 14 <= r <= 21. We now study the performance of a decision committee composed by FT classifiers. We will train six FT classifiers with random selected objects from the same 884,126 SDSS-DR7 objects with spectroscopic data that we use before. Both, the decision commitee and our previous single FT classifier will be applied to the new ojects from SDSS data releses eight, nine and ten. Finally we will compare peformances of both methods in this new data set. [1] Vasconcellos, E. C.; de Carvalho, R. R.; Gal, R. R.; LaBarbera, F. L.; Capelato, H. V.; Fraga Campos Velho, H.; Trevisan, M.; Ruiz, R. S. R

  20. The decision tree classifier - Design and potential. [for Landsat-1 data

    NASA Technical Reports Server (NTRS)

    Hauska, H.; Swain, P. H.

    1975-01-01

    A new classifier has been developed for the computerized analysis of remote sensor data. The decision tree classifier is essentially a maximum likelihood classifier using multistage decision logic. It is characterized by the fact that an unknown sample can be classified into a class using one or several decision functions in a successive manner. The classifier is applied to the analysis of data sensed by Landsat-1 over Kenosha Pass, Colorado. The classifier is illustrated by a tree diagram which for processing purposes is encoded as a string of symbols such that there is a unique one-to-one relationship between string and decision tree.

  1. ArborZ: Photometric Redshifts Using Boosted Decision Trees

    NASA Astrophysics Data System (ADS)

    Gerdes, David W.; Sypniewski, Adam J.; McKay, Timothy A.; Hao, Jiangang; Weis, Matthew R.; Wechsler, Risa H.; Busha, Michael T.

    2010-06-01

    Precision photometric redshifts will be essential for extracting cosmological parameters from the next generation of wide-area imaging surveys. In this paper, we introduce a photometric redshift algorithm, ArborZ, based on the machine-learning technique of boosted decision trees. We study the algorithm using galaxies from the Sloan Digital Sky Survey (SDSS) and from mock catalogs intended to simulate both the SDSS and the upcoming Dark Energy Survey. We show that it improves upon the performance of existing algorithms. Moreover, the method naturally leads to the reconstruction of a full probability density function (PDF) for the photometric redshift of each galaxy, not merely a single "best estimate" and error, and also provides a photo-z quality figure of merit for each galaxy that can be used to reject outliers. We show that the stacked PDFs yield a more accurate reconstruction of the redshift distribution N(z). We discuss limitations of the current algorithm and ideas for future work.

  2. Classification of dopamine, serotonin, and dual antagonists by decision trees.

    PubMed

    Kim, Hye-Jung; Choo, Hyunah; Cho, Yong Seo; Koh, Hun Yeong; No, Kyoung Tai; Pae, Ae Nim

    2006-04-15

    Dopamine antagonists (DA), serotonin antagonists (SA), and serotonin-dopamine dual antagonists (Dual) are being used as antipsychotics. A lot of dopamine and serotonin antagonists reveal non-selective binding affinity against these two receptors because the antagonists share structurally common features originated from conserved residues of binding site of the aminergic receptor family. Therefore, classification of dopamine and serotonin antagonists into their own receptors can be useful in the designing of selective antagonist for individual therapy of antipsychotic disorders. Data set containing 1135 dopamine antagonists (D2, D3, and D4), 1251 serotonin antagonists (5-HT1A, 5-HT2A, and 5-HT2C), and 386 serotonin-dopamine dual antagonists was collected from the MDDR database. Cerius2 descriptors were employed to develop a classification model for the 2772 compounds with antipsychotic activity. LDA (linear discriminant analysis), SIMCA (soft independent modeling of class analogy), RP (recursive partitioning), and ANN (artificial neural network) algorithms successfully classified the active class of each compound at the average 73.6% and predicted at the average 69.8%. The decision trees from RP, the best model, were generated to identify and interpret those descriptors that discriminate the active classes more easily. These classification models could be used as a virtual screening tool to predict the active class of new candidates. PMID:16387502

  3. ArborZ: PHOTOMETRIC REDSHIFTS USING BOOSTED DECISION TREES

    SciTech Connect

    Gerdes, David W.; Sypniewski, Adam J.; McKay, Timothy A.; Hao, Jiangang; Weis, Matthew R.; Wechsler, Risa H.; Busha, Michael T.

    2010-06-01

    Precision photometric redshifts will be essential for extracting cosmological parameters from the next generation of wide-area imaging surveys. In this paper, we introduce a photometric redshift algorithm, ArborZ, based on the machine-learning technique of boosted decision trees. We study the algorithm using galaxies from the Sloan Digital Sky Survey (SDSS) and from mock catalogs intended to simulate both the SDSS and the upcoming Dark Energy Survey. We show that it improves upon the performance of existing algorithms. Moreover, the method naturally leads to the reconstruction of a full probability density function (PDF) for the photometric redshift of each galaxy, not merely a single 'best estimate' and error, and also provides a photo-z quality figure of merit for each galaxy that can be used to reject outliers. We show that the stacked PDFs yield a more accurate reconstruction of the redshift distribution N(z). We discuss limitations of the current algorithm and ideas for future work.

  4. Learning from examples - Generation and evaluation of decision trees for software resource analysis

    NASA Technical Reports Server (NTRS)

    Selby, Richard W.; Porter, Adam A.

    1988-01-01

    A general solution method for the automatic generation of decision (or classification) trees is investigated. The approach is to provide insights through in-depth empirical characterization and evaluation of decision trees for software resource data analysis. The trees identify classes of objects (software modules) that had high development effort. Sixteen software systems ranging from 3,000 to 112,000 source lines were selected for analysis from a NASA production environment. The collection and analysis of 74 attributes (or metrics), for over 4,700 objects, captured information about the development effort, faults, changes, design style, and implementation style. A total of 9,600 decision trees were automatically generated and evaluated. The trees correctly identified 79.3 percent of the software modules that had high development effort or faults, and the trees generated from the best parameter combinations correctly identified 88.4 percent of the modules on the average.

  5. Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine

    NASA Technical Reports Server (NTRS)

    Schwabacher, Mark A.; Aguilar, Robert; Figueroa, Fernando F.

    2009-01-01

    The goal of this work was to use data-driven methods to automatically detect and isolate faults in the J-2X rocket engine. It was decided to use decision trees, since they tend to be easier to interpret than other data-driven methods. The decision tree algorithm automatically "learns" a decision tree by performing a search through the space of possible decision trees to find one that fits the training data. The particular decision tree algorithm used is known as C4.5. Simulated J-2X data from a high-fidelity simulator developed at Pratt & Whitney Rocketdyne and known as the Detailed Real-Time Model (DRTM) was used to "train" and test the decision tree. Fifty-six DRTM simulations were performed for this purpose, with different leak sizes, different leak locations, and different times of leak onset. To make the simulations as realistic as possible, they included simulated sensor noise, and included a gradual degradation in both fuel and oxidizer turbine efficiency. A decision tree was trained using 11 of these simulations, and tested using the remaining 45 simulations. In the training phase, the C4.5 algorithm was provided with labeled examples of data from nominal operation and data including leaks in each leak location. From the data, it "learned" a decision tree that can classify unseen data as having no leak or having a leak in one of the five leak locations. In the test phase, the decision tree produced very low false alarm rates and low missed detection rates on the unseen data. It had very good fault isolation rates for three of the five simulated leak locations, but it tended to confuse the remaining two locations, perhaps because a large leak at one of these two locations can look very similar to a small leak at the other location.

  6. Improved Frame Mode Selection for AMR-WB+ Based on Decision Tree

    NASA Astrophysics Data System (ADS)

    Kim, Jong Kyu; Kim, Nam Soo

    In this letter, we propose a coding mode selection method for the AMR-WB+ audio coder based on a decision tree. In order to reduce computation while maintaining good performance, decision tree classifier is adopted with the closed loop mode selection results as the target classification labels. The size of the decision tree is controlled by pruning, so the proposed method does not increase the memory requirement significantly. Through an evaluation test on a database covering both speech and music materials, the proposed method is found to achieve a much better mode selection accuracy compared with the open loop mode selection module in the AMR-WB+.

  7. Creating ensembles of oblique decision trees with evolutionary algorithms and sampling

    DOEpatents

    Cantu-Paz, Erick; Kamath, Chandrika

    2006-06-13

    A decision tree system that is part of a parallel object-oriented pattern recognition system, which in turn is part of an object oriented data mining system. A decision tree process includes the step of reading the data. If necessary, the data is sorted. A potential split of the data is evaluated according to some criterion. An initial split of the data is determined. The final split of the data is determined using evolutionary algorithms and statistical sampling techniques. The data is split. Multiple decision trees are combined in ensembles.

  8. Classification of Bent-Double Galaxies: Experiences with Ensembles of Decision Trees

    SciTech Connect

    Kamath, C; Cantu-Paz, E

    2002-01-08

    In earlier work, we have described our experiences with the use of decision tree classifiers to identify radio-emitting galaxies with a bent-double morphology in the FIRST astronomical survey. We now extend this work to include ensembles of decision tree classifiers, including two algorithms developed by us. These algorithms randomize the decision at each node of the tree, and because they consider fewer candidate splitting points, are faster than other methods for creating ensembles. The experiments presented in this paper with our astronomy data show that our algorithms are competitive in accuracy, but faster than other ensemble techniques such as Boosting, Bagging, and Arcx4 with different split criteria.

  9. Supervised hashing using graph cuts and boosted decision trees.

    PubMed

    Lin, Guosheng; Shen, Chunhua; Hengel, Anton van den

    2015-11-01

    To build large-scale query-by-example image retrieval systems, embedding image features into a binary Hamming space provides great benefits. Supervised hashing aims to map the original features to compact binary codes that are able to preserve label based similarity in the binary Hamming space. Most existing approaches apply a single form of hash function, and an optimization process which is typically deeply coupled to this specific form. This tight coupling restricts the flexibility of those methods, and can result in complex optimization problems that are difficult to solve. In this work we proffer a flexible yet simple framework that is able to accommodate different types of loss functions and hash functions. The proposed framework allows a number of existing approaches to hashing to be placed in context, and simplifies the development of new problem-specific hashing methods. Our framework decomposes the hashing learning problem into two steps: binary code (hash bit) learning and hash function learning. The first step can typically be formulated as binary quadratic problems, and the second step can be accomplished by training a standard binary classifier. For solving large-scale binary code inference, we show how it is possible to ensure that the binary quadratic problems are submodular such that efficient graph cut methods may be used. To achieve efficiency as well as efficacy on large-scale high-dimensional data, we propose to use boosted decision trees as the hash functions, which are nonlinear, highly descriptive, and are very fast to train and evaluate. Experiments demonstrate that the proposed method significantly outperforms most state-of-the-art methods, especially on high-dimensional data. PMID:26440270

  10. Application of preprocessing filtering on Decision Tree C4.5 and rough set theory

    NASA Astrophysics Data System (ADS)

    Chan, Joseph C. C.; Lin, Tsau Y.

    2001-03-01

    This paper compares two artificial intelligence methods: the Decision Tree C4.5 and Rough Set Theory on the stock market data. The Decision Tree C4.5 is reviewed with the Rough Set Theory. An enhanced window application is developed to facilitate the pre-processing filtering by introducing the feature (attribute) transformations, which allows users to input formulas and create new attributes. Also, the application produces three varieties of data set with delaying, averaging, and summation. The results prove the improvement of pre-processing by applying feature (attribute) transformations on Decision Tree C4.5. Moreover, the comparison between Decision Tree C4.5 and Rough Set Theory is based on the clarity, automation, accuracy, dimensionality, raw data, and speed, which is supported by the rules sets generated by both algorithms on three different sets of data.

  11. A Decision Tree Approach to the Interpretation of Multivariate Statistical Techniques.

    ERIC Educational Resources Information Center

    Fok, Lillian Y.; And Others

    1995-01-01

    Discusses the nature, power, and limitations of four multivariate techniques: factor analysis, multiple analysis of variance, multiple regression, and multiple discriminant analysis. Shows how decision trees assist in interpreting results. (SK)

  12. An approach for automated fault diagnosis based on a fuzzy decision tree and boundary analysis of a reconstructed phase space.

    PubMed

    Aydin, Ilhan; Karakose, Mehmet; Akin, Erhan

    2014-03-01

    Although reconstructed phase space is one of the most powerful methods for analyzing a time series, it can fail in fault diagnosis of an induction motor when the appropriate pre-processing is not performed. Therefore, boundary analysis based a new feature extraction method in phase space is proposed for diagnosis of induction motor faults. The proposed approach requires the measurement of one phase current signal to construct the phase space representation. Each phase space is converted into an image, and the boundary of each image is extracted by a boundary detection algorithm. A fuzzy decision tree has been designed to detect broken rotor bars and broken connector faults. The results indicate that the proposed approach has a higher recognition rate than other methods on the same dataset. PMID:24296116

  13. Aneurysmal subarachnoid hemorrhage prognostic decision-making algorithm using classification and regression tree analysis

    PubMed Central

    Lo, Benjamin W. Y.; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R. Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A. H.

    2016-01-01

    Background: Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). Methods: The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Results: Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56–2.45, P < 0.01). Conclusions: A clinically useful classification tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH. PMID:27512607

  14. Application of decision tree model for the ground subsidence hazard mapping near abandoned underground coal mines.

    PubMed

    Lee, Saro; Park, Inhye

    2013-09-30

    Subsidence of ground caused by underground mines poses hazards to human life and property. This study analyzed the hazard to ground subsidence using factors that can affect ground subsidence and a decision tree approach in a geographic information system (GIS). The study area was Taebaek, Gangwon-do, Korea, where many abandoned underground coal mines exist. Spatial data, topography, geology, and various ground-engineering data for the subsidence area were collected and compiled in a database for mapping ground-subsidence hazard (GSH). The subsidence area was randomly split 50/50 for training and validation of the models. A data-mining classification technique was applied to the GSH mapping, and decision trees were constructed using the chi-squared automatic interaction detector (CHAID) and the quick, unbiased, and efficient statistical tree (QUEST) algorithms. The frequency ratio model was also applied to the GSH mapping for comparing with probabilistic model. The resulting GSH maps were validated using area-under-the-curve (AUC) analysis with the subsidence area data that had not been used for training the model. The highest accuracy was achieved by the decision tree model using CHAID algorithm (94.01%) comparing with QUEST algorithms (90.37%) and frequency ratio model (86.70%). These accuracies are higher than previously reported results for decision tree. Decision tree methods can therefore be used efficiently for GSH analysis and might be widely used for prediction of various spatial events. PMID:23702378

  15. Combining evolutionary algorithms with oblique decision trees to detect bent double galaxies

    SciTech Connect

    Cantu-Paz, E; Kamath, C

    2000-06-22

    Decision trees have long been popular in classification as they use simple and easy-to-understand tests at each node. Most variants of decision trees test a single attribute at a node, leading to axis-parallel trees, where the test results in a hyperplane which is parallel to one of the dimensions in the attribute space. These trees can be rather large and inaccurate in cases where the concept to be learnt is best approximated by oblique hyperplanes. In such cases, it may be more appropriate to use an oblique decision tree, where the decision at each node is a linear combination of the attributes. Oblique decision trees have not gained wide popularity in part due to the complexity of constructing good oblique splits and the tendency of existing splitting algorithms to get stuck in local minima. Several alternatives have been proposed to handle these problems including randomization in conjunction with deterministic hill climbing and the use of simulated annealing. In this paper, they use evolutionary algorithms (EAs) to determine the split. EAs are well suited for this problem because of their global search properties, their tolerance to noisy fitness evaluations, and their scalability to large dimensional search spaces. They demonstrate the technique on a practical problem from astronomy, namely, the classification of galaxies with a bent-double morphology, and describe their experiences with several split evaluation criteria.

  16. Production of diagnostic rules from a neurotologic database with decision trees.

    PubMed

    Kentala, E; Viikki, K; Pyykkö, I; Juhola, M

    2000-02-01

    A decision tree is an artificial intelligence program that is adaptive and is closely related to a neural network, but can handle missing or nondecisive data in decision-making. Data on patients with Meniere's disease, vestibular schwannoma, traumatic vertigo, sudden deafness, benign paroxysmal positional vertigo, and vestibular neuritis were retrieved from the database of the otoneurologic expert system ONE for the development and testing of the accuracy of decision trees in the diagnostic workup. Decision trees were constructed separately for each disease. The accuracies of the best decision trees were 94%, 95%, 99%, 99%, 100%, and 100% for the respective diseases. The most important questions concerned the presence of vertigo, hearing loss, and tinnitus; duration of vertigo; frequency of vertigo attacks; severity of rotational vertigo; onset and type of hearing loss; and occurrence of head injury in relation to the timing of onset of vertigo. Meniere's disease was the most difficult to classify correctly. The validity and structure of the decision trees are easily comprehended and can be used outside the expert system. PMID:10685569

  17. Use of a decision tree to select the mud system for the Oso field, Nigeria

    SciTech Connect

    Dear, S.F. III; Beasley, R.D.; Barr, K.P.

    1995-10-01

    Far too often, the basis for selection of a mud system is the ``latest, greatest`` technology or personal preference rather than sound cost-effective analysis. The use of risk-vs.-cost decision analysis improves mud selection and makes it a proper business decision. Several mud systems usually are available to drill and well and, with good decision analysis, the cost-effectiveness of each alternative becomes apparent. This paper describes how the drilling team used structured decision analysis to evaluate and select the best mud system for the project. First, Monte Carlo simulations forecast the range of possible results with each alternative. The simulations provide most-likely values for the variables in the decision tree, including reasonable ranges for sensitivity analyses. This paper presents and discusses the simulations, the decision tree, and the sensitivity analyses.

  18. Combining evolutionary algorithms with oblique decision trees to detect bent-double galaxies

    NASA Astrophysics Data System (ADS)

    Cantu-Paz, Erick; Kamath, Chandrika

    2000-10-01

    Decision tress have long been popular in classification as they use simple and easy-to-understand tests at each node. Most variants of decision trees test a single attribute at a node, leading to axis- parallel trees, where the test results in a hyperplane which is parallel to one of the dimensions in the attribute space. These trees can be rather large and inaccurate in cases where the concept to be learned is best approximated by oblique hyperplanes. In such cases, it may be more appropriate to use an oblique decision tree, where the decision at each node is a linear combination of the attributes. Oblique decision trees have not gained wide popularity in part due to the complexity of constructing good oblique splits and the tendency of existing splitting algorithms to get stuck in local minima. Several alternatives have been proposed to handle these problems including randomization in conjunction wiht deterministic hill-climbing and the use of simulated annealing. In this paper, we use evolutionary algorithms (EAs) to determine the split. EAs are well suited for this problem because of their global search properties, their tolerance to noisy fitness evaluations, and their scalability to large dimensional search spaces. We demonstrate our technique on a synthetic data set, and then we apply it to a practical problem from astronomy, namely, the classification of galaxies with a bent-double morphology. In addition, we describe our experiences with several split evaluation criteria. Our results suggest that, in some cases, the evolutionary approach is faster and more accurate than existing oblique decision tree algorithms. However, for our astronomical data, the accuracy is not significantly different than the axis-parallel trees.

  19. Using decision trees to characterize verbal communication during change and stuck episodes in the therapeutic process

    PubMed Central

    Masías, Víctor H.; Krause, Mariane; Valdés, Nelson; Pérez, J. C.; Laengle, Sigifredo

    2015-01-01

    Methods are needed for creating models to characterize verbal communication between therapists and their patients that are suitable for teaching purposes without losing analytical potential. A technique meeting these twin requirements is proposed that uses decision trees to identify both change and stuck episodes in therapist-patient communication. Three decision tree algorithms (C4.5, NBTree, and REPTree) are applied to the problem of characterizing verbal responses into change and stuck episodes in the therapeutic process. The data for the problem is derived from a corpus of 8 successful individual therapy sessions with 1760 speaking turns in a psychodynamic context. The decision tree model that performed best was generated by the C4.5 algorithm. It delivered 15 rules characterizing the verbal communication in the two types of episodes. Decision trees are a promising technique for analyzing verbal communication during significant therapy events and have much potential for use in teaching practice on changes in therapeutic communication. The development of pedagogical methods using decision trees can support the transmission of academic knowledge to therapeutic practice. PMID:25914657

  20. Pruning a decision tree for selecting computer-related assistive devices for people with disabilities.

    PubMed

    Chi, Chia-Fen; Tseng, Li-Kai; Jang, Yuh

    2012-07-01

    Many disabled individuals lack extensive knowledge about assistive technology, which could help them use computers. In 1997, Denis Anson developed a decision tree of 49 evaluative questions designed to evaluate the functional capabilities of the disabled user and choose an appropriate combination of assistive devices, from a selection of 26, that enable the individual to use a computer. In general, occupational therapists guide the disabled users through this process. They often have to go over repetitive questions in order to find an appropriate device. A disabled user may require an alphanumeric entry device, a pointing device, an output device, a performance enhancement device, or some combination of these. Therefore, the current research eliminates redundant questions and divides Anson's decision tree into multiple independent subtrees to meet the actual demand of computer users with disabilities. The modified decision tree was tested by six disabled users to prove it can determine a complete set of assistive devices with a smaller number of evaluative questions. The means to insert new categories of computer-related assistive devices was included to ensure the decision tree can be expanded and updated. The current decision tree can help the disabled users and assistive technology practitioners to find appropriate computer-related assistive devices that meet with clients' individual needs in an efficient manner. PMID:22552588

  1. Cloud detection based on decision tree over Tibetan Plateau with MODIS data

    NASA Astrophysics Data System (ADS)

    Xu, Lina; Niu, Ruiqing; Fang, Shenghui; Dong, Yanfang

    2013-10-01

    Snow cover area is a very critical parameter for hydrologic cycle of the Earth. Furthermore, it will be a key factor for the effect of the climate change. An unbelievable situation in mapping snow cover is the existence of clouds. Clouds can easily be found in any image from satellite, because clouds are bright and white in the visible wavelengths. But it is not the case when there is snow or ice in the background. It is similar spectral appearance of snow and clouds. Many cloud decision methods are built on decision trees. The decision trees were designed based on empirical studies and simulations. In this paper a classification trees were used to build the decision tree. And then with a great deal repeating scenes coming from the same area the cloud pixel can be replaced by "its" real surface types, such as snow pixel or vegetation or water. The effect of the cloud can be distinguished in the short wave infrared. The results show that most cloud coverage being removed. A validation was carried out for all subsequent steps. It led to the removal of all remaining cloud cover. The results show that the decision tree method performed satisfied.

  2. Cloud Detection Based on Decision Tree Over Tibetan Plateau with Modis Data

    NASA Astrophysics Data System (ADS)

    Xu, L.; Fang, S.; Niu, R.; Li, J.

    2012-07-01

    Snow cover area is a very critical parameter for hydrologic cycle of the Earth. Furthermore, it will be a key factor for the effect of the climate change. An unbelievable situation in mapping snow cover is the existence of clouds. Clouds can easily be found in any image from satellite, because clouds are bright and white in the visible wavelengths. But it is not the case when there is snow or ice in the background. It is similar spectral appearance of snow and clouds. Many cloud decision methods are built on decision trees. The decision trees were designed based on empirical studies and simulations. In this paper a classification trees were used to build the decision tree. And then with a great deal repeating scenes coming from the same area the cloud pixel can be replaced by "its" real surface types, such as snow pixel or vegetation or water. The effect of the cloud can be distinguished in the short wave infrared. The results show that most cloud coverage being removed. A validation was carried out for all subsequent steps. It led to the removal of all remaining cloud cover. The results show that the decision tree method performed satisfied.

  3. Outsourcing the Portal: Another Branch in the Decision Tree.

    ERIC Educational Resources Information Center

    McMahon, Tim

    2000-01-01

    Discussion of the management of information resources in organizations focuses on the use of portal technologies to update intranet capabilities. Considers application outsourcing decisions, reviews benefits (including reducing costs) as well as concerns, and describes application service providers (ASPs). (LRW)

  4. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    PubMed

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  5. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    PubMed Central

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  6. Post-event human decision errors: operator action tree/time reliability correlation

    SciTech Connect

    Hall, R E; Fragola, J; Wreathall, J

    1982-11-01

    This report documents an interim framework for the quantification of the probability of errors of decision on the part of nuclear power plant operators after the initiation of an accident. The framework can easily be incorporated into an event tree/fault tree analysis. The method presented consists of a structure called the operator action tree and a time reliability correlation which assumes the time available for making a decision to be the dominating factor in situations requiring cognitive human response. This limited approach decreases the magnitude and complexity of the decision modeling task. Specifically, in the past, some human performance models have attempted prediction by trying to emulate sequences of human actions, or by identifying and modeling the information processing approach applicable to the task. The model developed here is directed at describing the statistical performance of a representative group of hypothetical individuals responding to generalized situations.

  7. The decision - identification tree: A new EIS scoping tool

    SciTech Connect

    Eccleston, C.H.

    1997-04-02

    No single methodology has been developed or universally accepted for determining the scope of an Environmental Impact Statement (EIS). Most typically, the scope is determined by first identifying actions and facilities to be analyzed. Yet, agencies sometimes complete an EIS, only to discover that the scope does not adequately address decisions that need to be made. Such discrepancies can often be traced to disconnects between the scoping process and the actual decision making that follows. A new tool, for use in a value engineering setting, provides an effective methodology for improving the EIS scoping process. Application of this tool is not limited to National Environmental Policy Act (NEPA) scoping efforts. This tool, could in fact, be used to map potential decision points for a range of diverse planning applications and exercises.

  8. Induction of somatic embryogenesis in gum arabic tree [Acacia senegal (L.) Willd].

    PubMed

    Rathore, Jitendra Singh; Rai, Manoj K; Shekhawat, N S

    2012-10-01

    Factors affecting somatic embryogenesis from immature cotyledon of gum arabic tree [Acacia senegal (L.) Willd.] were investigated. Induction of somatic embryogenesis was influenced by plant growth regulator concentrations and addition of amino acids in medium. Best induction of somatic embryogenesis was obtained on MS medium supplemented with 0.45 μM 2, 4-D, 2.32 μM Kin and 15 mM L-glutamine. L-glutamine plays a significant role in the maturation of somatic embryos and most of embryos attained maturity only on L-glutamine (15 mM) containing medium. Maximum percent (75.0 ± 2.5) germination of somatic embryos was recorded on medium containing 0.22 μM BAP. PMID:24082503

  9. Decision tree for the binding of dipeptides to the thermally fluctuating surface of cathepsin K

    NASA Astrophysics Data System (ADS)

    Nishiyama, Katsuhiko

    2016-03-01

    The behavior of 15 dipeptides on thermally fluctuating cathepsin K was investigated by molecular dynamics and docking simulations. Four dipeptides were distributed on sites near the active center, and the variations were small. Eleven dipeptides were distributed on sites far from the active center, and the variations were large for nine dipeptides and very large for the other two. The decision tree was constructed using genetic programming, and it accurately classified the 15 dipeptides. The decision tree would accurately estimate the behavior of various peptides, and should significantly contribute to the design of useful peptides.

  10. Fuzzy decision trees for planning and autonomous control of a coordinated team of UAVs

    NASA Astrophysics Data System (ADS)

    Smith, James F., III; Nguyen, ThanhVu H.

    2007-04-01

    A fuzzy logic resource manager that enables a collection of unmanned aerial vehicles (UAVs) to automatically cooperate to make meteorological measurements will be discussed. Once in flight no human intervention is required. Planning and real-time control algorithms determine the optimal trajectory and points each UAV will sample, while taking into account the UAVs' risk, risk tolerance, reliability, mission priority, fuel limitations, mission cost, and related uncertainties. The control algorithm permits newly obtained information about weather and other events to be introduced to allow the UAVs to be more effective. The approach is illustrated by a discussion of the fuzzy decision tree for UAV path assignment and related simulation. The different fuzzy membership functions on the tree are described in mathematical detail. The different methods by which this tree is obtained are summarized including a method based on using a genetic program as a data mining function. A second fuzzy decision tree that allows the UAVs to automatically collaborate without human intervention is discussed. This tree permits three different types of collaborative behavior between the UAVs. Simulations illustrating how the tree allows the different types of collaboration to be automated are provided. Simulations also show the ability of the control algorithm to allow UAVs to effectively cooperate to increase the UAV team's likelihood of success.

  11. Test Reviews: Euler, B. L. (2007). "Emotional Disturbance Decision Tree". Lutz, FL: Psychological Assessment Resources

    ERIC Educational Resources Information Center

    Tansy, Michael

    2009-01-01

    The Emotional Disturbance Decision Tree (EDDT) is a teacher-completed norm-referenced rating scale published by Psychological Assessment Resources, Inc., in Lutz, Florida. The 156-item EDDT was developed for use as part of a broader assessment process to screen and assist in the identification of 5- to 18-year-old children for the special…

  12. What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    2004-01-01

    To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…

  13. Classification and concentration estimation of explosive precursors using nanowires sensor array and decision tree learning

    NASA Astrophysics Data System (ADS)

    Cho, Junghwan; Li, Xiaopeng; Gu, Zhiyong; Kurup, Pradeep

    2011-09-01

    This paper aims to classify and estimate concentrations of explosive precursors using a nanowire sensor array and decision tree learning algorithm. The nanowire sensor array consists of tin oxide sensors with four different additives, platinum (Pt), copper (Cu), indium (In), and nickel (Ni). The nanowire sensor array was tested using the vapors from four explosives precursors, acetone, nitrobenzene, nitrotoluene, and octane with 10 different concentration levels each. A pattern recognition technique based on decision tree learning was applied to classify the explosive precursors and estimate their concentration. Classification and regression tree (CART) analysis was used for classification. The CART was also utilized for the purpose of structure identification in Sugeno fuzzy inference system (FIS) for estimating the concentration of the precursors. Two CARTs were trained and their testing results were investigated.

  14. Data-Mining-Based Coronary Heart Disease Risk Prediction Model Using Fuzzy Logic and Decision Tree

    PubMed Central

    Kim, Jaekwon; Lee, Jongsik

    2015-01-01

    Objectives The importance of the prediction of coronary heart disease (CHD) has been recognized in Korea; however, few studies have been conducted in this area. Therefore, it is necessary to develop a method for the prediction and classification of CHD in Koreans. Methods A model for CHD prediction must be designed according to rule-based guidelines. In this study, a fuzzy logic and decision tree (classification and regression tree [CART])-driven CHD prediction model was developed for Koreans. Datasets derived from the Korean National Health and Nutrition Examination Survey VI (KNHANES-VI) were utilized to generate the proposed model. Results The rules were generated using a decision tree technique, and fuzzy logic was applied to overcome problems associated with uncertainty in CHD prediction. Conclusions The accuracy and receiver operating characteristic (ROC) curve values of the propose systems were 69.51% and 0.594, proving that the proposed methods were more efficient than other models. PMID:26279953

  15. Time-lapse electromagnetic induction surveys under olive tree canopies reveal soil moisture dynamics and controls

    NASA Astrophysics Data System (ADS)

    Martínez, Gonzalo; Giraldez Cervera, Juan Vicente; Vanderlinden, Karl

    2015-04-01

    Soil moisture (θ) is a critical variable that exerts an important control on plant status and development. Soil sampling, neutron attenuation and electromagnetic methods such as TDR or FDR have been used widely to measure θ and provide point data at a possible range of temporal resolutions. However, these methods require either destructive sampling or permanently installed devices with often limiting measurement depths, or are extremely time-consuming. Moreover, the small support of such measurements compromises its value in heterogeneous soils. To overcome such limitations electromagnetic induction (EMI) can be tested to monitor θ at different spatial and temporal scales. This work investigates the potential of EMI to characterize the spatio-temporal variability of soil moisture from apparent electrical conductivity (ECa) under the canopy of individual olive trees. During one year we measured θ with a frequency of 5 min and ECa on an approximately weekly basis along transects from the tree trunk towards the inter-row area. CS-616 soil moisture sensors where horizontally installed in the walls of a trench at depths of 0.1, 0.2, 0.4, 0.6 and 0.8 m at five locations along the transect, with a separation of 0.8 m. The Dualem-21S sensor was used to measure weekly the ECa at 0.2 m increments, from the tree trunk to a distance of 4.4 m. The results showed similar drying and wetting patterns for θ and ECa. Both variables showed a decreasing pattern from the tree trunk towards the drip line, followed by a sharp increment and constant values towards the center of the inter-row space. This pattern reflects clearly the influence of root-zone water uptake under the tree canopy and higher θ values in the inter-row area where root-water uptake is smaller. Time-lapse ECa data responded to evaporation and infiltration fluxes with the highest sensitivity for the 1 and 1.5 m ECa signals, as compared to the 0.5 and 3.0 m signals. Overall these preliminary results revealed the

  16. Minimizing the cost of translocation failure with decision-tree models that predict species' behavioral response in translocation sites.

    PubMed

    Ebrahimi, Mehregan; Ebrahimie, Esmaeil; Bull, C Michael

    2015-08-01

    The high number of failures is one reason why translocation is often not recommended. Considering how behavior changes during translocations may improve translocation success. To derive decision-tree models for species' translocation, we used data on the short-term responses of an endangered Australian skink in 5 simulated translocations with different release conditions. We used 4 different decision-tree algorithms (decision tree, decision-tree parallel, decision stump, and random forest) with 4 different criteria (gain ratio, information gain, gini index, and accuracy) to investigate how environmental and behavioral parameters may affect the success of a translocation. We assumed behavioral changes that increased dispersal away from a release site would reduce translocation success. The trees became more complex when we included all behavioral parameters as attributes, but these trees yielded more detailed information about why and how dispersal occurred. According to these complex trees, there were positive associations between some behavioral parameters, such as fight and dispersal, that showed there was a higher chance, for example, of dispersal among lizards that fought than among those that did not fight. Decision trees based on parameters related to release conditions were easier to understand and could be used by managers to make translocation decisions under different circumstances. PMID:25737134

  17. Tools of the Future: How Decision Tree Analysis Will Impact Mission Planning

    NASA Technical Reports Server (NTRS)

    Otterstatter, Matthew R.

    2005-01-01

    The universe is infinitely complex; however, the human mind has a finite capacity. The multitude of possible variables, metrics, and procedures in mission planning are far too many to address exhaustively. This is unfortunate because, in general, considering more possibilities leads to more accurate and more powerful results. To compensate, we can get more insightful results by employing our greatest tool, the computer. The power of the computer will be utilized through a technology that considers every possibility, decision tree analysis. Although decision trees have been used in many other fields, this is innovative for space mission planning. Because this is a new strategy, no existing software is able to completely accommodate all of the requirements. This was determined through extensive research and testing of current technologies. It was necessary to create original software, for which a short-term model was finished this summer. The model was built into Microsoft Excel to take advantage of the familiar graphical interface for user input, computation, and viewing output. Macros were written to automate the process of tree construction, optimization, and presentation. The results are useful and promising. If this tool is successfully implemented in mission planning, our reliance on old-fashioned heuristics, an error-prone shortcut for handling complexity, will be reduced. The computer algorithms involved in decision trees will revolutionize mission planning. The planning will be faster and smarter, leading to optimized missions with the potential for more valuable data.

  18. Using attribute behavior diversity to build accurate decision tree committees for microarray data.

    PubMed

    Han, Qian; Dong, Guozhu

    2012-08-01

    DNA microarrays (gene chips), frequently used in biological and medical studies, measure the expressions of thousands of genes per sample. Using microarray data to build accurate classifiers for diseases is an important task. This paper introduces an algorithm, called Committee of Decision Trees by Attribute Behavior Diversity (CABD), to build highly accurate ensembles of decision trees for such data. Since a committee's accuracy is greatly influenced by the diversity among its member classifiers, CABD uses two new ideas to "optimize" that diversity, namely (1) the concept of attribute behavior-based similarity between attributes, and (2) the concept of attribute usage diversity among trees. The ideas are effective for microarray data, since such data have many features and behavior similarity between genes can be high. Experiments on microarray data for six cancers show that CABD outperforms previous ensemble methods significantly and outperforms SVM, and show that the diversified features used by CABD's decision tree committee can be used to improve performance of other classifiers such as SVM. CABD has potential for other high-dimensional data, and its ideas may apply to ensembles of other classifier types. PMID:22809418

  19. Validating a decision tree for serious infection: diagnostic accuracy in acutely ill children in ambulatory care

    PubMed Central

    Verbakel, Jan Y; Lemiengre, Marieke B; De Burghgraeve, Tine; De Sutter, An; Aertgeerts, Bert; Bullens, Dominique M A; Shinkins, Bethany; Van den Bruel, Ann; Buntinx, Frank

    2015-01-01

    Objective Acute infection is the most common presentation of children in primary care with only few having a serious infection (eg, sepsis, meningitis, pneumonia). To avoid complications or death, early recognition and adequate referral are essential. Clinical prediction rules have the potential to improve diagnostic decision-making for rare but serious conditions. In this study, we aimed to validate a recently developed decision tree in a new but similar population. Design Diagnostic accuracy study validating a clinical prediction rule. Setting and participants Acutely ill children presenting to ambulatory care in Flanders, Belgium, consisting of general practice and paediatric assessment in outpatient clinics or the emergency department. Intervention Physicians were asked to score the decision tree in every child. Primary outcome measures The outcome of interest was hospital admission for at least 24 h with a serious infection within 5 days after initial presentation. We report the diagnostic accuracy of the decision tree in sensitivity, specificity, likelihood ratios and predictive values. Results In total, 8962 acute illness episodes were included, of which 283 lead to admission to hospital with a serious infection. Sensitivity of the decision tree was 100% (95% CI 71.5% to 100%) at a specificity of 83.6% (95% CI 82.3% to 84.9%) in the general practitioner setting with 17% of children testing positive. In the paediatric outpatient and emergency department setting, sensitivities were below 92%, with specificities below 44.8%. Conclusions In an independent validation cohort, this clinical prediction rule has shown to be extremely sensitive to identify children at risk of hospital admission for a serious infection in general practice, making it suitable for ruling out. Trial registration number NCT02024282. PMID:26254472

  20. Using Boosted Decision Trees to Separate Signal and Background in B to XsGamma Decays

    SciTech Connect

    Barber, James; /Massachusetts U., Amherst /SLAC

    2006-09-27

    The measurement of the branching fraction of the flavor changing neutral current B {yields} X{sub s}{gamma} transition can be used to expose physics outside the Standard Model. In order to make a precise measurement of this inclusive branching fraction, it is necessary to be able to effectively separate signal and background in the data. In order to achieve better separation, an algorithm based on Boosted Decision Trees (BDTs) is implemented. Using Monte Carlo simulated events, ''forests'' of trees were trained and tested with different sets of parameters. This parameter space was studied with the goal of maximizing the figure of merit, Q, the measure of separation quality used in this analysis. It is found that the use of 1000 trees, with 100 values tested for each variable at each node, and 50 events required for a node to continue separating give the highest figure of merit, Q = 18.37.

  1. Decision tree approach for classification of remotely sensed satellite data using open source support

    NASA Astrophysics Data System (ADS)

    Sharma, Richa; Ghosh, Aniruddha; Joshi, P. K.

    2013-10-01

    In this study, an attempt has been made to develop a decision tree classification (DTC) algorithm for classification of remotely sensed satellite data (Landsat TM) using open source support. The decision tree is constructed by recursively partitioning the spectral distribution of the training dataset using WEKA, open source data mining software. The classified image is compared with the image classified using classical ISODATA clustering and Maximum Likelihood Classifier (MLC) algorithms. Classification result based on DTC method provided better visual depiction than results produced by ISODATA clustering or by MLC algorithms. The overall accuracy was found to be 90% (kappa = 0.88) using the DTC, 76.67% (kappa = 0.72) using the Maximum Likelihood and 57.5% (kappa = 0.49) using ISODATA clustering method. Based on the overall accuracy and kappa statistics, DTC was found to be more preferred classification approach than others.

  2. Circum-Arctic petroleum systems identified using decision-tree chemometrics

    USGS Publications Warehouse

    Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.; Gautier, D.L.

    2007-01-01

    Source- and age-related biomarker and isotopic data were measured for more than 1000 crude oil samples from wells and seeps collected above approximately 55??N latitude. A unique, multitiered chemometric (multivariate statistical) decision tree was created that allowed automated classification of 31 genetically distinct circumArctic oil families based on a training set of 622 oil samples. The method, which we call decision-tree chemometrics, uses principal components analysis and multiple tiers of K-nearest neighbor and SIMCA (soft independent modeling of class analogy) models to classify and assign confidence limits for newly acquired oil samples and source rock extracts. Geochemical data for each oil sample were also used to infer the age, lithology, organic matter input, depositional environment, and identity of its source rock. These results demonstrate the value of large petroleum databases where all samples were analyzed using the same procedures and instrumentation. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.

  3. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach.

    PubMed

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  4. Using Decision Trees for Estimating Mode Choice of Trips in Buca-Izmir

    NASA Astrophysics Data System (ADS)

    Oral, L. O.; Tecim, V.

    2013-05-01

    Decision makers develop transportation plans and models for providing sustainable transport systems in urban areas. Mode Choice is one of the stages in transportation modelling. Data mining techniques can discover factors affecting the mode choice. These techniques can be applied with knowledge process approach. In this study a data mining process model is applied to determine the factors affecting the mode choice with decision trees techniques by considering individual trip behaviours from household survey data collected within Izmir Transportation Master Plan. From this perspective transport mode choice problem is solved on a case in district of Buca-Izmir, Turkey with CRISP-DM knowledge process model.

  5. Office of Legacy Management Decision Tree for Solar Photovoltaic Projects - 13317

    SciTech Connect

    Elmer, John; Butherus, Michael; Barr, Deborah L.

    2013-07-01

    To support consideration of renewable energy power development as a land reuse option, the DOE Office of Legacy Management (LM) and the National Renewable Energy Laboratory (NREL) established a partnership to conduct an assessment of wind and solar renewable energy resources on LM lands. From a solar capacity perspective, the larger sites in the western United States present opportunities for constructing solar photovoltaic (PV) projects. A detailed analysis and preliminary plan was developed for three large sites in New Mexico, assessing the costs, the conceptual layout of a PV system, and the electric utility interconnection process. As a result of the study, a 1,214-hectare (3,000-acre) site near Grants, New Mexico, was chosen for further study. The state incentives, utility connection process, and transmission line capacity were key factors in assessing the feasibility of the project. LM's Durango, Colorado, Disposal Site was also chosen for consideration because the uranium mill tailings disposal cell is on a hillside facing south, transmission lines cross the property, and the community was very supportive of the project. LM worked with the regulators to demonstrate that the disposal cell's long-term performance would not be impacted by the installation of a PV solar system. A number of LM-unique issues were resolved in making the site available for a private party to lease a portion of the site for a solar PV project. A lease was awarded in September 2012. Using a solar decision tree that was developed and launched by the EPA and NREL, LM has modified and expanded the decision tree structure to address the unique aspects and challenges faced by LM on its multiple sites. The LM solar decision tree covers factors such as land ownership, usable acreage, financial viability of the project, stakeholder involvement, and transmission line capacity. As additional sites are transferred to LM in the future, the decision tree will assist in determining whether a solar

  6. MODIS Snow Cover Mapping Decision Tree Technique: Snow and Cloud Discrimination

    NASA Technical Reports Server (NTRS)

    Riggs, George A.; Hall, Dorothy K.

    2010-01-01

    Accurate mapping of snow cover continues to challenge cryospheric scientists and modelers. The Moderate-Resolution Imaging Spectroradiometer (MODIS) snow data products have been used since 2000 by many investigators to map and monitor snow cover extent for various applications. Users have reported on the utility of the products and also on problems encountered. Three problems or hindrances in the use of the MODIS snow data products that have been reported in the literature are: cloud obscuration, snow/cloud confusion, and snow omission errors in thin or sparse snow cover conditions. Implementation of the MODIS snow algorithm in a decision tree technique using surface reflectance input to mitigate those problems is being investigated. The objective of this work is to use a decision tree structure for the snow algorithm. This should alleviate snow/cloud confusion and omission errors and provide a snow map with classes that convey information on how snow was detected, e.g. snow under clear sky, snow tinder cloud, to enable users' flexibility in interpreting and deriving a snow map. Results of a snow cover decision tree algorithm are compared to the standard MODIS snow map and found to exhibit improved ability to alleviate snow/cloud confusion in some situations allowing up to about 5% increase in mapped snow cover extent, thus accuracy, in some scenes.

  7. Data mining for multiagent rules, strategies, and fuzzy decision tree structure

    NASA Astrophysics Data System (ADS)

    Smith, James F., III; Rhyne, Robert D., II; Fisher, Kristin

    2002-03-01

    A fuzzy logic based resource manager (RM) has been developed that automatically allocates electronic attack resources in real-time over many dissimilar platforms. Two different data mining algorithms have been developed to determine rules, strategies, and fuzzy decision tree structure. The first data mining algorithm uses a genetic algorithm as a data mining function and is called from an electronic game. The game allows a human expert to play against the resource manager in a simulated battlespace with each of the defending platforms being exclusively directed by the fuzzy resource manager and the attacking platforms being controlled by the human expert or operating autonomously under their own logic. This approach automates the data mining problem. The game automatically creates a database reflecting the domain expert's knowledge. It calls a data mining function, a genetic algorithm, for data mining of the database as required and allows easy evaluation of the information mined in the second step. The criterion for re- optimization is discussed as well as experimental results. Then a second data mining algorithm that uses a genetic program as a data mining function is introduced to automatically discover fuzzy decision tree structures. Finally, a fuzzy decision tree generated through this process is discussed.

  8. Flood-type classification in mountainous catchments using crisp and fuzzy decision trees

    NASA Astrophysics Data System (ADS)

    Sikorska, Anna E.; Viviroli, Daniel; Seibert, Jan

    2015-10-01

    Floods are governed by largely varying processes and thus exhibit various behaviors. Classification of flood events into flood types and the determination of their respective frequency is therefore important for a better understanding and prediction of floods. This study presents a flood classification for identifying flood patterns at a catchment scale by means of a fuzzy decision tree. Hence, events are represented as a spectrum of six main possible flood types that are attributed with their degree of acceptance. Considered types are flash, short rainfall, long rainfall, snow-melt, rainfall on snow and, in high alpine catchments, glacier-melt floods. The fuzzy decision tree also makes it possible to acknowledge the uncertainty present in the identification of flood processes and thus allows for more reliable flood class estimates than using a crisp decision tree, which identifies one flood type per event. Based on the data set in nine Swiss mountainous catchments, it was demonstrated that this approach is less sensitive to uncertainties in the classification attributes than the classical crisp approach. These results show that the fuzzy approach bears additional potential for analyses of flood patterns at a catchment scale and thereby it provides more realistic representation of flood processes.

  9. Building Decision Trees for Characteristic Ellipsoid Method to Monitor Power System Transient Behaviors

    SciTech Connect

    Ma, Jian; Diao, Ruisheng; Makarov, Yuri V.; Etingov, Pavel V.; Zhou, Ning; Dagle, Jeffery E.

    2010-12-01

    The characteristic ellipsoid is a new method to monitor the dynamics of power systems. Decision trees (DTs) play an important role in applying the characteristic ellipsoid method to system operation and analysis. This paper presents the idea and initial results of building DTs for detecting transient dynamic events using the characteristic ellipsoid method. The objective is to determine fault types, fault locations and clearance time in the system using decision trees based on ellipsoids of system transient responses. The New England 10-machine 39-bus system is used for running dynamic simulations to generate a sufficiently large number of transient events in different system configurations. Comprehensive transient simulations considering three fault types, two fault clearance times and different fault locations were conducted in the study. Bus voltage magnitudes and monitored reactive and active power flows are recorded as the phasor measurements to calculate characteristic ellipsoids whose volume, eccentricity, center and projection of the longest axis are used as indices to build decision trees. The DT performances are tested and compared by considering different sets of PMU locations. The proposed method demonstrates that the characteristic ellipsoid method is a very efficient and promising tool to monitor power system dynamic behaviors.

  10. Gene selection for cancer identification: a decision tree model empowered by particle swarm optimization algorithm

    PubMed Central

    2014-01-01

    Background In the application of microarray data, how to select a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers is an important issue. Many researchers use various computational intelligence methods to analyzed gene expression data. Results To achieve efficient gene selection from thousands of candidate genes that can contribute in identifying cancers, this study aims at developing a novel method utilizing particle swarm optimization combined with a decision tree as the classifier. This study also compares the performance of our proposed method with other well-known benchmark classification methods (support vector machine, self-organizing map, back propagation neural network, C4.5 decision tree, Naive Bayes, CART decision tree, and artificial immune recognition system) and conducts experiments on 11 gene expression cancer datasets. Conclusion Based on statistical analysis, our proposed method outperforms other popular classifiers for all test datasets, and is compatible to SVM for certain specific datasets. Further, the housekeeping genes with various expression patterns and tissue-specific genes are identified. These genes provide a high discrimination power on cancer classification. PMID:24555567

  11. A decision tree for selecting the most cost-effective waste disposal strategy in foodservice operations.

    PubMed

    Wie, Seunghee; Shanklin, Carol W; Lee, Kyung-Eun

    2003-04-01

    The purposes of this study were to determine costs of disposal strategies for wastes generated in foodservice operations and to develop a decision tree to determine the most cost-effective disposal strategy for foodservice operations. Four cases, including the central food processing center (CFPC) in a school district, a continuing-care retirement center (CCRC), a university dining center (UDC), and a commercial chain restaurant (CCR), were studied to determine the most cost-effective disposal strategy. Annual costs for the current and projected strategies were determined for each case. Results of waste characterization studies and stopwatch studies, interviews with foodservice directors, and water flow and electrical requirements from manufacturers' specifications were used to determine cost incurred. The annual percentage increases for labor, fees, and services were used to reflect an inflated economic condition for the ensuing 10 years of the study period. The Net Present Worth method was used to compare costs of strategies, and the multiparameter sensitivity analysis was conducted to examine the tolerance of the chosen strategy. The most cost-effective strategy differed among foodservice operations because of the composition of food and packaging wastes, the quantity of recyclable materials, the waste-hauling charges, labor costs, start-up costs, and inflation rate. For example, the use of a garbage disposal for food waste and landfills and recycling for packaging waste were the most cost-effective strategies for the CCRC. A decision tree was developed to illustrate the decision-making process that occurs when conducting cost analysis and subsequent decisions. Dietetics practitioners can use the decision tree when evaluating the results of the cost analysis. PMID:12669011

  12. Bonsai Trees in Your Head: How the Pavlovian System Sculpts Goal-Directed Choices by Pruning Decision Trees

    PubMed Central

    O'Nions, Elizabeth; Sheridan, Luke; Dayan, Peter; Roiser, Jonathan P.

    2012-01-01

    When planning a series of actions, it is usually infeasible to consider all potential future sequences; instead, one must prune the decision tree. Provably optimal pruning is, however, still computationally ruinous and the specific approximations humans employ remain unknown. We designed a new sequential reinforcement-based task and showed that human subjects adopted a simple pruning strategy: during mental evaluation of a sequence of choices, they curtailed any further evaluation of a sequence as soon as they encountered a large loss. This pruning strategy was Pavlovian: it was reflexively evoked by large losses and persisted even when overwhelmingly counterproductive. It was also evident above and beyond loss aversion. We found that the tendency towards Pavlovian pruning was selectively predicted by the degree to which subjects exhibited sub-clinical mood disturbance, in accordance with theories that ascribe Pavlovian behavioural inhibition, via serotonin, a role in mood disorders. We conclude that Pavlovian behavioural inhibition shapes highly flexible, goal-directed choices in a manner that may be important for theories of decision-making in mood disorders. PMID:22412360

  13. Socioeconomic determinants of menarche in rural Polish girls using the decision trees method.

    PubMed

    Matusik, Stanisław; Laska-Mierzejewska, Teresa; Chrzanowska, Maria

    2011-05-01

    The aim of this study was to assess the usefulness of the decision trees method as a research method of multidimensional associations between menarche and socioeconomic variables. The article is based on data collected from the rural area of Choszczno in the West Pomerania district of Poland between 1987 and 2001. Girls were asked about the appearance of first menstruation (a yes/no method). The average menarchal age was estimated by the probit analysis method, using second grade polynomials. The socioeconomic status of the girls' families was determined using five qualitative variables: fathers' and mothers' educational level, source of income, household appliances and the number of children in a family. For classification based on five socioeconomic variables, one of the most effective algorithms CART (Classification and Regression Trees) was used. In 2001 the menarchal age in 66% of examined girls was properly classified, while a higher efficiency of 70% was obtained for girls examined in 1987. The decision trees method enabled the definition of the hierarchy of socioeconomic variables influencing girls' biological development level. The strongest discriminatory power was attributed to the number of children in a family, and the mother's and then father's educational level. Using this method it is possible to detect differences in strength of socioeconomic variables associated with girls' pubescence before 1987 and after 2001 during the transformation of the economic and political systems in Poland. However, the decision trees method is infrequently applied in social sciences and constitutes a novelty; this article proves its usefulness in examining relations between biological processes and a population's living conditions. PMID:21211091

  14. Using decision trees to manage hospital readmission risk for acute myocardial infarction, heart failure, and pneumonia.

    PubMed

    Hilbert, John P; Zasadil, Scott; Keyser, Donna J; Peele, Pamela B

    2014-12-01

    To improve healthcare quality and reduce costs, the Affordable Care Act places hospitals at financial risk for excessive readmissions associated with acute myocardial infarction (AMI), heart failure (HF), and pneumonia (PN). Although predictive analytics is increasingly looked to as a means for measuring, comparing, and managing this risk, many modeling tools require data inputs that are not readily available and/or additional resources to yield actionable information. This article demonstrates how hospitals and clinicians can use their own structured discharge data to create decision trees that produce highly transparent, clinically relevant decision rules for better managing readmission risk associated with AMI, HF, and PN. For illustrative purposes, basic decision trees are trained and tested using publically available data from the California State Inpatient Databases and an open-source statistical package. As expected, these simple models perform less well than other more sophisticated tools, with areas under the receiver operating characteristic (ROC) curve (or AUC) of 0.612, 0.583, and 0.650, respectively, but achieve a lift of at least 1.5 or greater for higher-risk patients with any of the three conditions. More importantly, they are shown to offer substantial advantages in terms of transparency and interpretability, comprehensiveness, and adaptability. By enabling hospitals and clinicians to identify important factors associated with readmissions, target subgroups of patients at both high and low risk, and design and implement interventions that are appropriate to the risk levels observed, decision trees serve as an ideal application for addressing the challenge of reducing hospital readmissions. PMID:25160603

  15. Cloud Detection from Satellite Imagery: A Comparison of Expert-Generated and Automatically-Generated Decision Trees

    NASA Technical Reports Server (NTRS)

    Shiffman, Smadar

    2004-01-01

    Automated cloud detection and tracking is an important step in assessing global climate change via remote sensing. Cloud masks, which indicate whether individual pixels depict clouds, are included in many of the data products that are based on data acquired on- board earth satellites. Many cloud-mask algorithms have the form of decision trees, which employ sequential tests that scientists designed based on empirical astrophysics studies and astrophysics simulations. Limitations of existing cloud masks restrict our ability to accurately track changes in cloud patterns over time. In this study we explored the potential benefits of automatically-learned decision trees for detecting clouds from images acquired using the Advanced Very High Resolution Radiometer (AVHRR) instrument on board the NOAA-14 weather satellite of the National Oceanic and Atmospheric Administration. We constructed three decision trees for a sample of 8km-daily AVHRR data from 2000 using a decision-tree learning procedure provided within MATLAB(R), and compared the accuracy of the decision trees to the accuracy of the cloud mask. We used ground observations collected by the National Aeronautics and Space Administration Clouds and the Earth s Radiant Energy Systems S COOL project as the gold standard. For the sample data, the accuracy of automatically learned decision trees was greater than the accuracy of the cloud masks included in the AVHRR data product.

  16. Supervised learning with decision tree-based methods in computational and systems biology.

    PubMed

    Geurts, Pierre; Irrthum, Alexandre; Wehenkel, Louis

    2009-12-01

    At the intersection between artificial intelligence and statistics, supervised learning allows algorithms to automatically build predictive models from just observations of a system. During the last twenty years, supervised learning has been a tool of choice to analyze the always increasing and complexifying data generated in the context of molecular biology, with successful applications in genome annotation, function prediction, or biomarker discovery. Among supervised learning methods, decision tree-based methods stand out as non parametric methods that have the unique feature of combining interpretability, efficiency, and, when used in ensembles of trees, excellent accuracy. The goal of this paper is to provide an accessible and comprehensive introduction to this class of methods. The first part of the review is devoted to an intuitive but complete description of decision tree-based methods and a discussion of their strengths and limitations with respect to other supervised learning methods. The second part of the review provides a survey of their applications in the context of computational and systems biology. PMID:20023720

  17. Return or relocate? An inductive analysis of decision-making in a disaster.

    PubMed

    Henry, Jacques

    2013-04-01

    This paper proposes an inductive analysis of the decision as to whether to return or to relocate by persons in the State of Louisiana, United States, who evacuated after Hurricanes Katrina and Rita in August and September 2005, respectively. Drawing on interviews with evacuees in these events and extensive fieldwork in the impacted area, the paper seeks to identify the folk dimensions of the decision-making process, assess their arrangements, and situate the process in the larger context of risk and resilience in an advanced society. It suggests that, despite the material and emotional upheaval experienced by affected persons, the decision-making process is a rational endeavour combining a definite set of tightly interconnected factors, involving material dimensions and substantive values that can act in concert or in conflict. In addition, it indicates that there are significant variations by geographic areas, homeownership, and kind of decision. Some theoretical implications, practical measures, and suggestions for future research are examined. PMID:23278427

  18. Application of Decision Tree Algorithm for classification and identification of natural minerals using SEM-EDS

    NASA Astrophysics Data System (ADS)

    Akkaş, Efe; Akin, Lutfiye; Evren Çubukçu, H.; Artuner, Harun

    2015-07-01

    A mineral is a natural, homogeneous solid with a definite chemical composition and a highly ordered atomic arrangement. Recently, fast and accurate mineral identification/classification became a necessity. Energy Dispersive X-ray Spectrometers integrated with Scanning Electron Microscopes (SEM) are used to obtain rapid and reliable elemental analysis or chemical characterization of a solid. However, mineral identification is challenging since there is wide range of spectral dataset for natural minerals. The more mineralogical data acquired, time required for classification procedures increases. Moreover, applied instrumental conditions on a SEM-EDS differ for various applications, affecting the produced X-ray patterns even for the same mineral. This study aims to test whether C5.0 Decision Tree is a rapid and reliable method algorithm for classification and identification of various natural magmatic minerals. Ten distinct mineral groups (olivine, orthopyroxene, clinopyroxene, apatite, amphibole, plagioclase, K-feldspar, zircon, magnetite, biotite) from different igneous rocks have been analyzed on SEM-EDS. 4601 elemental X-ray intensity data have been collected under various instrumental conditions. 2400 elemental data have been used to train and the remaining 2201 data have been tested to identify the minerals. The vast majority of the test data have been classified accurately. Additionally, high accuracy has been reached on the minerals with similar chemical composition, such as olivine ((Mg,Fe)2[SiO4]) and orthopyroxene ((Mg,Fe)2[SiO6]). Furthermore, two members from amphibole group (magnesiohastingsite, tschermakite) and two from clinopyroxene group (diopside, hedenbergite) have been accurately identified by the Decision Tree Algorithm. These results demonstrate that C5.0 Decision Tree Algorithm is an efficient method for mineral group classification and the identification of mineral members.

  19. Decision tree-based learning to predict patient controlled analgesia consumption and readjustment

    PubMed Central

    2012-01-01

    Background Appropriate postoperative pain management contributes to earlier mobilization, shorter hospitalization, and reduced cost. The under treatment of pain may impede short-term recovery and have a detrimental long-term effect on health. This study focuses on Patient Controlled Analgesia (PCA), which is a delivery system for pain medication. This study proposes and demonstrates how to use machine learning and data mining techniques to predict analgesic requirements and PCA readjustment. Methods The sample in this study included 1099 patients. Every patient was described by 280 attributes, including the class attribute. In addition to commonly studied demographic and physiological factors, this study emphasizes attributes related to PCA. We used decision tree-based learning algorithms to predict analgesic consumption and PCA control readjustment based on the first few hours of PCA medications. We also developed a nearest neighbor-based data cleaning method to alleviate the class-imbalance problem in PCA setting readjustment prediction. Results The prediction accuracies of total analgesic consumption (continuous dose and PCA dose) and PCA analgesic requirement (PCA dose only) by an ensemble of decision trees were 80.9% and 73.1%, respectively. Decision tree-based learning outperformed Artificial Neural Network, Support Vector Machine, Random Forest, Rotation Forest, and Naïve Bayesian classifiers in analgesic consumption prediction. The proposed data cleaning method improved the performance of every learning method in this study of PCA setting readjustment prediction. Comparative analysis identified the informative attributes from the data mining models and compared them with the correlates of analgesic requirement reported in previous works. Conclusion This study presents a real-world application of data mining to anesthesiology. Unlike previous research, this study considers a wider variety of predictive factors, including PCA demands over time. We analyzed

  20. Improving medical diagnosis reliability using Boosted C5.0 decision tree empowered by Particle Swarm Optimization.

    PubMed

    Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin

    2015-08-01

    Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods. PMID:26737960

  1. Decision Tree Classifier for Classification of Plant and Animal Micro RNA's

    NASA Astrophysics Data System (ADS)

    Pant, Bhasker; Pant, Kumud; Pardasani, K. R.

    Gene expression is regulated by miRNAs or micro RNAs which can be 21-23 nucleotide in length. They are non coding RNAs which control gene expression either by translation repression or mRNA degradation. Plants and animals both contain miRNAs which have been classified by wet lab techniques. These techniques are highly expensive, labour intensive and time consuming. Hence faster and economical computational approaches are needed. In view of above a machine learning model has been developed for classification of plant and animal miRNAs using decision tree classifier. The model has been tested on available data and it gives results with 91% accuracy.

  2. Improvement and analysis of ID3 algorithm in decision-making tree

    NASA Astrophysics Data System (ADS)

    Xie, Xiao-Lan; Long, Zhen; Liao, Wen-Qi

    2015-12-01

    For the cooperative system under development, it needs to use the spatial analysis and relative technology concerning data mining in order to carry out the detection of the subject conflict and redundancy, while the ID3 algorithm is an important data mining. Due to the traditional ID3 algorithm in the decision-making tree towards the log part is rather complicated, this paper obtained a new computational formula of information gain through the optimization of algorithm of the log part. During the experiment contrast and theoretical analysis, it is found that IID3 (Improved ID3 Algorithm) algorithm owns higher calculation efficiency and accuracy and thus worth popularizing.

  3. A comparison of student academic achievement using decision trees techniques: Reflection from University Malaysia Perlis

    NASA Astrophysics Data System (ADS)

    Aziz, Fatihah; Jusoh, Abd Wahab; Abu, Mohd Syafarudy

    2015-05-01

    A decision tree is one of the techniques in data mining for prediction. Using this method, hidden information from abundant of data can be taken out and interpret the information into useful knowledge. In this paper the academic performance of the student will be examined from 2002 to 2012 from two faculties; Faculty of Manufacturing Engineering and Faculty of Microelectronic Engineering in University Malaysia Perlis (UniMAP). The objectives of this study are to determine and compare the factors that affect the students' academic achievement between the two faculties. The prediction results show there are five attributes that have been considered as factors that influence the students' academic performance.

  4. Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?

    PubMed Central

    Chung, Grace Y; Coiera, Enrico

    2008-01-01

    Background This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined. Methods A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 – 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted. Results Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values. Conclusion The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity. PMID:18957129

  5. Integrating Decision Tree and Hidden Markov Model (HMM) for Subtype Prediction of Human Influenza A Virus

    NASA Astrophysics Data System (ADS)

    Attaluri, Pavan K.; Chen, Zhengxin; Weerakoon, Aruna M.; Lu, Guoqing

    Multiple criteria decision making (MCDM) has significant impact in bioinformatics. In the research reported here, we explore the integration of decision tree (DT) and Hidden Markov Model (HMM) for subtype prediction of human influenza A virus. Infection with influenza viruses continues to be an important public health problem. Viral strains of subtype H3N2 and H1N1 circulates in humans at least twice annually. The subtype detection depends mainly on the antigenic assay, which is time-consuming and not fully accurate. We have developed a Web system for accurate subtype detection of human influenza virus sequences. The preliminary experiment showed that this system is easy-to-use and powerful in identifying human influenza subtypes. Our next step is to examine the informative positions at the protein level and extend its current functionality to detect more subtypes. The web functions can be accessed at http://glee.ist.unomaha.edu/.

  6. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  7. Genetic algorithm-based neural fuzzy decision tree for mixed scheduling in ATM networks.

    PubMed

    Lin, Chin-Teng; Chung, I-Fang; Pu, Her-Chang; Lee', Tsern-Huei; Chang, Jyh-Yeong

    2002-01-01

    Future broadband integrated services networks based on asynchronous transfer mode (ATM) technology are expected to support multiple types of multimedia information with diverse statistical characteristics and quality of service (QoS) requirements. To meet these requirements, efficient scheduling methods are important for traffic control in ATM networks. Among general scheduling schemes, the rate monotonic algorithm is simple enough to be used in high-speed networks, but does not attain the high system utilization of the deadline driven algorithm. However, the deadline driven scheme is computationally complex and hard to implement in hardware. The mixed scheduling algorithm is a combination of the rate monotonic algorithm and the deadline driven algorithm; thus it can provide most of the benefits of these two algorithms. In this paper, we use the mixed scheduling algorithm to achieve high system utilization under the hardware constraint. Because there is no analytic method for schedulability testing of mixed scheduling, we propose a genetic algorithm-based neural fuzzy decision tree (GANFDT) to realize it in a real-time environment. The GANFDT combines a GA and a neural fuzzy network into a binary classification tree. This approach also exploits the power of the classification tree. Simulation results show that the GANFDT provides an efficient way of carrying out mixed scheduling in ATM networks. PMID:18244889

  8. Decision-tree analysis of factors influencing rainfall-related building structure and content damage

    NASA Astrophysics Data System (ADS)

    Spekkers, M. H.; Kok, M.; Clemens, F. H. L. R.; ten Veldhuis, J. A. E.

    2014-09-01

    Flood-damage prediction models are essential building blocks in flood risk assessments. So far, little research has been dedicated to damage from small-scale urban floods caused by heavy rainfall, while there is a need for reliable damage models for this flood type among insurers and water authorities. The aim of this paper is to investigate a wide range of damage-influencing factors and their relationships with rainfall-related damage, using decision-tree analysis. For this, district-aggregated claim data from private property insurance companies in the Netherlands were analysed, for the period 1998-2011. The databases include claims of water-related damage (for example, damages related to rainwater intrusion through roofs and pluvial flood water entering buildings at ground floor). Response variables being modelled are average claim size and claim frequency, per district, per day. The set of predictors include rainfall-related variables derived from weather radar images, topographic variables from a digital terrain model, building-related variables and socioeconomic indicators of households. Analyses were made separately for property and content damage claim data. Results of decision-tree analysis show that claim frequency is most strongly associated with maximum hourly rainfall intensity, followed by real estate value, ground floor area, household income, season (property data only), buildings age (property data only), a fraction of homeowners (content data only), a and fraction of low-rise buildings (content data only). It was not possible to develop statistically acceptable trees for average claim size. It is recommended to investigate explanations for the failure to derive models. These require the inclusion of other explanatory factors that were not used in the present study, an investigation of the variability in average claim size at different spatial scales, and the collection of more detailed insurance data that allows one to distinguish between the

  9. Decision tree analysis of factors influencing rainfall-related building damage

    NASA Astrophysics Data System (ADS)

    Spekkers, M. H.; Kok, M.; Clemens, F. H. L. R.; ten Veldhuis, J. A. E.

    2014-04-01

    Flood damage prediction models are essential building blocks in flood risk assessments. Little research has been dedicated so far to damage of small-scale urban floods caused by heavy rainfall, while there is a need for reliable damage models for this flood type among insurers and water authorities. The aim of this paper is to investigate a wide range of damage-influencing factors and their relationships with rainfall-related damage, using decision tree analysis. For this, district-aggregated claim data from private property insurance companies in the Netherlands were analysed, for the period of 1998-2011. The databases include claims of water-related damage, for example, damages related to rainwater intrusion through roofs and pluvial flood water entering buildings at ground floor. Response variables being modelled are average claim size and claim frequency, per district per day. The set of predictors include rainfall-related variables derived from weather radar images, topographic variables from a digital terrain model, building-related variables and socioeconomic indicators of households. Analyses were made separately for property and content damage claim data. Results of decision tree analysis show that claim frequency is most strongly associated with maximum hourly rainfall intensity, followed by real estate value, ground floor area, household income, season (property data only), buildings age (property data only), ownership structure (content data only) and fraction of low-rise buildings (content data only). It was not possible to develop statistically acceptable trees for average claim size, which suggest that variability in average claim size is related to explanatory variables that cannot be defined at the district scale. Cross-validation results show that decision trees were able to predict 22-26% of variance in claim frequency, which is considerably better compared to results from global multiple regression models (11-18% of variance explained). Still, a

  10. Prediction of microRNA target genes using an efficient genetic algorithm-based decision tree

    PubMed Central

    Rabiee-Ghahfarrokhi, Behzad; Rafiei, Fariba; Niknafs, Ali Akbar; Zamani, Behzad

    2015-01-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that regulate gene expression in almost all plants and animals. They play an important role in key processes, such as proliferation, apoptosis, and pathogen–host interactions. Nevertheless, the mechanisms by which miRNAs act are not fully understood. The first step toward unraveling the function of a particular miRNA is the identification of its direct targets. This step has shown to be quite challenging in animals primarily because of incomplete complementarities between miRNA and target mRNAs. In recent years, the use of machine-learning techniques has greatly increased the prediction of miRNA targets, avoiding the need for costly and time-consuming experiments to achieve miRNA targets experimentally. Among the most important machine-learning algorithms are decision trees, which classify data based on extracted rules. In the present work, we used a genetic algorithm in combination with C4.5 decision tree for prediction of miRNA targets. We applied our proposed method to a validated human datasets. We nearly achieved 93.9% accuracy of classification, which could be related to the selection of best rules. PMID:26649272

  11. Using decision-tree classifier systems to extract knowledge from databases

    NASA Technical Reports Server (NTRS)

    St.clair, D. C.; Sabharwal, C. L.; Hacke, Keith; Bond, W. E.

    1990-01-01

    One difficulty in applying artificial intelligence techniques to the solution of real world problems is that the development and maintenance of many AI systems, such as those used in diagnostics, require large amounts of human resources. At the same time, databases frequently exist which contain information about the process(es) of interest. Recently, efforts to reduce development and maintenance costs of AI systems have focused on using machine learning techniques to extract knowledge from existing databases. Research is described in the area of knowledge extraction using a class of machine learning techniques called decision-tree classifier systems. Results of this research suggest ways of performing knowledge extraction which may be applied in numerous situations. In addition, a measurement called the concept strength metric (CSM) is described which can be used to determine how well the resulting decision tree can differentiate between the concepts it has learned. The CSM can be used to determine whether or not additional knowledge needs to be extracted from the database. An experiment involving real world data is presented to illustrate the concepts described.

  12. Snow event classification with a 2D video disdrometer - A decision tree approach

    NASA Astrophysics Data System (ADS)

    Bernauer, F.; Hürkamp, K.; Rühm, W.; Tschiersch, J.

    2016-05-01

    Snowfall classification according to crystal type or degree of riming of the snowflakes is import for many atmospheric processes, e.g. wet deposition of aerosol particles. 2D video disdrometers (2DVD) have recently proved their capability to measure microphysical parameters of snowfall. The present work has the aim of classifying snowfall according to microphysical properties of single hydrometeors (e.g. shape and fall velocity) measured by means of a 2DVD. The constraints for the shape and velocity parameters which are used in a decision tree for classification of the 2DVD measurements, are derived from detailed on-site observations, combining automatic 2DVD classification with visual inspection. The developed decision tree algorithm subdivides the detected events into three classes of dominating crystal type (single crystals, complex crystals and pellets) and three classes of dominating degree of riming (weak, moderate and strong). The classification results for the crystal type were validated with an independent data set proving the unambiguousness of the classification. In addition, for three long-term events, good agreement of the classification results with independently measured maximum dimension of snowflakes, snowflake bulk density and surrounding temperature was found. The developed classification algorithm is applicable for wind speeds below 5.0 m s -1 and has the advantage of being easily implemented by other users.

  13. Block-Based Connected-Component Labeling Algorithm Using Binary Decision Trees

    PubMed Central

    Chang, Wan-Yu; Chiu, Chung-Cheng; Yang, Jia-Horng

    2015-01-01

    In this paper, we propose a fast labeling algorithm based on block-based concepts. Because the number of memory access points directly affects the time consumption of the labeling algorithms, the aim of the proposed algorithm is to minimize neighborhood operations. Our algorithm utilizes a block-based view and correlates a raster scan to select the necessary pixels generated by a block-based scan mask. We analyze the advantages of a sequential raster scan for the block-based scan mask, and integrate the block-connected relationships using two different procedures with binary decision trees to reduce unnecessary memory access. This greatly simplifies the pixel locations of the block-based scan mask. Furthermore, our algorithm significantly reduces the number of leaf nodes and depth levels required in the binary decision tree. We analyze the labeling performance of the proposed algorithm alongside that of other labeling algorithms using high-resolution images and foreground images. The experimental results from synthetic and real image datasets demonstrate that the proposed algorithm is faster than other methods. PMID:26393597

  14. Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining

    PubMed Central

    Habibi, Shafi; Ahmadi, Maryam; Alizadeh, Somayeh

    2015-01-01

    Objectives: The aim of this study was to examine a predictive model using features related to the diabetes type 2 risk factors. Methods: The data were obtained from a database in a diabetes control system in Tabriz, Iran. The data included all people referred for diabetes screening between 2009 and 2011. The features considered as “Inputs” were: age, sex, systolic and diastolic blood pressure, family history of diabetes, and body mass index (BMI). Moreover, we used diagnosis as “Class”. We applied the “Decision Tree” technique and “J48” algorithm in the WEKA (3.6.10 version) software to develop the model. Results: After data preprocessing and preparation, we used 22,398 records for data mining. The model precision to identify patients was 0.717. The age factor was placed in the root node of the tree as a result of higher information gain. The ROC curve indicates the model function in identification of patients and those individuals who are healthy. The curve indicates high capability of the model, especially in identification of the healthy persons. Conclusions: We developed a model using the decision tree for screening T2DM which did not require laboratory tests for T2DM diagnosis. PMID:26156928

  15. Prediction of Antimicrobial Activity of Synthetic Peptides by a Decision Tree Model

    PubMed Central

    Lira, Felipe; Perez, Pedro S.; Baranauskas, José A.

    2013-01-01

    Antimicrobial resistance is a persistent problem in the public health sphere. However, recent attempts to find effective substitutes to combat infections have been directed at identifying natural antimicrobial peptides in order to circumvent resistance to commercial antibiotics. This study describes the development of synthetic peptides with antimicrobial activity, created in silico by site-directed mutation modeling using wild-type peptides as scaffolds for these mutations. Fragments of antimicrobial peptides were used for modeling with molecular modeling computational tools. To analyze these peptides, a decision tree model, which indicated the action range of peptides on the types of microorganisms on which they can exercise biological activity, was created. The decision tree model was processed using physicochemistry properties from known antimicrobial peptides available at the Antimicrobial Peptide Database (APD). The two most promising peptides were synthesized, and antimicrobial assays showed inhibitory activity against Gram-positive and Gram-negative bacteria. Colossomin C and colossomin D were the most inhibitory peptides at 5 μg/ml against Staphylococcus aureus and Escherichia coli. The methods described in this work and the results obtained are useful for the identification and development of new compounds with antimicrobial activity through the use of computational tools. PMID:23455341

  16. Analysis of acid rain patterns in northeastern China using a decision tree method

    NASA Astrophysics Data System (ADS)

    Zhang, Xiuying; Jiang, Hong; Jin, Jiaxin; Xu, Xiaohua; Zhang, Qingxin

    2012-01-01

    Acid rain is a major regional-scale environmental problem in China. To control acid rain pollution and to protect the ecological environment, it is urgent to document acid rain patterns in various regions of China. Taking Liaoning Province as the study area, the present work focused on the spatial and temporal variations of acid rains in northeastern China. It presents a means for predicting the occurrence of acid rain using geographic position, terrain characteristics, routinely monitored meteorological factors and column concentrations of atmospheric SO 2 and NO 2. The analysis applies a decision tree approach to the foregoing observation data. Results showed that: (1) acid rain occurred at 17 stations among the 81 monitoring stations in Liaoning Province, with the frequency of acid rain from 0 to 84.38%; (2) summer had the most acid rain occurrences followed by spring and autumn, and the winter had the least; (3) the total accuracy for the simulation of precipitation pH (pH ≤ 4.5, 4.5 < pH ≤ 5.6, and pH > 5.6) was 98.04% using the decision tree method known as C5. The simulation results also indicated that the distance to coastline, elevation, wind direction, wind speed, rainfall amount, atmospheric pressure, and the precursors of acid rain all have a strong influence on the occurrence of acid rains in northeastern China.

  17. Cardiovascular Dysautonomias Diagnosis Using Crisp and Fuzzy Decision Tree: A Comparative Study.

    PubMed

    Kadi, Ilham; Idri, Ali

    2016-01-01

    Decision trees (DTs) are one of the most popular techniques for learning classification systems, especially when it comes to learning from discrete examples. In real world, many data occurred in a fuzzy form. Hence a DT must be able to deal with such fuzzy data. In fact, integrating fuzzy logic when dealing with imprecise and uncertain data allows reducing uncertainty and providing the ability to model fine knowledge details. In this paper, a fuzzy decision tree (FDT) algorithm was applied on a dataset extracted from the ANS (Autonomic Nervous System) unit of the Moroccan university hospital Avicenne. This unit is specialized on performing several dynamic tests to diagnose patients with autonomic disorder and suggest them the appropriate treatment. A set of fuzzy classifiers were generated using FID 3.4. The error rates of the generated FDTs were calculated to measure their performances. Moreover, a comparison between the error rates obtained using crisp and FDTs was carried out and has proved that the results of FDTs were better than those obtained using crisp DTs. PMID:27139378

  18. Evolution of Decision Rules Used for IT Portfolio Management: An Inductive Approach

    NASA Astrophysics Data System (ADS)

    Karhade, Prasanna P.; Shaw, Michael J.; Subramanyam, Ramanath

    IT portfolio management and the related planning decisions for IT-dependent initiatives are critical to organizational performance. Building on the logic of appropriateness theoretical framework, we define an important characteristic of decision rules used during IT portfolio planning; rule appropriateness with regards to the risk-taking criterion. We propose that rule appropriateness will be an important factor explaining the evolution of rules over time. Using an inductive learning methodology, we analyze a unique dataset of actual IT portfolio planning decisions spanning two consecutive years within one organization. We present systematic comparative analysis of the evolution of rules used in planning over two years to validate our research proposition. We find that rules that were inappropriate in the first year are being redefined to design appropriate rules for use in the second year. Our work provides empirical evidence demonstrating organizational learning and improvements in IT portfolio planning capabilities.

  19. Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification

    NASA Astrophysics Data System (ADS)

    Höhle, J.

    2014-09-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.

  20. Determinants of farmers' tree planting investment decision as a degraded landscape management strategy in the central highlands of Ethiopia

    NASA Astrophysics Data System (ADS)

    Gessesse, B.; Bewket, W.; Bräuning, A.

    2015-11-01

    Land degradation due to lack of sustainable land management practices are one of the critical challenges in many developing countries including Ethiopia. This study explores the major determinants of farm level tree planting decision as a land management strategy in a typical framing and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and binary logistic regression model. The model significantly predicted farmers' tree planting decision (Chi-square = 37.29, df = 15, P<0.001). Besides, the computed significant value of the model suggests that all the considered predictor variables jointly influenced the farmers' decision to plant trees as a land management strategy. In this regard, the finding of the study show that local land-users' willingness to adopt tree growing decision is a function of a wide range of biophysical, institutional, socioeconomic and household level factors, however, the likelihood of household size, productive labour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system have positively and significantly influence on tree growing investment decisions in the study watershed. Eventually, the processes of land use conversion and land degradation are serious which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, devising sustainable and integrated land management policy options and implementing them would enhance ecological restoration and livelihood sustainability in the study watershed.

  1. Determinants of farmers' tree-planting investment decisions as a degraded landscape management strategy in the central highlands of Ethiopia

    NASA Astrophysics Data System (ADS)

    Gessesse, Berhan; Bewket, Woldeamlak; Bräuning, Achim

    2016-04-01

    Land degradation due to lack of sustainable land management practices is one of the critical challenges in many developing countries including Ethiopia. This study explored the major determinants of farm-level tree-planting decisions as a land management strategy in a typical farming and degraded landscape of the Modjo watershed, Ethiopia. The main data were generated from household surveys and analysed using descriptive statistics and a binary logistic regression model. The model significantly predicted farmers' tree-planting decisions (χ2 = 37.29, df = 15, P < 0.001). Besides, the computed significant value of the model revealed that all the considered predictor variables jointly influenced the farmers' decisions to plant trees as a land management strategy. The findings of the study demonstrated that the adoption of tree-growing decisions by local land users was a function of a wide range of biophysical, institutional, socioeconomic and household-level factors. In this regard, the likelihood of household size, productive labour force availability, the disparity of schooling age, level of perception of the process of deforestation and the current land tenure system had a critical influence on tree-growing investment decisions in the study watershed. Eventually, the processes of land-use conversion and land degradation were serious, which in turn have had adverse effects on agricultural productivity, local food security and poverty trap nexus. Hence, the study recommended that devising and implementing sustainable land management policy options would enhance ecological restoration and livelihood sustainability in the study watershed.

  2. Multi-output decision trees for lesion segmentation in multiple sclerosis

    NASA Astrophysics Data System (ADS)

    Jog, Amod; Carass, Aaron; Pham, Dzung L.; Prince, Jerry L.

    2015-03-01

    Multiple Sclerosis (MS) is a disease of the central nervous system in which the protective myelin sheath of the neurons is damaged. MS leads to the formation of lesions, predominantly in the white matter of the brain and the spinal cord. The number and volume of lesions visible in magnetic resonance (MR) imaging (MRI) are important criteria for diagnosing and tracking the progression of MS. Locating and delineating lesions manually requires the tedious and expensive efforts of highly trained raters. In this paper, we propose an automated algorithm to segment lesions in MR images using multi-output decision trees. We evaluated our algorithm on the publicly available MICCAI 2008 MS Lesion Segmentation Challenge training dataset of 20 subjects, and showed improved results in comparison to state-of-the-art methods. We also evaluated our algorithm on an in-house dataset of 49 subjects with a true positive rate of 0.41 and a positive predictive value 0.36.

  3. Identification of Water Bodies in a Landsat 8 OLI Image Using a J48 Decision Tree

    PubMed Central

    Acharya, Tri Dev; Lee, Dong Ha; Yang, In Tae; Lee, Jae Kang

    2016-01-01

    Water bodies are essential to humans and other forms of life. Identification of water bodies can be useful in various ways, including estimation of water availability, demarcation of flooded regions, change detection, and so on. In past decades, Landsat satellite sensors have been used for land use classification and water body identification. Due to the introduction of a New Operational Land Imager (OLI) sensor on Landsat 8 with a high spectral resolution and improved signal-to-noise ratio, the quality of imagery sensed by Landsat 8 has improved, enabling better characterization of land cover and increased data size. Therefore, it is necessary to explore the most appropriate and practical water identification methods that take advantage of the improved image quality and use the fewest inputs based on the original OLI bands. The objective of the study is to explore the potential of a J48 decision tree (JDT) in identifying water bodies using reflectance bands from Landsat 8 OLI imagery. J48 is an open-source decision tree. The test site for the study is in the Northern Han River Basin, which is located in Gangwon province, Korea. Training data with individual bands were used to develop the JDT model and later applied to the whole study area. The performance of the model was statistically analysed using the kappa statistic and area under the curve (AUC). The results were compared with five other known water identification methods using a confusion matrix and related statistics. Almost all the methods showed high accuracy, and the JDT was successfully applied to the OLI image using only four bands, where the new additional deep blue band of OLI was found to have the third highest information gain. Thus, the JDT can be a good method for water body identification based on images with improved resolution and increased size. PMID:27420067

  4. Bayesian decision tree for the classification of the mode of motion in single-molecule trajectories.

    PubMed

    Türkcan, Silvan; Masson, Jean-Baptiste

    2013-01-01

    Membrane proteins move in heterogeneous environments with spatially (sometimes temporally) varying friction and with biochemical interactions with various partners. It is important to reliably distinguish different modes of motion to improve our knowledge of the membrane architecture and to understand the nature of interactions between membrane proteins and their environments. Here, we present an analysis technique for single molecule tracking (SMT) trajectories that can determine the preferred model of motion that best matches observed trajectories. The method is based on Bayesian inference to calculate the posteriori probability of an observed trajectory according to a certain model. Information theory criteria, such as the Bayesian information criterion (BIC), the Akaike information criterion (AIC), and modified AIC (AICc), are used to select the preferred model. The considered group of models includes free Brownian motion, and confined motion in 2nd or 4th order potentials. We determine the best information criteria for classifying trajectories. We tested its limits through simulations matching large sets of experimental conditions and we built a decision tree. This decision tree first uses the BIC to distinguish between free Brownian motion and confined motion. In a second step, it classifies the confining potential further using the AIC. We apply the method to experimental Clostridium Perfingens [Formula: see text]-toxin (CP[Formula: see text]T) receptor trajectories to show that these receptors are confined by a spring-like potential. An adaptation of this technique was applied on a sliding window in the temporal dimension along the trajectory. We applied this adaptation to experimental CP[Formula: see text]T trajectories that lose confinement due to disaggregation of confining domains. This new technique adds another dimension to the discussion of SMT data. The mode of motion of a receptor might hold more biologically relevant information than the diffusion

  5. Identification of Water Bodies in a Landsat 8 OLI Image Using a J48 Decision Tree.

    PubMed

    Acharya, Tri Dev; Lee, Dong Ha; Yang, In Tae; Lee, Jae Kang

    2016-01-01

    Water bodies are essential to humans and other forms of life. Identification of water bodies can be useful in various ways, including estimation of water availability, demarcation of flooded regions, change detection, and so on. In past decades, Landsat satellite sensors have been used for land use classification and water body identification. Due to the introduction of a New Operational Land Imager (OLI) sensor on Landsat 8 with a high spectral resolution and improved signal-to-noise ratio, the quality of imagery sensed by Landsat 8 has improved, enabling better characterization of land cover and increased data size. Therefore, it is necessary to explore the most appropriate and practical water identification methods that take advantage of the improved image quality and use the fewest inputs based on the original OLI bands. The objective of the study is to explore the potential of a J48 decision tree (JDT) in identifying water bodies using reflectance bands from Landsat 8 OLI imagery. J48 is an open-source decision tree. The test site for the study is in the Northern Han River Basin, which is located in Gangwon province, Korea. Training data with individual bands were used to develop the JDT model and later applied to the whole study area. The performance of the model was statistically analysed using the kappa statistic and area under the curve (AUC). The results were compared with five other known water identification methods using a confusion matrix and related statistics. Almost all the methods showed high accuracy, and the JDT was successfully applied to the OLI image using only four bands, where the new additional deep blue band of OLI was found to have the third highest information gain. Thus, the JDT can be a good method for water body identification based on images with improved resolution and increased size. PMID:27420067

  6. A data mining approach to optimize pellets manufacturing process based on a decision tree algorithm.

    PubMed

    Ronowicz, Joanna; Thommes, Markus; Kleinebudde, Peter; Krysiński, Jerzy

    2015-06-20

    The present study is focused on the thorough analysis of cause-effect relationships between pellet formulation characteristics (pellet composition as well as process parameters) and the selected quality attribute of the final product. The shape using the aspect ratio value expressed the quality of pellets. A data matrix for chemometric analysis consisted of 224 pellet formulations performed by means of eight different active pharmaceutical ingredients and several various excipients, using different extrusion/spheronization process conditions. The data set contained 14 input variables (both formulation and process variables) and one output variable (pellet aspect ratio). A tree regression algorithm consistent with the Quality by Design concept was applied to obtain deeper understanding and knowledge of formulation and process parameters affecting the final pellet sphericity. The clear interpretable set of decision rules were generated. The spehronization speed, spheronization time, number of holes and water content of extrudate have been recognized as the key factors influencing pellet aspect ratio. The most spherical pellets were achieved by using a large number of holes during extrusion, a high spheronizer speed and longer time of spheronization. The described data mining approach enhances knowledge about pelletization process and simultaneously facilitates searching for the optimal process conditions which are necessary to achieve ideal spherical pellets, resulting in good flow characteristics. This data mining approach can be taken into consideration by industrial formulation scientists to support rational decision making in the field of pellets technology. PMID:25835791

  7. What Satisfies Students? Mining Student-Opinion Data with Regression and Decision-Tree Analysis. AIR 2002 Forum Paper.

    ERIC Educational Resources Information Center

    Thomas, Emily H.; Galambos, Nora

    To investigate how students' characteristics and experiences affect satisfaction, this study used regression and decision-tree analysis with the CHAID algorithm to analyze student opinion data from a sample of 1,783 college students. A data-mining approach identifies the specific aspects of students' university experience that most influence three…

  8. A Decision-Tree-Oriented Guidance Mechanism for Conducting Nature Science Observation Activities in a Context-Aware Ubiquitous Learning

    ERIC Educational Resources Information Center

    Hwang, Gwo-Jen; Chu, Hui-Chun; Shih, Ju-Ling; Huang, Shu-Hsien; Tsai, Chin-Chung

    2010-01-01

    A context-aware ubiquitous learning environment is an authentic learning environment with personalized digital supports. While showing the potential of applying such a learning environment, researchers have also indicated the challenges of providing adaptive and dynamic support to individual students. In this paper, a decision-tree-oriented…

  9. Model-independent evaluation of tumor markers and a logistic-tree approach to diagnostic decision support.

    PubMed

    Ni, Weizeng; Huang, Samuel H; Su, Qiang; Shi, Jinghua

    2014-01-01

    Sensitivity and specificity of using individual tumor markers hardly meet the clinical requirement. This challenge gave rise to many efforts, e.g., combing multiple tumor markers and employing machine learning algorithms. However, results from different studies are often inconsistent, which are partially attributed to the use of different evaluation criteria. Also, the wide use of model-dependent validation leads to high possibility of data overfitting when complex models are used for diagnosis. We propose two model-independent criteria, namely, area under the curve (AUC) and Relief to evaluate the diagnostic values of individual and multiple tumor markers, respectively. For diagnostic decision support, we propose the use of logistic-tree which combines decision tree and logistic regression. Application on a colorectal cancer dataset shows that the proposed evaluation criteria produce results that are consistent with current knowledge. Furthermore, the simple and highly interpretable logistic-tree has diagnostic performance that is competitive with other complex models. PMID:25516124

  10. Proposal of a Clinical Decision Tree Algorithm Using Factors Associated with Severe Dengue Infection

    PubMed Central

    Hussin, Narwani; Cheah, Wee Kooi; Ng, Kee Sing; Muninathan, Prema

    2016-01-01

    Background WHO’s new classification in 2009: dengue with or without warning signs and severe dengue, has necessitated large numbers of admissions to hospitals of dengue patients which in turn has been imposing a huge economical and physical burden on many hospitals around the globe, particularly South East Asia and Malaysia where the disease has seen a rapid surge in numbers in recent years. Lack of a simple tool to differentiate mild from life threatening infection has led to unnecessary hospitalization of dengue patients. Methods We conducted a single-centre, retrospective study involving serologically confirmed dengue fever patients, admitted in a single ward, in Hospital Kuala Lumpur, Malaysia. Data was collected for 4 months from February to May 2014. Socio demography, co-morbidity, days of illness before admission, symptoms, warning signs, vital signs and laboratory result were all recorded. Descriptive statistics was tabulated and simple and multiple logistic regression analysis was done to determine significant risk factors associated with severe dengue. Results 657 patients with confirmed dengue were analysed, of which 59 (9.0%) had severe dengue. Overall, the commonest warning sign were vomiting (36.1%) and abdominal pain (32.1%). Previous co-morbid, vomiting, diarrhoea, pleural effusion, low systolic blood pressure, high haematocrit, low albumin and high urea were found as significant risk factors for severe dengue using simple logistic regression. However the significant risk factors for severe dengue with multiple logistic regressions were only vomiting, pleural effusion, and low systolic blood pressure. Using those 3 risk factors, we plotted an algorithm for predicting severe dengue. When compared to the classification of severe dengue based on the WHO criteria, the decision tree algorithm had a sensitivity of 0.81, specificity of 0.54, positive predictive value of 0.16 and negative predictive of 0.96. Conclusion The decision tree algorithm proposed

  11. Application Of Decision Tree Approach To Student Selection Model- A Case Study

    NASA Astrophysics Data System (ADS)

    Harwati; Sudiya, Amby

    2016-01-01

    The main purpose of the institution is to provide quality education to the students and to improve the quality of managerial decisions. One of the ways to improve the quality of students is to arrange the selection of new students with a more selective. This research takes the case in the selection of new students at Islamic University of Indonesia, Yogyakarta, Indonesia. One of the university's selection is through filtering administrative selection based on the records of prospective students at the high school without paper testing. Currently, that kind of selection does not yet has a standard model and criteria. Selection is only done by comparing candidate application file, so the subjectivity of assessment is very possible to happen because of the lack standard criteria that can differentiate the quality of students from one another. By applying data mining techniques classification, can be built a model selection for new students which includes criteria to certain standards such as the area of origin, the status of the school, the average value and so on. These criteria are determined by using rules that appear based on the classification of the academic achievement (GPA) of the students in previous years who entered the university through the same way. The decision tree method with C4.5 algorithm is used here. The results show that students are given priority for admission is that meet the following criteria: came from the island of Java, public school, majoring in science, an average value above 75, and have at least one achievement during their study in high school.

  12. A Low Complexity System Based on Multiple Weighted Decision Trees for Indoor Localization

    PubMed Central

    Sánchez-Rodríguez, David; Hernández-Morera, Pablo; Quinteiro, José Ma.; Alonso-González, Itziar

    2015-01-01

    Indoor position estimation has become an attractive research topic due to growing interest in location-aware services. Nevertheless, satisfying solutions have not been found with the considerations of both accuracy and system complexity. From the perspective of lightweight mobile devices, they are extremely important characteristics, because both the processor power and energy availability are limited. Hence, an indoor localization system with high computational complexity can cause complete battery drain within a few hours. In our research, we use a data mining technique named boosting to develop a localization system based on multiple weighted decision trees to predict the device location, since it has high accuracy and low computational complexity. The localization system is built using a dataset from sensor fusion, which combines the strength of radio signals from different wireless local area network access points and device orientation information from a digital compass built-in mobile device, so that extra sensors are unnecessary. Experimental results indicate that the proposed system leads to substantial improvements on computational complexity over the widely-used traditional fingerprinting methods, and it has a better accuracy than they have. PMID:26110413

  13. Interactive change detection based on dissimilarity image and decision tree classification

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Crouzil, Alain; Puel, Jean-Baptiste

    2015-02-01

    Our study mainly focus on detecting changed regions in two images of the same scene taken by digital cameras at different times. The images taken by digital cameras generally provide less information than multi-channel remote sensing images. Moreover, the application-dependent insignificant changes, such as shadows or clouds, may cause the failure of the classical methods based on image differences. The machine learning approach seems to be promising, but the lack of a sufficient volume of training data for photographic landscape observatories discards a lot of methods. So we investigate in this work the interactive learning approach and provide a discriminative model that is a 16-dimensional feature space comprising the textural appearance and contextual information. Dissimilarity measures in different neighborhood sizes are used to detect the difference within the neighborhood of an image pair. To detect changes between two images, the user designates change and non-change samples (pixel sets) in the images using a selection tool. This data is used to train a classifier using decision tree training method which is then applied to all the other pixels of the image pair. The experiments have proved the potential of the proposed approach.

  14. Tailored approach in inguinal hernia repair - decision tree based on the guidelines.

    PubMed

    Köckerling, Ferdinand; Schug-Pass, Christine

    2014-01-01

    The endoscopic procedures TEP and TAPP and the open techniques Lichtenstein, Plug and Patch, and PHS currently represent the gold standard in inguinal hernia repair recommended in the guidelines of the European Hernia Society, the International Endohernia Society, and the European Association of Endoscopic Surgery. Eighty-two percent of experienced hernia surgeons use the "tailored approach," the differentiated use of the several inguinal hernia repair techniques depending on the findings of the patient, trying to minimize the risks. The following differential therapeutic situations must be distinguished in inguinal hernia repair: unilateral in men, unilateral in women, bilateral, scrotal, after previous pelvic and lower abdominal surgery, no general anesthesia possible, recurrence, and emergency surgery. Evidence-based guidelines and consensus conferences of experts give recommendations for the best approach in the individual situation of a patient. This review tries to summarize the recommendations of the various guidelines and to transfer them into a practical decision tree for the daily work of surgeons performing inguinal hernia repair. PMID:25593944

  15. A Low Complexity System Based on Multiple Weighted Decision Trees for Indoor Localization.

    PubMed

    Sánchez-Rodríguez, David; Hernández-Morera, Pablo; Quinteiro, José Ma; Alonso-González, Itziar

    2015-01-01

    Indoor position estimation has become an attractive research topic due to growing interest in location-aware services. Nevertheless, satisfying solutions have not been found with the considerations of both accuracy and system complexity. From the perspective of lightweight mobile devices, they are extremely important characteristics, because both the processor power and energy availability are limited. Hence, an indoor localization system with high computational complexity can cause complete battery drain within a few hours. In our research, we use a data mining technique named boosting to develop a localization system based on multiple weighted decision trees to predict the device location, since it has high accuracy and low computational complexity. The localization system is built using a dataset from sensor fusion, which combines the strength of radio signals from different wireless local area network access points and device orientation information from a digital compass built-in mobile device, so that extra sensors are unnecessary. Experimental results indicate that the proposed system leads to substantial improvements on computational complexity over the widely-used traditional fingerprinting methods, and it has a better accuracy than they have. PMID:26110413

  16. Accurate estimation of retinal vessel width using bagged decision trees and an extended multiresolution Hermite model.

    PubMed

    Lupaşcu, Carmen Alina; Tegolo, Domenico; Trucco, Emanuele

    2013-12-01

    We present an algorithm estimating the width of retinal vessels in fundus camera images. The algorithm uses a novel parametric surface model of the cross-sectional intensities of vessels, and ensembles of bagged decision trees to estimate the local width from the parameters of the best-fit surface. We report comparative tests with REVIEW, currently the public database of reference for retinal width estimation, containing 16 images with 193 annotated vessel segments and 5066 profile points annotated manually by three independent experts. Comparative tests are reported also with our own set of 378 vessel widths selected sparsely in 38 images from the Tayside Scotland diabetic retinopathy screening programme and annotated manually by two clinicians. We obtain considerably better accuracies compared to leading methods in REVIEW tests and in Tayside tests. An important advantage of our method is its stability (success rate, i.e., meaningful measurement returned, of 100% on all REVIEW data sets and on the Tayside data set) compared to a variety of methods from the literature. We also find that results depend crucially on testing data and conditions, and discuss criteria for selecting a training set yielding optimal accuracy. PMID:24001930

  17. Effect of training characteristics on object classification: An application using Boosted Decision Trees

    NASA Astrophysics Data System (ADS)

    Sevilla-Noarbe, I.; Etayo-Sotos, P.

    2015-06-01

    We present an application of a particular machine-learning method (Boosted Decision Trees, BDTs using AdaBoost) to separate stars and galaxies in photometric images using their catalog characteristics. BDTs are a well established machine learning technique used for classification purposes. They have been widely used specially in the field of particle and astroparticle physics, and we use them here in an optical astronomy application. This algorithm is able to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. The improvements are shown using the Sloan Digital Sky Survey Data Release 9, with respect to the type photometric classifier. We obtain an improvement in the impurity of the galaxy sample of a factor 2-4 for this particular dataset, adjusting for the same efficiency of the selection. Another main goal of this study is to verify the effects that different input vectors and training sets have on the classification performance, the results being of wider use to other machine learning techniques.

  18. Object classification in images for Epo doping control based on fuzzy decision trees

    NASA Astrophysics Data System (ADS)

    Bajla, Ivan; Hollander, Igor; Heiss, Dorothea; Granec, Reinhard; Minichmayr, Markus

    2005-02-01

    Erythropoietin (Epo) is a hormone which can be misused as a doping substance. Its detection involves analysis of images containing specific objects (bands), whose position and intensity are critical for doping positivity. Within a research project of the World Anti-Doping Agency (WADA) we are implementing the GASepo software that should serve for Epo testing in doping control laboratories world-wide. For identification of the bands we have developed a segmentation procedure based on a sequence of filters and edge detectors. Whereas all true bands are properly segmented, the procedure generates a relatively high number of false positives (artefacts). To separate these artefacts we suggested a post-segmentation supervised classification using real-valued geometrical measures of objects. The method is based on the ID3 (Ross Quinlan's) rule generation method, where fuzzy representation is used for linking the linguistic terms to quantitative data. The fuzzy modification of the ID3 method provides a framework that generates fuzzy decision trees, as well as fuzzy sets for input data. Using the MLTTM software (Machine Learning Framework) we have generated a set of fuzzy rules explicitly describing bands and artefacts. The method eliminated most of the artefacts. The contribution includes a comparison of the obtained misclassification errors to the errors produced by some other statistical classification methods.

  19. Smart on-board diagnostic decision trees for quantitative aviation equipment and safety procedures validation

    NASA Astrophysics Data System (ADS)

    Ali, Ali H.; Markarian, Garik; Tarter, Alex; Kölle, Rainer

    2010-04-01

    The current trend in high-accuracy aircraft navigation systems is towards using data from one or more inertial navigation subsystem and one or more navigational reference subsystems. The enhancement in fault diagnosis and detection is achieved via computing the minimum mean square estimate of the aircraft states using, for instance, Kalman filter method. However, this enhancement might degrade if the cause of a subsystem fault has some effect on other subsystems that are calculating the same measurement. One instance of such case is the tragic incident of Air France Flight 447 in June, 2009 where message transmissions in the last moment before the crash indicated inconsistencies in measured airspeed as reported by Airbus. In this research, we propose the use of mathematical aircraft model to work out the current states of the airplane and in turn, using these states to validate the readings of the navigation equipment throughout smart diagnostic decision tree network. Various simulated equipment failures have been introduced in a controlled environment to proof the concept of operation. The results have showed successful detection of the failing equipment in all cases.

  20. Boosting theory towards practice: Recent developments in decision tree induction and the weak learning framework

    SciTech Connect

    Kearns, M.

    1996-12-31

    One of the original goals of computational learning theory was that of formulating models that permit meaningful comparisons between the different machine learning heuristics that are used in practice [Kearns et al., 1987]. Despite the other successes of computational learning theory, this goal has proven elusive. Empirically successful machine learning algorithms such as C4.5 and the backpropagation algorithm for neural networks have not met the criteria of the well-known Probably Approximately Correct (PAC) model [Valiant, 1984] and its variants, and thus such models are of little use in drawing distinctions among the heuristics used in applications. Conversely, the algorithms suggest by computational learning theory are usually too limited in various ways to find wide application.

  1. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  2. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  3. Ant colony optimisation of decision tree and contingency table models for the discovery of gene-gene interactions.

    PubMed

    Sapin, Emmanuel; Keedwell, Ed; Frayling, Tim

    2015-12-01

    In this study, ant colony optimisation (ACO) algorithm is used to derive near-optimal interactions between a number of single nucleotide polymorphisms (SNPs). This approach is used to discover small numbers of SNPs that are combined into a decision tree or contingency table model. The ACO algorithm is shown to be very robust as it is proven to be able to find results that are discriminatory from a statistical perspective with logical interactions, decision tree and contingency table models for various numbers of SNPs considered in the interaction. A large number of the SNPs discovered here have been already identified in large genome-wide association studies to be related to type II diabetes in the literature, lending additional confidence to the results. PMID:26577156

  4. Chromosomal damage and EROD induction in tree swallows (Tachycineta bicolor) along the Upper Mississippi River, Minnesota, USA

    USGS Publications Warehouse

    Emilie Bigorgne; Custer, Thomas W.; Dummer, Paul; Erickson, Richard A.; Karouna, Natalie; Schultz, Sandra; Custer, Christine M.; Thogmartin, Wayne E.; Cole W. Matson

    2015-01-01

    The health of tree swallows, Tachycineta bicolor, on the Upper Mississippi River (UMR) was assessed in 2010 and 2011 using biomarkers at six sites downriver of Minneapolis/St. Paul, MN metropolitan area, a tributary into the UMR, and a nearby lake. Chromosomal damage was evaluated in nestling blood by measuring the coefficient of variation of DNA content (DNA CV) using flow cytometry. Cytochrome P450 1A activity in nestling liver was measured using the ethoxyresorufin-O-dealkylase (EROD) assay, and oxidative stress was estimated in nestling livers via determination of thiobarbituric acid reacting substances (TBARS), reduced glutathione (GSH), oxidized glutathione (GSSG), the ratio GSSG/GSH, total sulfhydryl, and protein bound sulfhydryl (PBSH). A multilevel regression model (DNA CV) and simple regressions (EROD and oxidative stress) were used to evaluate biomarker responses for each location. Chromosomal damage was significantly elevated at two sites on the UMR (Pigs Eye and Pool 2) relative to the Green Mountain Lake reference site, while the induction of EROD activity was only observed at Pigs Eye. No measures of oxidative stress differed among sites. Multivariate analysis confirmed an increased DNA CV at Pigs Eye and Pool 2, and elevated EROD activity at Pigs Eye. These results suggest that the health of tree swallows has been altered at the DNA level at Pigs Eye and Pool 2 sites, and at the physiological level at Pigs Eye site only.

  5. Chromosomal damage and EROD induction in tree swallows (Tachycineta bicolor) along the Upper Mississippi River, Minnesota, USA.

    PubMed

    Bigorgne, Emilie; Custer, Thomas W; Dummer, Paul M; Erickson, Richard A; Karouna-Renier, Natalie; Schultz, Sandra; Custer, Christine M; Thogmartin, Wayne E; Matson, Cole W

    2015-07-01

    The health of tree swallows, Tachycineta bicolor, on the Upper Mississippi River (UMR) was assessed in 2010 and 2011 using biomarkers at six sites downriver of Minneapolis/St. Paul, MN metropolitan area, a tributary into the UMR, and a nearby lake. Chromosomal damage was evaluated in nestling blood by measuring the coefficient of variation of DNA content (DNA CV) using flow cytometry. Cytochrome P450 1A activity in nestling liver was measured using the ethoxyresorufin-O-dealkylase (EROD) assay, and oxidative stress was estimated in nestling livers via determination of thiobarbituric acid reacting substances (TBARS), reduced glutathione (GSH), oxidized glutathione (GSSG), the ratio GSSG/GSH, total sulfhydryl, and protein bound sulfhydryl (PBSH). A multilevel regression model (DNA CV) and simple regressions (EROD and oxidative stress) were used to evaluate biomarker responses for each location. Chromosomal damage was significantly elevated at two sites on the UMR (Pigs Eye and Pool 2) relative to the Green Mountain Lake reference site, while the induction of EROD activity was only observed at Pigs Eye. No measures of oxidative stress differed among sites. Multivariate analysis confirmed an increased DNA CV at Pigs Eye and Pool 2, and elevated EROD activity at Pigs Eye. These results suggest that the health of tree swallows has been altered at the DNA level at Pigs Eye and Pool 2 sites, and at the physiological level at Pigs Eye site only. PMID:25777616

  6. Chi-squared Automatic Interaction Detection Decision Tree Analysis of Risk Factors for Infant Anemia in Beijing, China

    PubMed Central

    Ye, Fang; Chen, Zhi-Hua; Chen, Jie; Liu, Fang; Zhang, Yong; Fan, Qin-Ying; Wang, Lin

    2016-01-01

    Background: In the past decades, studies on infant anemia have mainly focused on rural areas of China. With the increasing heterogeneity of population in recent years, available information on infant anemia is inconclusive in large cities of China, especially with comparison between native residents and floating population. This population-based cross-sectional study was implemented to determine the anemic status of infants as well as the risk factors in a representative downtown area of Beijing. Methods: As useful methods to build a predictive model, Chi-squared automatic interaction detection (CHAID) decision tree analysis and logistic regression analysis were introduced to explore risk factors of infant anemia. A total of 1091 infants aged 6–12 months together with their parents/caregivers living at Heping Avenue Subdistrict of Beijing were surveyed from January 1, 2013 to December 31, 2014. Results: The prevalence of anemia was 12.60% with a range of 3.47%–40.00% in different subgroup characteristics. The CHAID decision tree model has demonstrated multilevel interaction among risk factors through stepwise pathways to detect anemia. Besides the three predictors identified by logistic regression model including maternal anemia during pregnancy, exclusive breastfeeding in the first 6 months, and floating population, CHAID decision tree analysis also identified the fourth risk factor, the maternal educational level, with higher overall classification accuracy and larger area below the receiver operating characteristic curve. Conclusions: The infant anemic status in metropolis is complex and should be carefully considered by the basic health care practitioners. CHAID decision tree analysis has demonstrated a better performance in hierarchical analysis of population with great heterogeneity. Risk factors identified by this study might be meaningful in the early detection and prompt treatment of infant anemia in large cities. PMID:27174328

  7. ATLAAS: an automatic decision tree-based learning algorithm for advanced image segmentation in positron emission tomography

    NASA Astrophysics Data System (ADS)

    Berthon, Beatrice; Marshall, Christopher; Evans, Mererid; Spezi, Emiliano

    2016-07-01

    Accurate and reliable tumour delineation on positron emission tomography (PET) is crucial for radiotherapy treatment planning. PET automatic segmentation (PET-AS) eliminates intra- and interobserver variability, but there is currently no consensus on the optimal method to use, as different algorithms appear to perform better for different types of tumours. This work aimed to develop a predictive segmentation model, trained to automatically select and apply the best PET-AS method, according to the tumour characteristics. ATLAAS, the automatic decision tree-based learning algorithm for advanced segmentation is based on supervised machine learning using decision trees. The model includes nine PET-AS methods and was trained on a 100 PET scans with known true contour. A decision tree was built for each PET-AS algorithm to predict its accuracy, quantified using the Dice similarity coefficient (DSC), according to the tumour volume, tumour peak to background SUV ratio and a regional texture metric. The performance of ATLAAS was evaluated for 85 PET scans obtained from fillable and printed subresolution sandwich phantoms. ATLAAS showed excellent accuracy across a wide range of phantom data and predicted the best or near-best segmentation algorithm in 93% of cases. ATLAAS outperformed all single PET-AS methods on fillable phantom data with a DSC of 0.881, while the DSC for H&N phantom data was 0.819. DSCs higher than 0.650 were achieved in all cases. ATLAAS is an advanced automatic image segmentation algorithm based on decision tree predictive modelling, which can be trained on images with known true contour, to predict the best PET-AS method when the true contour is unknown. ATLAAS provides robust and accurate image segmentation with potential applications to radiation oncology.

  8. ATLAAS: an automatic decision tree-based learning algorithm for advanced image segmentation in positron emission tomography.

    PubMed

    Berthon, Beatrice; Marshall, Christopher; Evans, Mererid; Spezi, Emiliano

    2016-07-01

    Accurate and reliable tumour delineation on positron emission tomography (PET) is crucial for radiotherapy treatment planning. PET automatic segmentation (PET-AS) eliminates intra- and interobserver variability, but there is currently no consensus on the optimal method to use, as different algorithms appear to perform better for different types of tumours. This work aimed to develop a predictive segmentation model, trained to automatically select and apply the best PET-AS method, according to the tumour characteristics. ATLAAS, the automatic decision tree-based learning algorithm for advanced segmentation is based on supervised machine learning using decision trees. The model includes nine PET-AS methods and was trained on a 100 PET scans with known true contour. A decision tree was built for each PET-AS algorithm to predict its accuracy, quantified using the Dice similarity coefficient (DSC), according to the tumour volume, tumour peak to background SUV ratio and a regional texture metric. The performance of ATLAAS was evaluated for 85 PET scans obtained from fillable and printed subresolution sandwich phantoms. ATLAAS showed excellent accuracy across a wide range of phantom data and predicted the best or near-best segmentation algorithm in 93% of cases. ATLAAS outperformed all single PET-AS methods on fillable phantom data with a DSC of 0.881, while the DSC for H&N phantom data was 0.819. DSCs higher than 0.650 were achieved in all cases. ATLAAS is an advanced automatic image segmentation algorithm based on decision tree predictive modelling, which can be trained on images with known true contour, to predict the best PET-AS method when the true contour is unknown. ATLAAS provides robust and accurate image segmentation with potential applications to radiation oncology. PMID:27273293

  9. Evaluation of the potential allergenicity of the enzyme microbial transglutaminase using the 2001 FAO/WHO Decision Tree.

    PubMed

    Pedersen, Mona H; Hansen, Tine K; Sten, Eva; Seguro, Katsuya; Ohtsuka, Tomoko; Morita, Akiko; Bindslev-Jensen, Carsten; Poulsen, Lars K

    2004-11-01

    All novel proteins must be assessed for their potential allergenicity before they are introduced into the food market. One method to achieve this is the 2001 FAO/WHO Decision Tree recommended for evaluation of proteins from genetically modified organisms (GMOs). It was the aim of this study to investigate the allergenicity of microbial transglutaminase (m-TG) from Streptoverticillium mobaraense. Amino acid sequence similarity to known allergens, pepsin resistance, and detection of protein binding to specific serum immunoglobulin E (IgE) (RAST) have been evaluated as recommended by the decision tree. Allergenicity in the source material was thought unlikely, since no IgE-mediated allergy to any bacteria has been reported. m-TG is fully degraded after 5 min of pepsin treatment. A database search showed that the enzyme has no homology with known allergens, down to a match of six contiguous amino acids, which meets the requirements of the decision tree. However, there is a match at the five contiguous amino acid level to the major codfish allergen Gad c1. The potential cross reactivity between m-TG and Gad c1 was investigated in RAST using sera from 25 documented cod-allergic patients and an extract of raw codfish. No binding between patient IgE and m-TG was observed. It can be concluded that no safety concerns with regard to the allergenic potential of m-TG were identified. PMID:15508178

  10. Decision tree supported substructure prediction of metabolites from GC-MS profiles.

    PubMed

    Hummel, Jan; Strehmel, Nadine; Selbig, Joachim; Walther, Dirk; Kopka, Joachim

    2010-06-01

    Gas chromatography coupled to mass spectrometry (GC-MS) is one of the most widespread routine technologies applied to the large scale screening and discovery of novel metabolic biomarkers. However, currently the majority of mass spectral tags (MSTs) remains unidentified due to the lack of authenticated pure reference substances required for compound identification by GC-MS. Here, we accessed the information on reference compounds stored in the Golm Metabolome Database (GMD) to apply supervised machine learning approaches to the classification and identification of unidentified MSTs without relying on library searches. Non-annotated MSTs with mass spectral and retention index (RI) information together with data of already identified metabolites and reference substances have been archived in the GMD. Structural feature extraction was applied to sub-divide the metabolite space contained in the GMD and to define the prediction target classes. Decision tree (DT)-based prediction of the most frequent substructures based on mass spectral features and RI information is demonstrated to result in highly sensitive and specific detections of sub-structures contained in the compounds. The underlying set of DTs can be inspected by the user and are made available for batch processing via SOAP (Simple Object Access Protocol)-based web services. The GMD mass spectral library with the integrated DTs is freely accessible for non-commercial use at http://gmd.mpimp-golm.mpg.de/. All matching and structure search functionalities are available as SOAP-based web services. A XML + HTTP interface, which follows Representational State Transfer (REST) principles, facilitates read-only access to data base entities. PMID:20526350

  11. Accurate and interpretable nanoSAR models from genetic programming-based decision tree construction approaches.

    PubMed

    Oksel, Ceyda; Winkler, David A; Ma, Cai Y; Wilkins, Terry; Wang, Xue Z

    2016-09-01

    The number of engineered nanomaterials (ENMs) being exploited commercially is growing rapidly, due to the novel properties they exhibit. Clearly, it is important to understand and minimize any risks to health or the environment posed by the presence of ENMs. Data-driven models that decode the relationships between the biological activities of ENMs and their physicochemical characteristics provide an attractive means of maximizing the value of scarce and expensive experimental data. Although such structure-activity relationship (SAR) methods have become very useful tools for modelling nanotoxicity endpoints (nanoSAR), they have limited robustness and predictivity and, most importantly, interpretation of the models they generate is often very difficult. New computational modelling tools or new ways of using existing tools are required to model the relatively sparse and sometimes lower quality data on the biological effects of ENMs. The most commonly used SAR modelling methods work best with large datasets, are not particularly good at feature selection, can be relatively opaque to interpretation, and may not account for nonlinearity in the structure-property relationships. To overcome these limitations, we describe the application of a novel algorithm, a genetic programming-based decision tree construction tool (GPTree) to nanoSAR modelling. We demonstrate the use of GPTree in the construction of accurate and interpretable nanoSAR models by applying it to four diverse literature datasets. We describe the algorithm and compare model results across the four studies. We show that GPTree generates models with accuracies equivalent to or superior to those of prior modelling studies on the same datasets. GPTree is a robust, automatic method for generation of accurate nanoSAR models with important advantages that it works with small datasets, automatically selects descriptors, and provides significantly improved interpretability of models. PMID:26956430

  12. Use of electromagnetic induction surveys to delimit zones of contrasting tree development in an irrigated olive orchard in Southern Spain.

    NASA Astrophysics Data System (ADS)

    Pedrera, Aura; Vanderlinden, Karl; Jesús Espejo-Pérez, Antonio; Gómez, José Alfonso; Giráldez, Juan Vicente

    2014-05-01

    Olives are historically closely linked to Mediterranean culture and have nowadays important societal and economical implications. Improving yield and preventing infestation by soil-borne pathogens are crucial issues in maintaining olive cropping competitive. In order to assess both issues properly at the farm or field scale, accurate knowledge of the spatial distribution of soil physical properties and associated water dynamics is required. Conventional soil surveying is generally prohibitive at commercial farms, but electromagnetic induction (EMI) sensors, measuring soil apparent electrical conductivity (ECa) provide a suitable alternative. ECa depends strongly on soil texture and water content and has been used exhaustively in precision agriculture to delimit management zones. The aim of this study was to delimit areas with unsatisfactory tree development in an olive orchard using EMI, and to identify the underlying relationships between ECa and the soil properties driving the spatial tree development pattern. An experimental catchment in S. Spain dedicated to irrigated olive cropping was surveyed for ECa under dry and wet soil conditions (0.06 vs. 0.22 g/g, respectively), using a Dualem 21-S EMI sensor. In addition, ECa and gravimetric soil water content (SWC) was measured at 45 locations throughout the catchment during each survey. At each of these locations, soil profile samples were collected to determine textural class including coarse particles content, organic matter (OM), and bulk density. Measurements for dry soil conditions with the perpendicular coil configuration with a separation of 2.1 m (P2.1) were chosen to make a first assessment of the orchard-growth variability. According to the shape of the histogram, the P2.1 ECa values were classified to delimit three areas in the field for which canopy coverage was estimated. Combining the 4 ECa signals for the wet and dry surveys, a principal component (PC) analysis showed that 91% of the total variance

  13. The creation of a digital soil map for Cyprus using decision-tree classification techniques

    NASA Astrophysics Data System (ADS)

    Camera, Corrado; Zomeni, Zomenia; Bruggeman, Adriana; Noller, Joy; Zissimos, Andreas

    2014-05-01

    Considering the increasing threats soil are experiencing especially in semi-arid, Mediterranean environments like Cyprus (erosion, contamination, sealing and salinisation), producing a high resolution, reliable soil map is essential for further soil conservation studies. This study aims to create a 1:50.000 soil map covering the area under the direct control of the Republic of Cyprus (5.760 km2). The study consists of two major steps. The first is the creation of a raster database of predictive variables selected according to the scorpan formula (McBratney et al., 2003). It is of particular interest the possibility of using, as soil properties, data coming from three older island-wide soil maps and the recently published geochemical atlas of Cyprus (Cohen et al., 2011). Ten highly characterizing elements were selected and used as predictors in the present study. For the other factors usual variables were used: temperature and aridity index for climate; total loss on ignition, vegetation and forestry types maps for organic matter; the DEM and related relief derivatives (slope, aspect, curvature, landscape units); bedrock, surficial geology and geomorphology (Noller, 2009) for parent material and age; and a sub-watershed map to better bound location related to parent material sources. In the second step, the digital soil map is created using the Random Forests package in R. Random Forests is a decision tree classification technique where many trees, instead of a single one, are developed and compared to increase the stability and the reliability of the prediction. The model is trained and verified on areas where a 1:25.000 published soil maps obtained from field work is available and then it is applied for predictive mapping to the other areas. Preliminary results obtained in a small area in the plain around the city of Lefkosia, where eight different soil classes are present, show very good capacities of the method. The Ramdom Forest approach leads to reproduce soil

  14. Analysis of the impact of recreational trail usage for prioritising management decisions: a regression tree approach

    NASA Astrophysics Data System (ADS)

    Tomczyk, Aleksandra; Ewertowski, Marek; White, Piran; Kasprzak, Leszek

    2016-04-01

    The dual role of many Protected Natural Areas in providing benefits for both conservation and recreation poses challenges for management. Although recreation-based damage to ecosystems can occur very quickly, restoration can take many years. The protection of conservation interests at the same as providing for recreation requires decisions to be made about how to prioritise and direct management actions. Trails are commonly used to divert visitors from the most important areas of a site, but high visitor pressure can lead to increases in trail width and a concomitant increase in soil erosion. Here we use detailed field data on condition of recreational trails in Gorce National Park, Poland, as the basis for a regression tree analysis to determine the factors influencing trail deterioration, and link specific trail impacts with environmental, use related and managerial factors. We distinguished 12 types of trails, characterised by four levels of degradation: (1) trails with an acceptable level of degradation; (2) threatened trails; (3) damaged trails; and (4) heavily damaged trails. Damaged trails were the most vulnerable of all trails and should be prioritised for appropriate conservation and restoration. We also proposed five types of monitoring of recreational trail conditions: (1) rapid inventory of negative impacts; (2) monitoring visitor numbers and variation in type of use; (3) change-oriented monitoring focusing on sections of trail which were subjected to changes in type or level of use or subjected to extreme weather events; (4) monitoring of dynamics of trail conditions; and (5) full assessment of trail conditions, to be carried out every 10-15 years. The application of the proposed framework can enhance the ability of Park managers to prioritise their trail management activities, enhancing trail conditions and visitor safety, while minimising adverse impacts on the conservation value of the ecosystem. A.M.T. was supported by the Polish Ministry of

  15. Specificity of extrafloral nectar induction by herbivores differs among native and invasive populations of tallow tree

    PubMed Central

    Wang, Yi; Carrillo, Juli; Siemann, Evan; Wheeler, Gregory S.; Zhu, Lin; Gu, Xue; Ding, Jianqing

    2013-01-01

    Background and Aims Invasive plants can be released from specialist herbivores and encounter novel generalists in their introduced ranges, leading to variation in defence among native and invasive populations. However, few studies have examined how constitutive and induced indirect defences change during plant invasion, especially during the juvenile stage. Methods Constitutive extrafloral nectar (EFN) production of native and invasive populations of juvenile tallow tree (Triadica sebifera) were compared, and leaf clipping, and damage by a native specialist (Noctuid) and two native generalist caterpillars (Noctuid and Limacodid) were used to examine inducible EFN production. Key results Plants from introduced populations had more leaves producing constitutive EFN than did native populations, but the content of soluble solids of EFN did not differ. Herbivores induced EFN production more than simulated herbivory. The specialist (Noctuid) induced more EFN than either generalist for native populations. The content of soluble solids in EFN was higher (2·1 times), with the specialist vs. the generalists causing the stronger response for native populations, but the specialist response was always comparable with the generalist responses for invasive populations. Conclusions These results suggest that constitutive and induced indirect defences are retained in juvenile plants of invasive populations even during plant establishment, perhaps due to generalist herbivory in the introduced range. However, responses specific to a specialist herbivore may be reduced in the introduced range where specialists are absent. This decreased defence may benefit specialist insects that are introduced for classical biological control of invasive plants. PMID:23761685

  16. Genetic program based data mining of fuzzy decision trees and methods of improving convergence and reducing bloat

    NASA Astrophysics Data System (ADS)

    Smith, James F., III; Nguyen, ThanhVu H.

    2007-04-01

    A data mining procedure for automatic determination of fuzzy decision tree structure using a genetic program (GP) is discussed. A GP is an algorithm that evolves other algorithms or mathematical expressions. Innovative methods for accelerating convergence of the data mining procedure and reducing bloat are given. In genetic programming, bloat refers to excessive tree growth. It has been observed that the trees in the evolving GP population will grow by a factor of three every 50 generations. When evolving mathematical expressions much of the bloat is due to the expressions not being in algebraically simplest form. So a bloat reduction method based on automated computer algebra has been introduced. The effectiveness of this procedure is discussed. Also, rules based on fuzzy logic have been introduced into the GP to accelerate convergence, reduce bloat and produce a solution more readily understood by the human user. These rules are discussed as well as other techniques for convergence improvement and bloat control. Comparisons between trees created using a genetic program and those constructed solely by interviewing experts are made. A new co-evolutionary method that improves the control logic evolved by the GP by having a genetic algorithm evolve pathological scenarios is discussed. The effect on the control logic is considered. Finally, additional methods that have been used to validate the data mining algorithm are referenced.

  17. Decision tree-based method for integrating gene expression, demographic, and clinical data to determine disease endotypes

    PubMed Central

    2013-01-01

    Background Complex diseases are often difficult to diagnose, treat and study due to the multi-factorial nature of the underlying etiology. Large data sets are now widely available that can be used to define novel, mechanistically distinct disease subtypes (endotypes) in a completely data-driven manner. However, significant challenges exist with regard to how to segregate individuals into suitable subtypes of the disease and understand the distinct biological mechanisms of each when the goal is to maximize the discovery potential of these data sets. Results A multi-step decision tree-based method is described for defining endotypes based on gene expression, clinical covariates, and disease indicators using childhood asthma as a case study. We attempted to use alternative approaches such as the Student’s t-test, single data domain clustering and the Modk-prototypes algorithm, which incorporates multiple data domains into a single analysis and none performed as well as the novel multi-step decision tree method. This new method gave the best segregation of asthmatics and non-asthmatics, and it provides easy access to all genes and clinical covariates that distinguish the groups. Conclusions The multi-step decision tree method described here will lead to better understanding of complex disease in general by allowing purely data-driven disease endotypes to facilitate the discovery of new mechanisms underlying these diseases. This application should be considered a complement to ongoing efforts to better define and diagnose known endotypes. When coupled with existing methods developed to determine the genetics of gene expression, these methods provide a mechanism for linking genetics and exposomics data and thereby accounting for both major determinants of disease. PMID:24188919

  18. Predicting Lung Radiotherapy-Induced Pneumonitis Using a Model Combining Parametric Lyman Probit With Nonparametric Decision Trees

    SciTech Connect

    Das, Shiva K. . E-mail: shiva.das@duke.edu; Zhou Sumin; Zhang, Junan; Yin, F.-F.; Dewhirst, Mark W.; Marks, Lawrence B.

    2007-07-15

    Purpose: To develop and test a model to predict for lung radiation-induced Grade 2+ pneumonitis. Methods and Materials: The model was built from a database of 234 lung cancer patients treated with radiotherapy (RT), of whom 43 were diagnosed with pneumonitis. The model augmented the predictive capability of the parametric dose-based Lyman normal tissue complication probability (LNTCP) metric by combining it with weighted nonparametric decision trees that use dose and nondose inputs. The decision trees were sequentially added to the model using a 'boosting' process that enhances the accuracy of prediction. The model's predictive capability was estimated by 10-fold cross-validation. To facilitate dissemination, the cross-validation result was used to extract a simplified approximation to the complicated model architecture created by boosting. Application of the simplified model is demonstrated in two example cases. Results: The area under the model receiver operating characteristics curve for cross-validation was 0.72, a significant improvement over the LNTCP area of 0.63 (p = 0.005). The simplified model used the following variables to output a measure of injury: LNTCP, gender, histologic type, chemotherapy schedule, and treatment schedule. For a given patient RT plan, injury prediction was highest for the combination of pre-RT chemotherapy, once-daily treatment, female gender and lowest for the combination of no pre-RT chemotherapy and nonsquamous cell histologic type. Application of the simplified model to the example cases revealed that injury prediction for a given treatment plan can range from very low to very high, depending on the settings of the nondose variables. Conclusions: Radiation pneumonitis prediction was significantly enhanced by decision trees that added the influence of nondose factors to the LNTCP formulation.

  19. How Induction Programs Affect the Decision of Alternate Route Urban Teachers to Remain Teaching

    ERIC Educational Resources Information Center

    LoCascio, Steven J.; Smeaton, Patricia S.; Waters, Faith H.

    2016-01-01

    This mixed-methods study analyzes the induction programs for alternate route beginning teachers in low socioeconomic, urban schools. The researcher surveyed 53 teachers at the end of their first year and conducted six in-depth follow-up interviews. The study found that half the teachers did not receive an induction program congruent with state…

  20. Procalcitonin and C-reactive protein-based decision tree model for distinguishing PFAPA flares from acute infections.

    PubMed

    Kraszewska-Głomba, Barbara; Szymańska-Toczek, Zofia; Szenborn, Leszek

    2016-01-01

    As no specific laboratory test has been identified, PFAPA (periodic fever, aphthous stomatitis, pharyngitis and cervical adenitis) remains a diagnosis of exclusion. We searched for a practical use of procalcitonin (PCT) and C-reactive protein (CRP) in distinguishing PFAPA attacks from acute bacterial and viral infections. Levels of PCT and CRP were measured in 38 patients with PFAPA and 81 children diagnosed with an acute bacterial (n=42) or viral (n=39) infection. Statistical analysis with the use of the C4.5 algorithm resulted in the following decision tree: viral infection if CRP≤19.1 mg/L; otherwise for cases with CRP>19.1 mg/L: bacterial infection if PCT>0.65ng/mL, PFAPA if PCT≤0.65 ng/mL. The model was tested using a 10-fold cross validation and in an independent test cohort (n=30), the rule's overall accuracy was 76.4% and 90% respectively. Although limited by a small sample size, the obtained decision tree might present a potential diagnostic tool for distinguishing PFAPA flares from acute infections when interpreted cautiously and with reference to the clinical context. PMID:27131024

  1. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier.

    PubMed

    Kambhampati, Satya Samyukta; Singh, Vishal; Manikandan, M Sabarimalai; Ramkumar, Barathram

    2015-08-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%. PMID:26609414

  2. Improving Crop Classification Techniques Using Optical Remote Sensing Imagery, High-Resolution Agriculture Resource Inventory Shapefiles and Decision Trees

    NASA Astrophysics Data System (ADS)

    Melnychuk, A. L.; Berg, A. A.; Sweeney, S.

    2010-12-01

    Recognition of anthropogenic effects of land use management practices on bodies of water is important for remediating and preventing eutrophication. In the case of Lake Simcoe, Ontario the main surrounding landuse is agriculture. To better manage the nutrient flow into the lake, knowledge of the management of the agricultural land is important. For this basin, a comprehensive agricultural resource inventory is required for assessment of policy and for input into water quality management and assessment tools. Supervised decision tree classification schemes, used in many previous applications, have yielded reliable classifications in agricultural land-use systems. However, when using these classification techniques the user is confronted with numerous data sources. In this study we use a large inventory of optical satellite image products (Landsat, AWiFS, SPOT and MODIS) and ancillary data sources (temporal MODIS-NDVI product signatures, digital elevation models and soil maps) at various spatial and temporal resolutions in a decision tree classification scheme. The sensitivity of the classification accuracy to various products is assessed to identify optimal data sources for classifying crop systems.

  3. Procalcitonin and C-reactive protein-based decision tree model for distinguishing PFAPA flares from acute infections

    PubMed Central

    Kraszewska-Głomba, Barbara; Szymańska-Toczek, Zofia; Szenborn, Leszek

    2016-01-01

    As no specific laboratory test has been identified, PFAPA (periodic fever, aphthous stomatitis, pharyngitis and cervical adenitis) remains a diagnosis of exclusion. We searched for a practical use of procalcitonin (PCT) and C-reactive protein (CRP) in distinguishing PFAPA attacks from acute bacterial and viral infections. Levels of PCT and CRP were measured in 38 patients with PFAPA and 81 children diagnosed with an acute bacterial (n=42) or viral (n=39) infection. Statistical analysis with the use of the C4.5 algorithm resulted in the following decision tree: viral infection if CRP≤19.1 mg/L; otherwise for cases with CRP>19.1 mg/L: bacterial infection if PCT>0.65ng/mL, PFAPA if PCT≤0.65 ng/mL. The model was tested using a 10-fold cross validation and in an independent test cohort (n=30), the rule’s overall accuracy was 76.4% and 90% respectively. Although limited by a small sample size, the obtained decision tree might present a potential diagnostic tool for distinguishing PFAPA flares from acute infections when interpreted cautiously and with reference to the clinical context. PMID:27131024

  4. Unified framework for triaxial accelerometer-based fall event detection and classification using cumulants and hierarchical decision tree classifier

    PubMed Central

    Kambhampati, Satya Samyukta; Singh, Vishal; Ramkumar, Barathram

    2015-01-01

    In this Letter, the authors present a unified framework for fall event detection and classification using the cumulants extracted from the acceleration (ACC) signals acquired using a single waist-mounted triaxial accelerometer. The main objective of this Letter is to find suitable representative cumulants and classifiers in effectively detecting and classifying different types of fall and non-fall events. It was discovered that the first level of the proposed hierarchical decision tree algorithm implements fall detection using fifth-order cumulants and support vector machine (SVM) classifier. In the second level, the fall event classification algorithm uses the fifth-order cumulants and SVM. Finally, human activity classification is performed using the second-order cumulants and SVM. The detection and classification results are compared with those of the decision tree, naive Bayes, multilayer perceptron and SVM classifiers with different types of time-domain features including the second-, third-, fourth- and fifth-order cumulants and the signal magnitude vector and signal magnitude area. The experimental results demonstrate that the second- and fifth-order cumulant features and SVM classifier can achieve optimal detection and classification rates of above 95%, as well as the lowest false alarm rate of 1.03%. PMID:26609414

  5. Lessons Learned from Applications of a Climate Change Decision Tree toWater System Projects in Kenya and Nepal

    NASA Astrophysics Data System (ADS)

    Ray, P. A.; Bonzanigo, L.; Taner, M. U.; Wi, S.; Yang, Y. C. E.; Brown, C.

    2015-12-01

    The Decision Tree Framework developed for the World Bank's Water Partnership Program provides resource-limited project planners and program managers with a cost-effective and effort-efficient, scientifically defensible, repeatable, and clear method for demonstrating the robustness of a project to climate change. At the conclusion of this process, the project planner is empowered to confidently communicate the method by which the vulnerabilities of the project have been assessed, and how the adjustments that were made (if any were necessary) improved the project's feasibility and profitability. The framework adopts a "bottom-up" approach to risk assessment that aims at a thorough understanding of a project's vulnerabilities to climate change in the context of other nonclimate uncertainties (e.g., economic, environmental, demographic, political). It helps identify projects that perform well across a wide range of potential future climate conditions, as opposed to seeking solutions that are optimal in expected conditions but fragile to conditions deviating from the expected. Lessons learned through application of the Decision Tree to case studies in Kenya and Nepal will be presented, and aspects of the framework requiring further refinement will be described.

  6. Accurate Prediction of Advanced Liver Fibrosis Using the Decision Tree Learning Algorithm in Chronic Hepatitis C Egyptian Patients.

    PubMed

    Hashem, Somaya; Esmat, Gamal; Elakel, Wafaa; Habashy, Shahira; Abdel Raouf, Safaa; Darweesh, Samar; Soliman, Mohamad; Elhefnawi, Mohamed; El-Adawy, Mohamed; ElHefnawi, Mahmoud

    2016-01-01

    Background/Aim. Respectively with the prevalence of chronic hepatitis C in the world, using noninvasive methods as an alternative method in staging chronic liver diseases for avoiding the drawbacks of biopsy is significantly increasing. The aim of this study is to combine the serum biomarkers and clinical information to develop a classification model that can predict advanced liver fibrosis. Methods. 39,567 patients with chronic hepatitis C were included and randomly divided into two separate sets. Liver fibrosis was assessed via METAVIR score; patients were categorized as mild to moderate (F0-F2) or advanced (F3-F4) fibrosis stages. Two models were developed using alternating decision tree algorithm. Model 1 uses six parameters, while model 2 uses four, which are similar to FIB-4 features except alpha-fetoprotein instead of alanine aminotransferase. Sensitivity and receiver operating characteristic curve were performed to evaluate the performance of the proposed models. Results. The best model achieved 86.2% negative predictive value and 0.78 ROC with 84.8% accuracy which is better than FIB-4. Conclusions. The risk of advanced liver fibrosis, due to chronic hepatitis C, could be predicted with high accuracy using decision tree learning algorithm that could be used to reduce the need to assess the liver biopsy. PMID:26880886

  7. Accurate Prediction of Advanced Liver Fibrosis Using the Decision Tree Learning Algorithm in Chronic Hepatitis C Egyptian Patients

    PubMed Central

    Hashem, Somaya; Esmat, Gamal; Elakel, Wafaa; Habashy, Shahira; Abdel Raouf, Safaa; Darweesh, Samar; Soliman, Mohamad; Elhefnawi, Mohamed; El-Adawy, Mohamed; ElHefnawi, Mahmoud

    2016-01-01

    Background/Aim. Respectively with the prevalence of chronic hepatitis C in the world, using noninvasive methods as an alternative method in staging chronic liver diseases for avoiding the drawbacks of biopsy is significantly increasing. The aim of this study is to combine the serum biomarkers and clinical information to develop a classification model that can predict advanced liver fibrosis. Methods. 39,567 patients with chronic hepatitis C were included and randomly divided into two separate sets. Liver fibrosis was assessed via METAVIR score; patients were categorized as mild to moderate (F0–F2) or advanced (F3-F4) fibrosis stages. Two models were developed using alternating decision tree algorithm. Model 1 uses six parameters, while model 2 uses four, which are similar to FIB-4 features except alpha-fetoprotein instead of alanine aminotransferase. Sensitivity and receiver operating characteristic curve were performed to evaluate the performance of the proposed models. Results. The best model achieved 86.2% negative predictive value and 0.78 ROC with 84.8% accuracy which is better than FIB-4. Conclusions. The risk of advanced liver fibrosis, due to chronic hepatitis C, could be predicted with high accuracy using decision tree learning algorithm that could be used to reduce the need to assess the liver biopsy. PMID:26880886

  8. Refined estimation of solar energy potential on roof areas using decision trees on CityGML-data

    NASA Astrophysics Data System (ADS)

    Baumanns, K.; Löwner, M.-O.

    2009-04-01

    We present a decision tree for a refined solar energy plant potential estimation on roof areas using the exchange format CityGML. Compared to raster datasets CityGML-data holds geometric and semantic information of buildings and roof areas in more detail. In addition to shadowing effects ownership structures and lifetime of roof areas can be incorporated into the valuation. Since the Renewable Energy Sources Act came into force in Germany in 2000, private house owners and municipals raise attention to the production of green electricity. At this the return on invest depends on the statutory price per Watt, the initial costs of the solar energy plant, its lifetime, and the real production of this installation. The latter depends on the radiation that is obtained from and the size of the solar energy plant. In this context the exposition and slope of the roof area is as important as building parts like chimneys or dormers that might shadow parts of the roof. Knowing the controlling factors a decision tree can be created to support a beneficial deployment of a solar energy plant. Also sufficient data has to be available. Airborne raster datasets can only support a coarse estimation of the solar energy potential of roof areas. While they carry no semantically information, even roof installations are hardly to identify. CityGML as an Open Geospatial Consortium standard is an interoperable exchange data format for virtual 3-dimensional Cities. Based on international standards it holds the aforementioned geometric properties as well as semantically information. In Germany many Cities are on the way to provide CityGML dataset, e. g. Berlin. Here we present a decision tree that incorporates geometrically as well as semantically demands for a refined estimation of the solar energy potential on roof areas. Based on CityGML's attribute lists we consider geometries of roofs and roof installations as well as global radiation which can be derived e. g. from the European Solar

  9. Decisions for Others Become Less Impulsive the Further Away They Are on the Family Tree

    PubMed Central

    Ziegler, Fenja V.; Tunney, Richard J.

    2012-01-01

    Background People tend to prefer a smaller immediate reward to a larger but delayed reward. Although this discounting of future rewards is often associated with impulsivity, it is not necessarily irrational. Instead it has been suggested that it reflects the decision maker’s greater interest in the ‘me now’ than the ‘me in 10 years’, such that the concern for our future self is about the same as for someone else who is close to us. Methodology/Principal Findings To investigate this we used a delay-discounting task to compare discount functions for choices that people would make for themselves against decisions that they think that other people should make, e.g. to accept $500 now or $1000 next week. The psychological distance of the hypothetical beneficiaries was manipulated in terms of the genetic coefficient of relatedness ranging from zero (e.g. a stranger, or unrelated close friend), .125 (e.g. a cousin), .25 (e.g. a nephew or niece), to .5 (parent or sibling). Conclusions/Significance The observed discount functions were steeper (i.e. more impulsive) for choices in which the decision-maker was the beneficiary than for all other beneficiaries. Impulsiveness of decisions declined systematically with the distance of the beneficiary from the decision-maker. The data are discussed with reference to the implusivity and interpersonal empathy gaps in decision-making. PMID:23209580

  10. Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree.

    PubMed

    Chao, Cheng-Min; Yu, Ya-Wen; Cheng, Bor-Wen; Kuo, Yao-Lung

    2014-10-01

    The aim of the paper is to use data mining technology to establish a classification of breast cancer survival patterns, and offers a treatment decision-making reference for the survival ability of women diagnosed with breast cancer in Taiwan. We studied patients with breast cancer in a specific hospital in Central Taiwan to obtain 1,340 data sets. We employed a support vector machine, logistic regression, and a C5.0 decision tree to construct a classification model of breast cancer patients' survival rates, and used a 10-fold cross-validation approach to identify the model. The results show that the establishment of classification tools for the classification of the models yielded an average accuracy rate of more than 90% for both; the SVM provided the best method for constructing the three categories of the classification system for the survival mode. The results of the experiment show that the three methods used to create the classification system, established a high accuracy rate, predicted a more accurate survival ability of women diagnosed with breast cancer, and could be used as a reference when creating a medical decision-making frame. PMID:25119239

  11. Generalization of the Viola-Jones method as a decision tree of strong classifiers for real-time object recognition in video stream

    NASA Astrophysics Data System (ADS)

    Minkina, A.; Nikolaev, D.; Usilin, S.; Kozyrev, V.

    2015-02-01

    In this paper, we present a new modification of Viola-Jones complex classifiers. We describe a complex classifier in the form of a decision tree and provide a method of training for such classifiers. Performance impact of the tree structure is analyzed. Comparison is carried out of precision and performance of the presented method with that of the classical cascade. Various tree architectures are experimentally studied. The task of vehicle wheels detection on images obtained from an automatic vehicle classification system is taken as an example.

  12. Evaluating Psychiatric Hospital Admission Decisions for Children in Foster Care: An Optimal Classification Tree Analysis

    ERIC Educational Resources Information Center

    Snowden, Jessica A.; Leon, Scott C.; Bryant, Fred B.; Lyons, John S.

    2007-01-01

    This study explored clinical and nonclinical predictors of inpatient hospital admission decisions across a sample of children in foster care over 4 years (N = 13,245). Forty-eight percent of participants were female and the mean age was 13.4 (SD = 3.5 years). Optimal data analysis (Yarnold & Soltysik, 2005) was used to construct a nonlinear…

  13. Discovering Decision Trees in the Curriculum Jungle: A Chronicle of Group Groping.

    ERIC Educational Resources Information Center

    Helburn, Nicholas

    Additional insight into the High School Geography Project (HSGP) is provided by this retrospective view of the critical decisions which influenced its nature and scope. A commitment was made to materials at the expense of teacher education and other changes in the educational system. Successive choices focused on a complete but frugal package of…

  14. Novel benzofuroxan derivatives against multidrug-resistant Staphylococcus aureus strains: design using Topliss' decision tree, synthesis and biological assay.

    PubMed

    Jorge, Salomão Dória; Palace-Berl, Fanny; Masunari, Andrea; Cechinel, Cléber André; Ishii, Marina; Pasqualoto, Kerly Fernanda Mesquita; Tavares, Leoberto Costa

    2011-08-15

    The aim of this study was the design of a set of benzofuroxan derivatives as antimicrobial agents exploring the physicochemical properties of the related substituents. Topliss' decision tree approach was applied to select the substituent groups. Hierarchical cluster analysis was also performed to emphasize natural clusters and patterns. The compounds were obtained using two synthetic approaches for reducing the synthetic steps as well as improving the yield. The minimal inhibitory concentration method was employed to evaluate the activity against multidrug-resistant Staphylococcus aureus strains. The most active compound was 4-nitro-3-(trifluoromethyl)[N'-(benzofuroxan-5-yl)methylene]benzhydrazide (MIC range 12.7-11.4 μg/mL), pointing out that the antimicrobial activity was indeed influenced by the hydrophobic and electron-withdrawing property of the substituent groups 3-CF(3) and 4-NO(2), respectively. PMID:21757359

  15. A decision tree-based on-line preventive control strategy for power system transient instability prevention

    NASA Astrophysics Data System (ADS)

    Xu, Yan; Dong, Zhao Yang; Zhang, Rui; Wong, Kit Po

    2014-02-01

    Maintaining transient stability is a basic requirement for secure power system operations. Preventive control deals with modifying the system operating point to withstand probable contingencies. In this article, a decision tree (DT)-based on-line preventive control strategy is proposed for transient instability prevention of power systems. Given a stability database, a distance-based feature estimation algorithm is first applied to identify the critical generators, which are then used as features to develop a DT. By interpreting the splitting rules of DT, preventive control is realised by formulating the rules in a standard optimal power flow model and solving it. The proposed method is transparent in control mechanism, on-line computation compatible and convenient to deal with multi-contingency. The effectiveness and efficiency of the method has been verified on New England 10-machine 39-bus test system.

  16. Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System

    NASA Astrophysics Data System (ADS)

    Wang, Hongcui; Kawahara, Tatsuya

    CALL (Computer Assisted Language Learning) systems using ASR (Automatic Speech Recognition) for second language learning have received increasing interest recently. However, it still remains a challenge to achieve high speech recognition performance, including accurate detection of erroneous utterances by non-native speakers. Conventionally, possible error patterns, based on linguistic knowledge, are added to the lexicon and language model, or the ASR grammar network. However, this approach easily falls in the trade-off of coverage of errors and the increase of perplexity. To solve the problem, we propose a method based on a decision tree to learn effective prediction of errors made by non-native speakers. An experimental evaluation with a number of foreign students learning Japanese shows that the proposed method can effectively generate an ASR grammar network, given a target sentence, to achieve both better coverage of errors and smaller perplexity, resulting in significant improvement in ASR accuracy.

  17. A method of building of decision trees based on data from wearable device during a rehabilitation of patients with tibia fractures

    SciTech Connect

    Kupriyanov, M. S. Shukeilo, E. Y. Shichkina, J. A.

    2015-11-17

    Nowadays technologies which are used in traumatology are a combination of mechanical, electronic, calculating and programming tools. Relevance of development of mobile applications for an expeditious data processing which are received from medical devices (in particular, wearable devices), and formulation of management decisions increases. Using of a mathematical method of building of decision trees for an assessment of a patient’s health condition using data from a wearable device considers in this article.

  18. A method of building of decision trees based on data from wearable device during a rehabilitation of patients with tibia fractures

    NASA Astrophysics Data System (ADS)

    Kupriyanov, M. S.; Shukeilo, E. Y.; Shichkina, J. A.

    2015-11-01

    Nowadays technologies which are used in traumatology are a combination of mechanical, electronic, calculating and programming tools. Relevance of development of mobile applications for an expeditious data processing which are received from medical devices (in particular, wearable devices), and formulation of management decisions increases. Using of a mathematical method of building of decision trees for an assessment of a patient's health condition using data from a wearable device considers in this article.

  19. Control of fire blight (Erwinia amylovora) on apple trees with trunk-injected plant resistance inducers and antibiotics and assessment of induction of pathogenesis-related protein genes.

    PubMed

    Aćimović, Srđan G; Zeng, Quan; McGhee, Gayle C; Sundin, George W; Wise, John C

    2015-01-01

    Management of fire blight is complicated by limitations on use of antibiotics in agriculture, antibiotic resistance development, and limited efficacy of alternative control agents. Even though successful in control, preventive antibiotic sprays also affect non-target bacteria, aiding the selection for resistance which could ultimately be transferred to the pathogen Erwinia amylovora. Trunk injection is a target-precise pesticide delivery method that utilizes tree xylem to distribute injected compounds. Trunk injection could decrease antibiotic usage in the open environment and increase the effectiveness of compounds in fire blight control. In field experiments, after 1-2 apple tree injections of either streptomycin, potassium phosphites (PH), or acibenzolar-S-methyl (ASM), significant reduction of blossom and shoot blight symptoms was observed compared to water injected control trees. Overall disease suppression with streptomycin was lower than typically observed following spray applications to flowers. Trunk injection of oxytetracycline resulted in excellent control of shoot blight severity, suggesting that injection is a superior delivery method for this antibiotic. Injection of both ASM and PH resulted in the significant induction of PR-1, PR-2, and PR-8 protein genes in apple leaves indicating induction of systemic acquired resistance (SAR) under field conditions. The time separating SAR induction and fire blight symptom suppression indicated that various defensive compounds within the SAR response were synthesized and accumulated in the canopy. ASM and PH suppressed fire blight even after cessation of induced gene expression. With the development of injectable formulations and optimization of doses and injection schedules, the injection of protective compounds could serve as an effective option for fire blight control. PMID:25717330

  20. Control of fire blight (Erwinia amylovora) on apple trees with trunk-injected plant resistance inducers and antibiotics and assessment of induction of pathogenesis-related protein genes

    PubMed Central

    Aćimović, Srđan G.; Zeng, Quan; McGhee, Gayle C.; Sundin, George W.; Wise, John C.

    2015-01-01

    Management of fire blight is complicated by limitations on use of antibiotics in agriculture, antibiotic resistance development, and limited efficacy of alternative control agents. Even though successful in control, preventive antibiotic sprays also affect non-target bacteria, aiding the selection for resistance which could ultimately be transferred to the pathogen Erwinia amylovora. Trunk injection is a target-precise pesticide delivery method that utilizes tree xylem to distribute injected compounds. Trunk injection could decrease antibiotic usage in the open environment and increase the effectiveness of compounds in fire blight control. In field experiments, after 1–2 apple tree injections of either streptomycin, potassium phosphites (PH), or acibenzolar-S-methyl (ASM), significant reduction of blossom and shoot blight symptoms was observed compared to water injected control trees. Overall disease suppression with streptomycin was lower than typically observed following spray applications to flowers. Trunk injection of oxytetracycline resulted in excellent control of shoot blight severity, suggesting that injection is a superior delivery method for this antibiotic. Injection of both ASM and PH resulted in the significant induction of PR-1, PR-2, and PR-8 protein genes in apple leaves indicating induction of systemic acquired resistance (SAR) under field conditions. The time separating SAR induction and fire blight symptom suppression indicated that various defensive compounds within the SAR response were synthesized and accumulated in the canopy. ASM and PH suppressed fire blight even after cessation of induced gene expression. With the development of injectable formulations and optimization of doses and injection schedules, the injection of protective compounds could serve as an effective option for fire blight control. PMID:25717330

  1. Forest or the trees: At what scale do elephants make foraging decisions?

    NASA Astrophysics Data System (ADS)

    Shrader, Adrian M.; Bell, Caroline; Bertolli, Liandra; Ward, David

    2012-07-01

    For herbivores, food is distributed spatially in a hierarchical manner ranging from plant parts to regions. Ultimately, utilisation of food is dependent on the scale at which herbivores make foraging decisions. A key factor that influences these decisions is body size, because selection inversely relates to body size. As a result, large animals can be less selective than small herbivores. Savanna elephants (Loxodonta africana) are the largest terrestrial herbivore. Thus, they represent a potential extreme with respect to unselective feeding. However, several studies have indicated that elephants prefer specific habitats and certain woody plant species. Thus, it is unclear at which scale elephants focus their foraging decisions. To determine this, we recorded the seasonal selection of habitats and woody plant species by elephants in the Ithala Game Reserve, South Africa. We expected that during the wet season, when both food quality and availability were high, that elephants would select primarily for habitats. This, however, does not mean that they would utilise plant species within these habitats in proportion to availability, but rather would show a stronger selection for habitats compared to plants. In contrast, during the dry season when food quality and availability declined, we expected that elephants would shift and select for the remaining high quality woody species across all habitats. Consistent with our predictions, elephants selected for the larger spatial scale (i.e. habitats) during the wet season. However, elephants did not increase their selection of woody species during the dry season, but rather increased their selection of habitats relative to woody plant selection. Unlike a number of earlier studies, we found that that neither palatability (i.e. crude protein, digestibility, and energy) alone nor tannin concentrations had a significant effect for determining the elephants' selection of woody species. However, the palatability:tannin ratio was

  2. Treatment of envenomation by Echis coloratus (mid-east saw scaled viper): a decision tree.

    PubMed

    Gilon, D; Shalev, O; Benbassat, J

    1989-01-01

    Envenomation by Echis coloratus causes a transient hemostatic failure. Systemic symptoms, hypotension and evident bleeding are rare, with only one reported fatality. In this paper, we examine the decision to treat victims of Echis coloratus by a specific horse antiserum. The decision model considers the mortality of treated and untreated envenomation, and the side effects of antiserum treatment: fatal anaphylaxis, serum sickness and increased risk of death after a possible repeated exposure to horse antiserum in the future. The results of the analysis are not sensitive to variations in the probability of side effects of antiserum treatment. They are sensitive to variations in the risk of bleeding after envenomation, in the degree of reduction of this risk by antiserum treatment and in the risk of dying after an event of bleeding. Prompt administration of antiserum appears to be the treatment of choice if it reduces the risk of bleeding from 23.6% to 20.3% and if 1.6% or more of the bleeding events are fatal. We conclude that presently available data support antiserum treatment of victims of Echis coloratus who present with hemostatic failure, even though the advantage imparted by this treatment appears to be small. PMID:2683230

  3. Tree Ensembles on the Induced Discrete Space.

    PubMed

    Yildiz, Olcay Taner

    2016-05-01

    Decision trees are widely used predictive models in machine learning. Recently, K -tree is proposed, where the original discrete feature space is expanded by generating all orderings of values of k discrete attributes and these orderings are used as the new attributes in decision tree induction. Although K -tree performs significantly better than the proper one, their exponential time complexity can prohibit their use. In this brief, we propose K -forest, an extension of random forest, where a subset of features is selected randomly from the induced discrete space. Simulation results on 17 data sets show that the novel ensemble classifier has significantly lower error rate compared with the random forest based on the original feature space. PMID:26011897

  4. Decision-tree analysis of clinical data to aid diagnostic reasoning for equine laminitis: a cross-sectional study.

    PubMed

    Wylie, C E; Shaw, D J; Verheyen, K L P; Newton, J R

    2016-04-23

    The objective of this cross-sectional study was to compare the prevalence of selected clinical signs in laminitis cases and non-laminitic but lame controls to evaluate their capability to discriminate laminitis from other causes of lameness. Participating veterinary practitioners completed a checklist of laminitis-associated clinical signs identified by literature review. Cases were defined as horses/ponies with veterinary-diagnosed, clinically apparent laminitis; controls were horses/ponies with any lameness other than laminitis. Associations were tested by logistic regression with adjusted odds ratios (ORs) and 95% confidence intervals, with veterinary practice as an a priori fixed effect. Multivariable analysis using graphical classification tree-based statistical models linked laminitis prevalence with specific combinations of clinical signs. Data were collected for 588 cases and 201 controls. Five clinical signs had a difference in prevalence of greater than +50 per cent: 'reluctance to walk' (OR 4.4), 'short, stilted gait at walk' (OR 9.4), 'difficulty turning' (OR 16.9), 'shifting weight' (OR 17.7) and 'increased digital pulse' (OR 13.2) (all P<0.001). 'Bilateral forelimb lameness' was the best discriminator; 92 per cent of animals with this clinical sign had laminitis (OR 40.5, P<0.001). If, in addition, horses/ponies had an 'increased digital pulse', 99 per cent were identified as laminitis. 'Presence of a flat/convex sole' also significantly enhanced clinical diagnosis discrimination (OR 15.5, P<0.001). This is the first epidemiological laminitis study to use decision-tree analysis, providing the first evidence base for evaluating clinical signs to differentially diagnose laminitis from other causes of lameness. Improved evaluation of the clinical signs displayed by laminitic animals examined by first-opinion practitioners will lead to equine welfare improvements. PMID:26969668

  5. Detecting subcanopy invasive plant species in tropical rainforest by integrating optical and microwave (InSAR/PolInSAR) remote sensing data, and a decision tree algorithm

    NASA Astrophysics Data System (ADS)

    Ghulam, Abduwasit; Porton, Ingrid; Freeman, Karen

    2014-02-01

    In this paper, we propose a decision tree algorithm to characterize spatial extent and spectral features of invasive plant species (i.e., guava, Madagascar cardamom, and Molucca raspberry) in tropical rainforests by integrating datasets from passive and active remote sensing sensors. The decision tree algorithm is based on a number of input variables including matching score and infeasibility images from Mixture Tuned Matched Filtering (MTMF), land-cover maps, tree height information derived from high resolution stereo imagery, polarimetric feature images, Radar Forest Degradation Index (RFDI), polarimetric and InSAR coherence and phase difference images. Spatial distributions of the study organisms are mapped using pixel-based Winner-Takes-All (WTA) algorithm, object oriented feature extraction, spectral unmixing, and compared with the newly developed decision tree approach. Our results show that the InSAR phase difference and PolInSAR HH-VV coherence images of L-band PALSAR data are the most important variables following the MTMF outputs in mapping subcanopy invasive plant species in tropical rainforest. We also show that the three types of invasive plants alone occupy about 17.6% of the Betampona Nature Reserve (BNR) while mixed forest, shrubland and grassland areas are summed to 11.9% of the reserve. This work presents the first systematic attempt to evaluate forest degradation, habitat quality and invasive plant statistics in the BNR, and provides significant insights as to management strategies for the control of invasive plants and conversation in the reserve.

  6. A decision tree model to estimate the value of information provided by a groundwater quality monitoring network

    NASA Astrophysics Data System (ADS)

    Khader, A.; Rosenberg, D.; McKee, M.

    2012-12-01

    Nitrate pollution poses a health risk for infants whose freshwater drinking source is groundwater. This risk creates a need to design an effective groundwater monitoring network, acquire information on groundwater conditions, and use acquired information to inform management. These actions require time, money, and effort. This paper presents a method to estimate the value of information (VOI) provided by a groundwater quality monitoring network located in an aquifer whose water poses a spatially heterogeneous and uncertain health risk. A decision tree model describes the structure of the decision alternatives facing the decision maker and the expected outcomes from these alternatives. The alternatives include: (i) ignore the health risk of nitrate contaminated water, (ii) switch to alternative water sources such as bottled water, or (iii) implement a previously designed groundwater quality monitoring network that takes into account uncertainties in aquifer properties, pollution transport processes, and climate (Khader and McKee, 2012). The VOI is estimated as the difference between the expected costs of implementing the monitoring network and the lowest-cost uninformed alternative. We illustrate the method for the Eocene Aquifer, West Bank, Palestine where methemoglobinemia is the main health problem associated with the principal pollutant nitrate. The expected cost of each alternative is estimated as the weighted sum of the costs and probabilities (likelihoods) associated with the uncertain outcomes resulting from the alternative. Uncertain outcomes include actual nitrate concentrations in the aquifer, concentrations reported by the monitoring system, whether people abide by manager recommendations to use/not-use aquifer water, and whether people get sick from drinking contaminated water. Outcome costs include healthcare for methemoglobinemia, purchase of bottled water, and installation and maintenance of the groundwater monitoring system. At current

  7. A decision tree model to estimate the value of information provided by a groundwater quality monitoring network

    NASA Astrophysics Data System (ADS)

    Khader, A. I.; Rosenberg, D. E.; McKee, M.

    2013-05-01

    Groundwater contaminated with nitrate poses a serious health risk to infants when this contaminated water is used for culinary purposes. To avoid this health risk, people need to know whether their culinary water is contaminated or not. Therefore, there is a need to design an effective groundwater monitoring network, acquire information on groundwater conditions, and use acquired information to inform management options. These actions require time, money, and effort. This paper presents a method to estimate the value of information (VOI) provided by a groundwater quality monitoring network located in an aquifer whose water poses a spatially heterogeneous and uncertain health risk. A decision tree model describes the structure of the decision alternatives facing the decision-maker and the expected outcomes from these alternatives. The alternatives include (i) ignore the health risk of nitrate-contaminated water, (ii) switch to alternative water sources such as bottled water, or (iii) implement a previously designed groundwater quality monitoring network that takes into account uncertainties in aquifer properties, contaminant transport processes, and climate (Khader, 2012). The VOI is estimated as the difference between the expected costs of implementing the monitoring network and the lowest-cost uninformed alternative. We illustrate the method for the Eocene Aquifer, West Bank, Palestine, where methemoglobinemia (blue baby syndrome) is the main health problem associated with the principal contaminant nitrate. The expected cost of each alternative is estimated as the weighted sum of the costs and probabilities (likelihoods) associated with the uncertain outcomes resulting from the alternative. Uncertain outcomes include actual nitrate concentrations in the aquifer, concentrations reported by the monitoring system, whether people abide by manager recommendations to use/not use aquifer water, and whether people get sick from drinking contaminated water. Outcome costs

  8. Expression profiling of FLOWERING LOCUS T-like gene in alternate bearing 'Hass' avocado trees suggests a role for PaFT in avocado flower induction.

    PubMed

    Ziv, Dafna; Zviran, Tali; Zezak, Oshrat; Samach, Alon; Irihimovitch, Vered

    2014-01-01

    In many perennials, heavy fruit load on a shoot decreases the ability of the plant to undergo floral induction in the following spring, resulting in a pattern of crop production known as alternate bearing. Here, we studied the effects of fruit load on floral determination in 'Hass' avocado (Persea americana). De-fruiting experiments initially confirmed the negative effects of fruit load on return to flowering. Next, we isolated a FLOWERING LOCUS T-like gene, PaFT, hypothesized to act as a phloem-mobile florigen signal and examined its expression profile in shoot tissues of on (fully loaded) and off (fruit-lacking) trees. Expression analyses revealed a strong peak in PaFT transcript levels in leaves of off trees from the end of October through November, followed by a return to starting levels. Moreover and concomitant with inflorescence development, only off buds displayed up-regulation of the floral identity transcripts PaAP1 and PaLFY, with significant variation being detected from October and November, respectively. Furthermore, a parallel microscopic study of off apical buds revealed the presence of secondary inflorescence axis structures that only appeared towards the end of November. Finally, ectopic expression of PaFT in Arabidopsis resulted in early flowering transition. Together, our data suggests a link between increased PaFT expression observed during late autumn and avocado flower induction. Furthermore, our results also imply that, as in the case of other crop trees, fruit-load might affect flowering by repressing the expression of PaFT in the leaves. Possible mechanism(s) by which fruit crop might repress PaFT expression, are discussed. PMID:25330324

  9. Expression Profiling of FLOWERING LOCUS T-Like Gene in Alternate Bearing ‘Hass' Avocado Trees Suggests a Role for PaFT in Avocado Flower Induction

    PubMed Central

    Ziv, Dafna; Zviran, Tali; Zezak, Oshrat; Samach, Alon; Irihimovitch, Vered

    2014-01-01

    In many perennials, heavy fruit load on a shoot decreases the ability of the plant to undergo floral induction in the following spring, resulting in a pattern of crop production known as alternate bearing. Here, we studied the effects of fruit load on floral determination in ‘Hass' avocado (Persea americana). De-fruiting experiments initially confirmed the negative effects of fruit load on return to flowering. Next, we isolated a FLOWERING LOCUS T-like gene, PaFT, hypothesized to act as a phloem-mobile florigen signal and examined its expression profile in shoot tissues of on (fully loaded) and off (fruit-lacking) trees. Expression analyses revealed a strong peak in PaFT transcript levels in leaves of off trees from the end of October through November, followed by a return to starting levels. Moreover and concomitant with inflorescence development, only off buds displayed up-regulation of the floral identity transcripts PaAP1 and PaLFY, with significant variation being detected from October and November, respectively. Furthermore, a parallel microscopic study of off apical buds revealed the presence of secondary inflorescence axis structures that only appeared towards the end of November. Finally, ectopic expression of PaFT in Arabidopsis resulted in early flowering transition. Together, our data suggests a link between increased PaFT expression observed during late autumn and avocado flower induction. Furthermore, our results also imply that, as in the case of other crop trees, fruit-load might affect flowering by repressing the expression of PaFT in the leaves. Possible mechanism(s) by which fruit crop might repress PaFT expression, are discussed. PMID:25330324

  10. Diagnosis of pulmonary hypertension from magnetic resonance imaging–based computational models and decision tree analysis

    PubMed Central

    Swift, Andrew J.; Capener, David; Kiely, David; Hose, Rod; Wild, Jim M.

    2016-01-01

    Abstract Accurately identifying patients with pulmonary hypertension (PH) using noninvasive methods is challenging, and right heart catheterization (RHC) is the gold standard. Magnetic resonance imaging (MRI) has been proposed as an alternative to echocardiography and RHC in the assessment of cardiac function and pulmonary hemodynamics in patients with suspected PH. The aim of this study was to assess whether machine learning using computational modeling techniques and image-based metrics of PH can improve the diagnostic accuracy of MRI in PH. Seventy-two patients with suspected PH attending a referral center underwent RHC and MRI within 48 hours. Fifty-seven patients were diagnosed with PH, and 15 had no PH. A number of functional and structural cardiac and cardiovascular markers derived from 2 mathematical models and also solely from MRI of the main pulmonary artery and heart were integrated into a classification algorithm to investigate the diagnostic utility of the combination of the individual markers. A physiological marker based on the quantification of wave reflection in the pulmonary artery was shown to perform best individually, but optimal diagnostic performance was found by the combination of several image-based markers. Classifier results, validated using leave-one-out cross validation, demonstrated that combining computation-derived metrics reflecting hemodynamic changes in the pulmonary vasculature with measurement of right ventricular morphology and function, in a decision support algorithm, provides a method to noninvasively diagnose PH with high accuracy (92%). The high diagnostic accuracy of these MRI-based model parameters may reduce the need for RHC in patients with suspected PH. PMID:27252844

  11. Diagnosis of pulmonary hypertension from magnetic resonance imaging-based computational models and decision tree analysis.

    PubMed

    Lungu, Angela; Swift, Andrew J; Capener, David; Kiely, David; Hose, Rod; Wild, Jim M

    2016-06-01

    Accurately identifying patients with pulmonary hypertension (PH) using noninvasive methods is challenging, and right heart catheterization (RHC) is the gold standard. Magnetic resonance imaging (MRI) has been proposed as an alternative to echocardiography and RHC in the assessment of cardiac function and pulmonary hemodynamics in patients with suspected PH. The aim of this study was to assess whether machine learning using computational modeling techniques and image-based metrics of PH can improve the diagnostic accuracy of MRI in PH. Seventy-two patients with suspected PH attending a referral center underwent RHC and MRI within 48 hours. Fifty-seven patients were diagnosed with PH, and 15 had no PH. A number of functional and structural cardiac and cardiovascular markers derived from 2 mathematical models and also solely from MRI of the main pulmonary artery and heart were integrated into a classification algorithm to investigate the diagnostic utility of the combination of the individual markers. A physiological marker based on the quantification of wave reflection in the pulmonary artery was shown to perform best individually, but optimal diagnostic performance was found by the combination of several image-based markers. Classifier results, validated using leave-one-out cross validation, demonstrated that combining computation-derived metrics reflecting hemodynamic changes in the pulmonary vasculature with measurement of right ventricular morphology and function, in a decision support algorithm, provides a method to noninvasively diagnose PH with high accuracy (92%). The high diagnostic accuracy of these MRI-based model parameters may reduce the need for RHC in patients with suspected PH. PMID:27252844

  12. Assessing and monitoring the risk of desertification in Dobrogea, Romania, using Landsat data and decision tree classifier.

    PubMed

    Vorovencii, Iosif

    2015-04-01

    The risk of the desertification of a part of Romania is increasingly evident, constituting a serious problem for the environment and the society. This article attempts to assess and monitor the risk of desertification in Dobrogea using Landsat Thematic Mapper (TM) satellite images acquired in 1987, 1994, 2000, 2007 and 2011. In order to assess the risk of desertification, we used as indicators the Modified Soil Adjustment Vegetation Index 1 (MSAVI1), the Moving Standard Deviation Index (MSDI) and the albedo, indices relating to the vegetation conditions, the landscape pattern and micrometeorology. The decision tree classifier (DTC) was also used on the basis of pre-established rules, and maps displaying six grades of desertification risk were obtained: non, very low, low, medium, high and severe. Land surface temperature (LST) was also used for the analysis. The results indicate that, according to pre-established rules for the period of 1987-2011, there are two grades of desertification risk that have an ascending trend in Dobrogea, namely very low and medium desertification. An investigation into the causes of the desertification risk revealed that high temperature is the main factor, accompanied by the destruction of forest shelterbelts and of the irrigation system and, to a smaller extent, by the fragmentation of agricultural land and the deforestation in the study area. PMID:25800368

  13. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    NASA Astrophysics Data System (ADS)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  14. Clinical elements that predict outcome after traumatic brain injury: a prospective multicenter recursive partitioning (decision-tree) analysis.

    PubMed

    Brown, Allen W; Malec, James F; McClelland, Robyn L; Diehl, Nancy N; Englander, Jeffrey; Cifu, David X

    2005-10-01

    Traumatic brain injury (TBI) often presents clinicians with a complex combination of clinical elements that can confound treatment and make outcome prediction challenging. Predictive models have commonly used acute physiological variables and gross clinical measures to predict mortality and basic outcome endpoints. The primary goal of this study was to consider all clinical elements available concerning a survivor of TBI admitted for inpatient rehabilitation, and identify those factors that predict disability, need for supervision, and productive activity one year after injury. The Traumatic Brain Injury Model Systems (TBIMS) database was used for decision tree analysis using recursive partitioning (n = 3463). Outcome measures included the Functional Independence Measure(), the Disability Rating Scale, the Supervision Rating Scale, and a measure of productive activity. Predictor variables included all physical examination elements, measures of injury severity (initial Glasgow Coma Scale score, duration of post-traumatic amnesia [PTA], length of coma, CT scan pathology), gender, age, and years of education. The duration of PTA, age, and most elements of the physical examination were predictive of early disability. The duration of PTA alone was selected to predict late disability and independent living. The duration of PTA, age, sitting balance, and limb strength were selected to predict productive activity at 1 year. The duration of PTA was the best predictor of outcome selected in this model for all endpoints and elements of the physical examination provided additional predictive value. Valid and reliable measures of PTA and physical impairment after TBI are important for accurate outcome prediction. PMID:16238482

  15. Model-Based Design of a Decision Tree for Treating HER2+ Cancers Based on Genetic and Protein Biomarkers

    PubMed Central

    Kirouac, DC; Lahdenranta, J; Du, J; Yarar, D; Onsum, MD; Nielsen, UB; McDonagh, CF

    2015-01-01

    Human cancers are incredibly diverse with regard to molecular aberrations, dependence on oncogenic signaling pathways, and responses to pharmacological intervention. We wished to assess how cellular dependence on the canonical PI3K vs. MAPK pathways within HER2+ cancers affects responses to combinations of targeted therapies, and biomarkers predictive of their activity. Through an integrative analysis of mechanistic model simulations and in vitro cell line profiling, we designed a six-arm decision tree to stratify treatment of HER2+ cancers using combinations of targeted agents. Activating mutations in the PI3K and MAPK pathways (PIK3CA and KRAS), and expression of the HER3 ligand heregulin determined sensitivity to combinations of inhibitors against HER2 (lapatinib), HER3 (MM-111), AKT (MK-2206), and MEK (GSK-1120212; trametinib), in addition to the standard of care trastuzumab (Herceptin). The strategy used to identify effective combinations and predictive biomarkers in HER2-expressing tumors may be more broadly extendable to other human cancers. PMID:26225238

  16. Application of artificial neural network, fuzzy logic and decision tree algorithms for modelling of streamflow at Kasol in India.

    PubMed

    Senthil Kumar, A R; Goyal, Manish Kumar; Ojha, C S P; Singh, R D; Swamee, P K

    2013-01-01

    The prediction of streamflow is required in many activities associated with the planning and operation of the components of a water resources system. Soft computing techniques have proven to be an efficient alternative to traditional methods for modelling qualitative and quantitative water resource variables such as streamflow, etc. The focus of this paper is to present the development of models using multiple linear regression (MLR), artificial neural network (ANN), fuzzy logic and decision tree algorithms such as M5 and REPTree for predicting the streamflow at Kasol located at the upstream of Bhakra reservoir in Sutlej basin in northern India. The input vector to the various models using different algorithms was derived considering statistical properties such as auto-correlation function, partial auto-correlation and cross-correlation function of the time series. It was found that REPtree model performed well compared to other soft computing techniques such as MLR, ANN, fuzzy logic, and M5P investigated in this study and the results of the REPTree model indicate that the entire range of streamflow values were simulated fairly well. The performance of the naïve persistence model was compared with other models and the requirement of the development of the naïve persistence model was also analysed by persistence index. PMID:24355836

  17. Rejecting Non-MIP-Like Tracks using Boosted Decision Trees with the T2K Pi-Zero Subdetector

    NASA Astrophysics Data System (ADS)

    Hogan, Matthew; Schwehr, Jacklyn; Cherdack, Daniel; Wilson, Robert; T2K Collaboration

    2016-03-01

    Tokai-to-Kamioka (T2K) is a long-baseline neutrino experiment with a narrow band energy spectrum peaked at 600 MeV. The Pi-Zero detector (PØD) is a plastic scintillator-based detector located in the off-axis near detector complex 280 meters from the beam origin. It is designed to constrain neutral-current induced π0 production background at the far detector using the water target which is interleaved between scintillator layers. A PØD-based measurement of charged-current (CC) single charged pion (1π+) production on water is being developed which will have expanded phase space coverage as compared to the previous analysis. The signal channel for this analysis, which for T2K is dominated by Δ production, is defined as events that produce a single muon, single charged pion, and any number of nucleons in the final state. The analysis will employ machine learning algorithms to enhance CC1π+ selection by studying topological observables that characterize signal well. Important observables for this analysis are those that discriminate a minimum ionizing particle (MIP) like a muon or pion from a proton at the T2K energies. This work describes the development of a discriminator using Boosted Decision Trees to reject non-MIP-like PØD tracks.

  18. Landsat-derived cropland mask for Tanzania using 2010-2013 time series and decision tree classifier methods

    NASA Astrophysics Data System (ADS)

    Justice, C. J.

    2015-12-01

    80% of Tanzania's population is involved in the agriculture sector. Despite this national dependence, agricultural reporting is minimal and monitoring efforts are in their infancy. The cropland mask developed through this study provides the framework for agricultural monitoring through informing analysis of crop conditions, dispersion, and intensity at a national scale. Tanzania is dominated by smallholder agricultural systems with an average field size of less than one hectare (Sarris et al, 2006). At this field scale, previous classifications of agricultural land in Tanzania using MODIS course resolution data are insufficient to inform a working monitoring system. The nation-wide cropland mask in this study was developed using composited Landsat tiles from a 2010-2013 time series. Decision tree classifiers methods were used in the study with representative training areas collected for agriculture and no agriculture using appropriate indices to separate these classes (Hansen et al, 2013). Validation was done using random sample and high resolution satellite images to compare Agriculture and No agriculture samples from the study area. The techniques used in this study were successful and have the potential to be adapted for other countries, allowing targeted monitoring efforts to improve food security, market price, and inform agricultural policy.

  19. An expert system with radial basis function neural network based on decision trees for predicting sediment transport in sewers.

    PubMed

    Ebtehaj, Isa; Bonakdari, Hossein; Zaji, Amir Hossein

    2016-01-01

    In this study, an expert system with a radial basis function neural network (RBF-NN) based on decision trees (DT) is designed to predict sediment transport in sewer pipes at the limit of deposition. First, sensitivity analysis is carried out to investigate the effect of each parameter on predicting the densimetric Froude number (Fr). The results indicate that utilizing the ratio of the median particle diameter to pipe diameter (d/D), ratio of median particle diameter to hydraulic radius (d/R) and volumetric sediment concentration (C(V)) as the input combination leads to the best Fr prediction. Subsequently, the new hybrid DT-RBF method is presented. The results of DT-RBF are compared with RBF and RBF-particle swarm optimization (PSO), which uses PSO for RBF training. It appears that DT-RBF is more accurate (R(2) = 0.934, MARE = 0.103, RMSE = 0.527, SI = 0.13, BIAS = -0.071) than the two other RBF methods. Moreover, the proposed DT-RBF model offers explicit expressions for use by practicing engineers. PMID:27386995

  20. Rapid decision support tool based on novel ecosystem service variables for retrofitting of permeable pavement systems in the presence of trees.

    PubMed

    Scholz, Miklas; Uzomah, Vincent C

    2013-08-01

    The retrofitting of sustainable drainage systems (SuDS) such as permeable pavements is currently undertaken ad hoc using expert experience supported by minimal guidance based predominantly on hard engineering variables. There is a lack of practical decision support tools useful for a rapid assessment of the potential of ecosystem services when retrofitting permeable pavements in urban areas that either feature existing trees or should be planted with trees in the near future. Thus the aim of this paper is to develop an innovative rapid decision support tool based on novel ecosystem service variables for retrofitting of permeable pavement systems close to trees. This unique tool proposes the retrofitting of permeable pavements that obtained the highest ecosystem service score for a specific urban site enhanced by the presence of trees. This approach is based on a novel ecosystem service philosophy adapted to permeable pavements rather than on traditional engineering judgement associated with variables based on quick community and environment assessments. For an example case study area such as Greater Manchester, which was dominated by Sycamore and Common Lime, a comparison with the traditional approach of determining community and environment variables indicates that permeable pavements are generally a preferred SuDS option. Permeable pavements combined with urban trees received relatively high scores, because of their great potential impact in terms of water and air quality improvement, and flood control, respectively. The outcomes of this paper are likely to lead to more combined permeable pavement and tree systems in the urban landscape, which are beneficial for humans and the environment. PMID:23697848

  1. Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS

    NASA Astrophysics Data System (ADS)

    Tehrany, Mahyat Shafapour; Pradhan, Biswajeet; Jebur, Mustafa Neamah

    2013-11-01

    Decision tree (DT) machine learning algorithm was used to map the flood susceptible areas in Kelantan, Malaysia.We used an ensemble frequency ratio (FR) and logistic regression (LR) model in order to overcome weak points of the LR.Combined method of FR and LR was used to map the susceptible areas in Kelantan, Malaysia.Results of both methods were compared and their efficiency was assessed.Most influencing conditioning factors on flooding were recognized.

  2. Utilizing home health care electronic health records for telehomecare patients with heart failure: a decision tree approach to detect associations with rehospitalizations

    PubMed Central

    Kang, Youjeong; McHugh, Matthew D; Chittams, Jesse; Bowles, Kathryn H.

    2016-01-01

    Heart failure is a complex condition with a significant impact on patients’ lives. A few studies have identified risk factors associated with rehospitalization among telehomecare patients with heart failure using logistic regression or survival analysis models. To date there are no published studies that have used data mining techniques to detect associations with rehospitalizations among telehomecare patients with heart failure. This study is a secondary analysis of the home health care electronic medical record called the Outcome Assessment and Information Set (OASIS)-C for 552 telemonitored heart failure patients. Bivariate analyses using SAS™ and a decision tree technique using Waikato Environment for Knowledge Analysis were used. From the decision tree technique, the presence of skin issue(s) was identified as the top predictor of rehospitalization that could be identified during the start of care assessment, followed by patient’s living situation, patient’s overall health status, severe pain experiences, frequency of activity-limiting pain, and total number of anticipated therapy visits coombined. Examining risk factors for rehospitalization from the OASIS-C database using a decision tree approach among a cohort of telehomecare patients provided a broad understanding of the characteristics of patients who are appropriate for the use of telehomcare or who need additional supports. PMID:26848645

  3. A protocol for developing early warning score models from vital signs data in hospitals using ensembles of decision trees

    PubMed Central

    Xu, Michael; Tam, Benjamin; Thabane, Lehana; Fox-Robichaud, Alison

    2015-01-01

    Introduction Multiple early warning scores (EWS) have been developed and implemented to reduce cardiac arrests on hospital wards. Case–control observational studies that generate an area under the receiver operator curve (AUROC) are the usual validation method, but investigators have also generated EWS with algorithms with no prior clinical knowledge. We present a protocol for the validation and comparison of our local Hamilton Early Warning Score (HEWS) with that generated using decision tree (DT) methods. Methods and analysis A database of electronically recorded vital signs from 4 medical and 4 surgical wards will be used to generate DT EWS (DT-HEWS). A third EWS will be generated using ensemble-based methods. Missing data will be multiple imputed. For a relative risk reduction of 50% in our composite outcome (cardiac or respiratory arrest, unanticipated intensive care unit (ICU) admission or hospital death) with a power of 80%, we calculated a sample size of 17 151 patient days based on our cardiac arrest rates in 2012. The performance of the National EWS, DT-HEWS and the ensemble EWS will be compared using AUROC. Ethics and dissemination Ethics approval was received from the Hamilton Integrated Research Ethics Board (#13-724-C). The vital signs and associated outcomes are stored in a database on our secure hospital server. Preliminary dissemination of this protocol was presented in abstract form at an international critical care meeting. Final results of this analysis will be used to improve on the existing HEWS and will be shared through publication and presentation at critical care meetings. PMID:26353873

  4. Induction of somatic embryogenesis in explants of shoot cultures established from adult Eucalyptus globulus and E. saligna × E. maidenii trees.

    PubMed

    Corredoira, E; Ballester, A; Ibarra, M; Vieitez, A M

    2015-06-01

    A reproducible procedure for induction of somatic embryogenesis (SE) from adult trees of Eucalyptus globulus Labill. and the hybrid E. saligna Smith × E. maidenii has been developed for the first time. Somatic embryos were obtained from both shoot apex and leaf explants of all three genotypes evaluated, although embryogenic frequencies were significantly influenced by the species/genotype, auxin and explant type. Picloram was more efficient for somatic embryo induction than naphthaleneacetic acid (NAA), with the highest frequency of induction being obtained in Murashige and Skoog medium containing 40 µM picloram and 40 mg l(-1) gum Arabic, in which 64% of the shoot apex explants and 68.8% of the leaf explants yielded somatic embryos. The embryogenic response of the hybrid was higher than that of the E. globulus, especially when NAA was used. The cultures initiated on picloram-containing medium consisted of nodular embryogenic structures surrounded by a mucilaginous coating layer that emerged from a watery callus developed from the initial explants. Cotyledonary somatic embryos were differentiated after subculture of these nodular embryogenic structures on a medium lacking plant growth regulators. Histological analysis confirmed the bipolar organization of the somatic embryos, with shoot and root meristems and closed procambial tissue that bifurcated into small cotyledons. The root pole was more differentiated than the shoot pole, which appeared to be formed by a few meristematic layers. Maintenance of the embryogenic lines by secondary SE was attained by subculturing individual cotyledonary embryos or small clusters of globular and torpedo embryos on medium with 16.11 µM NAA at 4- to 5-week intervals. Somatic embryos converted into plantlets after being transferred to liquid germination medium although plant regeneration remained poor. PMID:25877768

  5. Assessing the safety of co-exposure to food packaging migrants in food and water using the maximum cumulative ratio and an established decision tree.

    PubMed

    Price, Paul; Zaleski, Rosemary; Hollnagel, Heli; Ketelslegers, Hans; Han, Xianglu

    2014-01-01

    Food contact materials can release low levels of multiple chemicals (migrants) into foods and beverages, to which individuals can be exposed through food consumption. This paper investigates the potential for non-carcinogenic effects from exposure to multiple migrants using the Cefic Mixtures Ad hoc Team (MIAT) decision tree. The purpose of the assessment is to demonstrate how the decision tree can be applied to concurrent exposures to multiple migrants using either hazard or structural data on the specific components, i.e. based on the acceptable daily intake (ADI) or the threshold of toxicological concern. The tree was used to assess risks from co-exposure to migrants reported in a study on non-intentionally added substances (NIAS) eluting from food contact-grade plastic and two studies of water bottles: one on organic compounds and the other on ionic forms of various elements. The MIAT decision tree assigns co-exposures to different risk management groups (I, II, IIIA and IIIB) based on the hazard index, and the maximum cumulative ratio (MCR). The predicted co-exposures for all examples fell into Group II (low toxicological concern) and had MCR values of 1.3 and 2.4 (indicating that one or two components drove the majority of the mixture's toxicity). MCR values from the study of inorganic ions (126 mixtures) ranged from 1.1 to 3.8 for glass and from 1.1 to 5.0 for plastic containers. The MCR values indicated that a single compound drove toxicity in 58% of the mixtures. MCR values also declined with increases in the hazard index for the screening assessments of exposure (suggesting fewer substances contributed as risk potential increased). Overall, it can be concluded that the data on co-exposure to migrants evaluated in these case studies are of low toxicological concern and the safety assessment approach described in this paper was shown to be a helpful screening tool. PMID:24320041

  6. Measurement of single top quark production in the tau+jets channnel using boosted decision trees at D0

    SciTech Connect

    Liu, Zhiyi

    2009-12-01

    The top quark is the heaviest known matter particle and plays an important role in the Standard Model of particle physics. At hadron colliders, it is possible to produce single top quarks via the weak interaction. This allows a direct measurement of the CKM matrix element Vtb and serves as a window to new physics. The first direct measurement of single top quark production with a tau lepton in the final state (the tau+jets channel) is presented in this thesis. The measurement uses 4.8 fb-1 of Tevatron Run II data in p$\\bar{p}$ collisions at √s = 1.96 TeV acquired by the D0 experiment. After selecting a data sample and building a background model, the data and background model are in good agreement. A multivariate technique, boosted decision trees, is employed in discriminating the small single top quark signal from a large background. The expected sensitivity of the tau+jets channel in the Standard Model is 1.8 standard deviations. Using a Bayesian statistical approach, an upper limit on the cross section of single top quark production in the tau+jets channel is measured as 7.3 pb at 95% confidence level, and the cross section is measured as 3.4-1.8+2.0 pb. The result of the single top quark production in the tau+jets channel is also combined with those in the electron+jets and muon+jets channels. The expected sensitivity of the electron, muon and tau combined analysis is 4.7 standard deviations, to be compared to 4.5 standard deviations in electron and muon alone. The measured cross section in the three combined final states is σ(p$\\bar{p}$ → tb + X,tqb + X) = 3.84-0.83+0.89 pb. A lower limit on |Vtb| is also measured in the three combined final states to be larger than 0.85 at 95% confidence level. These results are consistent with Standard Model expectations.

  7. A decision tree approach for the application of drug metabolism and kinetic studies to in vivo and in vitro toxicological and pharmacological testing.

    PubMed

    Bach, P H; Bridges, J W

    1985-01-01

    The integration of toxicological and other biological findings with information on drug metabolism and pharmacokinetics is often very important for rational decision making in safety evaluation programmes. This goal is unlikely to be achieved by conducting a routine package of inflexibly defined drug metabolism and pharmacokinetic test protocols for each new chemical. Rather, an intelligent selection of experiments based on the known properties of a chemical is required. A series of decision trees are proposed which serve as an aide memoire in the choice of appropriate drug metabolism and pharmacokinetic experiments. These decision trees cover the physicochemical properties of a chemical, data on animal and human pharmacology and toxicology, and environmental information relevant to possible contamination. In many cases, drug metabolism and pharmacokinetic factors are an important prerequisite to the design of in vitro tests that are relevant to the in vivo situation. A scheme is provided to assist the identification of appropriate conditions for the in vitro testing of individual chemicals. PMID:3868347

  8. Segregating the Effects of Seed Traits and Common Ancestry of Hardwood Trees on Eastern Gray Squirrel Foraging Decisions

    PubMed Central

    Sundaram, Mekala; Willoughby, Janna R.; Lichti, Nathanael I.; Steele, Michael A.; Swihart, Robert K.

    2015-01-01

    The evolution of specific seed traits in scatter-hoarded tree species often has been attributed to granivore foraging behavior. However, the degree to which foraging investments and seed traits correlate with phylogenetic relationships among trees remains unexplored. We presented seeds of 23 different hardwood tree species (families Betulaceae, Fagaceae, Juglandaceae) to eastern gray squirrels (Sciurus carolinensis), and measured the time and distance travelled by squirrels that consumed or cached each seed. We estimated 11 physical and chemical seed traits for each species, and the phylogenetic relationships between the 23 hardwood trees. Variance partitioning revealed that considerable variation in foraging investment was attributable to seed traits alone (27–73%), and combined effects of seed traits and phylogeny of hardwood trees (5–55%). A phylogenetic PCA (pPCA) on seed traits and tree phylogeny resulted in 2 “global” axes of traits that were phylogenetically autocorrelated at the family and genus level and a third “local” axis in which traits were not phylogenetically autocorrelated. Collectively, these axes explained 30–76% of the variation in squirrel foraging investments. The first global pPCA axis, which produced large scores for seed species with thin shells, low lipid and high carbohydrate content, was negatively related to time to consume and cache seeds and travel distance to cache. The second global pPCA axis, which produced large scores for seeds with high protein, low tannin and low dormancy levels, was an important predictor of consumption time only. The local pPCA axis primarily reflected kernel mass. Although it explained only 12% of the variation in trait space and was not autocorrelated among phylogenetic clades, the local axis was related to all four squirrel foraging investments. Squirrel foraging behaviors are influenced by a combination of phylogenetically conserved and more evolutionarily labile seed traits that is

  9. Segregating the Effects of Seed Traits and Common Ancestry of Hardwood Trees on Eastern Gray Squirrel Foraging Decisions.

    PubMed

    Sundaram, Mekala; Willoughby, Janna R; Lichti, Nathanael I; Steele, Michael A; Swihart, Robert K

    2015-01-01

    The evolution of specific seed traits in scatter-hoarded tree species often has been attributed to granivore foraging behavior. However, the degree to which foraging investments and seed traits correlate with phylogenetic relationships among trees remains unexplored. We presented seeds of 23 different hardwood tree species (families Betulaceae, Fagaceae, Juglandaceae) to eastern gray squirrels (Sciurus carolinensis), and measured the time and distance travelled by squirrels that consumed or cached each seed. We estimated 11 physical and chemical seed traits for each species, and the phylogenetic relationships between the 23 hardwood trees. Variance partitioning revealed that considerable variation in foraging investment was attributable to seed traits alone (27-73%), and combined effects of seed traits and phylogeny of hardwood trees (5-55%). A phylogenetic PCA (pPCA) on seed traits and tree phylogeny resulted in 2 "global" axes of traits that were phylogenetically autocorrelated at the family and genus level and a third "local" axis in which traits were not phylogenetically autocorrelated. Collectively, these axes explained 30-76% of the variation in squirrel foraging investments. The first global pPCA axis, which produced large scores for seed species with thin shells, low lipid and high carbohydrate content, was negatively related to time to consume and cache seeds and travel distance to cache. The second global pPCA axis, which produced large scores for seeds with high protein, low tannin and low dormancy levels, was an important predictor of consumption time only. The local pPCA axis primarily reflected kernel mass. Although it explained only 12% of the variation in trait space and was not autocorrelated among phylogenetic clades, the local axis was related to all four squirrel foraging investments. Squirrel foraging behaviors are influenced by a combination of phylogenetically conserved and more evolutionarily labile seed traits that is consistent with a weak

  10. Decision-tree-model identification of nitrate pollution activities in groundwater: A combination of a dual isotope approach and chemical ions

    NASA Astrophysics Data System (ADS)

    Xue, Dongmei; Pang, Fengmei; Meng, Fanqiao; Wang, Zhongliang; Wu, Wenliang

    2015-09-01

    To develop management practices for agricultural crops to protect against NO3- contamination in groundwater, dominant pollution activities require reliable classification. In this study, we (1) classified potential NO3- pollution activities via an unsupervised learning algorithm based on δ15N- and δ18O-NO3- and physico-chemical properties of groundwater at 55 sampling locations; and (2) determined which water quality parameters could be used to identify the sources of NO3- contamination via a decision tree model. When a combination of δ15N-, δ18O-NO3- and physico-chemical properties of groundwater was used as an input for the k-means clustering algorithm, it allowed for a reliable clustering of the 55 sampling locations into 4 corresponding agricultural activities: well irrigated agriculture (28 sampling locations), sewage irrigated agriculture (16 sampling locations), a combination of sewage irrigated agriculture, farm and industry (5 sampling locations) and a combination of well irrigated agriculture and farm (6 sampling locations). A decision tree model with 97.5% classification success was developed based on SO42 - and Cl- variables. The NO3- and the δ15N- and δ18O-NO3- variables demonstrated limitation in developing a decision tree model as multiple N sources and fractionation processes both resulted in difficulties of discriminating NO3- concentrations and isotopic values. Although only the SO42 - and Cl- were selected as important discriminating variables, concentration data alone could not identify the specific NO3- sources responsible for groundwater contamination. This is a result of comprehensive analysis. To further reduce NO3- contamination, an integrated approach should be set-up by combining N and O isotopes of NO3- with land-uses and physico-chemical properties, especially in areas with complex agricultural activities.