Science.gov

Sample records for classification tree approach

  1. The decision tree approach to classification

    NASA Technical Reports Server (NTRS)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  2. Learning classification trees

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1991-01-01

    Algorithms for learning classification trees have had successes in artificial intelligence and statistics over many years. How a tree learning algorithm can be derived from Bayesian decision theory is outlined. This introduces Bayesian techniques for splitting, smoothing, and tree averaging. The splitting rule turns out to be similar to Quinlan's information gain splitting rule, while smoothing and averaging replace pruning. Comparative experiments with reimplementations of a minimum encoding approach, Quinlan's C4 and Breiman et al. Cart show the full Bayesian algorithm is consistently as good, or more accurate than these other approaches though at a computational price.

  3. Tree Classification Software

    NASA Technical Reports Server (NTRS)

    Buntine, Wray

    1993-01-01

    This paper introduces the IND Tree Package to prospective users. IND does supervised learning using classification trees. This learning task is a basic tool used in the development of diagnosis, monitoring and expert systems. The IND Tree Package was developed as part of a NASA project to semi-automate the development of data analysis and modelling algorithms using artificial intelligence techniques. The IND Tree Package integrates features from CART and C4 with newer Bayesian and minimum encoding methods for growing classification trees and graphs. The IND Tree Package also provides an experimental control suite on top. The newer features give improved probability estimates often required in diagnostic and screening tasks. The package comes with a manual, Unix 'man' entries, and a guide to tree methods and research. The IND Tree Package is implemented in C under Unix and was beta-tested at university and commercial research laboratories in the United States.

  4. A classification tree approach to the development of actuarial violence risk assessment tools.

    PubMed

    Steadman, H J; Silver, E; Monahan, J; Appelbaum, P S; Robbins, P C; Mulvey, E P; Grisso, T; Roth, L H; Banks, S

    2000-02-01

    Since the 1970s, a wide body of research has suggested that the accuracy of clinical risk assessments of violence might be increased if clinicians used actuarial tools. Despite considerable progress in recent years in the development of such tools for violence risk assessment, they remain primarily research instruments, largely ignored in daily clinical practice. We argue that because most existing actuarial tools are based on a main effects regression approach, they do not adequately reflect the contingent nature of the clinical assessment processes. To enhance the use of actuarial violence risk assessment tools, we propose a classification tree rather than a main effects regression approach. In addition, we suggest that by employing two decision thresholds for identifying high- and low-risk cases--instead of the standard single threshold--the use of actuarial tools to make dichotomous risk classification decisions may be further enhanced. These claims are supported with empirical data from the MacArthur Violence Risk Assessment Study.

  5. A novel approach to internal crown characterization for coniferous tree species classification

    NASA Astrophysics Data System (ADS)

    Harikumar, A.; Bovolo, F.; Bruzzone, L.

    2016-10-01

    The knowledge about individual trees in forest is highly beneficial in forest management. High density small foot- print multi-return airborne Light Detection and Ranging (LiDAR) data can provide a very accurate information about the structural properties of individual trees in forests. Every tree species has a unique set of crown structural characteristics that can be used for tree species classification. In this paper, we use both the internal and external crown structural information of a conifer tree crown, derived from a high density small foot-print multi-return LiDAR data acquisition for species classification. Considering the fact that branches are the major building blocks of a conifer tree crown, we obtain the internal crown structural information using a branch level analysis. The structure of each conifer branch is represented using clusters in the LiDAR point cloud. We propose the joint use of the k-means clustering and geometric shape fitting, on the LiDAR data projected onto a novel 3-dimensional space, to identify branch clusters. After mapping the identified clusters back to the original space, six internal geometric features are estimated using a branch-level analysis. The external crown characteristics are modeled by using six least correlated features based on cone fitting and convex hull. Species classification is performed using a sparse Support Vector Machines (sparse SVM) classifier.

  6. Chronic subdural hematoma: Surgical management and outcome in 986 cases: A classification and regression tree approach

    PubMed Central

    Rovlias, Aristedis; Theodoropoulos, Spyridon; Papoutsakis, Dimitrios

    2015-01-01

    Background: Chronic subdural hematoma (CSDH) is one of the most common clinical entities in daily neurosurgical practice which carries a most favorable prognosis. However, because of the advanced age and medical problems of patients, surgical therapy is frequently associated with various complications. This study evaluated the clinical features, radiological findings, and neurological outcome in a large series of patients with CSDH. Methods: A classification and regression tree (CART) technique was employed in the analysis of data from 986 patients who were operated at Asclepeion General Hospital of Athens from January 1986 to December 2011. Burr holes evacuation with closed system drainage has been the operative technique of first choice at our institution for 29 consecutive years. A total of 27 prognostic factors were examined to predict the outcome at 3-month postoperatively. Results: Our results indicated that neurological status on admission was the best predictor of outcome. With regard to the other data, age, brain atrophy, thickness and density of hematoma, subdural accumulation of air, and antiplatelet and anticoagulant therapy were found to correlate significantly with prognosis. The overall cross-validated predictive accuracy of CART model was 85.34%, with a cross-validated relative error of 0.326. Conclusions: Methodologically, CART technique is quite different from the more commonly used methods, with the primary benefit of illustrating the important prognostic variables as related to outcome. Since, the ideal therapy for the treatment of CSDH is still under debate, this technique may prove useful in developing new therapeutic strategies and approaches for patients with CSDH. PMID:26257985

  7. Quantification of chemical peptide reactivity for screening contact allergens: a classification tree model approach.

    PubMed

    Gerberick, G Frank; Vassallo, Jeffrey D; Foertsch, Leslie M; Price, Brad B; Chaney, Joel G; Lepoittevin, Jean-Pierre

    2007-06-01

    In the interest of reducing animal use, in vitro alternatives for skin sensitization testing are under development. One unifying characteristic of chemical allergens is the requirement that they react with proteins for the effective induction of skin sensitization. The majority of chemical allergens are electrophilic and react with nucleophilic amino acids. To determine whether and to what extent reactivity correlates with skin sensitization potential, 82 chemicals comprising allergens of different potencies and nonallergenic chemicals were evaluated for their ability to react with reduced glutathione (GSH) or with two synthetic peptides containing either a single cysteine or lysine. Following a 15-min reaction time with GSH, or a 24-h reaction time with the two synthetic peptides, the samples were analyzed by high-performance liquid chromatography. UV detection was used to monitor the depletion of GSH or the peptides. The peptide reactivity data were compared with existing local lymph node assay data using recursive partitioning methodology to build a classification tree that allowed a ranking of reactivity as minimal, low, moderate, and high. Generally, nonallergens and weak allergens demonstrated minimal to low peptide reactivity, whereas moderate to extremely potent allergens displayed moderate to high peptide reactivity. Classifying minimal reactivity as nonsensitizers and low, moderate, and high reactivity as sensitizers, it was determined that a model based on cysteine and lysine gave a prediction accuracy of 89%. The results of these investigations reveal that measurement of peptide reactivity has considerable potential utility as a screening approach for skin sensitization testing, and thereby for reducing reliance on animal-based test methods.

  8. Applying an Ensemble Classification Tree Approach to the Prediction of Completion of a 12-Step Facilitation Intervention with Stimulant Abusers

    PubMed Central

    Doyle, Suzanne R.; Donovan, Dennis M.

    2014-01-01

    Aims The purpose of this study was to explore the selection of predictor variables in the evaluation of drug treatment completion using an ensemble approach with classification trees. The basic methodology is reviewed and the subagging procedure of random subsampling is applied. Methods Among 234 individuals with stimulant use disorders randomized to a 12-Step facilitative intervention shown to increase stimulant use abstinence, 67.52% were classified as treatment completers. A total of 122 baseline variables were used to identify factors associated with completion. Findings The number of types of self-help activity involvement prior to treatment was the predominant predictor. Other effective predictors included better coping self-efficacy for substance use in high-risk situations, more days of prior meeting attendance, greater acceptance of the Disease model, higher confidence for not resuming use following discharge, lower ASI Drug and Alcohol composite scores, negative urine screens for cocaine or marijuana, and fewer employment problems. Conclusions The application of an ensemble subsampling regression tree method utilizes the fact that classification trees are unstable but, on average, produce an improved prediction of the completion of drug abuse treatment. The results support the notion there are early indicators of treatment completion that may allow for modification of approaches more tailored to fitting the needs of individuals and potentially provide more successful treatment engagement and improved outcomes. PMID:25134038

  9. Risk Profiles for Weight Gain among Postmenopausal Women: A Classification and Regression Tree Analysis Approach

    PubMed Central

    Jung, Su Yon; Vitolins, Mara Z.; Fenton, Jenifer; Frazier-Wood, Alexis C.; Hursting, Stephen D.; Chang, Shine

    2015-01-01

    Purpose Risk factors for obesity and weight gain are typically evaluated individually while “adjusting for” the influence of other confounding factors, and few studies, if any, have created risk profiles by clustering risk factors. We identified subgroups of postmenopausal women homogeneous in their clustered modifiable and non-modifiable risk factors for gaining ≥ 3% weight. Methods This study included 612 postmenopausal women 50–79 years old, enrolled in an ancillary study of the Women's Health Initiative Observational Study between February 1995 and July 1998. Classification and regression tree and stepwise regression models were built and compared. Results Of 27 selected variables, the factors significantly related to ≥ 3% weight gain were weight change in the past 2 years, age at menopause, dietary fiber, fat, alcohol intake, and smoking. In women younger than 65 years, less than 4 kg weight change in the past 2 years sufficiently reduced risk of ≥ 3% weight gain. Different combinations of risk factors related to weight gain were reported for subgroups of women: women 65 years or older (essential factor: < 9.8 g/day dietary factor), African Americans (essential factor: currently smoking), and white women (essential factor: ≥ 5 kg weight change for the past 2 years). Conclusions Our findings suggest specific characteristics for particular subgroups of postmenopausal women that may be useful for identifying those at risk for weight gain. The study results may be useful for targeting efforts to promote strategies to reduce the risk of obesity and weight gain in subgroups of postmenopausal women and maximize the effect of weight control by decreasing obesity-relevant adverse health outcomes. PMID:25822239

  10. IHC and the WHO classification of lymphomas: cost effective immunohistochemistry using a deductive reasoning "decision tree" approach.

    PubMed

    Taylor, Clive R

    2009-10-01

    The 2008 World Health Organization Classification of Tumors of the Hematopoietic and Lymphoid Tissues defines current standards of practice for the diagnosis and classification of malignant lymphomas and related entities. More than 50 different types of lymphomas are described, combining fine morphologic criteria with immunohistochemical (IHC), and sometimes molecular, findings. Faced with such a broad range of different lymphomas, some encountered only rarely, and a rapidly growing, ever changing, armamentarium of approximately 80 pertinent IHC "stains", the challenge to the pathologist is to employ IHC in an efficient manner, to arrive at an assured diagnosis as rapidly as possible. This review uses deductive reasoning, after a decision tree or dendrogram model that relies upon recognition of basic morphologic patterns for efficient selection, use and interpretation of IHC markers to classify node-based malignancies by the World Health Organization schema. The review is divided into 2 parts, the first addressing those lymphomas that produce a follicular or nodular pattern of lymph nodal involvement; the second addressing diffuse proliferations in lymph nodes. It is accepted that only specialized centers are able to apply all of the technical resources and experience necessary for definitive diagnosis of unusual cases. Emphasis therefore is given to the more common lymphomas and the more commonly available IHC "stains", for a pragmatic and practical approach that is both broadly feasible and cost effective. By this method an assured diagnosis may be reached in the majority of nodal lymphomas, at the same time developing a sufficiency of data to recognize those rare or atypical cases that require referral to a specialized center.

  11. Mapping trees outside forests using high-resolution aerial imagery: a comparison of pixel- and object-based classification approaches.

    PubMed

    Meneguzzo, Dacia M; Liknes, Greg C; Nelson, Mark D

    2013-08-01

    Discrete trees and small groups of trees in nonforest settings are considered an essential resource around the world and are collectively referred to as trees outside forests (ToF). ToF provide important functions across the landscape, such as protecting soil and water resources, providing wildlife habitat, and improving farmstead energy efficiency and aesthetics. Despite the significance of ToF, forest and other natural resource inventory programs and geospatial land cover datasets that are available at a national scale do not include comprehensive information regarding ToF in the United States. Additional ground-based data collection and acquisition of specialized imagery to inventory these resources are expensive alternatives. As a potential solution, we identified two remote sensing-based approaches that use free high-resolution aerial imagery from the National Agriculture Imagery Program (NAIP) to map all tree cover in an agriculturally dominant landscape. We compared the results obtained using an unsupervised per-pixel classifier (independent component analysis-[ICA]) and an object-based image analysis (OBIA) procedure in Steele County, Minnesota, USA. Three types of accuracy assessments were used to evaluate how each method performed in terms of: (1) producing a county-level estimate of total tree-covered area, (2) correctly locating tree cover on the ground, and (3) how tree cover patch metrics computed from the classified outputs compared to those delineated by a human photo interpreter. Both approaches were found to be viable for mapping tree cover over a broad spatial extent and could serve to supplement ground-based inventory data. The ICA approach produced an estimate of total tree cover more similar to the photo-interpreted result, but the output from the OBIA method was more realistic in terms of describing the actual observed spatial pattern of tree cover.

  12. DIF Trees: Using Classification Trees to Detect Differential Item Functioning

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qiu

    2010-01-01

    A nonparametric tree classification procedure is used to detect differential item functioning for items that are dichotomously scored. Classification trees are shown to be an alternative procedure to detect differential item functioning other than the use of traditional Mantel-Haenszel and logistic regression analysis. A nonparametric…

  13. Type I Error Control for Tree Classification

    PubMed Central

    Jung, Sin-Ho; Chen, Yong; Ahn, Hongshik

    2014-01-01

    Binary tree classification has been useful for classifying the whole population based on the levels of outcome variable that is associated with chosen predictors. Often we start a classification with a large number of candidate predictors, and each predictor takes a number of different cutoff values. Because of these types of multiplicity, binary tree classification method is subject to severe type I error probability. Nonetheless, there have not been many publications to address this issue. In this paper, we propose a binary tree classification method to control the probability to accept a predictor below certain level, say 5%. PMID:25452689

  14. Detecting hospital-acquired infections: A document classification approach using support vector machines and gradient tree boosting.

    PubMed

    Ehrentraut, Claudia; Ekholm, Markus; Tanushi, Hideyuki; Tiedemann, Jörg; Dalianis, Hercules

    2016-08-04

    Hospital-acquired infections pose a significant risk to patient health, while their surveillance is an additional workload for hospital staff. Our overall aim is to build a surveillance system that reliably detects all patient records that potentially include hospital-acquired infections. This is to reduce the burden of having the hospital staff manually check patient records. This study focuses on the application of text classification using support vector machines and gradient tree boosting to the problem. Support vector machines and gradient tree boosting have never been applied to the problem of detecting hospital-acquired infections in Swedish patient records, and according to our experiments, they lead to encouraging results. The best result is yielded by gradient tree boosting, at 93.7 percent recall, 79.7 percent precision and 85.7 percent F1 score when using stemming. We can show that simple preprocessing techniques and parameter tuning can lead to high recall (which we aim for in screening patient records) with appropriate precision for this task.

  15. Selecting Relevant Descriptors for Classification by Bayesian Estimates: A Comparison with Decision Trees and Support Vector Machines Approaches for Disparate Data Sets.

    PubMed

    Carbon-Mangels, Miriam; Hutter, Michael C

    2011-10-01

    Classification algorithms suffer from the curse of dimensionality, which leads to overfitting, particularly if the problem is over-determined. Therefore it is of particular interest to identify the most relevant descriptors to reduce the complexity. We applied Bayesian estimates to model the probability distribution of descriptors values used for binary classification using n-fold cross-validation. As a measure for the discriminative power of the classifiers, the symmetric form of the Kullback-Leibler divergence of their probability distributions was computed. We found that the most relevant descriptors possess a Gaussian-like distribution of their values, show the largest divergences, and therefore appear most often in the cross-validation scenario. The results were compared to those of the LASSO feature selection method applied to multiple decision trees and support vector machine approaches for data sets of substrates and nonsubstrates of three Cytochrome P450 isoenzymes, which comprise strongly unbalanced compound distributions. In contrast to decision trees and support vector machines, the performance of Bayesian estimates is less affected by unbalanced data sets. This strategy reveals those descriptors that allow a simple linear separation of the classes, whereas the superior accuracy of decision trees and support vector machines can be attributed to nonlinear separation, which are in turn more prone to overfitting.

  16. Geometric tree kernels: classification of COPD from airway tree geometry.

    PubMed

    Feragen, Aasa; Petersen, Jens; Grimm, Dominik; Dirksen, Asger; Pedersen, Jesper Holst; Borgwardt, Karsten; de Bruijne, Marleen

    2013-01-01

    Methodological contributions: This paper introduces a family of kernels for analyzing (anatomical) trees endowed with vector valued measurements made along the tree. While state-of-the-art graph and tree kernels use combinatorial tree/graph structure with discrete node and edge labels, the kernels presented in this paper can include geometric information such as branch shape, branch radius or other vector valued properties. In addition to being flexible in their ability to model different types of attributes, the presented kernels are computationally efficient and some of them can easily be computed for large datasets (N - 10.000) of trees with 30 - 600 branches. Combining the kernels with standard machine learning tools enables us to analyze the relation between disease and anatomical tree structure and geometry. Experimental results: The kernels are used to compare airway trees segmented from low-dose CT, endowed with branch shape descriptors and airway wall area percentage measurements made along the tree. Using kernelized hypothesis testing we show that the geometric airway trees are significantly differently distributed in patients with Chronic Obstructive Pulmonary Disease (COPD) than in healthy individuals. The geometric tree kernels also give a significant increase in the classification accuracy of COPD from geometric tree structure endowed with airway wall thickness measurements in comparison with state-of-the-art methods, giving further insight into the relationship between airway wall thickness and COPD. Software: Software for computing kernels and statistical tests is available at http://image.diku.dk/aasa/software.php.

  17. Classification tree method for bacterial source tracking with antibiotic resistance analysis data.

    PubMed

    Price, Bertram; Venso, Elichia A; Frana, Mark F; Greenberg, Joshua; Ware, Adam; Currey, Lee

    2006-05-01

    Various statistical classification methods, including discriminant analysis, logistic regression, and cluster analysis, have been used with antibiotic resistance analysis (ARA) data to construct models for bacterial source tracking (BST). We applied the statistical method known as classification trees to build a model for BST for the Anacostia Watershed in Maryland. Classification trees have more flexibility than other statistical classification approaches based on standard statistical methods to accommodate complex interactions among ARA variables. This article describes the use of classification trees for BST and includes discussion of its principal parameters and features. Anacostia Watershed ARA data are used to illustrate the application of classification trees, and we report the BST results for the watershed.

  18. The WHO classification of lymphomas: cost-effective immunohistochemistry using a deductive reasoning "decision tree" approach: part II: the decision tree approach: diffuse patterns of proliferation in lymph nodes.

    PubMed

    Taylor, Clive R

    2009-12-01

    The 2008 World Health Organization Classification of Tumors of the Haematopoietic and Lymphoid Tissues defines current standards of practice for the diagnosis and classification of malignant lymphomas and related entities. More than 50 different types of lymphomas are described. Faced with such a broad range of different lymphomas, some encountered only rarely, and a rapidly growing armamentarium of 80 or more pertinent immunohistochemical (IHC) "stains," the challenge to the pathologist is to use IHC in an efficient manner to arrive at an assured and timely diagnosis. This review uses deductive reasoning following a decision tree or dendrogram model, combining basic morphologic patterns and common IHC markers to classify node-based malignancies by the World Health Organization schema. The review is divided into 2 parts, the first addressing those lymphomas that produce a follicular or nodular pattern of lymph nodal involvement appeared in the previous issue of AIMM. The second part addresses diffuse proliferations in lymph nodes. Emphasis is given to the more common lymphomas and the more commonly available IHC "stains" for a pragmatic and practical approach that is both broadly feasible and cost-effective. By this method, an assured diagnosis may be reached in the majority of nodal lymphomas, at the same time developing a sufficiency of data to recognize those rare or atypical cases that require referral to a specialized center.

  19. Fast Image Texture Classification Using Decision Trees

    NASA Technical Reports Server (NTRS)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  20. Classification of posture and activities by using decision trees.

    PubMed

    Zhang, Ting; Tang, Wenlong; Sazonov, Edward S

    2012-01-01

    Obesity prevention and treatment as well as healthy life style recommendation requires the estimation of everyday physical activity. Monitoring posture allocations and activities with sensor systems is an effective method to achieve the goal. However, at present, most devices available rely on multiple sensors distributed on the body, which might be too obtrusive for everyday use. In this study, data was collected from a wearable shoe sensor system (SmartShoe) and a decision tree algorithm was applied for classification with high computational accuracy. The dataset was collected from 9 individual subjects performing 6 different activities--sitting, standing, walking, cycling, and stairs ascent/descent. Statistical features were calculated and the classification with decision tree classifier was performed, after which, advanced boosting algorithm was applied. The computational accuracy is as high as 98.85% without boosting, and 98.90% after boosting. Additionally, the simple tree structure provides a direct approach to simplify the feature set.

  1. Seasonal Effect on Tree Species Classification in an Urban Environment Using Hyperspectral Data, LiDAR, and an Object- Oriented Approach.

    PubMed

    Voss, Matthew; Sugumaran, Ramanathan

    2008-05-06

    The objective of the current study was to analyze the seasonal effect on differentiating tree species in an urban environment using multi-temporal hyperspectral data, Light Detection And Ranging (LiDAR) data, and a tree species database collected from the field. Two Airborne Imaging Spectrometer for Applications (AISA) hyperspectral images were collected, covering the Summer and Fall seasons. In order to make both datasets spatially and spectrally compatible, several preprocessing steps, including band reduction and a spatial degradation, were performed. An object-oriented classification was performed on both images using training data collected randomly from the tree species database. The seven dominant tree species (Gleditsia triacanthos, Acer saccharum, Tilia Americana, Quercus palustris, Pinus strobus and Picea glauca) were used in the classification. The results from this analysis did not show any major difference in overall accuracy between the two seasons. Overall accuracy was approximately 57% for the Summer dataset and 56% for the Fall dataset. However, the Fall dataset provided more consistent results for all tree species while the Summer dataset had a few higher individual class accuracies. Further, adding LiDAR into the classification improved the results by 19% for both fall and summer. This is mainly due to the removal of shadow effect and the addition of elevation data to separate low and high vegetation.

  2. Voxel classification based airway tree segmentation

    NASA Astrophysics Data System (ADS)

    Lo, Pechin; de Bruijne, Marleen

    2008-03-01

    This paper presents a voxel classification based method for segmenting the human airway tree in volumetric computed tomography (CT) images. In contrast to standard methods that use only voxel intensities, our method uses a more complex appearance model based on a set of local image appearance features and Kth nearest neighbor (KNN) classification. The optimal set of features for classification is selected automatically from a large set of features describing the local image structure at several scales. The use of multiple features enables the appearance model to differentiate between airway tree voxels and other voxels of similar intensities in the lung, thus making the segmentation robust to pathologies such as emphysema. The classifier is trained on imperfect segmentations that can easily be obtained using region growing with a manual threshold selection. Experiments show that the proposed method results in a more robust segmentation that can grow into the smaller airway branches without leaking into emphysematous areas, and is able to segment many branches that are not present in the training set.

  3. Semi-supervised SVM for individual tree crown species classification

    NASA Astrophysics Data System (ADS)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  4. Prediction of healthy blood with data mining classification by using Decision Tree, Naive Baysian and SVM approaches

    NASA Astrophysics Data System (ADS)

    Khalilinezhad, Mahdieh; Minaei, Behrooz; Vernazza, Gianni; Dellepiane, Silvana

    2015-03-01

    Data mining (DM) is the process of discovery knowledge from large databases. Applications of data mining in Blood Transfusion Organizations could be useful for improving the performance of blood donation service. The aim of this research is the prediction of healthiness of blood donors in Blood Transfusion Organization (BTO). For this goal, three famous algorithms such as Decision Tree C4.5, Naïve Bayesian classifier, and Support Vector Machine have been chosen and applied to a real database made of 11006 donors. Seven fields such as sex, age, job, education, marital status, type of donor, results of blood tests (doctors' comments and lab results about healthy or unhealthy blood donors) have been selected as input to these algorithms. The results of the three algorithms have been compared and an error cost analysis has been performed. According to this research and the obtained results, the best algorithm with low error cost and high accuracy is SVM. This research helps BTO to realize a model from blood donors in each area in order to predict the healthy blood or unhealthy blood of donors. This research could be useful if used in parallel with laboratory tests to better separate unhealthy blood.

  5. Classification Tree Method for Bacterial Source Tracking with Antibiotic Resistance Analysis Data

    PubMed Central

    Price, Bertram; Venso, Elichia A.; Frana, Mark F.; Greenberg, Joshua; Ware, Adam; Currey, Lee

    2006-01-01

    Various statistical classification methods, including discriminant analysis, logistic regression, and cluster analysis, have been used with antibiotic resistance analysis (ARA) data to construct models for bacterial source tracking (BST). We applied the statistical method known as classification trees to build a model for BST for the Anacostia Watershed in Maryland. Classification trees have more flexibility than other statistical classification approaches based on standard statistical methods to accommodate complex interactions among ARA variables. This article describes the use of classification trees for BST and includes discussion of its principal parameters and features. Anacostia Watershed ARA data are used to illustrate the application of classification trees, and we report the BST results for the watershed. PMID:16672492

  6. Evaluating multimedia chemical persistence: Classification and regression tree analysis

    SciTech Connect

    Bennett, D.H.; McKone, T.E.; Kastenberg, W.E.

    2000-04-01

    For the thousands of chemicals continuously released into the environment, it is desirable to make prospective assessments of those likely to be persistent. Widely distributed persistent chemicals are impossible to remove from the environment and remediation by natural processes may take decades, which is problematic if adverse health or ecological effects are discovered after prolonged release into the environment. A tiered approach using a classification scheme and a multimedia model for determining persistence is presented. Using specific criteria for persistence, a classification tree is developed to classify a chemical as persistent or nonpersistent based on the chemical properties. In this approach, the classification is derived from the results of a standardized unit world multimedia model. Thus, the classifications are more robust for multimedia pollutants than classifications using a single medium half-life. The method can be readily implemented and provides insight without requiring extensive and often unavailable data. This method can be used to classify chemicals when only a few properties are known and can be used to direct further data collection. Case studies are presented to demonstrate the advantages of the approach.

  7. Tree classification with fused mobile laser scanning and hyperspectral data.

    PubMed

    Puttonen, Eetu; Jaakkola, Anttoni; Litkey, Paula; Hyyppä, Juha

    2011-01-01

    Mobile Laser Scanning data were collected simultaneously with hyperspectral data using the Finnish Geodetic Institute Sensei system. The data were tested for tree species classification. The test area was an urban garden in the City of Espoo, Finland. Point clouds representing 168 individual tree specimens of 23 tree species were determined manually. The classification of the trees was done using first only the spatial data from point clouds, then with only the spectral data obtained with a spectrometer, and finally with the combined spatial and hyperspectral data from both sensors. Two classification tests were performed: the separation of coniferous and deciduous trees, and the identification of individual tree species. All determined tree specimens were used in distinguishing coniferous and deciduous trees. A subset of 133 trees and 10 tree species was used in the tree species classification. The best classification results for the fused data were 95.8% for the separation of the coniferous and deciduous classes. The best overall tree species classification succeeded with 83.5% accuracy for the best tested fused data feature combination. The respective results for paired structural features derived from the laser point cloud were 90.5% for the separation of the coniferous and deciduous classes and 65.4% for the species classification. Classification accuracies with paired hyperspectral reflectance value data were 90.5% for the separation of coniferous and deciduous classes and 62.4% for different species. The results are among the first of their kind and they show that mobile collected fused data outperformed single-sensor data in both classification tests and by a significant margin.

  8. Sensitivity of missing values in classification tree for large sample

    NASA Astrophysics Data System (ADS)

    Hasan, Norsida; Adam, Mohd Bakri; Mustapha, Norwati; Abu Bakar, Mohd Rizam

    2012-05-01

    Missing values either in predictor or in response variables are a very common problem in statistics and data mining. Cases with missing values are often ignored which results in loss of information and possible bias. The objectives of our research were to investigate the sensitivity of missing data in classification tree model for large sample. Data were obtained from one of the high level educational institutions in Malaysia. Students' background data were randomly eliminated and classification tree was used to predict students degree classification. The results showed that for large sample, the structure of the classification tree was sensitive to missing values especially for sample contains more than ten percent missing values.

  9. A Mixtures-of-Trees Framework for Multi-Label Classification

    PubMed Central

    Hong, Charmgil; Batal, Iyad; Hauskrecht, Milos

    2015-01-01

    We propose a new probabilistic approach for multi-label classification that aims to represent the class posterior distribution P(Y|X). Our approach uses a mixture of tree-structured Bayesian networks, which can leverage the computational advantages of conditional tree-structured models and the abilities of mixtures to compensate for tree-structured restrictions. We develop algorithms for learning the model from data and for performing multi-label predictions using the learned model. Experiments on multiple datasets demonstrate that our approach outperforms several state-of-the-art multi-label classification methods. PMID:25927011

  10. Watershed Merge Tree Classification for Electron Microscopy Image Segmentation

    SciTech Connect

    Liu, TIng; Jurrus, Elizabeth R.; Seyedhosseini, Mojtaba; Ellisman, Mark; Tasdizen, Tolga

    2012-11-11

    Automated segmentation of electron microscopy (EM) images is a challenging problem. In this paper, we present a novel method that utilizes a hierarchical structure and boundary classification for 2D neuron segmentation. With a membrane detection probability map, a watershed merge tree is built for the representation of hierarchical region merging from the watershed algorithm. A boundary classifier is learned with non-local image features to predict each potential merge in the tree, upon which merge decisions are made with consistency constraints in the sense of optimization to acquire the final segmentation. Independent of classifiers and decision strategies, our approach proposes a general framework for efficient hierarchical segmentation with statistical learning. We demonstrate that our method leads to a substantial improvement in segmentation accuracy.

  11. Classification of Liss IV Imagery Using Decision Tree Methods

    NASA Astrophysics Data System (ADS)

    Verma, Amit Kumar; Garg, P. K.; Prasad, K. S. Hari; Dadhwal, V. K.

    2016-06-01

    Image classification is a compulsory step in any remote sensing research. Classification uses the spectral information represented by the digital numbers in one or more spectral bands and attempts to classify each individual pixel based on this spectral information. Crop classification is the main concern of remote sensing applications for developing sustainable agriculture system. Vegetation indices computed from satellite images gives a good indication of the presence of vegetation. It is an indicator that describes the greenness, density and health of vegetation. Texture is also an important characteristics which is used to identifying objects or region of interest is an image. This paper illustrate the use of decision tree method to classify the land in to crop land and non-crop land and to classify different crops. In this paper we evaluate the possibility of crop classification using an integrated approach methods based on texture property with different vegetation indices for single date LISS IV sensor 5.8 meter high spatial resolution data. Eleven vegetation indices (NDVI, DVI, GEMI, GNDVI, MSAVI2, NDWI, NG, NR, NNIR, OSAVI and VI green) has been generated using green, red and NIR band and then image is classified using decision tree method. The other approach is used integration of texture feature (mean, variance, kurtosis and skewness) with these vegetation indices. A comparison has been done between these two methods. The results indicate that inclusion of textural feature with vegetation indices can be effectively implemented to produce classifiedmaps with 8.33% higher accuracy for Indian satellite IRS-P6, LISS IV sensor images.

  12. Trees of trees: an approach to comparing multiple alternative phylogenies.

    PubMed

    Nye, Tom M W

    2008-10-01

    Phylogenetic analysis very commonly produces several alternative trees for a given fixed set of taxa. For example, different sets of orthologous genes may be analyzed, or the analysis may sample from a distribution of probable trees. This article describes an approach to comparing and visualizing multiple alternative phylogenies via the idea of a "tree of trees" or "meta-tree." A meta-tree clusters phylogenies with similar topologies together in the same way that a phylogeny clusters species with similar DNA sequences. Leaf nodes on a meta-tree correspond to the original set of phylogenies given by some analysis, whereas interior nodes correspond to certain consensus topologies. The construction of meta-trees is motivated by analogy with construction of a most parsimonious tree for DNA data, but instead of using DNA letters, in a meta-tree the characters are partitions or splits of the set of taxa. An efficient algorithm for meta-tree construction is described that makes use of a known relationship between the majority consensus and parsimony in terms of gain and loss of splits. To illustrate these ideas meta-trees are constructed for two datasets: a set of gene trees for species of yeast and trees from a bootstrap analysis of a set of gene trees in ray-finned fish. A software tool for constructing meta-trees and comparing alternative phylogenies is available online, and the source code can be obtained from the author.

  13. Binary tree of posterior probability support vector machines for hyperspectral image classification

    NASA Astrophysics Data System (ADS)

    Wang, Dongli; Zhou, Yan; Zheng, Jianguo

    2011-01-01

    The problem of hyperspectral remote sensing images classification is revisited by posterior probability support vector machines (PPSVMs). To address the multiclass classification problem, PPSVMs are extended using binary tree structure and boosting with the Fisher ratio as class separability measure. The class pair with larger Fisher ratio separability measure is separated at upper nodes of the binary tree to optimize the structure of the tree and improve the classification accuracy. Two approaches are proposed to select the class pair and construct the binary tree. One is the so-called some-against-rest binary tree of PPSVMs (SBT), in which some classes are separated from the remaining classes at each node considering the Fisher ratio separability measure. For the other approach, named one-against-rest binary tree of PPSVMs (OBT), only one class is separated from the remaining classes at each node. Both approaches need only to train n - 1 (n is the number of classes) binary PPSVM classifiers, while the average convergence performance of SBT and OBT are O(log2n) and O[(n! - 1)/n], respectively. Experimental results show that both approaches obtain classification accuracy if not higher, at least comparable to other multiclass approaches, while using significantly fewer support vectors and reduced testing time.

  14. Classification Based on Tree-Structured Allocation Rules

    ERIC Educational Resources Information Center

    Vaughn, Brandon K.; Wang, Qui

    2008-01-01

    The authors consider the problem of classifying an unknown observation into 1 of several populations by using tree-structured allocation rules. Although many parametric classification procedures are robust to certain assumption violations, there is need for classification procedures that can be used regardless of the group-conditional…

  15. Urban Tree Classification Using Full-Waveform Airborne Laser Scanning

    NASA Astrophysics Data System (ADS)

    Koma, Zs.; Koenig, K.; Höfle, B.

    2016-06-01

    Vegetation mapping in urban environments plays an important role in biological research and urban management. Airborne laser scanning provides detailed 3D geodata, which allows to classify single trees into different taxa. Until now, research dealing with tree classification focused on forest environments. This study investigates the object-based classification of urban trees at taxonomic family level, using full-waveform airborne laser scanning data captured in the city centre of Vienna (Austria). The data set is characterised by a variety of taxa, including deciduous trees (beeches, mallows, plane trees and soapberries) and the coniferous pine species. A workflow for tree object classification is presented using geometric and radiometric features. The derived features are related to point density, crown shape and radiometric characteristics. For the derivation of crown features, a prior detection of the crown base is performed. The effects of interfering objects (e.g. fences and cars which are typical in urban areas) on the feature characteristics and the subsequent classification accuracy are investigated. The applicability of the features is evaluated by Random Forest classification and exploratory analysis. The most reliable classification is achieved by using the combination of geometric and radiometric features, resulting in 87.5% overall accuracy. By using radiometric features only, a reliable classification with accuracy of 86.3% can be achieved. The influence of interfering objects on feature characteristics is identified, in particular for the radiometric features. The results indicate the potential of using radiometric features in urban tree classification and show its limitations due to anthropogenic influences at the same time.

  16. Decision tree methods: applications for classification and prediction

    PubMed Central

    SONG, Yan-yan; LU, Ying

    2015-01-01

    Summary Decision tree methodology is a commonly used data mining method for establishing classification systems based on multiple covariates or for developing prediction algorithms for a target variable. This method classifies a population into branch-like segments that construct an inverted tree with a root node, internal nodes, and leaf nodes. The algorithm is non-parametric and can efficiently deal with large, complicated datasets without imposing a complicated parametric structure. When the sample size is large enough, study data can be divided into training and validation datasets. Using the training dataset to build a decision tree model and a validation dataset to decide on the appropriate tree size needed to achieve the optimal final model. This paper introduces frequently used algorithms used to develop decision trees (including CART, C4.5, CHAID, and QUEST) and describes the SPSS and SAS programs that can be used to visualize tree structure. PMID:26120265

  17. Comparing Hydrogeomorphic Approaches to Lake Classification

    NASA Astrophysics Data System (ADS)

    Martin, Sherry L.; Soranno, Patricia A.; Bremigan, Mary T.; Cheruvelil, Kendra S.

    2011-11-01

    A classification system is often used to reduce the number of different ecosystem types that governmental agencies are charged with monitoring and managing. We compare the ability of several different hydrogeomorphic (HGM)—based classifications to group lakes for water chemistry/clarity. We ask: (1) Which approach to lake classification is most successful at classifying lakes for similar water chemistry/clarity? (2) Which HGM features are most strongly related to the lake classes? and, (3) Can a single classification successfully classify lakes for all of the water chemistry/clarity variables examined? We use univariate and multivariate classification and regression tree (CART and MvCART) analysis of HGM features to classify alkalinity, water color, Secchi, total nitrogen, total phosphorus, and chlorophyll a from 151 minimally disturbed lakes in Michigan USA. We developed two MvCART models overall and two CART models for each water chemistry/clarity variable, in each case comparing: local HGM characteristics alone and local HGM characteristics combined with regionalizations and landscape position. The combined CART models had the highest strength of evidence (ωi range 0.92-1.00) and maximized within class homogeneity (ICC range 36-66%) for all water chemistry/clarity variables except water color and chlorophyll a. Because the most successful single classification was on average 20% less successful in classifying other water chemistry/clarity variables, we found that no single classification captures variability for all lake responses tested. Therefore, we suggest that the most successful classification (1) is specific to individual response variables, and (2) incorporates information from multiple spatial scales (regionalization and local HGM variables).

  18. Accelerating protein classification using suffix trees.

    PubMed

    Dorohonceanu, B; Nevill-Manning, C G

    2000-01-01

    Position-specific scoring matrices have been used extensively to recognize highly conserved protein regions. We present a method for accelerating these searches using a suffix tree data structure computed from the sequences to be searched. Building on earlier work that allows evaluation of a scoring matrix to be stopped early, the suffix tree-based method excludes many protein segments from consideration at once by pruning entire subtrees. Although suffix trees are usually expensive in space, the fact that scoring matrix evaluation requires an in-order traversal allows nodes to be stored more compactly without loss of speed, and our implementation requires only 17 bytes of primary memory per input symbol. Searches are accelerated by up to a factor of ten.

  19. Automated Method for Identification and Artery-Venous Classification of Vessel Trees in Retinal Vessel Networks

    PubMed Central

    Joshi, Vinayak S.; Reinhardt, Joseph M.; Garvin, Mona K.; Abramoff, Michael D.

    2014-01-01

    The separation of the retinal vessel network into distinct arterial and venous vessel trees is of high interest. We propose an automated method for identification and separation of retinal vessel trees in a retinal color image by converting a vessel segmentation image into a vessel segment map and identifying the individual vessel trees by graph search. Orientation, width, and intensity of each vessel segment are utilized to find the optimal graph of vessel segments. The separated vessel trees are labeled as primary vessel or branches. We utilize the separated vessel trees for arterial-venous (AV) classification, based on the color properties of the vessels in each tree graph. We applied our approach to a dataset of 50 fundus images from 50 subjects. The proposed method resulted in an accuracy of 91.44 correctly classified vessel pixels as either artery or vein. The accuracy of correctly classified major vessel segments was 96.42. PMID:24533066

  20. Combining QuickBird, LiDAR, and GIS topography indices to identify a single native tree species in a complex landscape using an object-based classification approach

    NASA Astrophysics Data System (ADS)

    Pham, Lien T. H.; Brabyn, Lars; Ashraf, Salman

    2016-08-01

    There are now a wide range of techniques that can be combined for image analysis. These include the use of object-based classifications rather than pixel-based classifiers, the use of LiDAR to determine vegetation height and vertical structure, as well terrain variables such as topographic wetness index and slope that can be calculated using GIS. This research investigates the benefits of combining these techniques to identify individual tree species. A QuickBird image and low point density LiDAR data for a coastal region in New Zealand was used to examine the possibility of mapping Pohutukawa trees which are regarded as an iconic tree in New Zealand. The study area included a mix of buildings and vegetation types. After image and LiDAR preparation, single tree objects were identified using a range of techniques including: a threshold of above ground height to eliminate ground based objects; Normalised Difference Vegetation Index and elevation difference between the first and last return of LiDAR data to distinguish vegetation from buildings; geometric information to separate clusters of trees from single trees, and treetop identification and region growing techniques to separate tree clusters into single tree crowns. Important feature variables were identified using Random Forest, and the Support Vector Machine provided the classification. The combined techniques using LiDAR and spectral data produced an overall accuracy of 85.4% (Kappa 80.6%). Classification using just the spectral data produced an overall accuracy of 75.8% (Kappa 67.8%). The research findings demonstrate how the combining of LiDAR and spectral data improves classification for Pohutukawa trees.

  1. Multiple Spectral-Spatial Classification Approach for Hyperspectral Data

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Benediktsson, Jon Atli; Chanussot, Jocelyn; Tilton, James C.

    2010-01-01

    A .new multiple classifier approach for spectral-spatial classification of hyperspectral images is proposed. Several classifiers are used independently to classify an image. For every pixel, if all the classifiers have assigned this pixel to the same class, the pixel is kept as a marker, i.e., a seed of the spatial region, with the corresponding class label. We propose to use spectral-spatial classifiers at the preliminary step of the marker selection procedure, each of them combining the results of a pixel-wise classification and a segmentation map. Different segmentation methods based on dissimilar principles lead to different classification results. Furthermore, a minimum spanning forest is built, where each tree is rooted on a classification -driven marker and forms a region in the spectral -spatial classification: map. Experimental results are presented for two hyperspectral airborne images. The proposed method significantly improves classification accuracies, when compared to previously proposed classification techniques.

  2. Identifying fallers among ophthalmic patients using classification tree methodology

    PubMed Central

    Chirico, Franco; Pecchia, Leandro; Rossi, Settimio; Testa, Francesco; Simonelli, Francesca

    2017-01-01

    Purpose To develop and validate a tool aiming to support ophthalmologists in identifying, during routine ophthalmologic visits, patients at higher risk of falling in the following year. Methods A group of 141 subjects (age: 73.2 ± 11.4 years), recruited at our Eye Clinic, underwent a baseline ophthalmic examination and a standardized questionnaire, including lifestyles, general health, social engagement and eyesight problems. Moreover, visual disability was assessed by the Activity of Daily Vision Scale (ADVS). The subjects were followed up for 12 months in order to record prospective falls. A subject who reported at least one fall within one year from the baseline assessment was considered as faller, otherwise as non-faller. Different tree-based algorithms (i.e., C4.5, AdaBoost and Random Forests) were used to develop automatic classifiers and their performances were evaluated by the cross-validation approach. Results Over the follow-up, 25 falls were referred by 13 patients. The logistic regression analysis showed the following variables as significant predictors of prospective falls: pseudophakia and use of prescribed eyeglasses as protective factors, recent worsening of visual acuity as risk factor. Random Forest ranked best corrected visual acuity, number of sleeping hours and job type as the most important features. Finally, AdaBoost enabled the identification of subjects at higher risk of falling in the following 12 months with a sensitivity rate of 69.2% and a specificity rate of 76.6%. Conclusions The current study proposes a novel method, based on classification trees applied to self-reported factors and health information assessed by a standardized questionnaire during ophthalmological visits, to identify ophthalmic patients at higher risk of falling in the following 12 months. The findings of the current study pave the way to the validation of the proposed novel tool for fall risk screening on a larger cohort of patients with visual impairment referred

  3. Object-based methods for individual tree identification and tree species classification from high-spatial resolution imagery

    NASA Astrophysics Data System (ADS)

    Wang, Le

    2003-10-01

    Modern forest management poses an increasing need for detailed knowledge of forest information at different spatial scales. At the forest level, the information for tree species assemblage is desired whereas at or below the stand level, individual tree related information is preferred. Remote Sensing provides an effective tool to extract the above information at multiple spatial scales in the continuous time domain. To date, the increasing volume and readily availability of high-spatial-resolution data have lead to a much wider application of remotely sensed products. Nevertheless, to make effective use of the improving spatial resolution, conventional pixel-based classification methods are far from satisfactory. Correspondingly, developing object-based methods becomes a central challenge for researchers in the field of Remote Sensing. This thesis focuses on the development of methods for accurate individual tree identification and tree species classification. We develop a method in which individual tree crown boundaries and treetop locations are derived under a unified framework. We apply a two-stage approach with edge detection followed by marker-controlled watershed segmentation. Treetops are modeled from radiometry and geometry aspects. Specifically, treetops are assumed to be represented by local radiation maxima and to be located near the center of the tree-crown. As a result, a marker image was created from the derived treetop to guide a watershed segmentation to further differentiate overlapping trees and to produce a segmented image comprised of individual tree crowns. The image segmentation method developed achieves a promising result for a 256 x 256 CASI image. Then further effort is made to extend our methods to the multiscales which are constructed from a wavelet decomposition. A scale consistency and geometric consistency are designed to examine the gradients along the scale-space for the purpose of separating true crown boundary from unwanted

  4. Flotation classification of ultrafine particles -- A novel classification approach

    SciTech Connect

    Qiu Guanzhou; Luo Lin; Hu Yuehua; Xu Jin; Wang Dianzuo

    1995-12-31

    This paper introduces a novel classification approach named the flotation classification approach which works by controlling interactions between particles. It differs considerably from the conventional classification processes operating on mechanical forces. In the present test, the micro-bubble flotation technology is grafted onto hydro-classification. Selective aggregation and dispersion of ultrafine particles are achieved through governing the interactions in the classification process. A series of laboratory classification tests for {minus}44 gm kaolin have been conducted on a classification column. As a result, about 92% recovery for minus 2 {micro}m size fraction Kaolin in the final product is obtained. In addition, two criteria for the classification are set up. Finally, a principle of classifying and controlling the interactions between particles is discussed in terms of surface thermodynamics and hydrodynamics.

  5. A Representation and Classification Scheme for Tree-Like Structures in Medical Images: Analyzing the Branching Pattern of Ductal Trees in X-ray Galactograms

    PubMed Central

    Megalooikonomou, Vasileios; Barnathan, Michael; Kontos, Despina; Bakic, Predrag R.; Maidment, Andrew D. A.

    2012-01-01

    We propose a multistep approach for representing and classifying tree-like structures in medical images. Tree-like structures are frequently encountered in biomedical contexts; examples are the bronchial system, the vascular topology, and the breast ductal network. We use tree encoding techniques, such as the depth-first string encoding and the Prüfer encoding, to obtain a symbolic string representation of the tree's branching topology; the problem of classifying trees is then reduced to string classification. We use the tf-idf text mining technique to assign a weight of significance to each string term (i.e., tree node label). Similarity searches and k-nearest neighbor classification of the trees is performed using the tf-idf weight vectors and the cosine similarity metric. We applied our approach to characterize the ductal tree-like parenchymal structure in X-ray galactograms, in order to distinguish among different radiological findings. Experimental results demonstrate the effectiveness of the proposed approach with classification accuracy reaching up to 86%, and also indicate that our method can potentially aid in providing insight to the relationship between branching patterns and function or pathology. PMID:19272984

  6. Improved similarity trees and their application to visual data classification.

    PubMed

    Paiva, Jose Gustavo S; Florian-Cruz, Laura; Pedrini, Helio; Telles, Guilherme P; Minghim, Rosane

    2011-12-01

    An alternative form to multidimensional projections for the visual analysis of data represented in multidimensional spaces is the deployment of similarity trees, such as Neighbor Joining trees. They organize data objects on the visual plane emphasizing their levels of similarity with high capability of detecting and separating groups and subgroups of objects. Besides this similarity-based hierarchical data organization, some of their advantages include the ability to decrease point clutter; high precision; and a consistent view of the data set during focusing, offering a very intuitive way to view the general structure of the data set as well as to drill down to groups and subgroups of interest. Disadvantages of similarity trees based on neighbor joining strategies include their computational cost and the presence of virtual nodes that utilize too much of the visual space. This paper presents a highly improved version of the similarity tree technique. The improvements in the technique are given by two procedures. The first is a strategy that replaces virtual nodes by promoting real leaf nodes to their place, saving large portions of space in the display and maintaining the expressiveness and precision of the technique. The second improvement is an implementation that significantly accelerates the algorithm, impacting its use for larger data sets. We also illustrate the applicability of the technique in visual data mining, showing its advantages to support visual classification of data sets, with special attention to the case of image classification. We demonstrate the capabilities of the tree for analysis and iterative manipulation and employ those capabilities to support evolving to a satisfactory data organization and classification.

  7. Probabilistic lung nodule classification with belief decision trees.

    PubMed

    Zinovev, Dmitriy; Feigenbaum, Jonathan; Furst, Jacob; Raicu, Daniela

    2011-01-01

    In reading Computed Tomography (CT) scans with potentially malignant lung nodules, radiologists make use of high level information (semantic characteristics) in their analysis. Computer-Aided Diagnostic Characterization (CADc) systems can assist radiologists by offering a "second opinion"--predicting these semantic characteristics for lung nodules. In this work, we propose a way of predicting the distribution of radiologists' opinions using a multiple-label classification algorithm based on belief decision trees using the National Cancer Institute (NCI) Lung Image Database Consortium (LIDC) dataset, which includes semantic annotations by up to four human radiologists for each one of the 914 nodules. Furthermore, we evaluate our multiple-label results using a novel distance-threshold curve technique--and, measuring the area under this curve, obtain 69% performance on the validation subset. We conclude that multiple-label classification algorithms are an appropriate method of representing the diagnoses of multiple radiologists on lung CT scans when ground truth is unavailable.

  8. Data mining in psychological treatment research: a primer on classification and regression trees.

    PubMed

    King, Matthew W; Resick, Patricia A

    2014-10-01

    Data mining of treatment study results can reveal unforeseen but critical insights, such as who receives the most benefit from treatment and under what circumstances. The usefulness and legitimacy of exploratory data analysis have received relatively little recognition, however, and analytic methods well suited to the task are not widely known in psychology. With roots in computer science and statistics, statistical learning approaches offer a credible option: These methods take a more inductive approach to building a model than is done in traditional regression, allowing the data greater role in suggesting the correct relationships between variables rather than imposing them a priori. Classification and regression trees are presented as a powerful, flexible exemplar of statistical learning methods. Trees allow researchers to efficiently identify useful predictors of an outcome and discover interactions between predictors without the need to anticipate and specify these in advance, making them ideal for revealing patterns that inform hypotheses about treatment effects. Trees can also provide a predictive model for forecasting outcomes as an aid to clinical decision making. This primer describes how tree models are constructed, how the results are interpreted and evaluated, and how trees overcome some of the complexities of traditional regression. Examples are drawn from randomized clinical trial data and highlight some interpretations of particular interest to treatment researchers. The limitations of tree models are discussed, and suggestions for further reading and choices in software are offered.

  9. A modified classification tree method for personalized medicine decisions

    PubMed Central

    Tsai, Wan-Min; Zhang, Heping; Buta, Eugenia; O’Malley, Stephanie

    2015-01-01

    The tree-based methodology has been widely applied to identify predictors of health outcomes in medical studies. However, the classical tree-based approaches do not pay particular attention to treatment assignment and thus do not consider prediction in the context of treatment received. In recent years, attention has been shifting from average treatment effects to identifying moderators of treatment response, and tree-based approaches to identify subgroups of subjects with enhanced treatment responses are emerging. In this study, we extend and present modifications to one of these approaches (Zhang et al., 2010 [29]) to efficiently identify subgroups of subjects who respond more favorably to one treatment than another based on their baseline characteristics. We extend the algorithm by incorporating an automatic pruning step and propose a measure for assessment of the predictive performance of the constructed tree. We evaluate the proposed method through a simulation study and illustrate the approach using a data set from a clinical trial of treatments for alcohol dependence. This simple and efficient statistical tool can be used for developing algorithms for clinical decision making and personalized treatment for patients based on their characteristics. PMID:26770292

  10. Superiority of Classification Tree versus Cluster, Fuzzy and Discriminant Models in a Heartbeat Classification System

    PubMed Central

    Krasteva, Vessela; Jekova, Irena; Leber, Remo; Schmid, Ramun; Abächerli, Roger

    2015-01-01

    This study presents a 2-stage heartbeat classifier of supraventricular (SVB) and ventricular (VB) beats. Stage 1 makes computationally-efficient classification of SVB-beats, using simple correlation threshold criterion for finding close match with a predominant normal (reference) beat template. The non-matched beats are next subjected to measurement of 20 basic features, tracking the beat and reference template morphology and RR-variability for subsequent refined classification in SVB or VB-class by Stage 2. Four linear classifiers are compared: cluster, fuzzy, linear discriminant analysis (LDA) and classification tree (CT), all subjected to iterative training for selection of the optimal feature space among extended 210-sized set, embodying interactive second-order effects between 20 independent features. The optimization process minimizes at equal weight the false positives in SVB-class and false negatives in VB-class. The training with European ST-T, AHA, MIT-BIH Supraventricular Arrhythmia databases found the best performance settings of all classification models: Cluster (30 features), Fuzzy (72 features), LDA (142 coefficients), CT (221 decision nodes) with top-3 best scored features: normalized current RR-interval, higher/lower frequency content ratio, beat-to-template correlation. Unbiased test-validation with MIT-BIH Arrhythmia database rates the classifiers in descending order of their specificity for SVB-class: CT (99.9%), LDA (99.6%), Cluster (99.5%), Fuzzy (99.4%); sensitivity for ventricular ectopic beats as part from VB-class (commonly reported in published beat-classification studies): CT (96.7%), Fuzzy (94.4%), LDA (94.2%), Cluster (92.4%); positive predictivity: CT (99.2%), Cluster (93.6%), LDA (93.0%), Fuzzy (92.4%). CT has superior accuracy by 0.3–6.8% points, with the advantage for easy model complexity configuration by pruning the tree consisted of easy interpretable ‘if-then’ rules. PMID:26461492

  11. Support vector machine classification trees based on fuzzy entropy of classification.

    PubMed

    de Boves Harrington, Peter

    2017-02-15

    The support vector machine (SVM) is a powerful classifier that has recently been implemented in a classification tree (SVMTreeG). This classifier partitioned the data by finding gaps in the data space. For large and complex datasets, there may be no gaps in the data space confounding this type of classifier. A novel algorithm was devised that uses fuzzy entropy to find optimal partitions for situations when clusters of data are overlapped in the data space. Also, a kernel version of the fuzzy entropy algorithm was devised. A fast support vector machine implementation is used that has no cost C or slack variables to optimize. Statistical comparisons using bootstrapped Latin partitions among the tree classifiers were made using a synthetic XOR data set and validated with ten prediction sets comprised of 50,000 objects and a data set of NMR spectra obtained from 12 tea sample extracts.

  12. Graduates employment classification using data mining approach

    NASA Astrophysics Data System (ADS)

    Aziz, Mohd Tajul Rizal Ab; Yusof, Yuhanis

    2016-08-01

    Data Mining is a platform to extract hidden knowledge in a collection of data. This study investigates the suitable classification model to classify graduates employment for one of the MARA Professional College (KPM) in Malaysia. The aim is to classify the graduates into either as employed, unemployed or further study. Five data mining algorithms offered in WEKA were used; Naïve Bayes, Logistic regression, Multilayer perceptron, k-nearest neighbor and Decision tree J48. Based on the obtained result, it is learned that the Logistic regression produces the highest classification accuracy which is at 92.5%. Such result was obtained while using 80% data for training and 20% for testing. The produced classification model will benefit the management of the college as it provides insight to the quality of graduates that they produce and how their curriculum can be improved to cater the needs from the industry.

  13. Decision Tree Classifier for Classification of Plant and Animal Micro RNA's

    NASA Astrophysics Data System (ADS)

    Pant, Bhasker; Pant, Kumud; Pardasani, K. R.

    Gene expression is regulated by miRNAs or micro RNAs which can be 21-23 nucleotide in length. They are non coding RNAs which control gene expression either by translation repression or mRNA degradation. Plants and animals both contain miRNAs which have been classified by wet lab techniques. These techniques are highly expensive, labour intensive and time consuming. Hence faster and economical computational approaches are needed. In view of above a machine learning model has been developed for classification of plant and animal miRNAs using decision tree classifier. The model has been tested on available data and it gives results with 91% accuracy.

  14. A novel decision-tree method for structured continuous-label classification.

    PubMed

    Hu, Hsiao-Wei; Chen, Yen-Liang; Tang, Kwei

    2013-12-01

    Structured continuous-label classification is a variety of classification in which the label is continuous in the data, but the goal is to classify data into classes that are a set of predefined ranges and can be organized in a hierarchy. In the hierarchy, the ranges at the lower levels are more specific and inherently more difficult to predict, whereas the ranges at the upper levels are less specific and inherently easier to predict. Therefore, both prediction specificity and prediction accuracy must be considered when building a decision tree (DT) from this kind of data. This paper proposes a novel classification algorithm for learning DT classifiers from data with structured continuous labels. This approach considers the distribution of labels throughout the hierarchical structure during the construction of trees without requiring discretization in the preprocessing stage. We compared the results of the proposed method with those of the C4.5 algorithm using eight real data sets. The empirical results indicate that the proposed method outperforms the C4.5 algorithm with regard to prediction accuracy, prediction specificity, and computational complexity.

  15. Support-vector-machine tree-based domain knowledge learning toward automated sports video classification

    NASA Astrophysics Data System (ADS)

    Xiao, Guoqiang; Jiang, Yang; Song, Gang; Jiang, Jianmin

    2010-12-01

    We propose a support-vector-machine (SVM) tree to hierarchically learn from domain knowledge represented by low-level features toward automatic classification of sports videos. The proposed SVM tree adopts a binary tree structure to exploit the nature of SVM's binary classification, where each internal node is a single SVM learning unit, and each external node represents the classified output type. Such a SVM tree presents a number of advantages, which include: 1. low computing cost; 2. integrated learning and classification while preserving individual SVM's learning strength; and 3. flexibility in both structure and learning modules, where different numbers of nodes and features can be added to address specific learning requirements, and various learning models can be added as individual nodes, such as neural networks, AdaBoost, hidden Markov models, dynamic Bayesian networks, etc. Experiments support that the proposed SVM tree achieves good performances in sports video classifications.

  16. The process and utility of classification and regression tree methodology in nursing research

    PubMed Central

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-01-01

    Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048

  17. Real-time classification of humans versus animals using profiling sensors and hidden Markov tree model

    NASA Astrophysics Data System (ADS)

    Hossen, Jakir; Jacobs, Eddie L.; Chari, Srikant

    2015-07-01

    Linear pyroelectric array sensors have enabled useful classifications of objects such as humans and animals to be performed with relatively low-cost hardware in border and perimeter security applications. Ongoing research has sought to improve the performance of these sensors through signal processing algorithms. In the research presented here, we introduce the use of hidden Markov tree (HMT) models for object recognition in images generated by linear pyroelectric sensors. HMTs are trained to statistically model the wavelet features of individual objects through an expectation-maximization learning process. Human versus animal classification for a test object is made by evaluating its wavelet features against the trained HMTs using the maximum-likelihood criterion. The classification performance of this approach is compared to two other techniques; a texture, shape, and spectral component features (TSSF) based classifier and a speeded-up robust feature (SURF) classifier. The evaluation indicates that among the three techniques, the wavelet-based HMT model works well, is robust, and has improved classification performance compared to a SURF-based algorithm in equivalent computation time. When compared to the TSSF-based classifier, the HMT model has a slightly degraded performance but almost an order of magnitude improvement in computation time enabling real-time implementation.

  18. Multivariate Approaches to Classification in Extragalactic Astronomy

    NASA Astrophysics Data System (ADS)

    Fraix-Burnet, Didier; Thuillard, Marc; Chattopadhyay, Asis Kumar

    2015-08-01

    Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono- or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.

  19. Real-Time Speech/Music Classification With a Hierarchical Oblique Decision Tree

    DTIC Science & Technology

    2008-04-01

    REAL-TIME SPEECH/ MUSIC CLASSIFICATION WITH A HIERARCHICAL OBLIQUE DECISION TREE Jun Wang, Qiong Wu, Haojiang Deng, Qin Yan Institute of Acoustics...time speech/ music classification with a hierarchical oblique decision tree. A set of discrimination features in frequency domain are selected...handle signals without discrimination and can not work properly in the existence of multimedia signals. This paper proposes a real-time speech/ music

  20. A classification and regression tree model of controls on dissolved inorganic nitrogen leaching from European forests.

    PubMed

    Rothwell, James J; Futter, Martyn N; Dise, Nancy B

    2008-11-01

    Often, there is a non-linear relationship between atmospheric dissolved inorganic nitrogen (DIN) input and DIN leaching that is poorly captured by existing models. We present the first application of the non-parametric classification and regression tree approach to evaluate the key environmental drivers controlling DIN leaching from European forests. DIN leaching was classified as low (<3), medium (3-15) or high (>15kg N ha(-1) year(-1)) at 215 sites across Europe. The analysis identified throughfall NO(3)(-) deposition, acid deposition, hydrology, soil type, the carbon content of the soil, and the legacy of historic N deposition as the dominant drivers of DIN leaching for these forests. Ninety four percent of sites were successfully classified into the appropriate leaching category. This approach shows promise for understanding complex ecosystem responses to a wide range of anthropogenic stressors as well as an improved method for identifying risk and targeting pollution mitigation strategies in forest ecosystems.

  1. Tree species classification in subtropical forests using small-footprint full-waveform LiDAR data

    NASA Astrophysics Data System (ADS)

    Cao, Lin; Coops, Nicholas C.; Innes, John L.; Dai, Jinsong; Ruan, Honghua; She, Guanghui

    2016-07-01

    The accurate classification of tree species is critical for the management of forest ecosystems, particularly subtropical forests, which are highly diverse and complex ecosystems. While airborne Light Detection and Ranging (LiDAR) technology offers significant potential to estimate forest structural attributes, the capacity of this new tool to classify species is less well known. In this research, full-waveform metrics were extracted by a voxel-based composite waveform approach and examined with a Random Forests classifier to discriminate six subtropical tree species (i.e., Masson pine (Pinus massoniana Lamb.)), Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.), Slash pines (Pinus elliottii Engelm.), Sawtooth oak (Quercus acutissima Carruth.) and Chinese holly (Ilex chinensis Sims.) at three levels of discrimination. As part of the analysis, the optimal voxel size for modelling the composite waveforms was investigated, the most important predictor metrics for species classification assessed and the effect of scan angle on species discrimination examined. Results demonstrate that all tree species were classified with relatively high accuracy (68.6% for six classes, 75.8% for four main species and 86.2% for conifers and broadleaved trees). Full-waveform metrics (based on height of median energy, waveform distance and number of waveform peaks) demonstrated high classification importance and were stable among various voxel sizes. The results also suggest that the voxel based approach can alleviate some of the issues associated with large scan angles. In summary, the results indicate that full-waveform LIDAR data have significant potential for tree species classification in the subtropical forests.

  2. An information-based network approach for protein classification

    PubMed Central

    Wan, Xiaogeng; Zhao, Xin; Yau, Stephen S. T.

    2017-01-01

    Protein classification is one of the critical problems in bioinformatics. Early studies used geometric distances and polygenetic-tree to classify proteins. These methods use binary trees to present protein classification. In this paper, we propose a new protein classification method, whereby theories of information and networks are used to classify the multivariate relationships of proteins. In this study, protein universe is modeled as an undirected network, where proteins are classified according to their connections. Our method is unsupervised, multivariate, and alignment-free. It can be applied to the classification of both protein sequences and structures. Nine examples are used to demonstrate the efficiency of our new method. PMID:28350835

  3. Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree With Gene Selection

    PubMed Central

    2005-01-01

    We investigate the problems of multiclass cancer classification with gene selection from gene expression data. Two different constructed multiclass classifiers with gene selection are proposed, which are fuzzy support vector machine (FSVM) with gene selection and binary classification tree based on SVM with gene selection. Using F test and recursive feature elimination based on SVM as gene selection methods, binary classification tree based on SVM with F test, binary classification tree based on SVM with recursive feature elimination based on SVM, and FSVM with recursive feature elimination based on SVM are tested in our experiments. To accelerate computation, preselecting the strongest genes is also used. The proposed techniques are applied to analyze breast cancer data, small round blue-cell tumors, and acute leukemia data. Compared to existing multiclass cancer classifiers and binary classification tree based on SVM with F test or binary classification tree based on SVM with recursive feature elimination based on SVM mentioned in this paper, FSVM based on recursive feature elimination based on SVM can find most important genes that affect certain types of cancer with high recognition accuracy. PMID:16046822

  4. A statistical approach to root system classification

    PubMed Central

    Bodner, Gernot; Leitner, Daniel; Nakhforoosh, Alireza; Sobotik, Monika; Moder, Karl; Kaul, Hans-Peter

    2013-01-01

    Plant root systems have a key role in ecology and agronomy. In spite of fast increase in root studies, still there is no classification that allows distinguishing among distinctive characteristics within the diversity of rooting strategies. Our hypothesis is that a multivariate approach for “plant functional type” identification in ecology can be applied to the classification of root systems. The classification method presented is based on a data-defined statistical procedure without a priori decision on the classifiers. The study demonstrates that principal component based rooting types provide efficient and meaningful multi-trait classifiers. The classification method is exemplified with simulated root architectures and morphological field data. Simulated root architectures showed that morphological attributes with spatial distribution parameters capture most distinctive features within root system diversity. While developmental type (tap vs. shoot-borne systems) is a strong, but coarse classifier, topological traits provide the most detailed differentiation among distinctive groups. Adequacy of commonly available morphologic traits for classification is supported by field data. Rooting types emerging from measured data, mainly distinguished by diameter/weight and density dominated types. Similarity of root systems within distinctive groups was the joint result of phylogenetic relation and environmental as well as human selection pressure. We concluded that the data-define classification is appropriate for integration of knowledge obtained with different root measurement methods and at various scales. Currently root morphology is the most promising basis for classification due to widely used common measurement protocols. To capture details of root diversity efforts in architectural measurement techniques are essential. PMID:23914200

  5. The use of airborne hyperspectral data for tree species classification in a species-rich Central European forest area

    NASA Astrophysics Data System (ADS)

    Richter, Ronny; Reu, Björn; Wirth, Christian; Doktor, Daniel; Vohland, Michael

    2016-10-01

    The success of remote sensing approaches to assess tree species diversity in a heterogeneously mixed forest stand depends on the availability of both appropriate data and suitable classification algorithms. To separate the high number of in total ten broadleaf tree species in a small structured floodplain forest, the Leipzig Riverside Forest, we introduce a majority based classification approach for Discriminant Analysis based on Partial Least Squares (PLS-DA), which was tested against Random Forest (RF) and Support Vector Machines (SVM). The classifier performance was tested on different sets of airborne hyperspectral image data (AISA DUAL) that were acquired on single dates in August and September and also stacked to a composite product. Shadowed gaps and shadowed crown parts were eliminated via spectral mixture analysis (SMA) prior to the pixel-based classification. Training and validation sets were defined spectrally with the conditioned Latin hypercube method as a stratified random sampling procedure. In the validation, PLS-DA consistently outperformed the RF and SVM approaches on all datasets. The additional use of spectral variable selection (CARS, "competitive adaptive reweighted sampling") combined with PLS-DA further improved classification accuracies. Up to 78.4% overall accuracy was achieved for the stacked dataset. The image recorded in August provided slightly higher accuracies than the September image, regardless of the applied classifier.

  6. Parallel K-dimensional tree classification based on semi-matroid structure for remote sensing applications

    NASA Astrophysics Data System (ADS)

    Chang, Yang-Lang; Chen, Zhi-Ming; Liu, Jin-Nan; Chang, Lena; Fang, Jyh Perng

    2010-08-01

    Satellite remote sensing images can be interpreted to provide important information of large-scale natural resources, such as lands, oceans, mountains, rivers, forests and minerals for Earth observations. Recent advances of remote sensing technologies have improved the availability of satellite imagery in a wide range of applications including high dimensional remote sensing data sets (e.g. high spectral and high spatial resolution images). The information of high dimensional remote sensing images obtained by state-of-the-art sensor technologies can be identified more accurately than images acquired by conventional remote sensing techniques. However, due to its large volume of image data, it requires a huge amount of storages and computing time. In response, the computational complexity of data processing for high dimensional remote sensing data analysis will increase. Consequently, this paper proposes a novel classification algorithm based on semi-matroid structure, known as the parallel k-dimensional tree semi-matroid (PKTSM) classification, which adopts a new hybrid parallel approach to deal with high dimensional data sets. It is implemented by combining the message passing interface (MPI) library, the open multi-processing (OpenMP) application programming interface and the compute unified device architecture (CUDA) of graphics processing units (GPU) in a hybrid mode. The effectiveness of the proposed PKTSM is evaluated by using MODIS/ASTER airborne simulator (MASTER) images and airborne synthetic aperture radar (AIRSAR) images for land cover classification during the Pacrim II campaign. The experimental results demonstrated that the proposed hybrid PKTSM can significantly improve the performance in terms of both computational speed-up and classification accuracy.

  7. Stochastic gradient boosting classification trees for forest fuel types mapping through airborne laser scanning and IRS LISS-III imagery

    NASA Astrophysics Data System (ADS)

    Chirici, G.; Scotti, R.; Montaghi, A.; Barbati, A.; Cartisano, R.; Lopez, G.; Marchetti, M.; McRoberts, R. E.; Olsson, H.; Corona, P.

    2013-12-01

    This paper presents an application of Airborne Laser Scanning (ALS) data in conjunction with an IRS LISS-III image for mapping forest fuel types. For two study areas of 165 km2 and 487 km2 in Sicily (Italy), 16,761 plots of size 30-m × 30-m were distributed using a tessellation-based stratified sampling scheme. ALS metrics and spectral signatures from IRS extracted for each plot were used as predictors to classify forest fuel types observed and identified by photointerpretation and fieldwork. Following use of traditional parametric methods that produced unsatisfactory results, three non-parametric classification approaches were tested: (i) classification and regression tree (CART), (ii) the CART bagging method called Random Forests, and (iii) the CART bagging/boosting stochastic gradient boosting (SGB) approach. This contribution summarizes previous experiences using ALS data for estimating forest variables useful for fire management in general and for fuel type mapping, in particular. It summarizes characteristics of classification and regression trees, presents the pre-processing operation, the classification algorithms, and the achieved results. The results demonstrated superiority of the SGB method with overall accuracy of 84%. The most relevant ALS metric was canopy cover, defined as the percent of non-ground returns. Other relevant metrics included the spectral information from IRS and several other ALS metrics such as percentiles of the height distribution, the mean height of all returns, and the number of returns.

  8. Tree Species Classification By Multiseasonal High Resolution Satellite Data

    NASA Astrophysics Data System (ADS)

    Elatawneh, Alata; Wallner, Adelheid; Straub, Christoph; Schneider, Thomas; Knoke, Thomas

    2013-12-01

    Accurate forest tree species mapping is a fundamental issue for sustainable forest management and planning. Forest tree species mapping with the means of remote sensing data is still a topic to be investigated. The Bavaria state institute of forestry is investigating the potential of using digital aerial images for forest management purposes. However, using aerial images is still cost- and time-consuming, in addition to their acquisition restrictions. The new space-born sensor generations such as, RapidEye, with a very high temporal resolution, offering multiseasonal data have the potential to improve the forest tree species mapping. In this study, we investigated the potential of multiseasonal RapidEye data for mapping tree species in a Mid European forest in Southern Germany. The RapidEye data of level A3 were collected on ten different dates in the years 2009, 2010 and 2011. For data analysis, a model was developed, which combines the Spectral Angle Mapper technique with a 10-fold- cross-validation. The analysis succeeded to differentiate four tree species; Norway spruce (Picea abies L.), Silver Fir (Abies alba Mill.), European beech (Fagus sylvatica) and Maple (Acer pseudoplatanus). The model success was evaluated using digital aerial images acquired in the year 2009 and inventory point records from 2008/09 inventory. Model results of the multiseasonal RapidEye data analysis achieved an overall accuracy of 76%. However, the success of the model was evaluated only for all the identified species and not for the individual.

  9. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    PubMed Central

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  10. A modified decision tree algorithm based on genetic algorithm for mobile user classification problem.

    PubMed

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity.

  11. A Systematic Approach to Subgroup Classification in Intellectual Disability

    ERIC Educational Resources Information Center

    Schalock, Robert L.; Luckasson, Ruth

    2015-01-01

    This article describes a systematic approach to subgroup classification based on a classification framework and sequential steps involved in the subgrouping process. The sequential steps are stating the purpose of the classification, identifying the classification elements, using relevant information, and using clearly stated and purposeful…

  12. Automatic Approach to Vhr Satellite Image Classification

    NASA Astrophysics Data System (ADS)

    Kupidura, P.; Osińska-Skotak, K.; Pluto-Kossakowska, J.

    2016-06-01

    In this paper, we present a proposition of a fully automatic classification of VHR satellite images. Unlike the most widespread approaches: supervised classification, which requires prior defining of class signatures, or unsupervised classification, which must be followed by an interpretation of its results, the proposed method requires no human intervention except for the setting of the initial parameters. The presented approach bases on both spectral and textural analysis of the image and consists of 3 steps. The first step, the analysis of spectral data, relies on NDVI values. Its purpose is to distinguish between basic classes, such as water, vegetation and non-vegetation, which all differ significantly spectrally, thus they can be easily extracted basing on spectral analysis. The second step relies on granulometric maps. These are the product of local granulometric analysis of an image and present information on the texture of each pixel neighbourhood, depending on the texture grain. The purpose of texture analysis is to distinguish between different classes, spectrally similar, but yet of different texture, e.g. bare soil from a built-up area, or low vegetation from a wooded area. Due to the use of granulometric analysis, based on mathematical morphology opening and closing, the results are resistant to the border effect (qualifying borders of objects in an image as spaces of high texture), which affect other methods of texture analysis like GLCM statistics or fractal analysis. Therefore, the effectiveness of the analysis is relatively high. Several indices based on values of different granulometric maps have been developed to simplify the extraction of classes of different texture. The third and final step of the process relies on a vegetation index, based on near infrared and blue bands. Its purpose is to correct partially misclassified pixels. All the indices used in the classification model developed relate to reflectance values, so the preliminary step

  13. Iqpc 2015 Track: Tree Separation and Classification in Mobile Mapping LIDAR Data

    NASA Astrophysics Data System (ADS)

    Gorte, B.; Oude Elberink, S.; Sirmacek, B.; Wang, J.

    2015-08-01

    The European FP7 project IQmulus yearly organizes several processing contests, where submissions are requested for novel algorithms for point cloud and other big geodata processing. This paper describes the set-up and execution of a contest having the purpose to evaluate state-of-the-art algorithms for Mobile Mapping System point clouds, in order to detect and identify (individual) trees. By the nature of MMS these are trees in the vicinity of the road network (rather than in forests). Therefore, part of the challenge is distinguishing between trees and other objects, such as buildings, street furniture, cars etc. Three submitted segmentation and classification algorithms are thus evaluated.

  14. Automated morphological analysis of bone marrow cells in microscopic images for diagnosis of leukemia: nucleus-plasma separation and cell classification using a hierarchical tree model of hematopoesis

    NASA Astrophysics Data System (ADS)

    Krappe, Sebastian; Wittenberg, Thomas; Haferlach, Torsten; Münzenmayer, Christian

    2016-03-01

    The morphological differentiation of bone marrow is fundamental for the diagnosis of leukemia. Currently, the counting and classification of the different types of bone marrow cells is done manually under the use of bright field microscopy. This is a time-consuming, subjective, tedious and error-prone process. Furthermore, repeated examinations of a slide may yield intra- and inter-observer variances. For that reason a computer assisted diagnosis system for bone marrow differentiation is pursued. In this work we focus (a) on a new method for the separation of nucleus and plasma parts and (b) on a knowledge-based hierarchical tree classifier for the differentiation of bone marrow cells in 16 different classes. Classification trees are easily interpretable and understandable and provide a classification together with an explanation. Using classification trees, expert knowledge (i.e. knowledge about similar classes and cell lines in the tree model of hematopoiesis) is integrated in the structure of the tree. The proposed segmentation method is evaluated with more than 10,000 manually segmented cells. For the evaluation of the proposed hierarchical classifier more than 140,000 automatically segmented bone marrow cells are used. Future automated solutions for the morphological analysis of bone marrow smears could potentially apply such an approach for the pre-classification of bone marrow cells and thereby shortening the examination time.

  15. Hierarchical description and extensive classification of protein structural changes by Motion Tree.

    PubMed

    Koike, Ryotaro; Ota, Motonori; Kidera, Akinori

    2014-02-06

    The structures of the same protein, determined under different conditions, provide clues toward understanding the role of structural changes in the protein's function. Structural changes are usually identified as rigid-body motions, which are defined using a particular threshold of rigidity, such as domain motions. However, each protein actually undergoes motions with various size and magnitude ranges. In this study, to describe protein structural changes more comprehensively, we propose a method based on hierarchical clustering. This method enables the illustration of a wide range of protein motions in a single tree diagram, named the "Motion Tree". We applied the method to 432 proteins exhibiting large structural changes and classified their Motion Trees in terms of the characteristic indices of the trees. This classification of the Motion Trees revealed clear relationships to their protein functions. Especially, complex structural changes are significantly correlated with multi-step protein functions.

  16. Using the PDD Behavior Inventory as a Level 2 Screener: A Classification and Regression Trees Analysis

    ERIC Educational Resources Information Center

    Cohen, Ira L.; Liu, Xudong; Hudson, Melissa; Gillis, Jennifer; Cavalari, Rachel N. S.; Romanczyk, Raymond G.; Karmel, Bernard Z.; Gardner, Judith M.

    2016-01-01

    In order to improve discrimination accuracy between Autism Spectrum Disorder (ASD) and similar neurodevelopmental disorders, a data mining procedure, Classification and Regression Trees (CART), was used on a large multi-site sample of PDD Behavior Inventory (PDDBI) forms on children with and without ASD. Discrimination accuracy exceeded 80%,…

  17. A Classification of Recent Widespread Tree Mortality in the Western US

    NASA Astrophysics Data System (ADS)

    Hicke, J. A.; Anderegg, W.; Allen, C. D.; Stephenson, N.

    2015-12-01

    Widespread tree mortality has been documented across the western United States in recent decades. Climate change has been implicated in these events, in particular warming and associated effects on tree stress and biotic disturbance agents. Given projected future warming, the capability of accurately predicting future tree mortality is critical. However, sufficient ecological understanding is needed to do so. Here we describe differences in various mortality types associated with spatial characteristics and climate drivers. We loosely classify mortality types into four categories: 1) widespread but low severity background mortality that has been increasing mainly because of greater stress associated with rising climatic water deficit; 2) tree die-offs that are driven by severe, hotter drought in which biotic agents play minor roles, such as sudden aspen decline; 3) tree die-offs in which hotter droughts combined with outbreaks of biotic agents, often less aggressive bark beetles, to cause mortality, such as piñon pine mortality in the Southwest; and 4) tree die-offs that were initiated or facilitated by droughts but which were associated with aggressive biotic agents that can kill healthy trees at high populations, such as mountain pine beetle outbreaks. An important use of this classification is the different pathways by which climate change can cause tree mortality. For some classes (background and primarily drought-driven mortality), predictions may be sufficiently accurate based on climate (drought) metrics. For classes in which biotic agents play a role, the direct warming effect on insects may occur through mechanisms not related to drought, and therefore predictions may need to include mechanisms other than drought. We note that this is a simplistic classification designed to facilitate understanding of tree mortality, and that overlap occurs among categories.

  18. Decision Tree Approach for Soil Liquefaction Assessment

    PubMed Central

    Gandomi, Amir H.; Fridline, Mark M.; Roke, David A.

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view. PMID:24489498

  19. Decision tree approach for soil liquefaction assessment.

    PubMed

    Gandomi, Amir H; Fridline, Mark M; Roke, David A

    2013-01-01

    In the current study, the performances of some decision tree (DT) techniques are evaluated for postearthquake soil liquefaction assessment. A database containing 620 records of seismic parameters and soil properties is used in this study. Three decision tree techniques are used here in two different ways, considering statistical and engineering points of view, to develop decision rules. The DT results are compared to the logistic regression (LR) model. The results of this study indicate that the DTs not only successfully predict liquefaction but they can also outperform the LR model. The best DT models are interpreted and evaluated based on an engineering point of view.

  20. Understanding tree growth responses after partial cuttings: A new approach

    PubMed Central

    Rossi, Sergio; Lussier, Jean-Martin; Walsh, Denis; Morin, Hubert

    2017-01-01

    Forest ecosystem management heads towards the use of partial cuttings. However, the wide variation in growth response of residual trees remains unexplained, preventing a suitable prediction of forest productivity. The aim of the study was to assess individual growth and identify the driving factors involved in the responses of residual trees. Six study blocks in even-aged black spruce [Picea mariana (Mill.) B.S.P.] stands of the eastern Canadian boreal forest were submitted to experimental shelterwood and seed-tree treatments. Individual-tree models were applied to 1039 trees to analyze their patterns of radial growth during the 10 years after partial cutting by using the nonlinear Schnute function on tree-ring series. The trees exhibited different growth patterns. A sigmoid growth was detected in 32% of trees, mainly in control plots of older stands. Forty-seven percent of trees located in the interior of residual strips showed an S-shape, which was influenced by stand mortality, harvested intensity and dominant height. Individuals showing an exponential pattern produced the greatest radial growth after cutting and were edge trees of younger stands with higher dominant height. A steady growth decline was observed in 4% of trees, represented by the individuals suppressed and insensitive to the treatment. The analyses demonstrated that individual nonlinear models are able to assess the variability in growth within the stand and the factors involved in the occurrence of the different growth patterns, thus improving understanding of the tree responses to partial cutting. This new approach can sustain forest management strategies by defining the best conditions to optimize the growth yield of residual trees. PMID:28222200

  1. A Nonparametric Approach to Estimate Classification Accuracy and Consistency

    ERIC Educational Resources Information Center

    Lathrop, Quinn N.; Cheng, Ying

    2014-01-01

    When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…

  2. Exploring precrash maneuvers using classification trees and random forests.

    PubMed

    Harb, Rami; Yan, Xuedong; Radwan, Essam; Su, Xiaogang

    2009-01-01

    Taking evasive actions vis-à-vis critical traffic situations impending to motor vehicle crashes endows drivers an opportunity to avoid the crash occurrence or at least diminish its severity. This study explores the drivers, vehicles, and environments' characteristics associated with crash avoidance maneuvers (i.e., evasive actions or no evasive actions). Rear-end collisions, head-on collisions, and angle collisions are analyzed separately using decision trees and the significance of the variables on the binary response variable (evasive actions or no evasive actions) is determined. Moreover, the random forests method is employed to rank the importance of the drivers/vehicles/environments characteristics on crash avoidance maneuvers. According to the exploratory analyses' results, drivers' visibility obstruction, drivers' physical impairment, drivers' distraction are associated with crash avoidance maneuvers in all three types of accidents. Moreover, speed limit is associated with rear-end collisions' avoidance maneuvers and vehicle type is correlated with head-on collisions and angle collisions' avoidance maneuvers. It is recommended that future research investigates further the explored trends (e.g., physically impaired drivers, visibility obstruction) using driving simulators which may help in legislative initiatives and in-vehicle technology recommendations.

  3. Effects of sample survey design on the accuracy of classification tree models in species distribution models

    USGS Publications Warehouse

    Edwards, T.C.; Cutler, D.R.; Zimmermann, N.E.; Geiser, L.; Moisen, G.G.

    2006-01-01

    We evaluated the effects of probabilistic (hereafter DESIGN) and non-probabilistic (PURPOSIVE) sample surveys on resultant classification tree models for predicting the presence of four lichen species in the Pacific Northwest, USA. Models derived from both survey forms were assessed using an independent data set (EVALUATION). Measures of accuracy as gauged by resubstitution rates were similar for each lichen species irrespective of the underlying sample survey form. Cross-validation estimates of prediction accuracies were lower than resubstitution accuracies for all species and both design types, and in all cases were closer to the true prediction accuracies based on the EVALUATION data set. We argue that greater emphasis should be placed on calculating and reporting cross-validation accuracy rates rather than simple resubstitution accuracy rates. Evaluation of the DESIGN and PURPOSIVE tree models on the EVALUATION data set shows significantly lower prediction accuracy for the PURPOSIVE tree models relative to the DESIGN models, indicating that non-probabilistic sample surveys may generate models with limited predictive capability. These differences were consistent across all four lichen species, with 11 of the 12 possible species and sample survey type comparisons having significantly lower accuracy rates. Some differences in accuracy were as large as 50%. The classification tree structures also differed considerably both among and within the modelled species, depending on the sample survey form. Overlap in the predictor variables selected by the DESIGN and PURPOSIVE tree models ranged from only 20% to 38%, indicating the classification trees fit the two evaluated survey forms on different sets of predictor variables. The magnitude of these differences in predictor variables throws doubt on ecological interpretation derived from prediction models based on non-probabilistic sample surveys. ?? 2006 Elsevier B.V. All rights reserved.

  4. The minimum distance approach to classification

    NASA Technical Reports Server (NTRS)

    Wacker, A. G.; Landgrebe, D. A.

    1971-01-01

    The work to advance the state-of-the-art of miminum distance classification is reportd. This is accomplished through a combination of theoretical and comprehensive experimental investigations based on multispectral scanner data. A survey of the literature for suitable distance measures was conducted and the results of this survey are presented. It is shown that minimum distance classification, using density estimators and Kullback-Leibler numbers as the distance measure, is equivalent to a form of maximum likelihood sample classification. It is also shown that for the parametric case, minimum distance classification is equivalent to nearest neighbor classification in the parameter space.

  5. Stroke damage detection using classification trees on electrical bioimpedance cerebral spectroscopy measurements.

    PubMed

    Atefi, Seyed Reza; Seoane, Fernando; Thorlin, Thorleif; Lindecrantz, Kaj

    2013-08-07

    After cancer and cardio-vascular disease, stroke is the third greatest cause of death worldwide. Given the limitations of the current imaging technologies used for stroke diagnosis, the need for portable non-invasive and less expensive diagnostic tools is crucial. Previous studies have suggested that electrical bioimpedance (EBI) measurements from the head might contain useful clinical information related to changes produced in the cerebral tissue after the onset of stroke. In this study, we recorded 720 EBI Spectroscopy (EBIS) measurements from two different head regions of 18 hemispheres of nine subjects. Three of these subjects had suffered a unilateral haemorrhagic stroke. A number of features based on structural and intrinsic frequency-dependent properties of the cerebral tissue were extracted. These features were then fed into a classification tree. The results show that a full classification of damaged and undamaged cerebral tissue was achieved after three hierarchical classification steps. Lastly, the performance of the classification tree was assessed using Leave-One-Out Cross Validation (LOO-CV). Despite the fact that the results of this study are limited to a small database, and the observations obtained must be verified further with a larger cohort of patients, these findings confirm that EBI measurements contain useful information for   assessing on the health of brain tissue after stroke and supports the hypothesis that classification features based on Cole parameters, spectral information and the geometry of EBIS measurements are useful to differentiate between healthy and stroke damaged brain tissue.

  6. Computer-aided diagnosis of Alzheimer's disease using support vector machines and classification trees

    NASA Astrophysics Data System (ADS)

    Salas-Gonzalez, D.; Górriz, J. M.; Ramírez, J.; López, M.; Álvarez, I.; Segovia, F.; Chaves, R.; Puntonet, C. G.

    2010-05-01

    This paper presents a computer-aided diagnosis technique for improving the accuracy of early diagnosis of Alzheimer-type dementia. The proposed methodology is based on the selection of voxels which present Welch's t-test between both classes, normal and Alzheimer images, greater than a given threshold. The mean and standard deviation of intensity values are calculated for selected voxels. They are chosen as feature vectors for two different classifiers: support vector machines with linear kernel and classification trees. The proposed methodology reaches greater than 95% accuracy in the classification task.

  7. Automatic lung nodule classification with radiomics approach

    NASA Astrophysics Data System (ADS)

    Ma, Jingchen; Wang, Qian; Ren, Yacheng; Hu, Haibo; Zhao, Jun

    2016-03-01

    Lung cancer is the first killer among the cancer deaths. Malignant lung nodules have extremely high mortality while some of the benign nodules don't need any treatment .Thus, the accuracy of diagnosis between benign or malignant nodules diagnosis is necessary. Notably, although currently additional invasive biopsy or second CT scan in 3 months later may help radiologists to make judgments, easier diagnosis approaches are imminently needed. In this paper, we propose a novel CAD method to distinguish the benign and malignant lung cancer from CT images directly, which can not only improve the efficiency of rumor diagnosis but also greatly decrease the pain and risk of patients in biopsy collecting process. Briefly, according to the state-of-the-art radiomics approach, 583 features were used at the first step for measurement of nodules' intensity, shape, heterogeneity and information in multi-frequencies. Further, with Random Forest method, we distinguish the benign nodules from malignant nodules by analyzing all these features. Notably, our proposed scheme was tested on all 79 CT scans with diagnosis data available in The Cancer Imaging Archive (TCIA) which contain 127 nodules and each nodule is annotated by at least one of four radiologists participating in the project. Satisfactorily, this method achieved 82.7% accuracy in classification of malignant primary lung nodules and benign nodules. We believe it would bring much value for routine lung cancer diagnosis in CT imaging and provide improvement in decision-support with much lower cost.

  8. The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification.

    PubMed

    Afrasiabi, Cyrus; Samad, Bushra; Dineen, David; Meacham, Christopher; Sjölander, Kimmen

    2013-07-01

    The PhyloFacts 'Fast Approximate Tree Classification' (FAT-CAT) web server provides a novel approach to ortholog identification using subtree hidden Markov model-based placement of protein sequences to phylogenomic orthology groups in the PhyloFacts database. Results on a data set of microbial, plant and animal proteins demonstrate FAT-CAT's high precision at separating orthologs and paralogs and robustness to promiscuous domains. We also present results documenting the precision of ortholog identification based on subtree hidden Markov model scoring. The FAT-CAT phylogenetic placement is used to derive a functional annotation for the query, including confidence scores and drill-down capabilities. PhyloFacts' broad taxonomic and functional coverage, with >7.3 M proteins from across the Tree of Life, enables FAT-CAT to predict orthologs and assign function for most sequence inputs. Four pipeline parameter presets are provided to handle different sequence types, including partial sequences and proteins containing promiscuous domains; users can also modify individual parameters. PhyloFacts trees matching the query can be viewed interactively online using the PhyloScope Javascript tree viewer and are hyperlinked to various external databases. The FAT-CAT web server is available at http://phylogenomics.berkeley.edu/phylofacts/fatcat/.

  9. Classification and regression trees for epidemiologic research: an air pollution example

    PubMed Central

    2014-01-01

    Background Identifying and characterizing how mixtures of exposures are associated with health endpoints is challenging. We demonstrate how classification and regression trees can be used to generate hypotheses regarding joint effects from exposure mixtures. Methods We illustrate the approach by investigating the joint effects of CO, NO2, O3, and PM2.5 on emergency department visits for pediatric asthma in Atlanta, Georgia. Pollutant concentrations were categorized as quartiles. Days when all pollutants were in the lowest quartile were held out as the referent group (n = 131) and the remaining 3,879 days were used to estimate the regression tree. Pollutants were parameterized as dichotomous variables representing each ordinal split of the quartiles (e.g. comparing CO quartile 1 vs. CO quartiles 2–4) and considered one at a time in a Poisson case-crossover model with control for confounding. The pollutant-split resulting in the smallest P-value was selected as the first split and the dataset was partitioned accordingly. This process repeated for each subset of the data until the P-values for the remaining splits were not below a given alpha, resulting in the formation of a “terminal node”. We used the case-crossover model to estimate the adjusted risk ratio for each terminal node compared to the referent group, as well as the likelihood ratio test for the inclusion of the terminal nodes in the final model. Results The largest risk ratio corresponded to days when PM2.5 was in the highest quartile and NO2 was in the lowest two quartiles (RR: 1.10, 95% CI: 1.05, 1.16). A simultaneous Wald test for the inclusion of all terminal nodes in the model was significant, with a chi-square statistic of 34.3 (p = 0.001, with 13 degrees of freedom). Conclusions Regression trees can be used to hypothesize about joint effects of exposure mixtures and may be particularly useful in the field of air pollution epidemiology for gaining a better understanding of complex

  10. A comparison of non-symmetric entropy-based classification trees and support vector machine for cardiovascular risk stratification.

    PubMed

    Singh, Anima; Guttag, John V

    2011-01-01

    Classification tree-based risk stratification models generate easily interpretable classification rules. This feature makes classification tree-based models appealing for use in a clinical setting, provided that they have comparable accuracy to other methods. In this paper, we present and evaluate the performance of a non-symmetric entropy-based classification tree algorithm. The algorithm is designed to accommodate class imbalance found in many medical datasets. We evaluate the performance of this algorithm, and compare it to that of SVM-based classifiers, when applied to 4219 non-ST elevation acute coronary syndrome patients. We generated SVM-based classifiers using three different strategies for handling class imbalance: cost-sensitive SVM learning, synthetic minority oversampling (SMOTE), and random majority undersampling. We used both linear and radial basis kernel-based SVMs. Our classification tree models outperformed SVM-based classifiers generated using each of the three techniques. On average, the classification tree models yielded a 14% improvement in G-score and a 21% improvement in F-score relative to the linear SVM classifiers with the best performance. Similarly, our classification tree models yielded a 12% improvement in G-score and a 21% improvement in the F-score over the best RBF kernel-based SVM classifiers.

  11. Classification of the cycle of the seminiferous epithelium in the common tree shrew (Tupaia glis).

    PubMed

    Maeda, S; Endo, H; Kimura, J; Rerkamnuaychoke, W; Chungsamarnyart, N; Yamada, J; Kurohmarum; Hayashi, Y; Nishida, T

    1996-05-01

    The classification of the cycle of the seminiferous epithelium was carried out in the common tree shrew (Tupaia glis). The tree shrew captured in Thailand were fixed with Bouin's fixative, embedded in paraffin wax, and stained with PAS-hematoxylin. The cycle was classified into twelve stages on the basis of the acrosomal changes of spermatids. Relative frequencies of stages form I to XII were 11.9, 7.2, 8.9, 22.5, 12.9, 9.7, 8.0, 5.9, 4.0, 3.2, 2.9, and 3.6%, respectively. Different stages did no appear in a cross-sectioned tubule as did in primates. The head of matured spermatid was discoidal in shape and different from that of primates and rodents. Spermatogenesis of the common tree shrew is different from that of primates and rodents according to its morphological features.

  12. A systematic approach to the classification of diseases.

    PubMed

    Murthy, A R

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences.

  13. A SYSTEMATIC APPROACH TO THE CLASSIFICATION OF DISEASES

    PubMed Central

    Murthy, A.R.V.

    1993-01-01

    Ayurvedic texts have adopted multiple approaches to the classification of diseases. Caraka while choosing a binary classification in Vimana sthana declares that the classifications may be numerable and innumerable basing on the criteria chosen for such classification. He gives full liberty to the individual to go in for the newer and newer classification, provided the criteria are different. Taking cue from this statement an attempt has been made at categorizing the diseases mentioned in Ayurvedic texts under different systems in keeping with the current practice in the Western Medical Sciences. PMID:22556612

  14. Classification tree and minimum-volume ellipsoid analyses of the distribution of ponderosa pine in the western USA

    USGS Publications Warehouse

    Norris, Jodi R.; Jackson, Stephen T.; Betancourt, Julio L.

    2006-01-01

    Aim? Ponderosa pine (Pinus ponderosa Douglas ex Lawson & C. Lawson) is an economically and ecologically important conifer that has a wide geographic range in the western USA, but is mostly absent from the geographic centre of its distribution - the Great Basin and adjoining mountain ranges. Much of its modern range was achieved by migration of geographically distinct Sierra Nevada (P. ponderosa var. ponderosa) and Rocky Mountain (P. ponderosa var. scopulorum) varieties in the last 10,000 years. Previous research has confirmed genetic differences between the two varieties, and measurable genetic exchange occurs where their ranges now overlap in western Montana. A variety of approaches in bioclimatic modelling is required to explore the ecological differences between these varieties and their implications for historical biogeography and impending changes in western landscapes. Location? Western USA. Methods? We used a classification tree analysis and a minimum-volume ellipsoid as models to explain the broad patterns of distribution of ponderosa pine in modern environments using climatic and edaphic variables. Most biogeographical modelling assumes that the target group represents a single, ecologically uniform taxonomic population. Classification tree analysis does not require this assumption because it allows the creation of pathways that predict multiple positive and negative outcomes. Thus, classification tree analysis can be used to test the ecological uniformity of the species. In addition, a multidimensional ellipsoid was constructed to describe the niche of each variety of ponderosa pine, and distances from the niche were calculated and mapped on a 4-km grid for each ecological variable. Results? The resulting classification tree identified three dominant pathways predicting ponderosa pine presence. Two of these three pathways correspond roughly to the distribution of var. ponderosa, and the third pathway generally corresponds to the distribution of var

  15. Classification of the PALMS single particle mass spectral data from Atlanta by regression tree analysis

    NASA Astrophysics Data System (ADS)

    Middlebrook, A. M.; Murphy, D. M.; Lee, S.; Lee, S.; Lee, S.; Thomson, D. S.; Thomson, D. S.

    2001-12-01

    During the Atlanta Supersites project in August 1999, the PALMS (Particle Analysis by Laser Mass Spectrometry) instrument collected over 500,000 individual particle spectra. The Atlanta data were originally analyzed by examining combinations of peaks and relative peak areas [Lee et al., 2001a,b], and a wide range of particle components such as sulfate, nitrate, mineral species, metals, organic species, and elemental carbon were detected. To further study the dataset, a classification program using regression tree analysis was developed and applied. Spectral data were compressed into a lower resolution spectrum (every 0.25 mass units) of the raw data and a list of peak areas (every mass unit). Each spectrum started as a normalized classification vector by itself. If the dot product of two classification vectors was within a certain threshold, they were combined into a new classification. The new classification vector was a normalized running average of the classifications being combined. In subsequent steps, the threshold for combining classifications was continuously lowered until a reasonable number of classifications remained. After the final iteration, each spectrum was compared individually with the entire set of classification vectors. Classifications were also combined manually. The classification results from the Atlanta data are generally consistent with those determined by peak identification. However, the classification program identified specific patterns in the mass spectra that were not found by peak identification and generated new particle types. Furthermore, rare particle types that may affect human health were studied in more detail. A description of the classification program as well as the results for the Atlanta data will be presented. Lee, S.-H., D. M. Murphy, D. S. Thomson, and A. M. Middlebrook, Chemical components of single particles measured with particle analysis by laser mass spectrometry (PALMS) during the Atlanta Supersites Project

  16. Parameter optimization of image classification techniques to delineate crowns of coppice trees on UltraCam-D aerial imagery in woodlands

    NASA Astrophysics Data System (ADS)

    Erfanifard, Yousef; Stereńczak, Krzysztof; Behnia, Negin

    2014-01-01

    Estimating the optimal parameters of some classification techniques becomes their negative aspect as it affects their performance for a given dataset and reduces classification accuracy. It was aimed to optimize the combination of effective parameters of support vector machine (SVM), artificial neural network (ANN), and object-based image analysis (OBIA) classification techniques by the Taguchi method. The optimized techniques were applied to delineate crowns of Persian oak coppice trees on UltraCam-D very high spatial resolution aerial imagery in Zagros semiarid woodlands, Iran. The imagery was classified and the maps were assessed by receiver operating characteristic curve and other performance metrics. The results showed that Taguchi is a robust approach to optimize the combination of effective parameters in these image classification techniques. The area under curve (AUC) showed that the optimized OBIA could well discriminate tree crowns on the imagery (AUC=0.897), while SVM and ANN yielded slightly less AUC performances of 0.819 and 0.850, respectively. The indices of accuracy (0.999) and precision (0.999) and performance metrics of specificity (0.999) and sensitivity (0.999) in the optimized OBIA were higher than with other techniques. The optimization of effective parameters of image classification techniques by the Taguchi method, thus, provided encouraging results to discriminate the crowns of Persian oak coppice trees on UltraCam-D aerial imagery in Zagros semiarid woodlands.

  17. Classification of Tree Species in Overstorey Canopy of Subtropical Forest Using QuickBird Images

    PubMed Central

    Lin, Chinsu; Popescu, Sorin C.; Thomson, Gavin; Tsogt, Khongor; Chang, Chein-I

    2015-01-01

    This paper proposes a supervised classification scheme to identify 40 tree species (2 coniferous, 38 broadleaf) belonging to 22 families and 36 genera in high spatial resolution QuickBird multispectral images (HMS). Overall kappa coefficient (OKC) and species conditional kappa coefficients (SCKC) were used to evaluate classification performance in training samples and estimate accuracy and uncertainty in test samples. Baseline classification performance using HMS images and vegetation index (VI) images were evaluated with an OKC value of 0.58 and 0.48 respectively, but performance improved significantly (up to 0.99) when used in combination with an HMS spectral-spatial texture image (SpecTex). One of the 40 species had very high conditional kappa coefficient performance (SCKC ≥ 0.95) using 4-band HMS and 5-band VIs images, but, only five species had lower performance (0.68 ≤ SCKC ≤ 0.94) using the SpecTex images. When SpecTex images were combined with a Visible Atmospherically Resistant Index (VARI), there was a significant improvement in performance in the training samples. The same level of improvement could not be replicated in the test samples indicating that a high degree of uncertainty exists in species classification accuracy which may be due to individual tree crown density, leaf greenness (inter-canopy gaps), and noise in the background environment (intra-canopy gaps). These factors increase uncertainty in the spectral texture features and therefore represent potential problems when using pixel-based classification techniques for multi-species classification. PMID:25978466

  18. A practicable approach for periodontal classification

    PubMed Central

    Mittal, Vishnu; Bhullar, Raman Preet K.; Bansal, Rachita; Singh, Karanprakash; Bhalodi, Anand; Khinda, Paramjit K.

    2013-01-01

    The Diagnosis and classification of periodontal diseases has remained a dilemma since long. Two distinct concepts have been used to define diseases: Essentialism and Nominalism. Essentialistic concept implies the real existence of disease whereas; nominalistic concept states that the names of diseases are the convenient way of stating concisely the endpoint of a diagnostic process. It generally advances from assessment of symptoms and signs toward knowledge of causation and gives a feasible option to name the disease for which etiology is either unknown or it is too complex to access in routine clinical practice. Various classifications have been proposed by the American Academy of Periodontology (AAP) in 1986, 1989 and 1999. The AAP 1999 classification is among the most widely used classification. But this classification also has demerits which provide impediment for its use in day to day practice. Hence a classification and diagnostic system is required which can help the clinician to access the patient's need and provide a suitable treatment which is in harmony with the diagnosis for that particular case. Here is an attempt to propose a practicable classification and diagnostic system of periodontal diseases for better treatment outcome. PMID:24379855

  19. Stratification of the severity of critically ill patients with classification trees

    PubMed Central

    2009-01-01

    Background Development of three classification trees (CT) based on the CART (Classification and Regression Trees), CHAID (Chi-Square Automatic Interaction Detection) and C4.5 methodologies for the calculation of probability of hospital mortality; the comparison of the results with the APACHE II, SAPS II and MPM II-24 scores, and with a model based on multiple logistic regression (LR). Methods Retrospective study of 2864 patients. Random partition (70:30) into a Development Set (DS) n = 1808 and Validation Set (VS) n = 808. Their properties of discrimination are compared with the ROC curve (AUC CI 95%), Percent of correct classification (PCC CI 95%); and the calibration with the Calibration Curve and the Standardized Mortality Ratio (SMR CI 95%). Results CTs are produced with a different selection of variables and decision rules: CART (5 variables and 8 decision rules), CHAID (7 variables and 15 rules) and C4.5 (6 variables and 10 rules). The common variables were: inotropic therapy, Glasgow, age, (A-a)O2 gradient and antecedent of chronic illness. In VS: all the models achieved acceptable discrimination with AUC above 0.7. CT: CART (0.75(0.71-0.81)), CHAID (0.76(0.72-0.79)) and C4.5 (0.76(0.73-0.80)). PCC: CART (72(69-75)), CHAID (72(69-75)) and C4.5 (76(73-79)). Calibration (SMR) better in the CT: CART (1.04(0.95-1.31)), CHAID (1.06(0.97-1.15) and C4.5 (1.08(0.98-1.16)). Conclusion With different methodologies of CTs, trees are generated with different selection of variables and decision rules. The CTs are easy to interpret, and they stratify the risk of hospital mortality. The CTs should be taken into account for the classification of the prognosis of critically ill patients. PMID:20003229

  20. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  1. A discrete element modelling approach for block impacts on trees

    NASA Astrophysics Data System (ADS)

    Toe, David; Bourrier, Franck; Olmedo, Ignatio; Berger, Frederic

    2015-04-01

    These past few year rockfall models explicitly accounting for block shape, especially those using the Discrete Element Method (DEM), have shown a good ability to predict rockfall trajectories. Integrating forest effects into those models still remain challenging. This study aims at using a DEM approach to model impacts of blocks on trees and identify the key parameters controlling the block kinematics after the impact on a tree. A DEM impact model of a block on a tree was developed and validated using laboratory experiments. Then, key parameters were assessed using a global sensitivity analyse. Modelling the impact of a block on a tree using DEM allows taking into account large displacements, material non-linearities and contacts between the block and the tree. Tree stems are represented by flexible cylinders model as plastic beams sustaining normal, shearing, bending, and twisting loading. Root soil interactions are modelled using a rotation stiffness acting on the bending moment at the bottom of the tree and a limit bending moment to account for tree overturning. The crown is taken into account using an additional mass distribute uniformly on the upper part of the tree. The block is represented by a sphere. The contact model between the block and the stem consists of an elastic frictional model. The DEM model was validated using laboratory impact tests carried out on 41 fresh beech (Fagus Sylvatica) stems. Each stem was 1,3 m long with a diameter between 3 to 7 cm. Wood stems were clamped on a rigid structure and impacted by a 149 kg charpy pendulum. Finally an intensive simulation campaign of blocks impacting trees was done to identify the input parameters controlling the block kinematics after the impact on a tree. 20 input parameters were considered in the DEM simulation model : 12 parameters were related to the tree and 8 parameters to the block. The results highlight that the impact velocity, the stem diameter, and the block volume are the three input

  2. Automated Diagnosis of Heart Sounds Using Rule-Based Classification Tree.

    PubMed

    Karar, Mohamed Esmail; El-Khafif, Sahar H; El-Brawany, Mohamed A

    2017-04-01

    In order to assist the diagnosis procedure of heart sound signals, this paper presents a new automated method for classifying the heart status using a rule-based classification tree into normal and three abnormal cases; namely the aortic valve stenosis, aortic insufficient, and ventricular septum defect. The developed method includes three main steps as follows. First, one cycle of the heart sound signals is automatically detected and segmented based on time properties of the heart signals. Second, the segmented cycle is preprocessed with the discrete wavelet transform and then largest Lyapunov exponents are calculated to generate the dynamical features of heart sound time series. Finally, a rule-based classification tree is fed by these Lyapunov exponents to give the final decision of the heart health status. The developed method has been tested successfully on twenty-two datasets of normal heart sounds and murmurs with success rate of 95.5%. The resulting error can be easily corrected by modifying the classification rules; consequently, the accuracy of automated heart sounds diagnosis is further improved.

  3. The Reliability of Classification of Terminal Nodes in GUIDE Decision Tree to Predict the Nonalcoholic Fatty Liver Disease.

    PubMed

    Birjandi, Mehdi; Ayatollahi, Seyyed Mohammad Taghi; Pourahmad, Saeedeh

    2016-01-01

    Tree structured modeling is a data mining technique used to recursively partition a dataset into relatively homogeneous subgroups in order to make more accurate predictions on generated classes. One of the classification tree induction algorithms, GUIDE, is a nonparametric method with suitable accuracy and low bias selection, which is used for predicting binary classes based on many predictors. In this tree, evaluating the accuracy of predicted classes (terminal nodes) is clinically of special importance. For this purpose, we used GUIDE classification tree in two statuses of equal and unequal misclassification cost in order to predict nonalcoholic fatty liver disease (NAFLD), considering 30 predictors. Then, to evaluate the accuracy of predicted classes by using bootstrap method, first the classification reliability in which individuals are assigned to a unique class and next the prediction probability reliability as support for that are considered.

  4. The Reliability of Classification of Terminal Nodes in GUIDE Decision Tree to Predict the Nonalcoholic Fatty Liver Disease

    PubMed Central

    Pourahmad, Saeedeh

    2016-01-01

    Tree structured modeling is a data mining technique used to recursively partition a dataset into relatively homogeneous subgroups in order to make more accurate predictions on generated classes. One of the classification tree induction algorithms, GUIDE, is a nonparametric method with suitable accuracy and low bias selection, which is used for predicting binary classes based on many predictors. In this tree, evaluating the accuracy of predicted classes (terminal nodes) is clinically of special importance. For this purpose, we used GUIDE classification tree in two statuses of equal and unequal misclassification cost in order to predict nonalcoholic fatty liver disease (NAFLD), considering 30 predictors. Then, to evaluate the accuracy of predicted classes by using bootstrap method, first the classification reliability in which individuals are assigned to a unique class and next the prediction probability reliability as support for that are considered. PMID:28053651

  5. Classification

    ERIC Educational Resources Information Center

    Clary, Renee; Wandersee, James

    2013-01-01

    In this article, Renee Clary and James Wandersee describe the beginnings of "Classification," which lies at the very heart of science and depends upon pattern recognition. Clary and Wandersee approach patterns by first telling the story of the "Linnaean classification system," introduced by Carl Linnacus (1707-1778), who is…

  6. A phylogenomic approach to bacterial subspecies classification: proof of concept in Mycobacterium abscessus

    PubMed Central

    2013-01-01

    Background Mycobacterium abscessus is a rapidly growing mycobacterium that is often associated with human infections. The taxonomy of this species has undergone several revisions and is still being debated. In this study, we sequenced the genomes of 12 M. abscessus strains and used phylogenomic analysis to perform subspecies classification. Results A data mining approach was used to rank and select informative genes based on the relative entropy metric for the construction of a phylogenetic tree. The resulting tree topology was similar to that generated using the concatenation of five classical housekeeping genes: rpoB, hsp65, secA, recA and sodA. Additional support for the reliability of the subspecies classification came from the analysis of erm41 and ITS gene sequences, single nucleotide polymorphisms (SNPs)-based classification and strain clustering demonstrated by a variable number tandem repeat (VNTR) assay and a multilocus sequence analysis (MLSA). We subsequently found that the concatenation of a minimal set of three median-ranked genes: DNA polymerase III subunit alpha (polC), 4-hydroxy-2-ketovalerate aldolase (Hoa) and cell division protein FtsZ (ftsZ), is sufficient to recover the same tree topology. PCR assays designed specifically for these genes showed that all three genes could be amplified in the reference strain of M. abscessus ATCC 19977T. Conclusion This study provides proof of concept that whole-genome sequence-based data mining approach can provide confirmatory evidence of the phylogenetic informativeness of existing markers, as well as lead to the discovery of a more economical and informative set of markers that produces similar subspecies classification in M. abscessus. The systematic procedure used in this study to choose the informative minimal set of gene markers can potentially be applied to species or subspecies classification of other bacteria. PMID:24330254

  7. A novel modulation classification approach using Gabor filter network.

    PubMed

    Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed

    2014-01-01

    A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel.

  8. A Novel Modulation Classification Approach Using Gabor Filter Network

    PubMed Central

    Ghauri, Sajjad Ahmed; Qureshi, Ijaz Mansoor; Cheema, Tanveer Ahmed; Malik, Aqdas Naveed

    2014-01-01

    A Gabor filter network based approach is used for feature extraction and classification of digital modulated signals by adaptively tuning the parameters of Gabor filter network. Modulation classification of digitally modulated signals is done under the influence of additive white Gaussian noise (AWGN). The modulations considered for the classification purpose are PSK 2 to 64, FSK 2 to 64, and QAM 4 to 64. The Gabor filter network uses the network structure of two layers; the first layer which is input layer constitutes the adaptive feature extraction part and the second layer constitutes the signal classification part. The Gabor atom parameters are tuned using Delta rule and updating of weights of Gabor filter using least mean square (LMS) algorithm. The simulation results show that proposed novel modulation classification algorithm has high classification accuracy at low signal to noise ratio (SNR) on AWGN channel. PMID:25126603

  9. Internal Carbon Recycling in Trees - New Approach, Findings, and Implications

    NASA Astrophysics Data System (ADS)

    Angert, A.; Hilman, B.

    2012-12-01

    The CO2 emitted by respiration in a tree woody tissue (stem, branch, or root) is usually assumed to diffuse directly out to the atmosphere. Given that the internal concentrations of CO2 are one to two orders of magnitude higher than the atmospheric concentration, a reuse of this respired carbon can be beneficial to plants. We have developed a new method to track the fraction of respired CO2 not emitted from stems and branches, from the ratio of the CO2 efflux to the O2 influx. This ratio, which we defined as the apparent respiratory quotient (ARQ), is expected to equal 1.0 if carbohydrates are the substrate for respiration, and all respired CO2 is directly emitted. Using this approach we have recently showed that ~30% of the CO2 respired by Amazon forest tree stems was not directly emitted. In the current study we have applied this approach to 5 tree species living in Mediterranean climate, and have performed seasonal and diurnal ARQ measurements, at different heights along the stem and branches. We found different seasonal variations in the ARQ of riparian versus drought-resilient trees. In addition, the ARQ diurnal cycle, together with the measurements in different heights, indicate that a considerable fraction of the CO2 not emitted is recycled within the tree.

  10. Eating Disorder Diagnoses: Empirical Approaches to Classification

    ERIC Educational Resources Information Center

    Wonderlich, Stephen A.; Joiner, Thomas E., Jr.; Keel, Pamela K.; Williamson, Donald A.; Crosby, Ross D.

    2007-01-01

    Decisions about the classification of eating disorders have significant scientific and clinical implications. The eating disorder diagnoses in the Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV; American Psychiatric Association, 1994) reflect the collective wisdom of experts in the field but are frequently not supported in…

  11. Non-Destructive Classification Approaches for Equilibrated Ordinary Chondrites

    NASA Astrophysics Data System (ADS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-09-01

    In order to compare a few non-destructive classification techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Mössbauer spectroscopy.

  12. Flow cytometry data analysis: comparing large multivariate data sets using classification trees

    SciTech Connect

    Norman, J.

    1994-12-31

    This paper describes a method to compare flow cytometry data sets, which typically contain 50,000 six-parameter measurements each. By this method, the data points in two such data sets are divided into subpopulations using a binary classification tree generated from the data. The {chi}{sup 2} test is then used to establish the homogeneity of the two data sets based on how their data are distributed across these subpopulations. Preliminary results indicate that this comparison method is sufficiently sensitive to detect differences between flow cytometry data sets that are too subtle for human investigators to notice.

  13. A cross-cultural investigation of college student alcohol consumption: a classification tree analysis.

    PubMed

    Kitsantas, Panagiota; Kitsantas, Anastasia; Anagnostopoulou, Tanya

    2008-01-01

    In this cross-cultural study, the authors attempted to identify high-risk subgroups for alcohol consumption among college students. American and Greek students (N = 132) answered questions about alcohol consumption, religious beliefs, attitudes toward drinking, advertisement influences, parental monitoring, and drinking consequences. Heavy drinkers in the American group were younger and less religious than were infrequent drinkers. In the Greek group, heavy drinkers tended to deny the negative results of drinking alcohol and use a permissive attitude to justify it, whereas infrequent drinkers were more likely to be monitored by their parents. These results suggest that parental monitoring and an emphasis on informing students about the negative effects of alcohol on their health and social and academic lives may be effective methods of reducing alcohol consumption. Classification tree analysis revealed that student attitudes toward drinking were important in the classification of American and Greek drinkers, indicating that this is a powerful predictor of alcohol consumption regardless of ethnic background.

  14. A statistical approach to set classification by feature selection with applications to classification of histopathology images.

    PubMed

    Jung, Sungkyu; Qiao, Xingye

    2014-09-01

    Set classification problems arise when classification tasks are based on sets of observations as opposed to individual observations. In set classification, a classification rule is trained with N sets of observations, where each set is labeled with class information, and the prediction of a class label is performed also with a set of observations. Data sets for set classification appear, for example, in diagnostics of disease based on multiple cell nucleus images from a single tissue. Relevant statistical models for set classification are introduced, which motivate a set classification framework based on context-free feature extraction. By understanding a set of observations as an empirical distribution, we employ a data-driven method to choose those features which contain information on location and major variation. In particular, the method of principal component analysis is used to extract the features of major variation. Multidimensional scaling is used to represent features as vector-valued points on which conventional classifiers can be applied. The proposed set classification approaches achieve better classification results than competing methods in a number of simulated data examples. The benefits of our method are demonstrated in an analysis of histopathology images of cell nuclei related to liver cancer.

  15. Classification of savanna tree species, in the Greater Kruger National Park region, by integrating hyperspectral and LiDAR data in a Random Forest data mining environment

    NASA Astrophysics Data System (ADS)

    Naidoo, L.; Cho, M. A.; Mathieu, R.; Asner, G.

    2012-04-01

    The accurate classification and mapping of individual trees at species level in the savanna ecosystem can provide numerous benefits for the managerial authorities. Such benefits include the mapping of economically useful tree species, which are a key source of food production and fuel wood for the local communities, and of problematic alien invasive and bush encroaching species, which can threaten the integrity of the environment and livelihoods of the local communities. Species level mapping is particularly challenging in African savannas which are complex, heterogeneous, and open environments with high intra-species spectral variability due to differences in geology, topography, rainfall, herbivory and human impacts within relatively short distances. Savanna vegetation are also highly irregular in canopy and crown shape, height and other structural dimensions with a combination of open grassland patches and dense woody thicket - a stark contrast to the more homogeneous forest vegetation. This study classified eight common savanna tree species in the Greater Kruger National Park region, South Africa, using a combination of hyperspectral and Light Detection and Ranging (LiDAR)-derived structural parameters, in the form of seven predictor datasets, in an automated Random Forest modelling approach. The most important predictors, which were found to play an important role in the different classification models and contributed to the success of the hybrid dataset model when combined, were species tree height; NDVI; the chlorophyll b wavelength (466 nm) and a selection of raw, continuum removed and Spectral Angle Mapper (SAM) bands. It was also concluded that the hybrid predictor dataset Random Forest model yielded the highest classification accuracy and prediction success for the eight savanna tree species with an overall classification accuracy of 87.68% and KHAT value of 0.843.

  16. The Tree of Life and a New Classification of Bony Fishes

    PubMed Central

    Betancur-R., Ricardo; Broughton, Richard E.; Wiley, Edward O.; Carpenter, Kent; López, J. Andrés; Li, Chenhong; Holcroft, Nancy I.; Arcila, Dahiana; Sanciangco, Millicent; Cureton II, James C; Zhang, Feifei; Buser, Thaddaeus; Campbell, Matthew A.; Ballesteros, Jesus A; Roa-Varon, Adela; Willis, Stuart; Borden, W. Calvin; Rowley, Thaine; Reneau, Paulette C.; Hough, Daniel J.; Lu, Guoqing; Grande, Terry; Arratia, Gloria; Ortí, Guillermo

    2013-01-01

    The tree of life of fishes is in a state of flux because we still lack a comprehensive phylogeny that includes all major groups. The situation is most critical for a large clade of spiny-finned fishes, traditionally referred to as percomorphs, whose uncertain relationships have plagued ichthyologists for over a century. Most of what we know about the higher-level relationships among fish lineages has been based on morphology, but rapid influx of molecular studies is changing many established systematic concepts. We report a comprehensive molecular phylogeny for bony fishes that includes representatives of all major lineages. DNA sequence data for 21 molecular markers (one mitochondrial and 20 nuclear genes) were collected for 1410 bony fish taxa, plus four tetrapod species and two chondrichthyan outgroups (total 1416 terminals). Bony fish diversity is represented by 1093 genera, 369 families, and all traditionally recognized orders. The maximum likelihood tree provides unprecedented resolution and high bootstrap support for most backbone nodes, defining for the first time a global phylogeny of fishes. The general structure of the tree is in agreement with expectations from previous morphological and molecular studies, but significant new clades arise. Most interestingly, the high degree of uncertainty among percomorphs is now resolved into nine well-supported supraordinal groups. The order Perciformes, considered by many a polyphyletic taxonomic waste basket, is defined for the first time as a monophyletic group in the global phylogeny. A new classification that reflects our phylogenetic hypothesis is proposed to facilitate communication about the newly found structure of the tree of life of fishes. Finally, the molecular phylogeny is calibrated using 60 fossil constraints to produce a comprehensive time tree. The new time-calibrated phylogeny will provide the basis for and stimulate new comparative studies to better understand the evolution of the amazing

  17. Application of classification-tree methods to identify nitrate sources in ground water

    USGS Publications Warehouse

    Spruill, T.B.; Showers, W.J.; Howe, S.S.

    2002-01-01

    A study was conducted to determine if nitrate sources in ground water (fertilizer on crops, fertilizer on golf courses, irrigation spray from hog (Sus scrofa) wastes, and leachate from poultry litter and septic systems) could be classified with 80% or greater success. Two statistical classification-tree models were devised from 48 water samples containing nitrate from five source categories. Model I was constructed by evaluating 32 variables and selecting four primary predictor variables (??15N, nitrate to ammonia ratio, sodium to potassium ratio, and zinc) to identify nitrate sources. A ??15N value of nitrate plus potassium 18.2 indicated inorganic or soil organic N. A nitrate to ammonia ratio 575 indicated nitrate from golf courses. A sodium to potassium ratio 3.2 indicated spray or poultry wastes. A value for zinc 2.8 indicated poultry wastes. Model 2 was devised by using all variables except ??15N. This model also included four variables (sodium plus potassium, nitrate to ammonia ratio, calcium to magnesium ratio, and sodium to potassium ratio) to distinguish categories. Both models were able to distinguish all five source categories with better than 80% overall success and with 71 to 100% success in individual categories using the learning samples. Seventeen water samples that were not used in model development were tested using Model 2 for three categories, and all were correctly classified. Classification-tree models show great potential in identifying sources of contamination and variables important in the source-identification process.

  18. Collaborative evaluation and management of students' health-related physical fitness: applications of cluster analysis and the classification tree.

    PubMed

    Chen, Jou-An; Shih, Chi-Chuan; Lin, Pay-Fan; Chen, Jin-Jong; Lin, Kuan-Chia

    2012-01-01

    Health-related physical fitness has decreased with age; this is od immense concern to adolescents. School-based health intervention programs can be classified as either population-wide or high-risk approach. Although the population-wide and risk-based approaches adopt different healthcare angles, they all need to focus resources on risk evaluation. In this paper, we describe an exploratory application of cluster analysis and the tree model to collaborative evaluation of students' health- related physical fitness from a high school sample in Taiwan (n=742). Cluster analysis show that physical fitness can be divided into relatively good, moderate and poor subgroups. There are significant differences in biochemical measurements among these three groups. For the tree model, we used 2004 school-year students as an experimental group and 2005 school-year students as a validation group. The results indicate that if sit-and-reach is shorter than 33 cm, BMI is >25.46 kg/m2, and 1600 m run/walk is >534 s, the predicted probability for the number of metabolic risk factors ≥2 is 100% and the population is 41, both results are the highest. From the risk-based healthcare viewpoint, the cluster analysis can sort out students' physical fitness data in a short time and then narrow down the scope to recognize the subgroups. A classification tree model specifically shows the discrimination paths between the measurements of physical fitness for metabolic risk and would be helpful for self-management or proper healthcare education targeting different groups. Applying both methods to specific adolescents' health issues could provide different angles in planning health promotion projects.

  19. Generation of 2D Land Cover Maps for Urban Areas Using Decision Tree Classification

    NASA Astrophysics Data System (ADS)

    Höhle, J.

    2014-09-01

    A 2D land cover map can automatically and efficiently be generated from high-resolution multispectral aerial images. First, a digital surface model is produced and each cell of the elevation model is then supplemented with attributes. A decision tree classification is applied to extract map objects like buildings, roads, grassland, trees, hedges, and walls from such an "intelligent" point cloud. The decision tree is derived from training areas which borders are digitized on top of a false-colour orthoimage. The produced 2D land cover map with six classes is then subsequently refined by using image analysis techniques. The proposed methodology is described step by step. The classification, assessment, and refinement is carried out by the open source software "R"; the generation of the dense and accurate digital surface model by the "Match-T DSM" program of the Trimble Company. A practical example of a 2D land cover map generation is carried out. Images of a multispectral medium-format aerial camera covering an urban area in Switzerland are used. The assessment of the produced land cover map is based on class-wise stratified sampling where reference values of samples are determined by means of stereo-observations of false-colour stereopairs. The stratified statistical assessment of the produced land cover map with six classes and based on 91 points per class reveals a high thematic accuracy for classes "building" (99 %, 95 % CI: 95 %-100 %) and "road and parking lot" (90 %, 95 % CI: 83 %-95 %). Some other accuracy measures (overall accuracy, kappa value) and their 95 % confidence intervals are derived as well. The proposed methodology has a high potential for automation and fast processing and may be applied to other scenes and sensors.

  20. Classification of oxide glasses: A polarizability approach

    SciTech Connect

    Dimitrov, Vesselin; Komatsu, Takayuki . E-mail: komatsu@chem.nagaokaut.ac.jp

    2005-03-15

    A classification of binary oxide glasses has been proposed taking into account the values obtained on their refractive index-based oxide ion polarizability {alpha}{sub O2-}(n{sub 0}), optical basicity {lambda}(n{sub 0}), metallization criterion M(n{sub 0}), interaction parameter A(n{sub 0}), and ion's effective charges as well as O1s and metal binding energies determined by XPS. Four groups of oxide glasses have been established: glasses formed by two glass-forming acidic oxides; glasses formed by glass-forming acidic oxide and modifier's basic oxide; glasses formed by glass-forming acidic and conditional glass-forming basic oxide; glasses formed by two basic oxides. The role of electronic ion polarizability in chemical bonding of oxide glasses has been also estimated. Good agreement has been found with the previous results concerning classification of simple oxides. The results obtained probably provide good basis for prediction of type of bonding in oxide glasses on the basis of refractive index as well as for prediction of new nonlinear optical materials.

  1. The creation of a digital soil map for Cyprus using decision-tree classification techniques

    NASA Astrophysics Data System (ADS)

    Camera, Corrado; Zomeni, Zomenia; Bruggeman, Adriana; Noller, Joy; Zissimos, Andreas

    2014-05-01

    Considering the increasing threats soil are experiencing especially in semi-arid, Mediterranean environments like Cyprus (erosion, contamination, sealing and salinisation), producing a high resolution, reliable soil map is essential for further soil conservation studies. This study aims to create a 1:50.000 soil map covering the area under the direct control of the Republic of Cyprus (5.760 km2). The study consists of two major steps. The first is the creation of a raster database of predictive variables selected according to the scorpan formula (McBratney et al., 2003). It is of particular interest the possibility of using, as soil properties, data coming from three older island-wide soil maps and the recently published geochemical atlas of Cyprus (Cohen et al., 2011). Ten highly characterizing elements were selected and used as predictors in the present study. For the other factors usual variables were used: temperature and aridity index for climate; total loss on ignition, vegetation and forestry types maps for organic matter; the DEM and related relief derivatives (slope, aspect, curvature, landscape units); bedrock, surficial geology and geomorphology (Noller, 2009) for parent material and age; and a sub-watershed map to better bound location related to parent material sources. In the second step, the digital soil map is created using the Random Forests package in R. Random Forests is a decision tree classification technique where many trees, instead of a single one, are developed and compared to increase the stability and the reliability of the prediction. The model is trained and verified on areas where a 1:25.000 published soil maps obtained from field work is available and then it is applied for predictive mapping to the other areas. Preliminary results obtained in a small area in the plain around the city of Lefkosia, where eight different soil classes are present, show very good capacities of the method. The Ramdom Forest approach leads to reproduce soil

  2. Tree-Level Hydrodynamic Approach for Improved Stomatal Conductance Parameterization

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, G.; Bohrer, G.; Matheny, A. M.; Ivanov, V. Y.

    2014-12-01

    The land-surface models do not mechanistically resolve hydrodynamic processes within the tree. The Finite-Elements Tree-Crown Hydrodynamics model version 2 (FETCH2) is based on the pervious FETCH model approach, but with finite difference numerics, and simplified single-beam conduit system. FETCH2 simulates water flow through the tree as a simplified system of porous media conduits. It explicitly resolves spatiotemporal hydraulic stresses throughout the tree's vertical extent that cannot be easily represented using other stomatal-conductance models. Empirical equations relate water potential at the stem to stomata conductance at leaves connected to the stem (through unresolved branches) at that height. While highly simplified, this approach bring some realism to the simulation of stomata conductance because the stomata can respond to stem water potential, rather than an assumed direct relationship with soil moisture, as is currently the case in almost all models. By enabling mechanistic simulation of hydrological traits, such as xylem conductivity, conductive area per DBH, vertical distribution of leaf area and maximal and minimal water content in the xylem, and their effect of the dynamics of water flow in the tree system, the FETCH2 modeling system enhanced our understanding of the role of hydraulic limitations on an experimental forest plot short-term water stresses that lead to tradeoffs between water and light availability for transpiring leaves in forest ecosystems. FETCH2 is particularly suitable to resolve the effects of structural differences between tree and species and size groups, and the consequences of differences in hydraulic strategies of different species. We leverage on a large dataset of sap flow from 60 trees of 4 species at our experimental plot at the University of Michigan Biological Station. Comparison of the sap flow and transpiration patterns in this site and an undisturbed control site shows significant difference in hydraulic strategies

  3. Morphological and molecular characteristics do not confirm popular classification of the Brazil nut tree in Acre, Brazil.

    PubMed

    Sujii, P S; Fernandes, E T M B; Azevedo, V C R; Ciampi, A Y; Martins, K; de O Wadt, L H

    2013-09-27

    In the State of Acre, the Brazil nut tree, Bertholletia excelsa (Lecythidaceae), is classified by the local population into two types according to morphological characteristics, including color and quality of wood, shape of the trunk and crown, and fruit production. We examined the reliability of this classification by comparing morphological and molecular data of four populations of Brazil nut trees from Vale do Rio Acre in the Brazilian Amazon. For the morphological analysis, we evaluated qualitative and quantitative information of the trees, fruits, and seeds. The molecular analysis was performed using RAPD and ISSR markers, with cluster analysis. Significant differences were found between the two types of Brazil nut trees for the characters diameter at breast height, fruit yield, fruit size, and number of seeds per fruit. Despite the significant correlation between the morphological characteristics and the popular classification, we observed all possible combinations of morphological characteristics in both types of Brazil nut trees. In some individuals, the classification did not correspond to any of the characteristics. The results obtained with molecular markers showed that the two locally classified types of Brazil nut trees did not differ genetically, indicating that there is no consistent separation between them.

  4. A novel dendrochronological approach reveals drivers of carbon sequestration in tree species of riparian forests across spatiotemporal scales.

    PubMed

    Rieger, Isaak; Kowarik, Ingo; Cherubini, Paolo; Cierjacks, Arne

    2017-01-01

    Aboveground carbon (C) sequestration in trees is important in global C dynamics, but reliable techniques for its modeling in highly productive and heterogeneous ecosystems are limited. We applied an extended dendrochronological approach to disentangle the functioning of drivers from the atmosphere (temperature, precipitation), the lithosphere (sedimentation rate), the hydrosphere (groundwater table, river water level fluctuation), the biosphere (tree characteristics), and the anthroposphere (dike construction). Carbon sequestration in aboveground biomass of riparian Quercus robur L. and Fraxinus excelsior L. was modeled (1) over time using boosted regression tree analysis (BRT) on cross-datable trees characterized by equal annual growth ring patterns and (2) across space using a subsequent classification and regression tree analysis (CART) on cross-datable and not cross-datable trees. While C sequestration of cross-datable Q. robur responded to precipitation and temperature, cross-datable F. excelsior also responded to a low Danube river water level. However, CART revealed that C sequestration over time is governed by tree height and parameters that vary over space (magnitude of fluctuation in the groundwater table, vertical distance to mean river water level, and longitudinal distance to upstream end of the study area). Thus, a uniform response to climatic drivers of aboveground C sequestration in Q. robur was only detectable in trees of an intermediate height class and in taller trees (>21.8m) on sites where the groundwater table fluctuated little (≤0.9m). The detection of climatic drivers and the river water level in F. excelsior depended on sites at lower altitudes above the mean river water level (≤2.7m) and along a less dynamic downstream section of the study area. Our approach indicates unexploited opportunities of understanding the interplay of different environmental drivers in aboveground C sequestration. Results may support species-specific and

  5. An Ensemble Rule Learning Approach for Automated Morphological Classification of Erythrocytes.

    PubMed

    Maity, Maitreya; Mungle, Tushar; Dhane, Dhiraj; Maiti, A K; Chakraborty, Chandan

    2017-04-01

    The analysis of pathophysiological change to erythrocytes is important for early diagnosis of anaemia. The manual assessment of pathology slides is time-consuming and complicated regarding various types of cell identification. This paper proposes an ensemble rule-based decision-making approach for morphological classification of erythrocytes. Firstly, the digital microscopic blood smear images are pre-processed for removal of spurious regions followed by colour normalisation and thresholding. The erythrocytes are segmented from background image using the watershed algorithm. The shape features are then extracted from the segmented image to detect shape abnormality present in microscopic blood smear images. The decision about the abnormality is taken using proposed multiple rule-based expert systems. The deciding factor is majority ensemble voting for abnormally shaped erythrocytes. Here, shape-based features are considered for nine different types of abnormal erythrocytes including normal erythrocytes. Further, the adaptive boosting algorithm is used to generate multiple decision tree models where each model tree generates an individual rule set. The supervised classification method is followed to generate rules using a C4.5 decision tree. The proposed ensemble approach is precise in detecting eight types of abnormal erythrocytes with an overall accuracy of 97.81% and weighted sensitivity of 97.33%, weighted specificity of 99.7%, and weighted precision of 98%. This approach shows the robustness of proposed strategy for erythrocytes classification into abnormal and normal class. The article also clarifies its latent quality to be incorporated in point of care technology solution targeting a rapid clinical assistance.

  6. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification.

  7. Inguinal hernia recurrence: Classification and approach

    PubMed Central

    Campanelli, Giampiero; Pettinari, Diego; Cavalli, Marta; Avesani, Ettore Contessini

    2006-01-01

    The authors reviewed the records of 2,468 operations of groin hernia in 2,350 patients, including 277 recurrent hernias updated to January 2005. The data obtained - evaluating technique, results and complications - were used to propose a simple anatomo-clinical classification into three types which could be used to plan the surgical strategy: Type R1: first recurrence ‘high,’ oblique external, reducible hernia with small (<2 cm) defect in non-obese patients, after pure tissue or mesh repairType R2: first recurrence ‘low,’ direct, reducible hernia with small (<2 cm) defect in non-obese patients, after pure tissue or mesh repairType R3: all the other recurrences - including femoral recurrences; recurrent groin hernia with big defect (inguinal eventration); multirecurrent hernias; nonreducible, linked with a controlateral primitive or recurrent hernia; and situations compromised from aggravating factors (for example obesity) or anyway not easily included in R1 or R2, after pure tissue or mesh repair. PMID:21187986

  8. A Hybrid Sensing Approach for Pure and Adulterated Honey Classification

    PubMed Central

    Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar

    2012-01-01

    This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033

  9. rpartOrdinal: An R Package for Deriving a Classification Tree for Predicting an Ordinal Response.

    PubMed

    Archer, Kellie J

    2010-04-01

    This paper describes an R package, rpartOrdinal, that implements alternative splitting functions for fitting a classification tree when interest lies in predicting an ordinal response. This includes the generalized Gini impurity function, which was introduced as a method for predicting an ordinal response by including costs of misclassification into the impurity function, as well as an alternative ordinal impurity function due to Piccarreta (2008) that does not require the assignment of misclassification costs. The ordered twoing splitting method, which is not defined as a decrease in node impurity, is also included in the package. Since, in the ordinal response setting, misclassifying observations to adjacent categories is a less egregious error than misclassifying observations to distant categories, this package also includes a function for estimating an ordinal measure of association, the gamma statistic.

  10. Identifying population groups with low palliative care program enrolment using classification and regression tree analysis.

    PubMed

    Gao, Jun; Johnston, Grace M; Lavergne, M Ruth; McIntyre, Paul

    2011-01-01

    Classification and regression tree (CART) analysis was used to identify subpopulations with lower palliative care program (PCP) enrolment rates. CART analysis uses recursive partitioning to group predictors. The PCP enrolment rate was 72 percent for the 6,892 adults who died of cancer from 2000 and 2005 in two counties in Nova Scotia, Canada. The lowest PCP enrolment rates were for nursing home residents over 82 years (27 percent), a group residing more than 43 kilometres from the PCP (31 percent), and another group living less than two weeks after their cancer diagnosis (37 percent). The highest rate (86 percent) was for the 2,118 persons who received palliative radiation. Findings from multiple logistic regression (MLR) were provided for comparison. CART findings identified low PCP enrolment subpopulations that were defined by interactions among demographic, social, medical, and health system predictors.

  11. Using Classification and Regression Trees (CART) and random forests to analyze attrition: Results from two simulations.

    PubMed

    Hayes, Timothy; Usami, Satoshi; Jacobucci, Ross; McArdle, John J

    2015-12-01

    In this article, we describe a recent development in the analysis of attrition: using classification and regression trees (CART) and random forest methods to generate inverse sampling weights. These flexible machine learning techniques have the potential to capture complex nonlinear, interactive selection models, yet to our knowledge, their performance in the missing data analysis context has never been evaluated. To assess the potential benefits of these methods, we compare their performance with commonly employed multiple imputation and complete case techniques in 2 simulations. These initial results suggest that weights computed from pruned CART analyses performed well in terms of both bias and efficiency when compared with other methods. We discuss the implications of these findings for applied researchers.

  12. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    SciTech Connect

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.; Pounds, Joel G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the set of predictors to around 8 factors that can be validated using reputable medical and public health resources.

  13. Identification of Sexually Abused Female Adolescents at Risk for Suicidal Ideations: A Classification and Regression Tree Analysis

    ERIC Educational Resources Information Center

    Brabant, Marie-Eve; Hebert, Martine; Chagnon, Francois

    2013-01-01

    This study explored the clinical profiles of 77 female teenager survivors of sexual abuse and examined the association of abuse-related and personal variables with suicidal ideations. Analyses revealed that 64% of participants experienced suicidal ideations. Findings from classification and regression tree analysis indicated that depression,…

  14. Ecological land classification: A survey approach

    NASA Astrophysics Data System (ADS)

    Rowe, J. Stan; Sheard, John W.

    1981-09-01

    A landscape approach to ecological land mapping, as illustrated in this article, proceeds by pattern recognition based on ecological theory. The unit areas delineated are hypotheses that arise from a knowledge of what is ecologically important in the land. Units formed by the mapper are likely to be inefficient or irrelevant for ecological purposes unless he possesses a sound rationale as to the interactions and controlling influences of the structural components of ecosystems. Here is the central problem with what have been called “objective” multivariate approaches to mapping based on grid units and the sometimes arbitrary attributes thereof; they tend to conceal the importance of ecological theory and the necessity for theory-based supervision of pattern recognition. Multivariate techniques are best used iteratively to verify and refine map units initially recognized and delineated by theoretical considerations. These ideas are illustrated by an example of a reconnaissance survey in the Northwest Territories of Canada.

  15. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification.

    PubMed

    Baali, Hamza; Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J E

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain-computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling's [Formula: see text] statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%.

  16. An approach for quantifying the efficacy of ecological classification schemes as management tools

    NASA Astrophysics Data System (ADS)

    Flanagan, A. M.; Cerrato, R. M.

    2015-10-01

    Rigorous assessments of ecological classification schemes being applied to submerged environments are needed to evaluate their utility as management tools. Verification that a scheme can quantitatively capture habitat and community variation would be of considerable value to individuals responsible for making difficult management decisions relevant to widespread environmental challenges including those in fisheries, preservation or restoration of critical habitats, and climate change. In this paper, an assessment approach that evaluates a scheme by treating it like a quantitative statistical model is presented. It couples two direct gradient, multivariate statistical techniques, multivariate regression trees (MRT) and redundancy analysis (RDA), with a modelling protocol involving model formulation, model selection, parameter estimation, and measurement of precision to produce a very flexible strategy for analyzing structure in ecological data. To illustrate the proposed approach, the assessment focused on benthic infauna and evaluating the Folk grain size classification scheme, along with some alternative grain size models. Analysis of data sets revealed that while it was fairly easy to uncover biotic-environmental relationships that were over-fitted, the community structure inherent in the data tended to be robustly discernible and preserved across all grain size models, but rigidly parameterized models (i.e., a one size fits all approach for grain size characterization with fixed boundaries) were generally ineffective. The proposed approach provided a clear, detailed, and rigorous assessment of Folk and several alternative models and can be used for the quantitative evaluation of existing ecological classification schemes and/or in the development of new schemes.

  17. Effect of training characteristics on object classification: An application using Boosted Decision Trees

    NASA Astrophysics Data System (ADS)

    Sevilla-Noarbe, I.; Etayo-Sotos, P.

    2015-06-01

    We present an application of a particular machine-learning method (Boosted Decision Trees, BDTs using AdaBoost) to separate stars and galaxies in photometric images using their catalog characteristics. BDTs are a well established machine learning technique used for classification purposes. They have been widely used specially in the field of particle and astroparticle physics, and we use them here in an optical astronomy application. This algorithm is able to improve from simple thresholding cuts on standard separation variables that may be affected by local effects such as blending, badly calculated background levels or which do not include information in other bands. The improvements are shown using the Sloan Digital Sky Survey Data Release 9, with respect to the type photometric classifier. We obtain an improvement in the impurity of the galaxy sample of a factor 2-4 for this particular dataset, adjusting for the same efficiency of the selection. Another main goal of this study is to verify the effects that different input vectors and training sets have on the classification performance, the results being of wider use to other machine learning techniques.

  18. Identifying tree crown delineation shapes and need for remediation on high resolution imagery using an evidence based approach

    NASA Astrophysics Data System (ADS)

    Leckie, Donald G.; Walsworth, Nicholas; Gougeon, François A.

    2016-04-01

    In order to fully realize the benefits of automated individual tree mapping for tree species, health, forest inventory attribution and forest management decision making, the tree delineations should be as good as possible. The concept of identifying poorly delineated tree crowns and suggesting likely types of remediation was investigated. Delineations (isolations or isols) were classified into shape types reflecting whether they were realistic tree shapes and the likely kind of remediation needed. Shape type was classified by an evidence based rules approach using primitives based on isol size, shape indices, morphology, the presence of local maxima, and matches with template models representing trees of different sizes. A test set containing 50,000 isols based on an automated tree delineation of 40 cm multispectral airborne imagery of a diverse temperate-boreal forest site was used. Isolations representing single trees or several trees were the focus, as opposed to cases where a tree is split into several isols. For eight shape classes from regular through to convolute, shape classification accuracy was in the order of 62%; simplifying to six classes accuracy was 83%. Shape type did give an indication of the type of remediation and there were 6% false alarms (i.e., isols classed as needing remediation but did not). Alternately, there were 5% omissions (i.e., isols of regular shape and not earmarked for remediation that did need remediation). The usefulness of the concept of identifying poor delineations in need of remediation was demonstrated and one suite of methods developed and shown to be effective.

  19. Annual Crop Type Classification of the U.S. Great Plains for 2000 - 2011: An Application of Classification Tree Modeling using Remote Sensing and Ancillary Environmental Data (Invited)

    NASA Astrophysics Data System (ADS)

    Howard, D. M.; Wylie, B. K.

    2013-12-01

    The purpose of this study was to increase spatial and temporal availability of crop classification data using reliable source data that have the potential of being applied on local, regional, national, and global levels. This study implemented classification tree modeling to map annual crop types throughout the U.S. Great Plains from 2000 - 2011. Classification tree modeling has been shown in numerous studies to be an effective tool for developing classification models. In this study, nearly 18 million crop observation points, derived from annual U.S. Department of Agriculture (USDA) National Agriculture Statistics Service (NASS) Cropland Data Layers (CDLs), were used in the training, development, and validation of a classification tree crop type model (CTM). Each observation point was further defined by weekly Normalized Differential Vegetation Index (NDVI) readings, annual climatic conditions, soil conditions, and a number of other biogeophysical environmental characteristics. The CTM accounted for the most prevalent crop types in the area, including, corn, soybeans, winter wheat, spring wheat, cotton, sorghum, and alfalfa. Other crops that did not fit into any of these classes were identified and grouped into a miscellaneous class. An 87% success rate was achieved on the classification of 1.8 million observation points (10% of total observation points) that were withheld from training. The CTM was applied to create annual crop maps of the U.S. Great Plains for 2000 - 2011 at a spatial resolution of 250 meters. Product validation was performed by comparing county acreage derived from the modeled crop maps and county acreage data from the USDA NASS Survey Program for each crop type and each year. Greater than 15,000 county records from 2001 - 2010 were compared with a Pearson's correlation coefficient of r = 0.87.

  20. Simulating California Reservoir Operation Using the Classification and Regression Tree Algorithm Combined with a Shuffled Cross-Validation Scheme

    NASA Astrophysics Data System (ADS)

    Yang, T.; Gao, X.; Sorooshian, S.; Li, X.

    2015-12-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs, and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as the consideration of policy and regulation, environmental constraints, dry/wet conditions, etc. In this paper, a reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve model's predictive performance. An application study of 9 major reservoirs in California is carried out and the simulated results from different decision tree approaches are compared with observation, including original CART and Random Forest. The statistical measurements show that CART combined with the shuffled cross-validation scheme gives a better predictive performance over the other two methods, especially in simulating the peak flows. The results for simulated controlled outflow, storage changes and storage trajectories also show that the proposed model is able to consistently and reasonably predict the human's reservoir operation decisions. In addition, we found that the operation in the Trinity Lake, Oroville Lake and Shasta Lake are greatly influenced by policy and regulation, while low elevation reservoirs are more sensitive to inflow amount than others.

  1. Using Clustering and Classification Approaches in Interactive Retrieval.

    ERIC Educational Resources Information Center

    Wu, Mingfang; Fuller, Michael; Wilkinson, Ross

    2001-01-01

    Presents an ongoing series of experiments with the TREC (Text Retrieval Conference) Interactive Track that test the feasibility and effectiveness of using clustering and classification as an aid to retrieval and answer construction. Results indicate that the success of the approach depends on assessing the quality of the final answers generated by…

  2. Transition from a botanical to a molecular classification in tree pollen allergy: implications for diagnosis and therapy.

    PubMed

    Mothes, Nadine; Horak, Friedrich; Valenta, Rudolf

    2004-12-01

    Tree pollens are among the most important allergen sources. Allergic cross-reactivity to pollens of trees from various plant orders has so far been classified according to botanical relationships. In this context, cross-reactivities to pollens of trees of the Fagales order (birch, alder, hazel, hornbeam, oak, chestnut), fruits and vegetables, between pollens of the Scrophulariales (olive, ash, plantain, privet, lilac) and pollens of the Coniferales (cedar, cypress, pine) are well established. The application of molecular biology methods for allergen characterization has revealed the molecular nature of many important tree pollen allergens. We review the spectrum of tree pollen allergens and propose a classification of tree pollen and related allergies based on major allergen molecules instead of botanical relationships among the allergenic sources. This molecular classification suggests the major birch pollen allergen, Bet v 1 as a marker for Fagales pollen and related plant food allergies, the major olive pollen allergen, Ole e 1, as a possible marker for Scrophulariales pollen allergy and the cedar allergens, Cry j 1 and Cry j 2, as potential markers for allergy to Coniferales pollens. We exemplify for Fagales pollen allergy and Bet v 1 that major marker allergens are diagnostic tools to determine the disease-eliciting allergen source. Information obtained by diagnostic testing with marker allergens will be important for the appropriate selection of patients for allergen-specific forms of therapy.

  3. Rational approaches to improving the isolation of endophytic actinobacteria from Australian native trees.

    PubMed

    Kaewkla, Onuma; Franco, Christopher M M

    2013-02-01

    In recent years, new actinobacterial species have been isolated as endophytes of plants and shrubs and are sought after both for their role as potential producers of new drug candidates for the pharmaceutical industry and as biocontrol inoculants for sustainable agriculture. Molecular-based approaches to the study of microbial ecology generally reveal a broader microbial diversity than can be obtained by cultivation methods. This study aimed to improve the success of isolating individual members of the actinobacterial population as pure cultures as well as improving the ability to characterise the large numbers obtained in pure culture. To achieve this objective, our study successfully employed rational and holistic approaches including the use of isolation media with low concentrations of nutrients normally available to the microorganism in the plant, plating larger quantities of plant sample, incubating isolation plates for up to 16 weeks, excising colonies when they are visible and choosing Australian endemic trees as the source of the actinobacteria. A hierarchy of polyphasic methods based on culture morphology, amplified 16S rRNA gene restriction analysis and limited sequencing was used to classify all 576 actinobacterial isolates from leaf, stem and root samples of two eucalypts: a Grey Box and Red Gum, a native apricot tree and a native pine tree. The classification revealed that, in addition to 413 Streptomyces spp., isolates belonged to 16 other actinobacterial genera: Actinomadura (two strains), Actinomycetospora (six), Actinopolymorpha (two), Amycolatopsis (six), Gordonia (one), Kribbella (25), Micromonospora (six), Nocardia (ten), Nocardioides (11), Nocardiopsis (one), Nonomuraea (one), Polymorphospora (two), Promicromonospora (51), Pseudonocardia (36), Williamsia (two) and a novel genus Flindersiella (one). In order to prove novelty, 12 strains were characterised fully to the species level based on polyphasic taxonomy. One strain represented a novel

  4. Pattern classification approach to rocket engine diagnostics

    SciTech Connect

    Tulpule, S.

    1989-01-01

    This paper presents a systems level approach to integrate state-of-the-art rocket engine technology with advanced computational techniques to develop an integrated diagnostic system (IDS) for future rocket propulsion systems. The key feature of this IDS is the use of advanced diagnostic algorithms for failure detection as opposed to the current practice of redline-based failure detection methods. The paper presents a top-down analysis of rocket engine diagnostic requirements, rocket engine operation, applicable diagnostic algorithms, and algorithm design techniques, which serve as a basis for the IDS. The concepts of hierarchical, model-based information processing are described, together with the use uf signal processing, pattern recognition, and artificial intelligence techniques which are an integral part of this diagnostic system. 27 refs.

  5. Classification Algorithms for Big Data Analysis, a Map Reduce Approach

    NASA Astrophysics Data System (ADS)

    Ayma, V. A.; Ferreira, R. S.; Happ, P.; Oliveira, D.; Feitosa, R.; Costa, G.; Plaza, A.; Gamba, P.

    2015-03-01

    Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA's machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance.

  6. An efficient tree classifier ensemble-based approach for pedestrian detection.

    PubMed

    Xu, Yanwu; Cao, Xianbin; Qiao, Hong

    2011-02-01

    Classification-based pedestrian detection systems (PDSs) are currently a hot research topic in the field of intelligent transportation. A PDS detects pedestrians in real time on moving vehicles. A practical PDS demands not only high detection accuracy but also high detection speed. However, most of the existing classification-based approaches mainly seek for high detection accuracy, while the detection speed is not purposely optimized for practical application. At the same time, the performance, particularly the speed, is primarily tuned based on experiments without theoretical foundations, leading to a long training procedure. This paper starts with measuring and optimizing detection speed, and then a practical classification-based pedestrian detection solution with high detection speed and training speed is described. First, an extended classification/detection speed metric, named feature-per-object (fpo), is proposed to measure the detection speed independently from execution. Then, an fpo minimization model with accuracy constraints is formulated based on a tree classifier ensemble, where the minimum fpo can guarantee the highest detection speed. Finally, the minimization problem is solved efficiently by using nonlinear fitting based on radial basis function neural networks. In addition, the optimal solution is directly used to instruct classifier training; thus, the training speed could be accelerated greatly. Therefore, a rapid and accurate classification-based detection technique is proposed for the PDS. Experimental results on urban traffic videos show that the proposed method has a high detection speed with an acceptable detection rate and a false-alarm rate for onboard detection; moreover, the training procedure is also very fast.

  7. An overview of the phase-modular fault tree approach to phased mission system analysis

    NASA Technical Reports Server (NTRS)

    Meshkat, L.; Xing, L.; Donohue, S. K.; Ou, Y.

    2003-01-01

    We look at how fault tree analysis (FTA), a primary means of performing reliability analysis of PMS, can meet this challenge in this paper by presenting an overview of the modular approach to solving fault trees that represent PMS.

  8. About decomposition approach for solving the classification problem

    NASA Astrophysics Data System (ADS)

    Andrianova, A. A.

    2016-11-01

    This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.

  9. Impact of atmospheric correction and image filtering on hyperspectral classification of tree species using support vector machine

    NASA Astrophysics Data System (ADS)

    Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko

    2015-01-01

    Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.

  10. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2001-01-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  11. Knowledge-based approach to video content classification

    NASA Astrophysics Data System (ADS)

    Chen, Yu; Wong, Edward K.

    2000-12-01

    A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.

  12. Control of tree water networks: A geometric programming approach

    NASA Astrophysics Data System (ADS)

    Sela Perelman, L.; Amin, S.

    2015-10-01

    This paper presents a modeling and operation approach for tree water supply systems. The network control problem is approximated as a geometric programming (GP) problem. The original nonlinear nonconvex network control problem is transformed into a convex optimization problem. The optimization model can be efficiently solved to optimality using state-of-the-art solvers. Two control schemes are presented: (1) operation of network actuators (pumps and valves) and (2) controlled demand shedding allocation between network consumers with limited resources. The dual of the network control problem is formulated and is used to perform sensitivity analysis with respect to hydraulic constraints. The approach is demonstrated on a small branched-topology network and later extended to a medium-size irrigation network. The results demonstrate an intrinsic trade-off between energy costs and demand shedding policy, providing an efficient decision support tool for active management of water systems.

  13. A conceptual approach to approximate tree root architecture in infinite slope models

    NASA Astrophysics Data System (ADS)

    Schmaltz, Elmar; Glade, Thomas

    2016-04-01

    paraboloids represent a cordate-root-system with radius r, height h and a constant, species-independent curvature. This procedure simplifies the classification of tree species into the three defined geometric solids. In this study we introduce a conceptual approach to estimate the 2- and 3-dimensional distribution of different tree root systems, and to implement it in a raster environment, as it is used in infinite slope models. Hereto we used the PCRaster extension in a python framework. The results show that root distribution and root growth are spatially reproducible in a simple raster framework. The outputs exhibit significant effects for a synthetically generated slope on local scale for equal time-steps. The preliminary results depict an initial step to develop a vegetation module that can be coupled with hydro-mechanical slope stability models. This approach is expected to yield a valuable contribution to the implementation of vegetation-related properties, in particular effects of root-reinforcement, into physically-based approaches using infinite slope models.

  14. Aerial Images from AN Uav System: 3d Modeling and Tree Species Classification in a Park Area

    NASA Astrophysics Data System (ADS)

    Gini, R.; Passoni, D.; Pinto, L.; Sona, G.

    2012-07-01

    The use of aerial imagery acquired by Unmanned Aerial Vehicles (UAVs) is scheduled within the FoGLIE project (Fruition of Goods Landscape in Interactive Environment): it starts from the need to enhance the natural, artistic and cultural heritage, to produce a better usability of it by employing audiovisual movable systems of 3D reconstruction and to improve monitoring procedures, by using new media for integrating the fruition phase with the preservation ones. The pilot project focus on a test area, Parco Adda Nord, which encloses various goods' types (small buildings, agricultural fields and different tree species and bushes). Multispectral high resolution images were taken by two digital compact cameras: a Pentax Optio A40 for RGB photos and a Sigma DP1 modified to acquire the NIR band. Then, some tests were performed in order to analyze the UAV images' quality with both photogrammetric and photo-interpretation purposes, to validate the vector-sensor system, the image block geometry and to study the feasibility of tree species classification. Many pre-signalized Control Points were surveyed through GPS to allow accuracy analysis. Aerial Triangulations (ATs) were carried out with photogrammetric commercial software, Leica Photogrammetry Suite (LPS) and PhotoModeler, with manual or automatic selection of Tie Points, to pick out pros and cons of each package in managing non conventional aerial imagery as well as the differences in the modeling approach. Further analysis were done on the differences between the EO parameters and the corresponding data coming from the on board UAV navigation system.

  15. Weighing risk factors associated with bee colony collapse disorder by classification and regression tree analysis.

    PubMed

    VanEngelsdorp, Dennis; Speybroeck, Niko; Evans, Jay D; Nguyen, Bach Kim; Mullin, Chris; Frazier, Maryann; Frazier, Jim; Cox-Foster, Diana; Chen, Yanping; Tarpy, David R; Haubruge, Eric; Pettis, Jeffrey S; Saegerman, Claude

    2010-10-01

    Colony collapse disorder (CCD), a syndrome whose defining trait is the rapid loss of adult worker honey bees, Apis mellifera L., is thought to be responsible for a minority of the large overwintering losses experienced by U.S. beekeepers since the winter 2006-2007. Using the same data set developed to perform a monofactorial analysis (PloS ONE 4: e6481, 2009), we conducted a classification and regression tree (CART) analysis in an attempt to better understand the relative importance and interrelations among different risk variables in explaining CCD. Fifty-five exploratory variables were used to construct two CART models: one model with and one model without a cost of misclassifying a CCD-diagnosed colony as a non-CCD colony. The resulting model tree that permitted for misclassification had a sensitivity and specificity of 85 and 74%, respectively. Although factors measuring colony stress (e.g., adult bee physiological measures, such as fluctuating asymmetry or mass of head) were important discriminating values, six of the 19 variables having the greatest discriminatory value were pesticide levels in different hive matrices. Notably, coumaphos levels in brood (a miticide commonly used by beekeepers) had the highest discriminatory value and were highest in control (healthy) colonies. Our CART analysis provides evidence that CCD is probably the result of several factors acting in concert, making afflicted colonies more susceptible to disease. This analysis highlights several areas that warrant further attention, including the effect of sublethal pesticide exposure on pathogen prevalence and the role of variability in bee tolerance to pesticides on colony survivorship.

  16. An improved methodology for land-cover classification using artificial neural networks and a decision tree classifier

    NASA Astrophysics Data System (ADS)

    Arellano-Neri, Olimpia

    Mapping is essential for the analysis of the land and land-cover dynamics, which influence many environmental processes and properties. When creating land-cover maps it is important to minimize error, since error will propagate into later analyses based upon these land cover maps. The reliability of land cover maps derived from remotely sensed data depends upon an accurate classification. For decades, traditional statistical methods have been applied in land-cover classification with varying degrees of accuracy. One of the most significant developments in the field of land-cover classification using remotely sensed data has been the introduction of Artificial Neural Networks (ANN) procedures. In this research, Artificial Neural Networks were applied to remotely sensed data of the southwestern Ohio region for land-cover classification. Three variants on traditional ANN-based classifiers were explored here: (1) the use of a customized architecture of the neural network in terms of the input layer for each land-cover class, (2) the use of texture analysis to combine spectral information and spatial information which is essential for urban classes, and (3) the use of decision tree (DT) classification to refine the ANN classification and ultimately to achieve a more reliable land-cover thematic map. The objective of this research was to prove that a classification based on Artificial Neural Networks (ANN) and decision tree (DT) would outperform by far the National Land Cover Data (NLCD). The NLCD is a land-cover classification produced by a cooperative effort between the United States Geological Survey (USGS) and the United States Environmental Protection Agency (USEPA). In order to achieve this objective, an accuracy assessment was conducted for both NLCD classification and ANN/DT classification. Error matrices resulting from the accuracy assessments provided overall accuracy, accuracy of each class, omission errors, and commission errors for each classification. The

  17. ADHD classification using bag of words approach on network features

    NASA Astrophysics Data System (ADS)

    Solmaz, Berkan; Dey, Soumyabrata; Rao, A. Ravishankar; Shah, Mubarak

    2012-02-01

    Attention Deficit Hyperactivity Disorder (ADHD) is receiving lots of attention nowadays mainly because it is one of the common brain disorders among children and not much information is known about the cause of this disorder. In this study, we propose to use a novel approach for automatic classification of ADHD conditioned subjects and control subjects using functional Magnetic Resonance Imaging (fMRI) data of resting state brains. For this purpose, we compute the correlation between every possible voxel pairs within a subject and over the time frame of the experimental protocol. A network of voxels is constructed by representing a high correlation value between any two voxels as an edge. A Bag-of-Words (BoW) approach is used to represent each subject as a histogram of network features; such as the number of degrees per voxel. The classification is done using a Support Vector Machine (SVM). We also investigate the use of raw intensity values in the time series for each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features. Experimental results verified that the classification accuracy improves when the combined histogram is used. We tested our approach on a highly challenging dataset released by NITRC for ADHD-200 competition and obtained promising results. The dataset not only has a large size but also includes subjects from different demography and edge groups. To the best of our knowledge, this is the first paper to propose BoW approach in any functional brain disorder classification and we believe that this approach will be useful in analysis of many brain related conditions.

  18. Classification.

    PubMed

    Tuxhorn, Ingrid; Kotagal, Prakash

    2008-07-01

    In this article, we review the practical approach and diagnostic relevance of current seizure and epilepsy classification concepts and principles as a basic framework for good management of patients with epileptic seizures and epilepsy. Inaccurate generalizations about terminology, diagnosis, and treatment may be the single most important factor, next to an inadequately obtained history, that determines the misdiagnosis and mismanagement of patients with epilepsy. A stepwise signs and symptoms approach for diagnosis, evaluation, and management along the guidelines of the International League Against Epilepsy and definitions of epileptic seizures and epilepsy syndromes offers a state-of-the-art clinical approach to managing patients with epilepsy.

  19. Simulating California reservoir operation using the classification and regression-tree algorithm combined with a shuffled cross-validation scheme

    NASA Astrophysics Data System (ADS)

    Yang, Tiantian; Gao, Xiaogang; Sorooshian, Soroosh; Li, Xin

    2016-03-01

    The controlled outflows from a reservoir or dam are highly dependent on the decisions made by the reservoir operators, instead of a natural hydrological process. Difference exists between the natural upstream inflows to reservoirs and the controlled outflows from reservoirs that supply the downstream users. With the decision maker's awareness of changing climate, reservoir management requires adaptable means to incorporate more information into decision making, such as water delivery requirement, environmental constraints, dry/wet conditions, etc. In this paper, a robust reservoir outflow simulation model is presented, which incorporates one of the well-developed data-mining models (Classification and Regression Tree) to predict the complicated human-controlled reservoir outflows and extract the reservoir operation patterns. A shuffled cross-validation approach is further implemented to improve CART's predictive performance. An application study of nine major reservoirs in California is carried out. Results produced by the enhanced CART, original CART, and random forest are compared with observation. The statistical measurements show that the enhanced CART and random forest overperform the CART control run in general, and the enhanced CART algorithm gives a better predictive performance over random forest in simulating the peak flows. The results also show that the proposed model is able to consistently and reasonably predict the expert release decisions. Experiments indicate that the release operation in the Oroville Lake is significantly dominated by SWP allocation amount and reservoirs with low elevation are more sensitive to inflow amount than others.

  20. Non-Destructive Classification Approaches for Equilbrated Ordinary Chondrites

    NASA Technical Reports Server (NTRS)

    Righter, K.; Harrington, R.; Schroeder, C.; Morris, R. V.

    2013-01-01

    Classification of meteorites is most effectively carried out by petrographic and mineralogic studies of thin sections, but a rapid and accurate classification technique for the many samples collected in dense collection areas (hot and cold deserts) is of great interest. Oil immersion techniques have been used to classify a large proportion of the US Antarctic meteorite collections since the mid-1980s [1]. This approach has allowed rapid characterization of thousands of samples over time, but nonetheless utilizes a piece of the sample that has been ground to grains or a powder. In order to compare a few non-destructive techniques with the standard approaches, we have characterized a group of chondrites from the Larkman Nunatak region using magnetic susceptibility and Moessbauer spectroscopy.

  1. Study and ranking of determinants of Taenia solium infections by classification tree models.

    PubMed

    Mwape, Kabemba E; Phiri, Isaac K; Praet, Nicolas; Dorny, Pierre; Muma, John B; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals.

  2. Differential Diagnosis of Erythmato-Squamous Diseases Using Classification and Regression Tree

    PubMed Central

    Maghooli, Keivan; Langarizadeh, Mostafa; Shahmoradi, Leila; Habibi-koolaee, Mahdi; Jebraeily, Mohamad; Bouraghi, Hamid

    2016-01-01

    Introduction: Differential diagnosis of Erythmato-Squamous Diseases (ESD) is a major challenge in the field of dermatology. The ESD diseases are placed into six different classes. Data mining is the process for detection of hidden patterns. In the case of ESD, data mining help us to predict the diseases. Different algorithms were developed for this purpose. Objective: we aimed to use the Classification and Regression Tree (CART) to predict differential diagnosis of ESD. Methods: we used the Cross Industry Standard Process for Data Mining (CRISP-DM) methodology. For this purpose, the dermatology data set from machine learning repository, UCI was obtained. The Clementine 12.0 software from IBM Company was used for modelling. In order to evaluation of the model we calculate the accuracy, sensitivity and specificity of the model. Results: The proposed model had an accuracy of 94.84% ( Standard Deviation: 24.42) in order to correct prediction of the ESD disease. Conclusions: Results indicated that using of this classifier could be useful. But, it would be strongly recommended that the combination of machine learning methods could be more useful in terms of prediction of ESD. PMID:28077889

  3. Study and Ranking of Determinants of Taenia solium Infections by Classification Tree Models

    PubMed Central

    Mwape, Kabemba E.; Phiri, Isaac K.; Praet, Nicolas; Dorny, Pierre; Muma, John B.; Zulu, Gideon; Speybroeck, Niko; Gabriël, Sarah

    2015-01-01

    Taenia solium taeniasis/cysticercosis is an important public health problem occurring mainly in developing countries. This work aimed to study the determinants of human T. solium infections in the Eastern province of Zambia and rank them in order of importance. A household (HH)-level questionnaire was administered to 680 HHs from 53 villages in two rural districts and the taeniasis and cysticercosis status determined. A classification tree model (CART) was used to define the relative importance and interactions between different predictor variables in their effect on taeniasis and cysticercosis. The Katete study area had a significantly higher taeniasis and cysticercosis prevalence than the Petauke area. The CART analysis for Katete showed that the most important determinant for cysticercosis infections was the number of HH inhabitants (6 to 10) and for taeniasis was the number of HH inhabitants > 6. The most important determinant in Petauke for cysticercosis was the age of head of household > 32 years and for taeniasis it was age < 55 years. The CART analysis showed that the most important determinant for both taeniasis and cysticercosis infections was the number of HH inhabitants (6 to 10) in Katete district and age in Petauke. The results suggest that control measures should target HHs with a high number of inhabitants and older individuals. PMID:25404073

  4. Prediction of cadmium enrichment in reclaimed coastal soils by classification and regression tree

    NASA Astrophysics Data System (ADS)

    Ru, Feng; Yin, Aijing; Jin, Jiaxin; Zhang, Xiuying; Yang, Xiaohui; Zhang, Ming; Gao, Chao

    2016-08-01

    Reclamation of coastal land is one of the most common ways to obtain land resources in China. However, it has long been acknowledged that the artificial interference with coastal land has disadvantageous effects, such as heavy metal contamination. This study aimed to develop a prediction model for cadmium enrichment levels and assess the importance of affecting factors in typical reclaimed land in Eastern China (DFCL: Dafeng Coastal Land). Two hundred and twenty seven surficial soil/sediment samples were collected and analyzed to identify the enrichment levels of cadmium and the possible affecting factors in soils and sediments. The classification and regression tree (CART) model was applied in this study to predict cadmium enrichment levels. The prediction results showed that cadmium enrichment levels assessed by the CART model had an accuracy of 78.0%. The CART model could extract more information on factors affecting the environmental behavior of cadmium than correlation analysis. The integration of correlation analysis and the CART model showed that fertilizer application and organic carbon accumulation were the most important factors affecting soil/sediment cadmium enrichment levels, followed by particle size effects (Al2O3, TFe2O3 and SiO2), contents of Cl and S, surrounding construction areas and reclamation history.

  5. Multicenter study on caries risk assessment in adults using survival Classification and Regression Trees

    PubMed Central

    Arino, Masumi; Ito, Ataru; Fujiki, Shozo; Sugiyama, Seiichi; Hayashi, Mikako

    2016-01-01

    Dental caries is an important public health problem worldwide. This study aims to prove how preventive therapies reduce the onset of caries in adult patients, and to identify patients with high or low risk of caries by using Classification and Regression Trees based survival analysis (survival CART). A clinical data set of 732 patients aged 20 to 64 years in nine Japanese general practices was analyzed with the following parameters: age, DMFT, number of mutans streptococci (SM) and Lactobacilli (LB), secretion rate and buffer capacity of saliva, and compliance with a preventive program. Results showed the incidence of primary carious lesion was affected by SM, LB and compliance with a preventive program; secondary carious lesion was affected by DMFT, SM and LB. Survival CART identified high-risk patients for primary carious lesion according to their poor compliance with a preventive program and SM (≥106 CFU/ml) with a hazard ratio of 3.66 (p = 0.0002). In the case of secondary caries, patients with LB (≥105 CFU/ml) and DMFT (>15) were identified as high risk with a hazard ratio of 3.50 (p < 0.0001). We conclude that preventive programs can be effective in limiting the incidence of primary carious lesion. PMID:27381750

  6. Use of Binary Partition Tree and energy minimization for object-based classification of urban land cover

    NASA Astrophysics Data System (ADS)

    Li, Mengmeng; Bijker, Wietske; Stein, Alfred

    2015-04-01

    Two main challenges are faced when classifying urban land cover from very high resolution satellite images: obtaining an optimal image segmentation and distinguishing buildings from other man-made objects. For optimal segmentation, this work proposes a hierarchical representation of an image by means of a Binary Partition Tree (BPT) and an unsupervised evaluation of image segmentations by energy minimization. For building extraction, we apply fuzzy sets to create a fuzzy landscape of shadows which in turn involves a two-step procedure. The first step is a preliminarily image classification at a fine segmentation level to generate vegetation and shadow information. The second step models the directional relationship between building and shadow objects to extract building information at the optimal segmentation level. We conducted the experiments on two datasets of Pléiades images from Wuhan City, China. To demonstrate its performance, the proposed classification is compared at the optimal segmentation level with Maximum Likelihood Classification and Support Vector Machine classification. The results show that the proposed classification produced the highest overall accuracies and kappa coefficients, and the smallest over-classification and under-classification geometric errors. We conclude first that integrating BPT with energy minimization offers an effective means for image segmentation. Second, we conclude that the directional relationship between building and shadow objects represented by a fuzzy landscape is important for building extraction.

  7. A nearest neighbour approach by genetic distance to the assignment of individual trees to geographic origin.

    PubMed

    Degen, Bernd; Blanc-Jolivet, Céline; Stierand, Katrin; Gillet, Elizabeth

    2017-03-01

    During the past decade, the use of DNA for forensic applications has been extensively implemented for plant and animal species, as well as in humans. Tracing back the geographical origin of an individual usually requires genetic assignment analysis. These approaches are based on reference samples that are grouped into populations or other aggregates and intend to identify the most likely group of origin. Often this grouping does not have a biological but rather a historical or political justification, such as "country of origin". In this paper, we present a new nearest neighbour approach to individual assignment or classification within a given but potentially imperfect grouping of reference samples. This method, which is based on the genetic distance between individuals, functions better in many cases than commonly used methods. We demonstrate the operation of our assignment method using two data sets. One set is simulated for a large number of trees distributed in a 120km by 120km landscape with individual genotypes at 150 SNPs, and the other set comprises experimental data of 1221 individuals of the African tropical tree species Entandrophragma cylindricum (Sapelli) genotyped at 61 SNPs. Judging by the level of correct self-assignment, our approach outperformed the commonly used frequency and Bayesian approaches by 15% for the simulated data set and by 5-7% for the Sapelli data set. Our new approach is less sensitive to overlapping sources of genetic differentiation, such as genetic differences among closely-related species, phylogeographic lineages and isolation by distance, and thus operates better even for suboptimal grouping of individuals.

  8. A wrapper-based approach to image segmentation and classification.

    PubMed

    Farmer, Michael E; Jain, Anil K

    2005-12-01

    The traditional processing flow of segmentation followed by classification in computer vision assumes that the segmentation is able to successfully extract the object of interest from the background image. It is extremely difficult to obtain a reliable segmentation without any prior knowledge about the object that is being extracted from the scene. This is further complicated by the lack of any clearly defined metrics for evaluating the quality of segmentation or for comparing segmentation algorithms. We propose a method of segmentation that addresses both of these issues, by using the object classification subsystem as an integral part of the segmentation. This will provide contextual information regarding the objects to be segmented, as well as allow us to use the probability of correct classification as a metric to determine the quality of the segmentation. We view traditional segmentation as a filter operating on the image that is independent of the classifier, much like the filter methods for feature selection. We propose a new paradigm for segmentation and classification that follows the wrapper methods of feature selection. Our method wraps the segmentation and classification together, and uses the classification accuracy as the metric to determine the best segmentation. By using shape as the classification feature, we are able to develop a segmentation algorithm that relaxes the requirement that the object of interest to be segmented must be homogeneous in some low-level image parameter, such as texture, color, or grayscale. This represents an improvement over other segmentation methods that have used classification information only to modify the segmenter parameters, since these algorithms still require an underlying homogeneity in some parameter space. Rather than considering our method as, yet, another segmentation algorithm, we propose that our wrapper method can be considered as an image segmentation framework, within which existing image segmentation

  9. New Approach for Segmentation and Extraction of Single Tree from Point Clouds Data and Aerial Images

    NASA Astrophysics Data System (ADS)

    Homainejad, A. S.

    2016-06-01

    This paper addresses a new approach for reconstructing a 3D model from single trees via Airborne Laser Scanners (ALS) data and aerial images. The approach detects and extracts single tree from ALS data and aerial images. The existing approaches are able to provide bulk segmentation from a group of trees; however, some methods focused on detection and extraction of a particular tree from ALS and images. Segmentation of a single tree within a group of trees is mostly a mission impossible since the detection of boundary lines between the trees is a tedious job and basically it is not feasible. In this approach an experimental formula based on the height of the trees was developed and applied in order to define the boundary lines between the trees. As a result, each single tree was segmented and extracted and later a 3D model was created. Extracted trees from this approach have a unique identification and attribute. The output has application in various fields of science and engineering such as forestry, urban planning, and agriculture. For example in forestry, the result can be used for study in ecologically diverse, biodiversity and ecosystem.

  10. A hybrid ensemble learning approach to star-galaxy classification

    NASA Astrophysics Data System (ADS)

    Kim, Edward J.; Brunner, Robert J.; Carrasco Kind, Matias

    2015-10-01

    There exist a variety of star-galaxy classification techniques, each with their own strengths and weaknesses. In this paper, we present a novel meta-classification framework that combines and fully exploits different techniques to produce a more robust star-galaxy classification. To demonstrate this hybrid, ensemble approach, we combine a purely morphological classifier, a supervised machine learning method based on random forest, an unsupervised machine learning method based on self-organizing maps, and a hierarchical Bayesian template-fitting method. Using data from the CFHTLenS survey (Canada-France-Hawaii Telescope Lensing Survey), we consider different scenarios: when a high-quality training set is available with spectroscopic labels from DEEP2 (Deep Extragalactic Evolutionary Probe Phase 2 ), SDSS (Sloan Digital Sky Survey), VIPERS (VIMOS Public Extragalactic Redshift Survey), and VVDS (VIMOS VLT Deep Survey), and when the demographics of sources in a low-quality training set do not match the demographics of objects in the test data set. We demonstrate that our Bayesian combination technique improves the overall performance over any individual classification method in these scenarios. Thus, strategies that combine the predictions of different classifiers may prove to be optimal in currently ongoing and forthcoming photometric surveys, such as the Dark Energy Survey and the Large Synoptic Survey Telescope.

  11. Trees

    ERIC Educational Resources Information Center

    Al-Khaja, Nawal

    2007-01-01

    This is a thematic lesson plan for young learners about palm trees and the importance of taking care of them. The two part lesson teaches listening, reading and speaking skills. The lesson includes parts of a tree; the modal auxiliary, can; dialogues and a role play activity.

  12. Improved wetland remote sensing in Yellowstone National Park using classification trees to combine TM imagery and ancillary environmental data

    USGS Publications Warehouse

    Wright, C.; Gallant, A.

    2007-01-01

    The U.S. Fish and Wildlife Service uses the term palustrine wetland to describe vegetated wetlands traditionally identified as marsh, bog, fen, swamp, or wet meadow. Landsat TM imagery was combined with image texture and ancillary environmental data to model probabilities of palustrine wetland occurrence in Yellowstone National Park using classification trees. Model training and test locations were identified from National Wetlands Inventory maps, and classification trees were built for seven years spanning a range of annual precipitation. At a coarse level, palustrine wetland was separated from upland. At a finer level, five palustrine wetland types were discriminated: aquatic bed (PAB), emergent (PEM), forested (PFO), scrub-shrub (PSS), and unconsolidated shore (PUS). TM-derived variables alone were relatively accurate at separating wetland from upland, but model error rates dropped incrementally as image texture, DEM-derived terrain variables, and other ancillary GIS layers were added. For classification trees making use of all available predictors, average overall test error rates were 7.8% for palustrine wetland/upland models and 17.0% for palustrine wetland type models, with consistent accuracies across years. However, models were prone to wetland over-prediction. While the predominant PEM class was classified with omission and commission error rates less than 14%, we had difficulty identifying the PAB and PSS classes. Ancillary vegetation information greatly improved PSS classification and moderately improved PFO discrimination. Association with geothermal areas distinguished PUS wetlands. Wetland over-prediction was exacerbated by class imbalance in likely combination with spatial and spectral limitations of the TM sensor. Wetland probability surfaces may be more informative than hard classification, and appear to respond to climate-driven wetland variability. The developed method is portable, relatively easy to implement, and should be applicable in other

  13. Land cover and forest formation distributions for St. Kitts, Nevis, St. Eustatius, Grenada and Barbados from decision tree classification of cloud-cleared satellite imagery

    USGS Publications Warehouse

    Helmer, E.H.; Kennaway, T.A.; Pedreros, D.H.; Clark, M.L.; Marcano-Vega, H.; Tieszen, L.L.; Ruzycki, T.R.; Schill, S.R.; Carrington, C.M.S.

    2008-01-01

    Satellite image-based mapping of tropical forests is vital to conservation planning. Standard methods for automated image classification, however, limit classification detail in complex tropical landscapes. In this study, we test an approach to Landsat image interpretation on four islands of the Lesser Antilles, including Grenada and St. Kitts, Nevis and St. Eustatius, testing a more detailed classification than earlier work in the latter three islands. Secondly, we estimate the extents of land cover and protected forest by formation for five islands and ask how land cover has changed over the second half of the 20th century. The image interpretation approach combines image mosaics and ancillary geographic data, classifying the resulting set of raster data with decision tree software. Cloud-free image mosaics for one or two seasons were created by applying regression tree normalization to scene dates that could fill cloudy areas in a base scene. Such mosaics are also known as cloud-filled, cloud-minimized or cloud-cleared imagery, mosaics, or composites. The approach accurately distinguished several classes that more standard methods would confuse; the seamless mosaics aided reference data collection; and the multiseason imagery allowed us to separate drought deciduous forests and woodlands from semi-deciduous ones. Cultivated land areas declined 60 to 100 percent from about 1945 to 2000 on several islands. Meanwhile, forest cover has increased 50 to 950%. This trend will likely continue where sugar cane cultivation has dominated. Like the island of Puerto Rico, most higher-elevation forest formations are protected in formal or informal reserves. Also similarly, lowland forests, which are drier forest types on these islands, are not well represented in reserves. Former cultivated lands in lowland areas could provide lands for new reserves of drier forest types. The land-use history of these islands may provide insight for planners in countries currently considering

  14. Classification of Parkinsonian Syndromes from FDG-PET Brain Data Using Decision Trees with SSM/PCA Features

    PubMed Central

    Mudali, D.; Teune, L. K.; Renken, R. J.; Leenders, K. L.; Roerdink, J. B. T. M.

    2015-01-01

    Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data. PMID:25918550

  15. A methodological approach to the classification of dermoscopy images

    PubMed Central

    Celebi, M. Emre; Kingravi, Hassan A.; Uddin, Bakhtiyar; Iyatomi, Hitoshi; Aslandogan, Y. Alp; Stoecker, William V.; Moss, Randy H.

    2011-01-01

    In this paper a methodological approach to the classification of pigmented skin lesions in dermoscopy images is presented. First, automatic border detection is performed to separate the lesion from the background skin. Shape features are then extracted from this border. For the extraction of color and texture related features, the image is divided into various clinically significant regions using the Euclidean distance transform. This feature data is fed into an optimization framework, which ranks the features using various feature selection algorithms and determines the optimal feature subset size according to the area under the ROC curve measure obtained from support vector machine classification. The issue of class imbalance is addressed using various sampling strategies, and the classifier generalization error is estimated using Monte Carlo cross validation. Experiments on a set of 564 images yielded a specificity of 92.34% and a sensitivity of 93.33%. PMID:17387001

  16. "Trees and Things That Live in Trees": Three Children with Special Needs Experience the Project Approach

    ERIC Educational Resources Information Center

    Griebling, Susan; Elgas, Peg; Konerman, Rachel

    2015-01-01

    The authors report on research conducted during a project investigation undertaken with preschool children, ages 3-5. The report focuses on three children with special needs and the positive outcomes for each child as they engaged in the project Trees and Things That Live in Trees. Two of the children were diagnosed with developmental delays, and…

  17. Investigating the Utility of Oblique Tree-Based Ensembles for the Classification of Hyperspectral Data

    PubMed Central

    Poona, Nitesh; van Niekerk, Adriaan; Ismail, Riyad

    2016-01-01

    Ensemble classifiers are being widely used for the classification of spectroscopic data. In this regard, the random forest (RF) ensemble has been successfully applied in an array of applications, and has proven to be robust in handling high dimensional data. More recently, several variants of the traditional RF algorithm including rotation forest (rotF) and oblique random forest (oRF) have been applied to classifying high dimensional data. In this study we compare the traditional RF, rotF, and oRF (using three different splitting rules, i.e., ridge regression, partial least squares, and support vector machine) for the classification of healthy and infected Pinus radiata seedlings using high dimensional spectroscopic data. We further test the robustness of these five ensemble classifiers to reduced spectral resolution by spectral resampling (binning) of the original spectral bands. The results showed that the three oblique random forest ensembles outperformed both the traditional RF and rotF ensembles. Additionally, the rotF ensemble proved to be the least robust of the five ensembles tested. Spectral resampling of the original bands provided mixed results. Nevertheless, the results demonstrate that using spectral resampled bands is a promising approach to classifying asymptomatic stress in Pinus radiata seedlings. PMID:27854290

  18. Cluster Stability Estimation Based on a Minimal Spanning Trees Approach

    NASA Astrophysics Data System (ADS)

    Volkovich, Zeev (Vladimir); Barzily, Zeev; Weber, Gerhard-Wilhelm; Toledano-Kitai, Dvora

    2009-08-01

    Among the areas of data and text mining which are employed today in science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. However, there are many open questions still waiting for a theoretical and practical treatment, e.g., the problem of determining the true number of clusters has not been satisfactorily solved. In the current paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters we estimate the stability of partitions obtained from clustering of samples. Partitions are considered consistent if their clusters are stable. Clusters validity is measured as the total number of edges, in the clusters' minimal spanning trees, connecting points from different samples. Actually, we use the Friedman and Rafsky two sample test statistic. The homogeneity hypothesis, of well mingled samples within the clusters, leads to asymptotic normal distribution of the considered statistic. Resting upon this fact, the standard score of the mentioned edges quantity is set, and the partition quality is represented by the worst cluster corresponding to the minimal standard score value. It is natural to expect that the true number of clusters can be characterized by the empirical distribution having the shortest left tail. The proposed methodology sequentially creates the described value distribution and estimates its left-asymmetry. Numerical experiments, presented in the paper, demonstrate the ability of the approach to detect the true number of clusters.

  19. A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification

    PubMed Central

    Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.

    2015-01-01

    In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898

  20. Use of classification trees to apportion single echo detections to species: Application to the pelagic fish community of Lake Superior

    USGS Publications Warehouse

    Yule, Daniel L.; Adams, Jean V.; Hrabik, Thomas R.; Vinson, Mark R.; Woiak, Zebadiah; Ahrenstroff, Tyler D.

    2013-01-01

    Acoustic methods are used to estimate the density of pelagic fish in large lakes with results of midwater trawling used to assign species composition. Apportionment in lakes having mixed species can be challenging because only a small fraction of the water sampled acoustically is sampled with trawl gear. Here we describe a new method where single echo detections (SEDs) are assigned to species based on classification tree models developed from catch data that separate species based on fish size and the spatial habitats they occupy. During the summer of 2011, we conducted a spatially-balanced lake-wide acoustic and midwater trawl survey of Lake Superior. A total of 51 sites in four bathymetric depth strata (0–30 m, 30–100 m, 100–200 m, and >200 m) were sampled. We developed classification tree models for each stratum and found fish length was the most important variable for separating species. To apply these trees to the acoustic data, we needed to identify a target strength to length (TS-to-L) relationship appropriate for all abundant Lake Superior pelagic species. We tested performance of 7 general (i.e., multi-species) relationships derived from three published studies. The best-performing relationship was identified by comparing predicted and observed catch compositions using a second independent Lake Superior data set. Once identified, the relationship was used to predict lengths of SEDs from the lake-wide survey, and the classification tree models were used to assign each SED to a species. Exotic rainbow smelt (Osmerus mordax) were the most common species at bathymetric depths 100 m (384 million; 6.0 kt). Cisco (Coregonus artedi) were widely distributed over all strata with their population estimated at 182 million (44 kt). The apportionment method we describe should be transferable to other large lakes provided fish are not tightly aggregated, and an appropriate TS-to-L relationship for abundant pelagic fish species can be determined.

  1. AutoClass: A Bayesian Approach to Classification

    NASA Technical Reports Server (NTRS)

    Stutz, John; Cheeseman, Peter; Hanson, Robin; Taylor, Will; Lum, Henry, Jr. (Technical Monitor)

    1994-01-01

    We describe a Bayesian approach to the untutored discovery of classes in a set of cases, sometimes called finite mixture separation or clustering. The main difference between clustering and our approach is that we search for the "best" set of class descriptions rather than grouping the cases themselves. We describe our classes in terms of a probability distribution or density function, and the locally maximal posterior probability valued function parameters. We rate our classifications with an approximate joint probability of the data and functional form, marginalizing over the parameters. Approximation is necessitated by the computational complexity of the joint probability. Thus, we marginalize w.r.t. local maxima in the parameter space. We discuss the rationale behind our approach to classification. We give the mathematical development for the basic mixture model and describe the approximations needed for computational tractability. We instantiate the basic model with the discrete Dirichlet distribution and multivariant Gaussian density likelihoods. Then we show some results for both constructed and actual data.

  2. Prognostic transcriptional association networks: a new supervised approach based on regression trees

    PubMed Central

    Nepomuceno-Chamorro, Isabel; Azuaje, Francisco; Devaux, Yvan; Nazarov, Petr V.; Muller, Arnaud; Aguilar-Ruiz, Jesús S.; Wagner, Daniel R.

    2011-01-01

    Motivation: The application of information encoded in molecular networks for prognostic purposes is a crucial objective of systems biomedicine. This approach has not been widely investigated in the cardiovascular research area. Within this area, the prediction of clinical outcomes after suffering a heart attack would represent a significant step forward. We developed a new quantitative prediction-based method for this prognostic problem based on the discovery of clinically relevant transcriptional association networks. This method integrates regression trees and clinical class-specific networks, and can be applied to other clinical domains. Results: Before analyzing our cardiovascular disease dataset, we tested the usefulness of our approach on a benchmark dataset with control and disease patients. We also compared it to several algorithms to infer transcriptional association networks and classification models. Comparative results provided evidence of the prediction power of our approach. Next, we discovered new models for predicting good and bad outcomes after myocardial infarction. Using blood-derived gene expression data, our models reported areas under the receiver operating characteristic curve above 0.70. Our model could also outperform different techniques based on co-expressed gene modules. We also predicted processes that may represent novel therapeutic targets for heart disease, such as the synthesis of leucine and isoleucine. Availability: The SATuRNo software is freely available at http://www.lsi.us.es/isanepo/toolsSaturno/. Contact: inepomuceno@us.es Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21098433

  3. Full hierarchic versus non-hierarchic classification approaches for mapping sealed surfaces at the rural-urban fringe using high-resolution satellite data.

    PubMed

    De Roeck, Tim; Van de Voorde, Tim; Canters, Frank

    2009-01-01

    Since 2008 more than half of the world population is living in cities and urban sprawl is continuing. Because of these developments, the mapping and monitoring of urban environments and their surroundings is becoming increasingly important. In this study two object-oriented approaches for high-resolution mapping of sealed surfaces are compared: a standard non-hierarchic approach and a full hierarchic approach using both multi-layer perceptrons and decision trees as learning algorithms. Both methods outperform the standard nearest neighbour classifier, which is used as a benchmark scenario. For the multi-layer perceptron approach, applying a hierarchic classification strategy substantially increases the accuracy of the classification. For the decision tree approach a one-against-all hierarchic classification strategy does not lead to an improvement of classification accuracy compared to the standard all-against-all approach. Best results are obtained with the hierarchic multi-layer perceptron classification strategy, producing a kappa value of 0.77. A simple shadow reclassification procedure based on characteristics of neighbouring objects further increases the kappa value to 0.84.

  4. Prediction of Severe Acute Pancreatitis Using a Decision Tree Model Based on the Revised Atlanta Classification of Acute Pancreatitis

    PubMed Central

    Zhang, Yushun; Yang, Chong; Gou, Shanmiao; Li, Yongfeng; Xiong, Jiongxin; Wu, Heshui; Wang, Chunyou

    2015-01-01

    Objective To develop a model for the early prediction of severe acute pancreatitis based on the revised Atlanta classification of acute pancreatitis. Methods Clinical data of 1308 patients with acute pancreatitis (AP) were included in the retrospective study. A total of 603 patients who were admitted to the hospital within 36 hours of the onset of the disease were included at last according to the inclusion criteria. The clinical data were collected within 12 hours after admission. All the patients were classified as having mild acute pancreatitis (MAP), moderately severe acute pancreatitis (MSAP) and severe acute pancreatitis (SAP) based on the revised Atlanta classification of acute pancreatitis. All the 603 patients were randomly divided into training group (402 cases) and test group (201 cases). Univariate and multiple regression analyses were used to identify the independent risk factors for the development of SAP in the training group. Then the prediction model was constructed using the decision tree method, and this model was applied to the test group to evaluate its validity. Results The decision tree model was developed using creatinine, lactate dehydrogenase, and oxygenation index to predict SAP. The diagnostic sensitivity and specificity of SAP in the training group were 80.9% and 90.0%, respectively, and the sensitivity and specificity in the test group were 88.6% and 90.4%, respectively. Conclusions The decision tree model based on creatinine, lactate dehydrogenase, and oxygenation index is more likely to predict the occurrence of SAP. PMID:26580397

  5. A three-way approach for protein function classification

    PubMed Central

    2017-01-01

    The knowledge of protein functions plays an essential role in understanding biological cells and has a significant impact on human life in areas such as personalized medicine, better crops and improved therapeutic interventions. Due to expense and inherent difficulty of biological experiments, intelligent methods are generally relied upon for automatic assignment of functions to proteins. The technological advancements in the field of biology are improving our understanding of biological processes and are regularly resulting in new features and characteristics that better describe the role of proteins. It is inevitable to neglect and overlook these anticipated features in designing more effective classification techniques. A key issue in this context, that is not being sufficiently addressed, is how to build effective classification models and approaches for protein function prediction by incorporating and taking advantage from the ever evolving biological information. In this article, we propose a three-way decision making approach which provides provisions for seeking and incorporating future information. We considered probabilistic rough sets based models such as Game-Theoretic Rough Sets (GTRS) and Information-Theoretic Rough Sets (ITRS) for inducing three-way decisions. An architecture of protein functions classification with probabilistic rough sets based three-way decisions is proposed and explained. Experiments are carried out on Saccharomyces cerevisiae species dataset obtained from Uniprot database with the corresponding functional classes extracted from the Gene Ontology (GO) database. The results indicate that as the level of biological information increases, the number of deferred cases are reduced while maintaining similar level of accuracy. PMID:28234929

  6. Multifaceted approach to the diagnosis and classification of acute leukemias.

    PubMed

    McKenna, R W

    2000-08-01

    Until recently, the diagnosis and classification of acute myeloid (AML) and acute lymphoblastic (ALL) leukemias was based almost exclusively on well-defined morphologic criteria and cytochemical stains. Although most cases can be diagnosed by these methods, there is only modest correlation between morphologic categories and treatment responsiveness and prognosis. The expansion of therapeutic options and improvement in remission induction and disease-free survival for both AML and ALL have stimulated emphasis on defining good and poor treatment response groups. This is most effectively accomplished by a multifaceted approach to diagnosis and classification using immunophenotyping, cytogenetics, and molecular analysis in addition to the traditional methods. Immunophenotyping is important in characterizing morphologically poorly differentiated acute leukemias and in defining prognostic categories of ALL. Cytogenetic and molecular studies provide important prognostic information and are becoming vitally important in determining the appropriate treatment protocol. With optimal application of these techniques in the diagnosis of acute leukemias, treatment strategies can be more specifically directed and new therapeutic approaches can be evaluated more effectively.

  7. Incremental Transductive Learning Approaches to Schistosomiasis Vector Classification

    NASA Astrophysics Data System (ADS)

    Fusco, Terence; Bi, Yaxin; Wang, Haiying; Browne, Fiona

    2016-08-01

    The key issues pertaining to collection of epidemic disease data for our analysis purposes are that it is a labour intensive, time consuming and expensive process resulting in availability of sparse sample data which we use to develop prediction models. To address this sparse data issue, we present the novel Incremental Transductive methods to circumvent the data collection process by applying previously acquired data to provide consistent, confidence-based labelling alternatives to field survey research. We investigated various reasoning approaches for semi-supervised machine learning including Bayesian models for labelling data. The results show that using the proposed methods, we can label instances of data with a class of vector density at a high level of confidence. By applying the Liberal and Strict Training Approaches, we provide a labelling and classification alternative to standalone algorithms. The methods in this paper are components in the process of reducing the proliferation of the Schistosomiasis disease and its effects.

  8. A new approach to modeling tree rainfall interception

    NASA Astrophysics Data System (ADS)

    Xiao, Qingfu; McPherson, E. Gregory; Ustin, Susan L.; Grismer, Mark E.

    2000-12-01

    A three-dimensional physically based stochastic model was developed to describe canopy rainfall interception processes at desired spatial and temporal resolutions. Such model development is important to understand these processes because forest canopy interception may exceed 59% of annual precipitation in old growth trees. The model describes the interception process from a single leaf, to a branch segment, and then up to the individual tree level. It takes into account rainfall, meteorology, and canopy architecture factors as explicit variables. Leaf and stem surface roughness, architecture, and geometric shape control both leaf drip and stemflow. Model predictions were evaluated using actual interception data collected for two mature open grown trees, a 9-year-old broadleaf deciduous pear tree (Pyrus calleryana "Bradford" or Callery pear) and an 8-year-old broadleaf evergreen oak tree (Quercus suber or cork oak). When simulating 18 rainfall events for the oak tree and 16 rainfall events for the pear tree, the model over estimated interception loss by 4.5% and 3.0%, respectively, while stemflow was under estimated by 0.8% and 3.3%, and throughfall was under estimated by 3.7% for the oak tree and over estimated by 0.3% for the pear tree. A model sensitivity analysis indicates that canopy surface storage capacity had the greatest influence on interception, and interception losses were sensitive to leaf and stem surface area indices. Among rainfall factors, interception losses relative to gross precipitation were most sensitive to rainfall amount. Rainfall incident angle had a significant effect on total precipitation intercepting the projected surface area. Stemflow was sensitive to stem segment and leaf zenith angle distributions. Enhanced understanding of interception loss dynamics should lead to improved urban forest ecosystem management.

  9. A comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification

    NASA Astrophysics Data System (ADS)

    Lin, Yi; Hyyppä, Juha

    2016-04-01

    Tree species information is crucial for digital forestry, and efficient techniques for classifying tree species are extensively demanded. To this end, airborne light detection and ranging (LiDAR) has been introduced. However, the literature review suggests that most of the previous airborne LiDAR-based studies were only based on limited kinds of tree signatures. To address this gap, this study proposed developing a novel modular framework for LiDAR-based tree species classification, by deriving feature parameters in a systematic way. Specifically, feature parameters of point-distribution (PD), laser pulse intensity (IN), crown-internal (CI) and tree-external (TE) structures were proposed and derived. With a support-vector-machine (SVM) classifier used, the classifications were conducted in a leave-one-out-for-cross-validation (LOOCV) mode. Based on the samples of four typical boreal tree species, i.e., Picea abies, Pinus sylvestris, Populus tremula and Quercus robur, tests showed that the accuracies of the classifications based on the acquired PD-, IN-, CI- and TE-categorized feature parameters as well as the integration of their individual optimal parameters are 65.00%, 80.00%, 82.50%, 85.00% and 92.50%, respectively. These results indicate that the procedures proposed in this study can be used as a comprehensive but efficient framework of proposing and validating feature parameters from airborne LiDAR data for tree species classification.

  10. Human and tree classification based on a model using 3D ladar in a GPS-denied environment

    NASA Astrophysics Data System (ADS)

    Cho, Kuk; Baeg, Seung-Ho; Park, Sangdeok

    2013-05-01

    This study explained a method to classify humans and trees by extraction their geometric and statistical features in data obtained from 3D LADAR. In a wooded GPS-denied environment, it is difficult to identify the location of unmanned ground vehicles and it is also difficult to properly recognize the environment in which these vehicles move. In this study, using the point cloud data obtained via 3D LADAR, a method to extract the features of humans, trees, and other objects within an environment was implemented and verified through the processes of segmentation, feature extraction, and classification. First, for the segmentation, the radially bounded nearest neighbor method was applied. Second, for the feature extraction, each segmented object was divided into three parts, and then their geometrical and statistical features were extracted. A human was divided into three parts: the head, trunk and legs. A tree was also divided into three parts: the top, middle, and bottom. The geometric features were the variance of the x-y data for the center of each part in an object, using the distance between the two central points for each part, using K-mean clustering. The statistical features were the variance of each of the parts. In this study, three, six and six features of data were extracted, respectively, resulting in a total of 15 features. Finally, after training the extracted data via an artificial network, new data were classified. This study showed the results of an experiment that applied an algorithm proposed with a vehicle equipped with 3D LADAR in a thickly forested area, which is a GPS-denied environment. A total of 5,158 segments were obtained and the classification rates for human and trees were 82.9% and 87.4%, respectively.

  11. Rule based fuzzy logic approach for classification of fibromyalgia syndrome.

    PubMed

    Arslan, Evren; Yildiz, Sedat; Albayrak, Yalcin; Koklukaya, Etem

    2016-06-01

    Fibromyalgia syndrome (FMS) is a chronic muscle and skeletal system disease observed generally in women, manifesting itself with a widespread pain and impairing the individual's quality of life. FMS diagnosis is made based on the American College of Rheumatology (ACR) criteria. However, recently the employability and sufficiency of ACR criteria are under debate. In this context, several evaluation methods, including clinical evaluation methods were proposed by researchers. Accordingly, ACR had to update their criteria announced back in 1990, 2010 and 2011. Proposed rule based fuzzy logic method aims to evaluate FMS at a different angle as well. This method contains a rule base derived from the 1990 ACR criteria and the individual experiences of specialists. The study was conducted using the data collected from 60 inpatient and 30 healthy volunteers. Several tests and physical examination were administered to the participants. The fuzzy logic rule base was structured using the parameters of tender point count, chronic widespread pain period, pain severity, fatigue severity and sleep disturbance level, which were deemed important in FMS diagnosis. It has been observed that generally fuzzy predictor was 95.56 % consistent with at least of the specialists, who are not a creator of the fuzzy rule base. Thus, in diagnosis classification where the severity of FMS was classified as well, consistent findings were obtained from the comparison of interpretations and experiences of specialists and the fuzzy logic approach. The study proposes a rule base, which could eliminate the shortcomings of 1990 ACR criteria during the FMS evaluation process. Furthermore, the proposed method presents a classification on the severity of the disease, which was not available with the ACR criteria. The study was not limited to only disease classification but at the same time the probability of occurrence and severity was classified. In addition, those who were not suffering from FMS were

  12. Target-classification approach applied to active UXO sites

    NASA Astrophysics Data System (ADS)

    Shubitidze, F.; Fernández, J. P.; Shamatava, Irma; Barrowes, B. E.; O'Neill, K.

    2013-06-01

    This study is designed to illustrate the discrimination performance at two UXO active sites (Oklahoma's Fort Sill and the Massachusetts Military Reservation) of a set of advanced electromagnetic induction (EMI) inversion/discrimination models which include the orthonormalized volume magnetic source (ONVMS), joint diagonalization (JD), and differential evolution (DE) approaches and whose power and flexibility greatly exceed those of the simple dipole model. The Fort Sill site is highly contaminated by a mix of the following types of munitions: 37-mm target practice tracers, 60-mm illumination mortars, 75-mm and 4.5'' projectiles, 3.5'', 2.36'', and LAAW rockets, antitank mine fuzes with and without hex nuts, practice MK2 and M67 grenades, 2.5'' ballistic windshields, M2A1-mines with/without bases, M19-14 time fuzes, and 40-mm practice grenades with/without cartridges. The site at the MMR site contains targets of yet different sizes. In this work we apply our models to EMI data collected using the MetalMapper (MM) and 2 × 2 TEMTADS sensors. The data for each anomaly are inverted to extract estimates of the extrinsic and intrinsic parameters associated with each buried target. (The latter include the total volume magnetic source or NVMS, which relates to size, shape, and material properties; the former includes location, depth, and orientation). The estimated intrinsic parameters are then used for classification performed via library matching and the use of statistical classification algorithms; this process yielded prioritized dig-lists that were submitted to the Institute for Defense Analyses (IDA) for independent scoring. The models' classification performance is illustrated and assessed based on these independent evaluations.

  13. Active Optical Sensors for Tree Stem Detection and Classification in Nurseries

    PubMed Central

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J.; Hanson, Bradley D.; Slaughter, David C.

    2014-01-01

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops. PMID:24949638

  14. Active optical sensors for tree stem detection and classification in nurseries.

    PubMed

    Garrido, Miguel; Perez-Ruiz, Manuel; Valero, Constantino; Gliever, Chris J; Hanson, Bradley D; Slaughter, David C

    2014-06-19

    Active optical sensing (LIDAR and light curtain transmission) devices mounted on a mobile platform can correctly detect, localize, and classify trees. To conduct an evaluation and comparison of the different sensors, an optical encoder wheel was used for vehicle odometry and provided a measurement of the linear displacement of the prototype vehicle along a row of tree seedlings as a reference for each recorded sensor measurement. The field trials were conducted in a juvenile tree nursery with one-year-old grafted almond trees at Sierra Gold Nurseries, Yuba City, CA, United States. Through these tests and subsequent data processing, each sensor was individually evaluated to characterize their reliability, as well as their advantages and disadvantages for the proposed task. Test results indicated that 95.7% and 99.48% of the trees were successfully detected with the LIDAR and light curtain sensors, respectively. LIDAR correctly classified, between alive or dead tree states at a 93.75% success rate compared to 94.16% for the light curtain sensor. These results can help system designers select the most reliable sensor for the accurate detection and localization of each tree in a nursery, which might allow labor-intensive tasks, such as weeding, to be automated without damaging crops.

  15. A new multi criteria classification approach in a multi agent system applied to SEEG analysis.

    PubMed

    Kinié, A; Ndiaye, M; Montois, J J; Jacquelet, Y

    2007-01-01

    This work is focused on the study of the organization of the SEEG signals during epileptic seizures with multi-agent system approach. This approach is based on cooperative mechanisms of auto-organization at the micro level and of emergence of a global function at the macro level. In order to evaluate this approach we propose a distributed collaborative approach for the classification of the interesting signals. This new multi-criteria classification method is able to provide a relevant brain area structures organisation and to bring out epileptogenic networks elements. The method is compared to another classification approach a fuzzy classification and gives better results when applied to SEEG signals.

  16. Comparison of four approaches to a rock facies classification problem

    USGS Publications Warehouse

    Dubois, M.K.; Bohling, G.C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  17. Reflectance properties of West African savanna trees from ground radiometer measurements. II - Classification of components

    NASA Technical Reports Server (NTRS)

    Hanan, N. P.; Prince, S. D.; Franklin, J.

    1993-01-01

    A pole-mounted radiometer was used to measure the reflectance properties in the red and near-IR of three Sahelian tree species. These properties are classified depending on their location over the canopy. A geometrical description of the patterns of shadow and sunlight on and beneath a model tree when viewed from above is given, and six components are defined. Tree canopies are found to be dark in the red waveband with respect to the soil, but have little or no effect on the near-IR.

  18. Trees

    NASA Astrophysics Data System (ADS)

    Epstein, Henri

    2016-11-01

    An algebraic formalism, developed with V. Glaser and R. Stora for the study of the generalized retarded functions of quantum field theory, is used to prove a factorization theorem which provides a complete description of the generalized retarded functions associated with any tree graph. Integrating over the variables associated to internal vertices to obtain the perturbative generalized retarded functions for interacting fields arising from such graphs is shown to be possible for a large category of space-times.

  19. Industrial and occupational ergonomics in the petrochemical process industry: a regression trees approach.

    PubMed

    Bevilacqua, M; Ciarapica, F E; Giacchetta, G

    2008-07-01

    This work is an attempt to apply classification tree methods to data regarding accidents in a medium-sized refinery, so as to identify the important relationships between the variables, which can be considered as decision-making rules when adopting any measures for improvement. The results obtained using the CART (Classification And Regression Trees) method proved to be the most precise and, in general, they are encouraging concerning the use of tree diagrams as preliminary explorative techniques for the assessment of the ergonomic, management and operational parameters which influence high accident risk situations. The Occupational Injury analysis carried out in this paper was planned as a dynamic process and can be repeated systematically. The CART technique, which considers a very wide set of objective and predictive variables, shows new cause-effect correlations in occupational safety which had never been previously described, highlighting possible injury risk groups and supporting decision-making in these areas. The use of classification trees must not, however, be seen as an attempt to supplant other techniques, but as a complementary method which can be integrated into traditional types of analysis.

  20. An Approach for Automatic Classification of Radiology Reports in Spanish.

    PubMed

    Cotik, Viviana; Filippo, Darío; Castaño, José

    2015-01-01

    Automatic detection of relevant terms in medical reports is useful for educational purposes and for clinical research. Natural language processing (NLP) techniques can be applied in order to identify them. In this work we present an approach to classify radiology reports written in Spanish into two sets: the ones that indicate pathological findings and the ones that do not. In addition, the entities corresponding to pathological findings are identified in the reports. We use RadLex, a lexicon of English radiology terms, and NLP techniques to identify the occurrence of pathological findings. Reports are classified using a simple algorithm based on the presence of pathological findings, negation and hedge terms. The implemented algorithms were tested with a test set of 248 reports annotated by an expert, obtaining a best result of 0.72 F1 measure. The output of the classification task can be used to look for specific occurrences of pathological findings.

  1. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2012-02-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  2. A Visual Analytics Approach for Correlation, Classification, and Regression Analysis

    SciTech Connect

    Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.

    2013-01-01

    New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.

  3. GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran.

    PubMed

    Naghibi, Seyed Amir; Pourghasemi, Hamid Reza; Dixon, Barnali

    2016-01-01

    Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regression tree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (≈70 %) locations were used for the spring potential mapping, while the remaining 259 (≈30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.

  4. An improved classification tree analysis of high cost modules based upon an axiomatic definition of complexity

    NASA Technical Reports Server (NTRS)

    Tian, Jianhui; Porter, Adam; Zelkowitz, Marvin V.

    1992-01-01

    Identification of high cost modules has been viewed as one mechanism to improve overall system reliability, since such modules tend to produce more than their share of problems. A decision tree model was used to identify such modules. In this current paper, a previously developed axiomatic model of program complexity is merged with the previously developed decision tree process for an improvement in the ability to identify such modules. This improvement was tested using data from the NASA Software Engineering Laboratory.

  5. A comparison of feature selection methods for multitemporal tree species classification

    NASA Astrophysics Data System (ADS)

    Pipkins, Kyle; Förster, Michael; Clasen, Anne; Schmidt, Tobias; Kleinschmit, Birgit

    2014-10-01

    The problem of feature selection is a significant one in classification problems, where the addition of too many features to the classification fails to lead to significant increases in classification accuracy. This problem is especially significant within the context of multitemporal remote sensing classifications, where the costs and efforts associated with the acquisition of additional imagery can be extensive. It would thus be beneficial to identify the most important seasons for acquiring imagery for specific land cover types. This study uses a phenologically-adjusted 21 date RapidEye time-series in order to evaluate two methods of feature selection. The two methods compared in this study are a genetic algorithm (GA) and a semi-exhaustive method (EXH), both of which compare permutations of sequential date and band combinations. These methods are employed using a seven class support vector machine classification on a Normalized Difference Vegetation Index (NDVI)-transformed dataset. Overall accuracy (OAA) is used as the performance metric, and OAA significance is assessed using the McNemar test. The results from the feature selection methods are compared on the basis of phenological seasons selected across all iterations and the ideal number of combinations, based on the ratio of better performing classifications to all other classifications. The results suggest that the GA has a moderate but insignificant correlation when compared with the EXH for identifying ideal phenological seasons (overall Spearman's ρ= 0.60, p = 0.13), but is comparable when considering the number of seasons and image combinations.

  6. The importance of chemosensory clues in Aguaruna tree classification and identification

    PubMed Central

    Jernigan, Kevin A

    2008-01-01

    Background The ethnobotanical literature still contains few detailed descriptions of the sensory criteria people use for judging membership in taxonomic categories. Olfactory criteria in particular have been explored very little. This paper will describe the importance of odor for woody plant taxonomy and identification among the Aguaruna Jívaro of the northern Peruvian Amazon, focusing on the Aguaruna category númi (trees excluding palms). Aguaruna informants almost always place trees that they consider to have a similar odor together as kumpají – 'companions,' a metaphor they use to describe trees that they consider to be related. Methods The research took place in several Aguaruna communities in the upper Marañón region of the Peruvian Amazon. Structured interview data focus on informant criteria for membership in various folk taxa of trees. Informants were also asked to explain what members of each group of related companions had in common. This paper focuses on odor and taste criteria that came to light during these structured interviews. Botanical voucher specimens were collected, wherever possible. Results Of the 182 tree folk genera recorded in this study, 51 (28%) were widely considered to possess a distinctive odor. Thirty nine of those (76%) were said to have odors similar to some other tree, while the other 24% had unique odors. Aguaruna informants very rarely described tree odors in non-botanical terms. Taste was used mostly to describe trees with edible fruits. Trees judged to be related were nearly always in the same botanical family. Conclusion The results of this study illustrate that odor of bark, sap, flowers, fruit and leaves are important clues that help the Aguaruna to judge the relatedness of trees found in their local environment. In contrast, taste appears to play a more limited role. The results suggest a more general ethnobotanical hypothesis that could be tested in other cultural settings: people tend to consider plants with

  7. Hybrid Classification of Pulmonary Nodules

    NASA Astrophysics Data System (ADS)

    Lee, S. L. A.; Kouzani, A. Z.; Hu, E. J.

    Automated classification of lung nodules is challenging because of the variation in shape and size of lung nodules, as well as their associated differences in their images. Ensemble based learners have demonstrated the potentialof good performance. Random forests are employed for pulmonary nodule classification where each tree in the forest produces a classification decision, and an integrated output is calculated. A classification aided by clustering approach is proposed to improve the lung nodule classification performance. Three experiments are performed using the LIDC lung image database of 32 cases. The classification performance and execution times are presented and discussed.

  8. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach

    ERIC Educational Resources Information Center

    Klauer, Karl Christoph

    2010-01-01

    Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…

  9. Deep water X-mas tree standardization -- Interchangeability approach

    SciTech Connect

    Paula, M.T.R.; Paulo, C.A.S.; Moreira, C.C.

    1995-12-31

    Aiming the rationalization of subsea operations to turn the production of oil and gas more economical and reliable, standardization of subsea equipment interfaces is a tool that can play a very important role. Continuing the program initiated some years ago, Petrobras is now harvesting the results from the first efforts. Diverless guidelineless subsea Christmas trees from four different suppliers have already been manufactured in accordance to the standardized specification. Tests performed this year in Macae (Campos Basin onshore base), in Brazil, confirmed the interchangeability among subsea Christmas trees, tubing hangers, adapter bases and flowline hubs of different manufacturers. This interchangeability, associated with the use of proven techniques, results in operational flexibility, savings in rig time and reduction in production losses during workovers. By now, 33 complete sets of subsea Christmas trees have already been delivered and successfully tested. Other 28 sets are still being manufactured by the four local suppliers. For the next five years, more than a hundred of these trees will be required for the exploration of the new discoveries. This paper describes the standardized equipment, the role of the operator in an integrated way of working with the manufacturers on the standardization activities, the importance of a frank information flow through the involved companies and how a simple manufacturing philosophy, with the use of construction jigs, has proved to work satisfactorily.

  10. A Fault Tree Approach to Analysis of Organizational Communication Systems.

    ERIC Educational Resources Information Center

    Witkin, Belle Ruth; Stephens, Kent G.

    Fault Tree Analysis (FTA) is a method of examing communication in an organization by focusing on: (1) the complex interrelationships in human systems, particularly in communication systems; (2) interactions across subsystems and system boundaries; and (3) the need to select and "prioritize" channels which will eliminate noise in the…

  11. Bacillary dysentery and meteorological factors in northeastern China: a historical review based on classification and regression trees.

    PubMed

    Guan, Peng; Huang, Desheng; Guo, Junqiao; Wang, Ping; Zhou, Baosen

    2008-09-01

    The relationship between the incidence of bacillary dysentery and meteorological factors was investigated. Data on bacillary dysentery incidence in Shenyang from 1990 to 1996 were obtained from Liaoning Provincial Center for Disease Control and Prevention, and meteorological data such as atmospheric pressure, air temperature, precipitation, evaporation, wind speed, and the amount of solar radiation were obtained from Shenyang Meteorological Bureau. Kendall and Spearman correlations were used to analyze the relationship between bacillary dysentery and meteorological factors. The incidence of bacillary dysentery was treated as a response variable, and meteorological factors were treated as predictable variables. Software R 2.3.1 was used to execute the classification and regression trees (CART). The model improved the accuracy of the fitting results. The residual sum square error of the regression tree model was 53.9, while the residual sum square error of the multivariate linear regression model was 107.2. Among all the meteorological indexes, relative humidity, minimum temperature, and pressure one month prior were statistically influential factors in the multivariate regression tree model. CART may be a useful tool for dealing with heterogeneous data, as it can serve as a decision support tool and is notable for its simplicity and ease.

  12. Hydrometeor classification from polarimetric radar measurements: a clustering approach

    NASA Astrophysics Data System (ADS)

    Grazioli, Jacopo; Tuia, Devis; Berne, Alexis

    2015-04-01

    Hydrometeor classification is the process that aims at identifying the dominant type of hydrometeor (e.g. rain, hail, snow aggregates, hail, graupel, ice crystals) in a domain covered by a polarimetric weather radar during precipitation. The techniques documented in the literature are mostly based on numerical simulations and fuzzy logic. This involves the arbitrary selection of a set of hydrometeor classes and the numerical simulation of theoretical radar observations associated to each class. The information derived from the simulation is then applied to actual radar measurements by means of fuzzy logic input-output association. This approach has some limitations: the number and type of the hydrometeor categories undergoing identification is selected arbitrarily and the scattering simulations are based on constraining assumptions, especially in case of solid hydrometeors. Furthermore, in presence of noise and uncertainties, it is not guaranteed that the selected hydrometeor classes can be effectively identified in actual observations. In the present work we propose a different starting point for the classification task, which is based on observations instead of numerical simulations. We provide criteria for the selection of the number of hydrometeor classes that can be identified, by looking at how polarimetric observations collected over different precipitation events form clusters in the multi-dimensional space of the polarimetric variables. Two datasets, collected by an X-band weather radar, are employed in the study. The first dataset covers mountainous weather conditions (Swiss Alps), while the second includes Mediterranean orographic precipitation events collected during the special observation period (SOP) 2012 of the HyMeX campaign. We employ an unsupervised hierarchical clustering method to group the observations into clusters and we introduce a spatial smoothness constraint for the groups, assuming that the hydrometeor type changes smoothly in space

  13. Nonlinear feature extraction for MMW image classification: a supervised approach

    NASA Astrophysics Data System (ADS)

    Maskall, Guy T.; Webb, Andrew R.

    2002-07-01

    The specular nature of Radar imagery causes problems for ATR as small changes to the configuration of targets can result in significant changes to the resulting target signature. This adds to the challenge of constructing a classifier that is both robust to changes in target configuration and capable of generalizing to previously unseen targets. Here, we describe the application of a nonlinear Radial Basis Function (RBF) transformation to perform feature extraction on millimeter-wave (MMW) imagery of target vehicles. The features extracted were used as inputs to a nearest-neighbor classifier to obtain measures of classification performance. The training of the feature extraction stage was by way of a loss function that quantified the amount of data structure preserved in the transformation to feature space. In this paper we describe a supervised extension to the loss function and explore the value of using the supervised training process over the unsupervised approach and compare with results obtained using a supervised linear technique (Linear Discriminant Analysis --- LDA). The data used were Inverse Synthetic Aperture Radar (ISAR) images of armored vehicles gathered at 94GHz and were categorized as Armored Personnel Carrier, Main Battle Tank or Air Defense Unit. We find that the form of supervision used in this work is an advantage when the number of features used for classification is low, with the conclusion that the supervision allows information useful for discrimination between classes to be distilled into fewer features. When only one example of each class is used for training purposes, the LDA results are comparable to the RBF results. However, when an additional example is added per class, the RBF results are significantly better than those from LDA. Thus, the RBF technique seems better able to make use of the extra knowledge available to the system about variability between different examples of the same class.

  14. Remote sensing of aquatic vegetation distribution in Taihu Lake using an improved classification tree with modified thresholds.

    PubMed

    Zhao, Dehua; Jiang, Hao; Yang, Tangwu; Cai, Ying; Xu, Delin; An, Shuqing

    2012-03-01

    Classification trees (CT) have been used successfully in the past to classify aquatic vegetation from spectral indices (SI) obtained from remotely-sensed images. However, applying CT models developed for certain image dates to other time periods within the same year or among different years can reduce the classification accuracy. In this study, we developed CT models with modified thresholds using extreme SI values (CT(m)) to improve the stability of the models when applying them to different time periods. A total of 903 ground-truth samples were obtained in September of 2009 and 2010 and classified as emergent, floating-leaf, or submerged vegetation or other cover types. Classification trees were developed for 2009 (Model-09) and 2010 (Model-10) using field samples and a combination of two images from winter and summer. Overall accuracies of these models were 92.8% and 94.9%, respectively, which confirmed the ability of CT analysis to map aquatic vegetation in Taihu Lake. However, Model-10 had only 58.9-71.6% classification accuracy and 31.1-58.3% agreement (i.e., pixels classified the same in the two maps) for aquatic vegetation when it was applied to image pairs from both a different time period in 2010 and a similar time period in 2009. We developed a method to estimate the effects of extrinsic (EF) and intrinsic (IF) factors on model uncertainty using Modis images. Results indicated that 71.1% of the instability in classification between time periods was due to EF, which might include changes in atmospheric conditions, sun-view angle and water quality. The remainder was due to IF, such as phenological and growth status differences between time periods. The modified version of Model-10 (i.e. CT(m)) performed better than traditional CT with different image dates. When applied to 2009 images, the CT(m) version of Model-10 had very similar thresholds and performance as Model-09, with overall accuracies of 92.8% and 90.5% for Model-09 and the CT(m) version of Model

  15. Multinomial tree models for assessing the status of the reference in studies of the accuracy of tools for binary classification

    PubMed Central

    Botella, Juan; Huang, Huiling; Suero, Manuel

    2013-01-01

    Studies that evaluate the accuracy of binary classification tools are needed. Such studies provide 2 × 2 cross-classifications of test outcomes and the categories according to an unquestionable reference (or gold standard). However, sometimes a suboptimal reliability reference is employed. Several methods have been proposed to deal with studies where the observations are cross-classified with an imperfect reference. These methods require that the status of the reference, as a gold standard or as an imperfect reference, is known. In this paper a procedure for determining whether it is appropriate to maintain the assumption that the reference is a gold standard or an imperfect reference, is proposed. This procedure fits two nested multinomial tree models, and assesses and compares their absolute and incremental fit. Its implementation requires the availability of the results of several independent studies. These should be carried out using similar designs to provide frequencies of cross-classification between a test and the reference under investigation. The procedure is applied in two examples with real data. PMID:24106484

  16. A Novel Approach on Designing Augmented Fuzzy Cognitive Maps Using Fuzzified Decision Trees

    NASA Astrophysics Data System (ADS)

    Papageorgiou, Elpiniki I.

    This paper proposes a new methodology for designing Fuzzy Cognitive Maps using crisp decision trees that have been fuzzified. Fuzzy cognitive map is a knowledge-based technique that works as an artificial cognitive network inheriting the main aspects of cognitive maps and artificial neural networks. Decision trees, in the other hand, are well known intelligent techniques that extract rules from both symbolic and numeric data. Fuzzy theoretical techniques are used to fuzzify crisp decision trees in order to soften decision boundaries at decision nodes inherent in this type of trees. Comparisons between crisp decision trees and the fuzzified decision trees suggest that the later fuzzy tree is significantly more robust and produces a more balanced decision making. The approach proposed in this paper could incorporate any type of fuzzy decision trees. Through this methodology, new linguistic weights were determined in FCM model, thus producing augmented FCM tool. The framework is consisted of a new fuzzy algorithm to generate linguistic weights that describe the cause-effect relationships among the concepts of the FCM model, from induced fuzzy decision trees.

  17. Increased tree establishment in Lithuanian peat bogs--insights from field and remotely sensed approaches.

    PubMed

    Edvardsson, Johannes; Šimanauskienė, Rasa; Taminskas, Julius; Baužienė, Ieva; Stoffel, Markus

    2015-02-01

    Over the past century an ongoing establishment of Scots pine (Pinus sylvestris L.), sometimes at accelerating rates, is noted at three studied Lithuanian peat bogs, namely Kerėplis, Rėkyva and Aukštumala, all representing different degrees of tree coverage and geographic settings. Present establishment rates seem to depend on tree density on the bog surface and are most significant at sparsely covered sites where about three-fourth of the trees have established since the mid-1990s, whereas the initial establishment in general was during the early to mid-19th century. Three methods were used to detect, compare and describe tree establishment: (1) tree counts in small plots, (2) dendrochronological dating of bog pine trees, and (3) interpretation of aerial photographs and historical maps of the study areas. In combination, the different approaches provide complimentary information but also weigh up each other's drawbacks. Tree counts in plots provided a reasonable overview of age class distributions and enabled capturing of the most recently established trees with ages less than 50 years. The dendrochronological analysis yielded accurate tree ages and a good temporal resolution of long-term changes. Tree establishment and spread interpreted from aerial photographs and historical maps provided a good overview of tree spread and total affected area. It also helped to verify the results obtained with the other methods and an upscaling of findings to the entire peat bogs. The ongoing spread of trees in predominantly undisturbed peat bogs is related to warmer and/or drier climatic conditions, and to a minor degree to land-use changes. Our results therefore provide valuable insights into vegetation changes in peat bogs, also with respect to bog response to ongoing and future climatic changes.

  18. Spectral difference analysis and airborne imaging classification for citrus greening infected trees

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Citrus greening, also called Huanglongbing (HLB), became a devastating disease spread through citrus groves in Florida, since it was first found in 2005. Multispectral (MS) and hyperspectral (HS) airborne images of citrus groves in Florida were acquired to detect citrus greening infected trees in 20...

  19. Chemical classification of cattle. 2. Phylogenetic tree and specific status of the Zebu.

    PubMed

    Manwell, C; Baker, C M

    1980-01-01

    Phylogenetic trees for the ten major breed groups of cattle were constructed by Farris's (1972) maximum parsimony method, or Fitch & Margoliash's (1967) method, which averages ou the deviation over the entire assemblage. Both techniques yield essentially identical trees. The phylogenetic tree for the ten major cattle breed groups can be superimposed on a map of Europe and western Asia, the root of the tree being close to the 'fertile crescent' in Asia Minor, believed to be a primary centre of bovine domestication. For some but not all protein variants there is a cline of gene frequencies as one proceeds from the British Isles and northwest Europe towards southeast Europe and Asia Minor, with the most extreme gene frequencies in the Zebu breeds of India. It is not clear to what extent the observed clines are primary or secondary, i.e., consequent to the initial migrations of cattle towards the end of the Pleistocene or consequent to the many migrations of man with his domesticated cattle. Such clines as exist are not in themselves sufficient to prove either selection versus genetic drift or to establish taxonomic ranking. Contrary to some suggestions in the literature, the biochemical evidence supports Linnaeus's original conclusions: Bos taurus and Bos indicus are distinct species.

  20. Identification, classification and differential expression of oleosin genes in tung tree (Vernicia fordii)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Triacylglycerols (TAG) are the major molecules of energy storage in eukaryotes. TAG are packed in subcellular structures called oil bodies or lipid droplets. Oleosins (OLE) are the major proteins in plant oil bodies. Multiple isoforms of OLE are present in plants such as tung tree (Vernicia fordii),...

  1. Impacts of age-dependent tree sensitivity and dating approaches on dendrogeomorphic time series of landslides

    NASA Astrophysics Data System (ADS)

    Šilhán, Karel; Stoffel, Markus

    2015-05-01

    Different approaches and thresholds have been utilized in the past to date landslides with growth ring series of disturbed trees. Past work was mostly based on conifer species because of their well-defined ring boundaries and the easy identification of compression wood after stem tilting. More recently, work has been expanded to include broad-leaved trees, which are thought to produce less and less evident reactions after landsliding. This contribution reviews recent progress made in dendrogeomorphic landslide analysis and introduces a new approach in which landslides are dated via ring eccentricity formed after tilting. We compare results of this new and the more conventional approaches. In addition, the paper also addresses tree sensitivity to landslide disturbance as a function of tree age and trunk diameter using 119 common beech (Fagus sylvatica L.) and 39 Crimean pine (Pinus nigra ssp. pallasiana) trees growing on two landslide bodies. The landslide events reconstructed with the classical approach (reaction wood) also appear as events in the eccentricity analysis, but the inclusion of eccentricity clearly allowed for more (162%) landslides to be detected in the tree-ring series. With respect to tree sensitivity, conifers and broad-leaved trees show the strongest reactions to landslides at ages comprised between 40 and 60 years, with a second phase of increased sensitivity in P. nigra at ages of ca. 120-130 years. These phases of highest sensitivities correspond with trunk diameters at breast height of 6-8 and 18-22 cm, respectively (P. nigra). This study thus calls for the inclusion of eccentricity analyses in future landslide reconstructions as well as for the selection of trees belonging to different age and diameter classes to allow for a well-balanced and more complete reconstruction of past events.

  2. Corpus Callosum MR Image Classification

    NASA Astrophysics Data System (ADS)

    Elsayed, A.; Coenen, F.; Jiang, C.; García-Fiñana, M.; Sluming, V.

    An approach to classifying Magnetic Resonance (MR) image data is described. The specific application is the classification of MRI scan data according to the nature of the corpus callosum, however the approach has more general applicability. A variation of the “spectral segmentation with multi-scale graph decomposition” mechanism is introduced. The result of the segmentation is stored in a quad-tree data structure to which a weighted variation (also developed by the authors) of the gSpan algorithm is applied to identify frequent sub-trees. As a result the images are expressed as a set frequent sub-trees. There may be a great many of these and thus a decision tree based feature reduction technique is applied before classification takes place. The results show that the proposed approach performs both efficiently and effectively, obtaining a classification accuracy of over 95% in the case of the given application.

  3. Pattern Recognition Approaches for Breast Cancer DCE-MRI Classification: A Systematic Review.

    PubMed

    Fusco, Roberta; Sansone, Mario; Filice, Salvatore; Carone, Guglielmo; Amato, Daniela Maria; Sansone, Carlo; Petrillo, Antonella

    We performed a systematic review of several pattern analysis approaches for classifying breast lesions using dynamic, morphological, and textural features in dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). Several machine learning approaches, namely artificial neural networks (ANN), support vector machines (SVM), linear discriminant analysis (LDA), tree-based classifiers (TC), and Bayesian classifiers (BC), and features used for classification are described. The findings of a systematic review of 26 studies are presented. The sensitivity and specificity are respectively 91 and 83 % for ANN, 85 and 82 % for SVM, 96 and 85 % for LDA, 92 and 87 % for TC, and 82 and 85 % for BC. The sensitivity and specificity are respectively 82 and 74 % for dynamic features, 93 and 60 % for morphological features, 88 and 81 % for textural features, 95 and 86 % for a combination of dynamic and morphological features, and 88 and 84 % for a combination of dynamic, morphological, and other features. LDA and TC have the best performance. A combination of dynamic and morphological features gives the best performance.

  4. A Cladistic Approach for the Classification of Oligotrichid Ciliates (Ciliophora: Spirotricha)

    PubMed Central

    AGATHA, Sabine

    2010-01-01

    Summary Currently, gene sequence genealogies of the Oligotrichea Bütschli, 1889 comprise only few species. Therefore, a cladistic approach, especially to the Oligotrichida, was made, applying Hennig's method and computer programs. Twenty-three characters were selected and discussed, i.e., the morphology of the oral apparatus (five characters), the somatic ciliature (eight characters), special organelles (four characters), and ontogenetic particulars (six characters). Nine of these characters developed convergently twice. Although several new features were included into the analyses, the cladograms match other morphological trees in the monophyly of the Oligotrichea, Halteriia, Oligotrichia, Oligotrichida, and Choreotrichida. The main synapomorphies of the Oligotrichea are the enantiotropic division mode and the de novo-origin of the undulating membranes. Although the sister group relationship of the Halteriia and the Oligotrichia contradicts results obtained by gene sequence analyses, no morphologic, ontogenetic or ultrastructural features were found, which support a branching of Halteria grandinella within the Stichotrichida. The cladistic approaches suggest paraphyly of the family Strombidiidae probably due to the scarce knowledge. A revised classification of the Oligotrichea is suggested, including all sufficiently known families and genera. PMID:20396404

  5. Single-cell approaches for molecular classification of endocrine tumors

    PubMed Central

    Koh, James; Allbritton, Nancy L.; Sosa, Julie A.

    2015-01-01

    Purpose of review In this review, we summarize recent developments in single-cell technologies that can be employed for the functional and molecular classification of endocrine cells in normal and neoplastic tissue. Recent findings The emergence of new platforms for the isolation, analysis, and dynamic assessment of individual cell identity and reactive behavior enables experimental deconstruction of intratumoral heterogeneity and other contexts, where variability in cell signaling and biochemical responsiveness inform biological function and clinical presentation. These tools are particularly appropriate for examining and classifying endocrine neoplasias, as the clinical sequelae of these tumors are often driven by disrupted hormonal responsiveness secondary to compromised cell signaling. Single-cell methods allow for multidimensional experimental designs incorporating both spatial and temporal parameters with the capacity to probe dynamic cell signaling behaviors and kinetic response patterns dependent upon sequential agonist challenge. Summary Intratumoral heterogeneity in the provenance, composition, and biological activity of different forms of endocrine neoplasia presents a significant challenge for prognostic assessment. Single-cell technologies provide an array of powerful new approaches uniquely well suited for dissecting complex endocrine tumors. Studies examining the relationship between clinical behavior and tumor compositional variations in cellular activity are now possible, providing new opportunities to deconstruct the underlying mechanisms of endocrine neoplasia. PMID:26632769

  6. Potential of Full Waveform Airborne Laser Scanning Data for Urban Area Classification - Transfer of Classification Approaches Between Missions

    NASA Astrophysics Data System (ADS)

    Tran, G.; Nguyen, D.; Milenkovic, M.; Pfeifer, N.

    2015-04-01

    Full-waveform (FWF) LiDAR (Light Detection and Ranging) systems have their advantage in recording the entire backscattered signal of each emitted laser pulse compared to conventional airborne discrete-return laser scanner systems. The FWF systems can provide point clouds which contain extra attributes like amplitude and echo width, etc. In this study, a FWF data collected in 2010 for Eisenstadt, a city in the eastern part of Austria was used to classify four main classes: buildings, trees, waterbody and ground by employing a decision tree. Point density, echo ratio, echo width, normalised digital surface model and point cloud roughness are the main inputs for classification. The accuracy of the final results, correctness and completeness measures, were assessed by comparison of the classified output to a knowledge-based labelling of the points. Completeness and correctness between 90% and 97% was reached, depending on the class. While such results and methods were presented before, we are investigating additionally the transferability of the classification method (features, thresholds ...) to another urban FWF lidar point cloud. Our conclusions are that from the features used, only echo width requires new thresholds. A data-driven adaptation of thresholds is suggested.

  7. Narrowing historical uncertainty: probabilistic classification of ambiguously identified tree species in historical forest survey data

    USGS Publications Warehouse

    Mladenoff, D.J.; Dahir, S.E.; Nordheim, E.V.; Schulte, L.A.; Guntenspergen, G.R.

    2002-01-01

    Historical data have increasingly become appreciated for insight into the past conditions of ecosystems. Uses of such data include assessing the extent of ecosystem change; deriving ecological baselines for management, restoration, and modeling; and assessing the importance of past conditions on the composition and function of current systems. One historical data set of this type is the Public Land Survey (PLS) of the United States General Land Office, which contains data on multiple tree species, sizes, and distances recorded at each survey point, located at half-mile (0.8 km) intervals on a 1-mi (1.6 km) grid. This survey method was begun in the 1790s on US federal lands extending westward from Ohio. Thus, the data have the potential of providing a view of much of the US landscape from the mid-1800s, and they have been used extensively for this purpose. However, historical data sources, such as those describing the species composition of forests, can often be limited in the detail recorded and the reliability of the data, since the information was often not originally recorded for ecological purposes. Forest trees are sometimes recorded ambiguously, using generic or obscure common names. For the PLS data of northern Wisconsin, USA, we developed a method to classify ambiguously identified tree species using logistic regression analysis, using data on trees that were clearly identified to species and a set of independent predictor variables to build the models. The models were first created on partial data sets for each species and then tested for fit against the remaining data. Validations were conducted using repeated, random subsets of the data. Model prediction accuracy ranged from 81% to 96% in differentiating congeneric species among oak, pine, ash, maple, birch, and elm. Major predictor variables were tree size, associated species, landscape classes indicative of soil type, and spatial location within the study region. Results help to clarify ambiguities

  8. One or Two Dimensions in Spontaneous Classification: A Simplicity Approach

    ERIC Educational Resources Information Center

    Pothos, Emmanuel M.; Close, James

    2008-01-01

    When participants are asked to spontaneously categorize a set of items, they typically produce unidimensional classifications, i.e., categorize the items on the basis of only one of their dimensions of variation. We examine whether it is possible to predict unidimensional vs. two-dimensional classification on the basis of the abstract stimulus…

  9. Automated detection and classification of lunar craters using multiple approaches

    NASA Astrophysics Data System (ADS)

    Sawabe, Y.; Matsunaga, T.; Rokugawa, S.

    Many missions such as Clementine and SELENE (SELenological and Engineering Explorer) take lunar images for examination. A large volume of imagery data has already been archived and much more is on the way. Extracting the necessary information from the already large and ever growing volume of data is the crucial problem that needs to be overcome. Craters are studied extensively since they provide us with the relative age of the surface unit and more information on the lunar surface geology. Manually extracting craters from lunar images is a difficult task because it requires a great deal of man power as well as specific knowledge and skills of extraction. Several automated craters detection algorithms have been developed but none is yet practical or sufficiently tested to be reliable. Our previous algorithm (Sawabe, Y., Matsunaga, T., Rokugawa, S. Automatic crater detection algorithm for the lunar surface using multiple approaches. J. Remote Sens. Soc. Jpn. 25 (2), 157 168, 2005.) was improved to enhance detection of craters in lunar images and automate crater classification. This algorithm was tested using various images for wide range of applicability. Four approaches were used with the crater detecting algorithm to find (1) “shady and sunny” patters in images with low sun angle, (2) circular features in edge images, (3) curves and circles in thinned and connected edge lines, and (4) discrete or broken circular edge lines using fuzzy Hough transform. The algorithm was applied to mare and highland images of the moon captured by Clementine and Apollo under different solar angles and spatial resolution. The new algorithm was able to detect 80% more without parameter tuning. In addition, the detected craters were classified by spectral characteristics derived from Clementine UV Vis multi-spectral images. Finally, the lunar surface GIS was formulated which has the geological and spectral attributes automatically generated by our algorithm. It could be helpful

  10. Simple, novel approaches to investigating biophysical characteristics of individual mid-latitude deciduous trees

    NASA Astrophysics Data System (ADS)

    Kalibo, Humphrey Wafula

    Forests play a critical role in the functioning of the biosphere and support the livelihoods of millions of people. With increasing anthropogenic influences and looming effects associated with climatic variability, it is crucial that the research community and policy makers take advantage of the capabilities afforded by remote sensing technologies to generate reliable and timely data to support management decisions. Set in the species-rich woodland of Prairie Pines in Lincoln, Nebraska, this research addresses three distinct objectives that could contribute towards forest research and management. First, three supervised classification algorithms were applied to two hyperspectral AISA-Eagle images to evaluate their capability for spectrally identifying selected tree species. The findings show that each algorithm had low to moderate overall classification accuracies (46%-62%), probably due to mixed pixels resulting from pronounced heterogeneity in tree diversity; however, the algorithms could be a rapid means to assess species composition. The second objective is an investigation into how twelve individual morphologically different deciduous trees transmit incoming photosynthetically active radiation (PAR) over the course of the growing season. It was found that more diffuse light was transmitted than direct light, dictated by seasonality, vegetation fraction (VF), and leaf size. In the final objective, VF derived from upward-looking hemispherical photographs of twelve deciduous tree canopies and eight spectral vegetation indices (VIs) calculated from in situ single leaf-level reflectance data were used to investigate whether the VIs could mimic and estimate the temporal patterns of measured VF of each tree over the growing season. The findings show that all the indices accurately depicted the temporal patterns of the photo-derived VF. NDVI and SAVI had the highest correlations (R 2 > 0.7; RMSE 0.7; E > 0.8) and closely mirrored the temporal patterns of VF for nine

  11. The Iqmulus Urban Showcase: Automatic Tree Classification and Identification in Huge Mobile Mapping Point Clouds

    NASA Astrophysics Data System (ADS)

    Böhm, J.; Bredif, M.; Gierlinger, T.; Krämer, M.; Lindenberg, R.; Liu, K.; Michel, F.; Sirmacek, B.

    2016-06-01

    Current 3D data capturing as implemented on for example airborne or mobile laser scanning systems is able to efficiently sample the surface of a city by billions of unselective points during one working day. What is still difficult is to extract and visualize meaningful information hidden in these point clouds with the same efficiency. This is where the FP7 IQmulus project enters the scene. IQmulus is an interactive facility for processing and visualizing big spatial data. In this study the potential of IQmulus is demonstrated on a laser mobile mapping point cloud of 1 billion points sampling ~ 10 km of street environment in Toulouse, France. After the data is uploaded to the IQmulus Hadoop Distributed File System, a workflow is defined by the user consisting of retiling the data followed by a PCA driven local dimensionality analysis, which runs efficiently on the IQmulus cloud facility using a Spark implementation. Points scattering in 3 directions are clustered in the tree class, and are separated next into individual trees. Five hours of processing at the 12 node computing cluster results in the automatic identification of 4000+ urban trees. Visualization of the results in the IQmulus fat client helps users to appreciate the results, and developers to identify remaining flaws in the processing workflow.

  12. An efficient approach to 3D single tree-crown delineation in LiDAR data

    NASA Astrophysics Data System (ADS)

    Mongus, Domen; Žalik, Borut

    2015-10-01

    This paper proposes a new method for 3D delineation of single tree-crowns in LiDAR data by exploiting the complementaries of treetop and tree trunk detections. A unified mathematical framework is provided based on the graph theory, allowing for all the segmentations to be achieved using marker-controlled watersheds. Treetops are defined by detecting concave neighbourhoods within the canopy height model using locally fitted surfaces. These serve as markers for watershed segmentation of the canopy layer where possible oversegmentation is reduced by merging the regions based on their heights, areas, and shapes. Additional tree crowns are delineated from mid- and under-storey layers based on tree trunk detection. A new approach for estimating the verticalities of the points' distributions is proposed for this purpose. The watershed segmentation is then applied on a density function within the voxel space, while boundaries of delineated trees from the canopy layer are used to prevent the overspreading of regions. The experiments show an approximately 6% increase in the efficiency of the proposed treetop definition based on locally fitted surfaces in comparison with the traditionally used local maxima of the smoothed canopy height model. In addition, 4% increase in the efficiency is achieved by the proposed tree trunk detection. Although the tree trunk detection alone is dependent on the data density, supplementing it with the treetop detection the proposed approach is efficient even when dealing with low density point-clouds.

  13. Subgrouping patients with low back pain: evolution of a classification approach to physical therapy.

    PubMed

    Fritz, Julie M; Cleland, Joshua A; Childs, John D

    2007-06-01

    The development of valid classification methods to assist the physical therapy management of patients with low back pain has been recognized as a research priority. There is also growing evidence that the use of a classification approach to physical therapy results in better clinical outcomes than the use of alternative management approaches. In 1995 Delitto and colleagues proposed a classification system intended to inform and direct the physical therapy management of patients with low back pain. The system described 4 classifications of patients with low back pain (manipulation, stabilization, specific exercise, and traction). Each classification could be identified by a unique set of examination criteria, and was associated with an intervention strategy believed to result in the best outcomes for the patient. The system was based on expert opinion and research evidence available at the time. A substantial amount of research has emerged in the years since the introduction of this classification system, including the development of clinical prediction rules, providing new evidence for the examination criteria used to place a patient into a classification and for the optimal intervention strategies for each classification. New evidence should continually be incorporated into existing classification systems. The purpose of this clinical commentary is to review this classification system, its evolution and current status, and to discuss its implications for the classification of patients with low back pain.

  14. A fusion-based approach for uterine cervical cancer histology image classification.

    PubMed

    De, Soumya; Stanley, R Joe; Lu, Cheng; Long, Rodney; Antani, Sameer; Thoma, George; Zuna, Rosemary

    2013-01-01

    Expert pathologists commonly perform visual interpretation of histology slides for cervix tissue abnormality diagnosis. We investigated an automated, localized, fusion-based approach for cervix histology image analysis for squamous epithelium classification into Normal, CIN1, CIN2, and CIN3 grades of cervical intraepithelial neoplasia (CIN). The epithelium image analysis approach includes medial axis determination, vertical segment partitioning as medial axis orthogonal cuts, individual vertical segment feature extraction and classification, and image-based classification using a voting scheme fusing the vertical segment CIN grades. Results using 61 images showed at least 15.5% CIN exact grade classification improvement using the localized vertical segment fusion versus global image features.

  15. Multi-temporal remote sensing image classification - a multi-view approach

    SciTech Connect

    Chandola, Varun; Vatsavai, Raju

    2010-01-01

    Multispectral remote sensing images have been widely used for automated land use and land cover classification tasks. Often thematic classification is done using single date image, however in many instances a single date image is not informative enough to distinguish between different land cover types. In this paper we show how one can use multiple images, collected at different times of year (for example, during crop growing season), to learn a better classifier. We propose two approaches, an ensemble of classifiers approach and a co-training based approach, and show how both of these methods outperform a straightforward stacked vector approach often used in multi-temporal image classification. Additionally, the co-training based method addresses the challenge of limited labeled training data in supervised classification, as this classification scheme utilizes a large number of unlabeled samples (which comes for free) in conjunction with a small set of labeled training data.

  16. Integrated Analysis of Tropical Trees Growth: A Multivariate Approach

    PubMed Central

    YÁÑEZ-ESPINOSA, LAURA; TERRAZAS, TERESA; LÓPEZ-MATA, LAURO

    2006-01-01

    • Background and Aims One of the problems analysing cause–effect relationships of growth and environmental factors is that a single factor could be correlated with other ones directly influencing growth. One attempt to understand tropical trees' growth cause–effect relationships is integrating research about anatomical, physiological and environmental factors that influence growth in order to develop mathematical models. The relevance is to understand the nature of the process of growth and to model this as a function of the environment. • Methods The relationships of Aphananthe monoica, Pleuranthodendron lindenii and Psychotria costivenia radial growth and phenology with environmental factors (local climate, vertical strata microclimate and physical and chemical soil variables) were evaluated from April 2000 to September 2001. The association among these groups of variables was determined by generalized canonical correlation analysis (GCCA), which considers the probable associations of three or more data groups and the selection of the most important variables for each data group. • Key Results The GCCA allowed determination of a general model of relationships among tree phenology and radial growth with climate, microclimate and soil factors. A strong influence of climate in phenology and radial growth existed. Leaf initiation and cambial activity periods were associated with maximum temperature and day length, and vascular tissue differentiation with soil moisture and rainfall. The analyses of individual species detected different relationships for the three species. • Conclusions The analyses of the individual species suggest that each one takes advantage in a different way of the environment in which they are growing, allowing them to coexist. PMID:16822807

  17. RAVEN. Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Alfonsi, Andrea; Rabiti, Cristian; Mandelli, Diego; Cogliati, Joshua; Kinoshita, Robert

    2014-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  18. RAVEN: Dynamic Event Tree Approach Level III Milestone

    SciTech Connect

    Andrea Alfonsi; Cristian Rabiti; Diego Mandelli; Joshua Cogliati; Robert Kinoshita

    2013-07-01

    Conventional Event-Tree (ET) based methodologies are extensively used as tools to perform reliability and safety assessment of complex and critical engineering systems. One of the disadvantages of these methods is that timing/sequencing of events and system dynamics are not explicitly accounted for in the analysis. In order to overcome these limitations several techniques, also know as Dynamic Probabilistic Risk Assessment (DPRA), have been developed. Monte-Carlo (MC) and Dynamic Event Tree (DET) are two of the most widely used D-PRA methodologies to perform safety assessment of Nuclear Power Plants (NPP). In the past two years, the Idaho National Laboratory (INL) has developed its own tool to perform Dynamic PRA: RAVEN (Reactor Analysis and Virtual control ENvironment). RAVEN has been designed to perform two main tasks: 1) control logic driver for the new Thermo-Hydraulic code RELAP-7 and 2) post-processing tool. In the first task, RAVEN acts as a deterministic controller in which the set of control logic laws (user defined) monitors the RELAP-7 simulation and controls the activation of specific systems. Moreover, the control logic infrastructure is used to model stochastic events, such as components failures, and perform uncertainty propagation. Such stochastic modeling is deployed using both MC and DET algorithms. In the second task, RAVEN processes the large amount of data generated by RELAP-7 using data-mining based algorithms. This report focuses on the analysis of dynamic stochastic systems using the newly developed RAVEN DET capability. As an example, a DPRA analysis, using DET, of a simplified pressurized water reactor for a Station Black-Out (SBO) scenario is presented.

  19. Fusion of LiDAR and aerial imagery for the estimation of downed tree volume using Support Vector Machines classification and region based object fitting

    NASA Astrophysics Data System (ADS)

    Selvarajan, Sowmya

    The study classifies 3D small footprint full waveform digitized LiDAR fused with aerial imagery to downed trees using Support Vector Machines (SVM) algorithm. Using small footprint waveform LiDAR, airborne LiDAR systems can provide better canopy penetration and very high spatial resolution. The small footprint waveform scanner system Riegl LMS-Q680 is addition with an UltraCamX aerial camera are used to measure and map downed trees in a forest. The various data preprocessing steps helped in the identification of ground points from the dense LiDAR dataset and segment the LiDAR data to help reduce the complexity of the algorithm. The haze filtering process helped to differentiate the spectral signatures of the various classes within the aerial image. Such processes, helped to better select the features from both sensor data. The six features: LiDAR height, LiDAR intensity, LiDAR echo, and three image intensities are utilized. To do so, LiDAR derived, aerial image derived and fused LiDAR-aerial image derived features are used to organize the data for the SVM hypothesis formulation. Several variations of the SVM algorithm with different kernels and soft margin parameter C are experimented. The algorithm is implemented to classify downed trees over a pine trees zone. The LiDAR derived features provided an overall accuracy of 98% of downed trees but with no classification error of 86%. The image derived features provided an overall accuracy of 65% and fusion derived features resulted in an overall accuracy of 88%. The results are observed to be stable and robust. The SVM accuracies were accompanied by high false alarm rates, with the LiDAR classification producing 58.45%, image classification producing 95.74% and finally the fused classification producing 93% false alarm rates The Canny edge correction filter helped control the LiDAR false alarm to 35.99%, image false alarm to 48.56% and fused false alarm to 37.69% The implemented classifiers provided a powerful tool for

  20. New Approaches to Object Classification in Synoptic Sky Surveys

    SciTech Connect

    Donalek, C.; Mahabal, A.; Djorgovski, S. G.; Marney, S.; Drake, A.; Glikman, E.; Graham, M. J.; Williams, R.

    2008-12-05

    Digital synoptic sky surveys pose several new object classification challenges. In surveys where real-time detection and classification of transient events is a science driver, there is a need for an effective elimination of instrument-related artifacts which can masquerade as transient sources in the detection pipeline, e.g., unremoved large cosmic rays, saturation trails, reflections, crosstalk artifacts, etc. We have implemented such an Artifact Filter, using a supervised neural network, for the real-time processing pipeline in the Palomar-Quest (PQ) survey. After the training phase, for each object it takes as input a set of measured morphological parameters and returns the probability of it being a real object. Despite the relatively low number of training cases for many kinds of artifacts, the overall artifact classification rate is around 90%, with no genuine transients misclassified during our real-time scans. Another question is how to assign an optimal star-galaxy classification in a multi-pass survey, where seeing and other conditions change between different epochs, potentially producing inconsistent classifications for the same object. We have implemented a star/galaxy multipass classifier that makes use of external and a priori knowledge to find the optimal classification from the individually derived ones. Both these techniques can be applied to other, similar surveys and data sets.

  1. Idiopathic interstitial pneumonias and emphysema: detection and classification using a texture-discriminative approach

    NASA Astrophysics Data System (ADS)

    Fetita, C.; Chang-Chien, K. C.; Brillet, P. Y.; Pr"teux, F.; Chang, R. F.

    2012-03-01

    Our study aims at developing a computer-aided diagnosis (CAD) system for fully automatic detection and classification of pathological lung parenchyma patterns in idiopathic interstitial pneumonias (IIP) and emphysema using multi-detector computed tomography (MDCT). The proposed CAD system is based on three-dimensional (3-D) mathematical morphology, texture and fuzzy logic analysis, and can be divided into four stages: (1) a multi-resolution decomposition scheme based on a 3-D morphological filter was exploited to discriminate the lung region patterns at different analysis scales. (2) An additional spatial lung partitioning based on the lung tissue texture was introduced to reinforce the spatial separation between patterns extracted at the same resolution level in the decomposition pyramid. Then, (3) a hierarchic tree structure was exploited to describe the relationship between patterns at different resolution levels, and for each pattern, six fuzzy membership functions were established for assigning a probability of association with a normal tissue or a pathological target. Finally, (4) a decision step exploiting the fuzzy-logic assignments selects the target class of each lung pattern among the following categories: normal (N), emphysema (EM), fibrosis/honeycombing (FHC), and ground glass (GDG). According to a preliminary evaluation on an extended database, the proposed method can overcome the drawbacks of a previously developed approach and achieve higher sensitivity and specificity.

  2. A new multi criteria classification approach in a multi agent system applied to SEEG analysis

    PubMed Central

    Kinie, Abel; Ndiaye, Mamadou Lamine L.; Montois, Jean-Jacques; Jacquelet, Yann

    2007-01-01

    This work is focused on the study of the organization of the SEEG signals during epileptic seizures with multi-agent system approach. This approach is based on cooperative mechanisms of auto-organization at the micro level and of emergence of a global function at the macro level. In order to evaluate this approach we propose a distributed collaborative approach for the classification of the interesting signals. This new multi-criteria classification method is able to provide a relevant brain area structures organisation and to bring out epileptogenic networks elements. The method is compared to another classification approach a fuzzy classification and gives better results when applied to SEEG signals. PMID:18002381

  3. Text Categorization Based on K-Nearest Neighbor Approach for Web Site Classification.

    ERIC Educational Resources Information Center

    Kwon, Oh-Woog; Lee, Jong-Hyeok

    2003-01-01

    Discusses text categorization and Web site classification and proposes a three-step classification system that includes the use of Web pages linked with the home page. Highlights include the k-nearest neighbor (k-NN) approach; improving performance with a feature selection method and a term weighting scheme using HTML tags; and similarity…

  4. The Comprehensive AOCMF Classification System: Radiological Issues and Systematic Approach

    PubMed Central

    Buitrago-Téllez, Carlos H.; Cornelius, Carl-Peter; Prein, Joachim; Kunz, Christoph; Ieva, Antonio di; Audigé, Laurent

    2014-01-01

    The AOCMF Classification Group developed a hierarchical three-level craniomaxillofacial (CMF) classification system with increasing level of complexity and details. The basic level 1 system differentiates fracture location in the mandible (code 91), midface (code 92), skull base (code 93), and cranial vault (code 94); the levels 2 and 3 focus on defining fracture location and morphology within more detailed regions and subregions. Correct imaging acquisition, systematic analysis, and interpretation according to the anatomic and surgical relevant structures in the CMF regions are essential for an accurate, reproducible, and comprehensive diagnosis of CMF fractures using that system. Basic principles for radiographic diagnosis are based on conventional plain films, multidetector computed tomography, and magnetic resonance imaging. In this tutorial, the radiological issues according to each level of the classification are described. PMID:25489396

  5. Machine Learning Approaches for High-resolution Urban Land Cover Classification: A Comparative Study

    SciTech Connect

    Vatsavai, Raju; Chandola, Varun; Cheriyadat, Anil M; Bright, Eddie A; Bhaduri, Budhendra L; Graesser, Jordan B

    2011-01-01

    The proliferation of several machine learning approaches makes it difficult to identify a suitable classification technique for analyzing high-resolution remote sensing images. In this study, ten classification techniques were compared from five broad machine learning categories. Surprisingly, the performance of simple statistical classification schemes like maximum likelihood and Logistic regression over complex and recent techniques is very close. Given that these two classifiers require little input from the user, they should still be considered for most classification tasks. Multiple classifier systems is a good choice if the resources permit.

  6. Neural network approach to classification of infrasound signals

    NASA Astrophysics Data System (ADS)

    Lee, Dong-Chang

    As part of the International Monitoring Systems of the Preparatory Commissions for the Comprehensive Nuclear Test-Ban Treaty Organization, the Infrasound Group at the University of Alaska Fairbanks maintains and operates two infrasound stations to monitor global nuclear activity. In addition, the group specializes in detecting and classifying the man-made and naturally produced signals recorded at both stations by computing various characterization parameters (e.g. mean of the cross correlation maxima, trace velocity, direction of arrival, and planarity values) using the in-house developed weighted least-squares algorithm. Classifying commonly observed low-frequency (0.015--0.1 Hz) signals at out stations, namely mountain associated waves and high trace-velocity signals, using traditional approach (e.g. analysis of power spectral density) presents a problem. Such signals can be separated statistically by setting a window to the trace-velocity estimate for each signal types, and the feasibility of such technique is demonstrated by displaying and comparing various summary plots (e.g. universal, seasonal and azimuthal variations) produced by analyzing infrasound data (2004--2007) from the Fairbanks and Antarctic arrays. Such plots with the availability of magnetic activity information (from the College International Geophysical Observatory located at Fairbanks, Alaska) leads to possible physical sources of the two signal types. Throughout this thesis a newly developed robust algorithm (sum of squares of variance ratios) with improved detection quality (under low signal to noise ratios) over two well-known detection algorithms (mean of the cross correlation maxima and Fisher Statistics) are investigated for its efficacy as a new detector. A neural network is examined for its ability to automatically classify the two signals described above against clutter (spurious signals with common characteristics). Four identical perceptron networks are trained and validated (with

  7. A Method for Application of Classification Tree Models to Map Aquatic Vegetation Using Remotely Sensed Images from Different Sensors and Dates

    PubMed Central

    Jiang, Hao; Zhao, Dehua; Cai, Ying; An, Shuqing

    2012-01-01

    In previous attempts to identify aquatic vegetation from remotely-sensed images using classification trees (CT), the images used to apply CT models to different times or locations necessarily originated from the same satellite sensor as that from which the original images used in model development came, greatly limiting the application of CT. We have developed an effective normalization method to improve the robustness of CT models when applied to images originating from different sensors and dates. A total of 965 ground-truth samples of aquatic vegetation types were obtained in 2009 and 2010 in Taihu Lake, China. Using relevant spectral indices (SI) as classifiers, we manually developed a stable CT model structure and then applied a standard CT algorithm to obtain quantitative (optimal) thresholds from 2009 ground-truth data and images from Landsat7-ETM+, HJ-1B-CCD, Landsat5-TM and ALOS-AVNIR-2 sensors. Optimal CT thresholds produced average classification accuracies of 78.1%, 84.7% and 74.0% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. However, the optimal CT thresholds for different sensor images differed from each other, with an average relative variation (RV) of 6.40%. We developed and evaluated three new approaches to normalizing the images. The best-performing method (Method of 0.1% index scaling) normalized the SI images using tailored percentages of extreme pixel values. Using the images normalized by Method of 0.1% index scaling, CT models for a particular sensor in which thresholds were replaced by those from the models developed for images originating from other sensors provided average classification accuracies of 76.0%, 82.8% and 68.9% for emergent vegetation, floating-leaf vegetation and submerged vegetation, respectively. Applying the CT models developed for normalized 2009 images to 2010 images resulted in high classification (78.0%–93.3%) and overall (92.0%–93.1%) accuracies. Our results suggest

  8. Tree level hydrodynamic approach for resolving aboveground water storage and stomatal conductance and modeling the effects of tree hydraulic strategy

    NASA Astrophysics Data System (ADS)

    Mirfenderesgi, Golnazalsadat; Bohrer, Gil; Matheny, Ashley M.; Fatichi, Simone; Moraes Frasson, Renato Prata; Schäfer, Karina V. R.

    2016-07-01

    The finite difference ecosystem-scale tree crown hydrodynamics model version 2 (FETCH2) is a tree-scale hydrodynamic model of transpiration. The FETCH2 model employs a finite difference numerical methodology and a simplified single-beam conduit system to explicitly resolve xylem water potentials throughout the vertical extent of a tree. Empirical equations relate water potential within the stem to stomatal conductance of the leaves at each height throughout the crown. While highly simplified, this approach brings additional realism to the simulation of transpiration by linking stomatal responses to stem water potential rather than directly to soil moisture, as is currently the case in the majority of land surface models. FETCH2 accounts for plant hydraulic traits, such as the degree of anisohydric/isohydric response of stomata, maximal xylem conductivity, vertical distribution of leaf area, and maximal and minimal xylem water content. We used FETCH2 along with sap flow and eddy covariance data sets collected from a mixed plot of two genera (oak/pine) in Silas Little Experimental Forest, NJ, USA, to conduct an analysis of the intergeneric variation of hydraulic strategies and their effects on diurnal and seasonal transpiration dynamics. We define these strategies through the parameters that describe the genus level transpiration and xylem conductivity responses to changes in stem water potential. Our evaluation revealed that FETCH2 considerably improved the simulation of ecosystem transpiration and latent heat flux in comparison to more conventional models. A virtual experiment showed that the model was able to capture the effect of hydraulic strategies such as isohydric/anisohydric behavior on stomatal conductance under different soil-water availability conditions.

  9. A heuristic multi-criteria classification approach incorporating data quality information for choropleth mapping

    PubMed Central

    Sun, Min; Wong, David; Kronenfeld, Barry

    2016-01-01

    Despite conceptual and technology advancements in cartography over the decades, choropleth map design and classification fail to address a fundamental issue: estimates that are statistically indifferent may be assigned to different classes on maps or vice versa. Recently, the class separability concept was introduced as a map classification criterion to evaluate the likelihood that estimates in two classes are statistical different. Unfortunately, choropleth maps created according to the separability criterion usually have highly unbalanced classes. To produce reasonably separable but more balanced classes, we propose a heuristic classification approach to consider not just the class separability criterion but also other classification criteria such as evenness and intra-class variability. A geovisual-analytic package was developed to support the heuristic mapping process to evaluate the trade-off between relevant criteria and to select the most preferable classification. Class break values can be adjusted to improve the performance of a classification. PMID:28286426

  10. Neural network approaches versus statistical methods in classification of multisource remote sensing data

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon A.; Swain, Philip H.; Ersoy, Okan K.

    1990-01-01

    Neural network learning procedures and statistical classificaiton methods are applied and compared empirically in classification of multisource remote sensing and geographic data. Statistical multisource classification by means of a method based on Bayesian classification theory is also investigated and modified. The modifications permit control of the influence of the data sources involved in the classification process. Reliability measures are introduced to rank the quality of the data sources. The data sources are then weighted according to these rankings in the statistical multisource classification. Four data sources are used in experiments: Landsat MSS data and three forms of topographic data (elevation, slope, and aspect). Experimental results show that two different approaches have unique advantages and disadvantages in this classification application.

  11. Histopathological image analysis for centroblasts classification through dimensionality reduction approaches.

    PubMed

    Kornaropoulos, Evgenios N; Niazi, M Khalid Khan; Lozanski, Gerard; Gurcan, Metin N

    2014-03-01

    We present two novel automated image analysis methods to differentiate centroblast (CB) cells from noncentroblast (non-CB) cells in digital images of H&E-stained tissues of follicular lymphoma. CB cells are often confused by similar looking cells within the tissue, therefore a system to help their classification is necessary. Our methods extract the discriminatory features of cells by approximating the intrinsic dimensionality from the subspace spanned by CB and non-CB cells. In the first method, discriminatory features are approximated with the help of singular value decomposition (SVD), whereas in the second method they are extracted using Laplacian Eigenmaps. Five hundred high-power field images were extracted from 17 slides, which are then used to compose a database of 213 CB and 234 non-CB region of interest images. The recall, precision, and overall accuracy rates of the developed methods were measured and compared with existing classification methods. Moreover, the reproducibility of both classification methods was also examined. The average values of the overall accuracy were 99.22% ± 0.75% and 99.07% ± 1.53% for COB and CLEM, respectively. The experimental results demonstrate that both proposed methods provide better classification accuracy of CB/non-CB in comparison with the state of the art methods.

  12. Oregon Hydrologic Landscapes: An Approach for Broadscale Hydrologic Classification

    EPA Science Inventory

    Gaged streams represent only a small percentage of watershed hydrologic conditions throughout the Unites States and globe, but there is a growing need for hydrologic classification systems that can serve as the foundation for broad-scale assessments of the hydrologic functions of...

  13. Using hydrogeomorphic criteria to classify wetlands on Mt. Desert Island, Maine - approach, classification system, and examples

    USGS Publications Warehouse

    Nielsen, Martha G.; Guntenspergen, Glenn R.; Neckles, Hilary A.

    2005-01-01

    A wetland classification system was designed for Mt. Desert Island, Maine, to help categorize the large number of wetlands (over 1,200 mapped units) as an aid to understanding their hydrologic functions. The classification system, developed by the U.S. Geological Survey (USGS), in cooperation with the National Park Service, uses a modified hydrogeomorphic (HGM) approach, and assigns categories based on position in the landscape, soils and surficial geologic setting, and source of water. A dichotomous key was developed to determine a preliminary HGM classification of wetlands on the island. This key is designed for use with USGS topographic maps and 1:24,000 geographic information system (GIS) coverages as an aid to the classification, but may also be used with field data. Hydrologic data collected from a wetland monitoring study were used to determine whether the preliminary classification of individual wetlands using the HGM approach yielded classes that were consistent with actual hydroperiod data. Preliminary HGM classifications of the 20 wetlands in the monitoring study were consistent with the field hydroperiod data. The modified HGM classification approach appears robust, although the method apparently works somewhat better with undisturbed wetlands than with disturbed wetlands. This wetland classification system could be applied to other hydrogeologically similar areas of northern New England.

  14. Comparison of Sub-pixel Classification Approaches for Crop-specific Mapping

    EPA Science Inventory

    The Moderate Resolution Imaging Spectroradiometer (MODIS) data has been increasingly used for crop mapping and other agricultural applications. Phenology-based classification approaches using the NDVI (Normalized Difference Vegetation Index) 16-day composite (250 m) data product...

  15. Data-Driven Multimodal Sleep Apnea Events Detection : Synchrosquezing Transform Processing and Riemannian Geometry Classification Approaches.

    PubMed

    Rutkowski, Tomasz M

    2016-07-01

    A novel multimodal and bio-inspired approach to biomedical signal processing and classification is presented in the paper. This approach allows for an automatic semantic labeling (interpretation) of sleep apnea events based the proposed data-driven biomedical signal processing and classification. The presented signal processing and classification methods have been already successfully applied to real-time unimodal brainwaves (EEG only) decoding in brain-computer interfaces developed by the author. In the current project the very encouraging results are obtained using multimodal biomedical (brainwaves and peripheral physiological) signals in a unified processing approach allowing for the automatic semantic data description. The results thus support a hypothesis of the data-driven and bio-inspired signal processing approach validity for medical data semantic interpretation based on the sleep apnea events machine-learning-related classification.

  16. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: the case study of Denmark.

    PubMed

    Bou Kheir, Rania; Greve, Mogens H; Bøcher, Peder K; Greve, Mette B; Larsen, René; McCloy, Keith

    2010-05-01

    Soil organic carbon (SOC) is one of the most important carbon stocks globally and has large potential to affect global climate. Distribution patterns of SOC in Denmark constitute a nation-wide baseline for studies on soil carbon changes (with respect to Kyoto protocol). This paper predicts and maps the geographic distribution of SOC across Denmark using remote sensing (RS), geographic information systems (GISs) and decision-tree modeling (un-pruned and pruned classification trees). Seventeen parameters, i.e. parent material, soil type, landscape type, elevation, slope gradient, slope aspect, mean curvature, plan curvature, profile curvature, flow accumulation, specific catchment area, tangent slope, tangent curvature, steady-state wetness index, Normalized Difference Vegetation Index (NDVI), Normalized Difference Wetness Index (NDWI) and Soil Color Index (SCI) were generated to statistically explain SOC field measurements in the area of interest (Denmark). A large number of tree-based classification models (588) were developed using (i) all of the parameters, (ii) all Digital Elevation Model (DEM) parameters only, (iii) the primary DEM parameters only, (iv), the remote sensing (RS) indices only, (v) selected pairs of parameters, (vi) soil type, parent material and landscape type only, and (vii) the parameters having a high impact on SOC distribution in built pruned trees. The best constructed classification tree models (in the number of three) with the lowest misclassification error (ME) and the lowest number of nodes (N) as well are: (i) the tree (T1) combining all of the parameters (ME=29.5%; N=54); (ii) the tree (T2) based on the parent material, soil type and landscape type (ME=31.5%; N=14); and (iii) the tree (T3) constructed using parent material, soil type, landscape type, elevation, tangent slope and SCI (ME=30%; N=39). The produced SOC maps at 1:50,000 cartographic scale using these trees are highly matching with coincidence values equal to 90.5% (Map T1

  17. Gregarine site-heterogeneous 18S rDNA trees, revision of gregarine higher classification, and the evolutionary diversification of Sporozoa.

    PubMed

    Cavalier-Smith, Thomas

    2014-10-01

    Gregarine 18S ribosomal DNA trees are hard to resolve because they exhibit the most disparate rates of rDNA evolution of any eukaryote group. As site-heterogeneous tree-reconstruction algorithms can give more accurate trees, especially for technically unusually challenging groups, I present the first site-heterogeneous rDNA trees for 122 gregarines and an extensive set of 452 appropriate outgroups. While some features remain poorly resolved, these trees fit morphological diversity better than most previous, evolutionarily less realistic, maximum likelihood trees. Gregarines are probably polyphyletic, with some 'eugregarines' and all 'neogregarines' (both abandoned as taxa) being more closely related to Cryptosporidium and Rhytidocystidae than to archigregarines. I establish a new subclass Orthogregarinia (new orders Vermigregarida, Arthrogregarida) for gregarines most closely related to Cryptosporidium and group Orthogregarinia, Cryptosporidiidae, and Rhytidocystidae as revised class Gregarinomorphea. Archigregarines are excluded from Gregarinomorphea and grouped with new orders Velocida (Urosporoidea superfam. n. and Veloxidium) and Stenophorida as a new sporozoan class Paragregarea. Platyproteum and Filipodium never group with Orthogregarinia or Paragregarea and are sufficiently different morphologically to merit a new order Squirmida. I revise gregarine higher-level classification generally in the light of site-heterogeneous-model trees, discuss their evolution, and also sporozoan cell structure and life-history evolution, correcting widespread misinterpretations.

  18. Frugivores bias seed-adult tree associations through nonrandom seed dispersal: a phylogenetic approach.

    PubMed

    Razafindratsima, Onja H; Dunham, Amy E

    2016-08-01

    Frugivores are the main seed dispersers in many ecosystems, such that behaviorally driven, nonrandom patterns of seed dispersal are a common process; but patterns are poorly understood. Characterizing these patterns may be essential for understanding spatial organization of fruiting trees and drivers of seed-dispersal limitation in biodiverse forests. To address this, we studied resulting spatial associations between dispersed seeds and adult tree neighbors in a diverse rainforest in Madagascar, using a temporal and phylogenetic approach. Data show that by using fruiting trees as seed-dispersal foci, frugivores bias seed dispersal under conspecific adults and under heterospecific trees that share dispersers and fruiting time with the dispersed species. Frugivore-mediated seed dispersal also resulted in nonrandom phylogenetic associations of dispersed seeds with their nearest adult neighbors, in nine out of the 16 months of our study. However, these nonrandom phylogenetic associations fluctuated unpredictably over time, ranging from clustered to overdispersed. The spatial and phylogenetic template of seed dispersal did not translate to similar patterns of association in adult tree neighborhoods, suggesting the importance of post-dispersal processes in structuring plant communities. Results suggest that frugivore-mediated seed dispersal is important for structuring early stages of plant-plant associations, setting the template for post-dispersal processes that influence ultimate patterns of plant recruitment. Importantly, if biased patterns of dispersal are common in other systems, frugivores may promote tree coexistence in biodiverse forests by limiting the frequency and diversity of heterospecific interactions of seeds they disperse.

  19. Multistage classification of multispectral Earth observational data: The design approach

    NASA Technical Reports Server (NTRS)

    Bauer, M. E. (Principal Investigator); Muasher, M. J.; Landgrebe, D. A.

    1981-01-01

    An algorithm is proposed which predicts the optimal features at every node in a binary tree procedure. The algorithm estimates the probability of error by approximating the area under the likelihood ratio function for two classes and taking into account the number of training samples used in estimating each of these two classes. Some results on feature selection techniques, particularly in the presence of a very limited set of training samples, are presented. Results comparing probabilities of error predicted by the proposed algorithm as a function of dimensionality as compared to experimental observations are shown for aircraft and LANDSAT data. Results are obtained for both real and simulated data. Finally, two binary tree examples which use the algorithm are presented to illustrate the usefulness of the procedure.

  20. Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis

    PubMed Central

    Rastegari, Azam; Haghdoost, Ali Akbar; Baneshi, Mohammad Reza

    2013-01-01

    Background Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Methods Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. Findings The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. Conclusion In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables. PMID:24494152

  1. Evaluating Two Approaches to Helping College Students Understand Evolutionary Trees through Diagramming Tasks

    ERIC Educational Resources Information Center

    Perry, Judy; Meir, Eli; Herron, Jon C.; Maruca, Susan; Stal, Derek

    2008-01-01

    To understand evolutionary theory, students must be able to understand and use evolutionary trees and their underlying concepts. Active, hands-on curricula relevant to macroevolution can be challenging to implement across large college-level classes where textbook learning is the norm. We evaluated two approaches to helping students learn…

  2. Toward noncooperative iris recognition: a classification approach using multiple signatures.

    PubMed

    Proença, Hugo; Alexandre, Luís A

    2007-04-01

    This paper focuses on noncooperative iris recognition, i.e., the capture of iris images at large distances, under less controlled lighting conditions, and without active participation of the subjects. This increases the probability of capturing very heterogeneous images (regarding focus, contrast, or brightness) and with several noise factors (iris obstructions and reflections). Current iris recognition systems are unable to deal with noisy data and substantially increase their error rates, especially the false rejections, in these conditions. We propose an iris classification method that divides the segmented and normalized iris image into six regions, makes an independent feature extraction and comparison for each region, and combines each of the dissimilarity values through a classification rule. Experiments show a substantial decrease, higher than 40 percent, of the false rejection rates in the recognition of noisy iris images.

  3. An efficient semi-supervised classification approach for hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Tan, Kun; Li, Erzhu; Du, Qian; Du, Peijun

    2014-11-01

    In this paper, an efficient semi-supervised support vector machine (SVM) with segmentation-based ensemble (S2SVMSE) algorithm is proposed for hyperspectral image classification. The algorithm utilizes spatial information extracted by a segmentation algorithm for unlabeled sample selection. The unlabeled samples that are the most similar to the labeled ones are found and the candidate set of unlabeled samples to be chosen is enlarged to the corresponding image segments. To ensure the finally selected unlabeled samples be spatially widely distributed and less correlated, random selection is conducted with the flexibility of the number of unlabeled samples actually participating in semi-supervised learning. Classification is also refined through a spectral-spatial feature ensemble technique. The proposed method with very limited labeled training samples is evaluated via experiments with two real hyperspectral images, where it outperforms the fully supervised SVM and the semi-supervised version without spectral-spatial ensemble.

  4. Response-Time Approach to Contrasting Models of Perceptual Classification

    DTIC Science & Technology

    2013-02-01

    For example, in Experiment 1 of Nosofsky et al. (2011), the stimuli were a set of 27 Munsell colors varying along dimensions of hue , saturation, and...develop and test models that explain the time course of classification and recognition decision making. The first specific goal involved the...Several empirical studies demonstrated successful applications of the new theory in this domain. The second goal involved the development and testing

  5. A science based approach to topical drug classification system (TCS).

    PubMed

    Shah, Vinod P; Yacobi, Avraham; Rădulescu, Flavian Ştefan; Miron, Dalia Simona; Lane, Majella E

    2015-08-01

    The Biopharmaceutics Classification System (BCS) for oral immediate release solid drug products has been very successful; its implementation in drug industry and regulatory approval has shown significant progress. This has been the case primarily because BCS was developed using sound scientific judgment. Following the success of BCS, we have considered the topical drug products for similar classification system based on sound scientific principles. In USA, most of the generic topical drug products have qualitatively (Q1) and quantitatively (Q2) same excipients as the reference listed drug (RLD). The applications of in vitro release (IVR) and in vitro characterization are considered for a range of dosage forms (suspensions, creams, ointments and gels) of differing strengths. We advance a Topical Drug Classification System (TCS) based on a consideration of Q1, Q2 as well as the arrangement of matter and microstructure of topical formulations (Q3). Four distinct classes are presented for the various scenarios that may arise and depending on whether biowaiver can be granted or not.

  6. A Novel Anti-classification Approach for Knowledge Protection.

    PubMed

    Lin, Chen-Yi; Chen, Tung-Shou; Tsai, Hui-Fang; Lee, Wei-Bin; Hsu, Tien-Yu; Kao, Yuan-Hung

    2015-10-01

    Classification is the problem of identifying a set of categories where new data belong, on the basis of a set of training data whose category membership is known. Its application is wide-spread, such as the medical science domain. The issue of the classification knowledge protection has been paid attention increasingly in recent years because of the popularity of cloud environments. In the paper, we propose a Shaking Sorted-Sampling (triple-S) algorithm for protecting the classification knowledge of a dataset. The triple-S algorithm sorts the data of an original dataset according to the projection results of the principal components analysis so that the features of the adjacent data are similar. Then, we generate noise data with incorrect classes and add those data to the original dataset. In addition, we develop an effective positioning strategy, determining the added positions of noise data in the original dataset, to ensure the restoration of the original dataset after removing those noise data. The experimental results show that the disturbance effect of the triple-S algorithm on the CLC, MySVM, and LibSVM classifiers increases when the noise data ratio increases. In addition, compared with existing methods, the disturbance effect of the triple-S algorithm is more significant on MySVM and LibSVM when a certain amount of the noise data added to the original dataset is reached.

  7. Classification as clustering: a Pareto cooperative-competitive GP approach.

    PubMed

    McIntyre, Andrew R; Heywood, Malcolm I

    2011-01-01

    Intuitively population based algorithms such as genetic programming provide a natural environment for supporting solutions that learn to decompose the overall task between multiple individuals, or a team. This work presents a framework for evolving teams without recourse to prespecifying the number of cooperating individuals. To do so, each individual evolves a mapping to a distribution of outcomes that, following clustering, establishes the parameterization of a (Gaussian) local membership function. This gives individuals the opportunity to represent subsets of tasks, where the overall task is that of classification under the supervised learning domain. Thus, rather than each team member representing an entire class, individuals are free to identify unique subsets of the overall classification task. The framework is supported by techniques from evolutionary multiobjective optimization (EMO) and Pareto competitive coevolution. EMO establishes the basis for encouraging individuals to provide accurate yet nonoverlaping behaviors; whereas competitive coevolution provides the mechanism for scaling to potentially large unbalanced datasets. Benchmarking is performed against recent examples of nonlinear SVM classifiers over 12 UCI datasets with between 150 and 200,000 training instances. Solutions from the proposed coevolutionary multiobjective GP framework appear to provide a good balance between classification performance and model complexity, especially as the dataset instance count increases.

  8. Feature selection using Decision Tree and classification through Proximal Support Vector Machine for fault diagnostics of roller bearing

    NASA Astrophysics Data System (ADS)

    Sugumaran, V.; Muralidharan, V.; Ramachandran, K. I.

    2007-02-01

    Roller bearing is one of the most widely used rotary elements in a rotary machine. The roller bearing's nature of vibration reveals its condition and the features that show the nature, are to be extracted through some indirect means. Statistical parameters like kurtosis, standard deviation, maximum value, etc. form a set of features, which are widely used in fault diagnostics. Often the problem is, finding out good features that discriminate the different fault conditions of the bearing. Selection of good features is an important phase in pattern recognition and requires detailed domain knowledge. This paper illustrates the use of a Decision Tree that identifies the best features from a given set of samples for the purpose of classification. It uses Proximal Support Vector Machine (PSVM), which has the capability to efficiently classify the faults using statistical features. The vibration signal from a piezoelectric transducer is captured for the following conditions: good bearing, bearing with inner race fault, bearing with outer race fault, and inner and outer race fault. The statistical features are extracted therefrom and classified successfully using PSVM and SVM. The results of PSVM and SVM are compared.

  9. Biodiversity among Lactobacillus helveticus Strains Isolated from Different Natural Whey Starter Cultures as Revealed by Classification Trees

    PubMed Central

    Gatti, Monica; Trivisano, Carlo; Fabrizi, Enrico; Neviani, Erasmo; Gardini, Fausto

    2004-01-01

    Lactobacillus helveticus is a homofermentative thermophilic lactic acid bacterium used extensively for manufacturing Swiss type and aged Italian cheese. In this study, the phenotypic and genotypic diversity of strains isolated from different natural dairy starter cultures used for Grana Padano, Parmigiano Reggiano, and Provolone cheeses was investigated by a classification tree technique. A data set was used that consists of 119 L. helveticus strains, each of which was studied for its physiological characters, as well as surface protein profiles and hybridization with a species-specific DNA probe. The methodology employed in this work allowed the strains to be grouped into terminal nodes without difficult and subjective interpretation. In particular, good discrimination was obtained between L. helveticus strains isolated, respectively, from Grana Padano and from Provolone natural whey starter cultures. The method used in this work allowed identification of the main characteristics that permit discrimination of biotypes. In order to understand what kind of genes could code for phenotypes of technological relevance, evidence that specific DNA sequences are present only in particular biotypes may be of great interest. PMID:14711641

  10. Unimodal transform of variables selected by interval segmentation purity for classification tree modeling of high-dimensional microarray data.

    PubMed

    Du, Wen; Gu, Ting; Tang, Li-Juan; Jiang, Jian-Hui; Wu, Hai-Long; Shen, Guo-Li; Yu, Ru-Qin

    2011-09-15

    As a greedy search algorithm, classification and regression tree (CART) is easily relapsing into overfitting while modeling microarray gene expression data. A straightforward solution is to filter irrelevant genes via identifying significant ones. Considering some significant genes with multi-modal expression patterns exhibiting systematic difference in within-class samples are difficult to be identified by existing methods, a strategy that unimodal transform of variables selected by interval segmentation purity (UTISP) for CART modeling is proposed. First, significant genes exhibiting varied expression patterns can be properly identified by a variable selection method based on interval segmentation purity. Then, unimodal transform is implemented to offer unimodal featured variables for CART modeling via feature extraction. Because significant genes with complex expression patterns can be properly identified and unimodal feature extracted in advance, this developed strategy potentially improves the performance of CART in combating overfitting or underfitting while modeling microarray data. The developed strategy is demonstrated using two microarray data sets. The results reveal that UTISP-based CART provides superior performance to k-nearest neighbors or CARTs coupled with other gene identifying strategies, indicating UTISP-based CART holds great promise for microarray data analysis.

  11. Derivation of Tree Canopy Cover by Multiscale Remote Sensing Approach

    NASA Astrophysics Data System (ADS)

    Wu, W.

    2011-08-01

    In forestry, treecanopy cover (CC) is an important biophysical indicator for characterizing terrestrial ecosystemsand modeling global biogeochemical cycles, e.g., woody biomass estimation, carbon balance analysis (sink/emission). However, currently available CC product cannot fully meet what we need while conducting woody biomass estimation in tropical savannas.It is thus necessary to develop an approach to estimate more reliable CC. Based on the acquisition of multisensor and multiresolution dataset, this study introduces an innovative multiscalemethod for this purpose taking the multiple savannas country Sudan as an example. The procedure includes: (1)Measurement of CC using Google Earth Pro in which very high resolution images such as QuickBirdand GeoEye images are available, and then the measured CC was coupled with atmospherically corrected and reflectance-based 16 frames of Landsat ETM+ vegetation indices (EVI, SARVI and NDVI)dated Nov 1999-2002 to establish the CC-VIs models; it was noted that among these indices NDVI indicates the best correlation with CC (CC = 153.09NDVI- 10.12, R2 = 0.91);(2) The NDVI of Landsat ETM+ was calibrated against MODIS NDVI of the same time period (Nov 2002)to make sure that model developed from Landsat ETM+ data can be applied to MODIS data for upscalingto regional scale study; (3)Time-series MODIS NDVI data of the period Jan 2002-Dec 2009 (MODIS13Q1, 250m, 186 acquisitions) were acquired and used to decompose the woody component(NDVI) from seasonal changeand herbaceous component by time-series analysis;(4) The equation obtained in step 1 was applied to the decomposed MODIS woody NDVI images to derive country scale CC data. The produced CC was checked against the 287 ground measured CC obtained in step 1 and a good agreement (R2 = 0.53-0.71) was found.It is hence concluded that the proposed multiscale approach is effective, operational and can be applied for reliable estimation of regional and even continental scales CC data.

  12. A Phenotypic Approach for IUIS PID Classification and Diagnosis: Guidelines for Clinicians at the Bedside

    PubMed Central

    Jeddane, Leïla; Ailal, Fatima; Al Herz, Waleed; Conley, Mary Ellen; Cunningham-Rundles, Charlotte; Etzioni, Amos; Fischer, Alain; Franco, Jose Luis; Geha, Raif S.; Hammarström, Lennart; Nonoyama, Shigeaki; Ochs, Hans D.; Roifman, Chaim M.; Seger, Reinhard; Tang, Mimi L. K.; Puck, Jennifer M.; Chapel, Helen; Notarangelo, Luigi D.; Casanova, Jean-Laurent

    2014-01-01

    The number of genetically defined Primary Immunodeficiency Diseases (PID) has increased exponentially, especially in the past decade. The biennial classification published by the IUIS PID expert committee is therefore quickly expanding, providing valuable information regarding the disease-causing genotypes, the immunological anomalies, and the associated clinical features of PIDs. These are grouped in eight, somewhat overlapping, categories of immune dysfunction. However, based on this immunological classification, the diagnosis of a specific PID from the clinician’s observation of an individual clinical and/or immunological phenotype remains difficult, especially for non-PID specialists. The purpose of this work is to suggest a phenotypic classification that forms the basis for diagnostic trees, leading the physician to particular groups of PIDs, starting from clinical features and combining routine immunological investigations along the way.We present 8 colored diagnostic figures that correspond to the 8 PID groups in the IUIS Classification, including all the PIDs cited in the 2011 update of the IUIS classification and most of those reported since. PMID:23657403

  13. A statistical approach to material classification using image patch exemplars.

    PubMed

    Varma, Manik; Zisserman, Andrew

    2009-11-01

    In this paper, we investigate material classification from single images obtained under unknown viewpoint and illumination. It is demonstrated that materials can be classified using the joint distribution of intensity values over extremely compact neighborhoods (starting from as small as 3 \\times 3 pixels square) and that this can outperform classification using filter banks with large support. It is also shown that the performance of filter banks is inferior to that of image patches with equivalent neighborhoods. We develop novel texton-based representations which are suited to modeling this joint neighborhood distribution for Markov random fields. The representations are learned from training images and then used to classify novel images (with unknown viewpoint and lighting) into texture classes. Three such representations are proposed and their performance is assessed and compared to that of filter banks. The power of the method is demonstrated by classifying 2,806 images of all 61 materials present in the Columbia-Utrecht database. The classification performance surpasses that of recent state-of-the-art filter bank-based classifiers such as Leung and Malik (IJCV 01), Cula and Dana (IJCV 04), and Varma and Zisserman (IJCV 05). We also benchmark performance by classifying all of the textures present in the UIUC, Microsoft Textile, and San Francisco outdoor data sets. We conclude with discussions on why features based on compact neighborhoods can correctly discriminate between textures with large global structure and why the performance of filter banks is not superior to that of the source image patches from which they were derived.

  14. Bayesian Evidence Framework for Decision Tree Learning

    NASA Astrophysics Data System (ADS)

    Chatpatanasiri, Ratthachat; Kijsirikul, Boonserm

    2005-11-01

    This work is primary interested in the problem of, given the observed data, selecting a single decision (or classification) tree. Although a single decision tree has a high risk to be overfitted, the induced tree is easily interpreted. Researchers have invented various methods such as tree pruning or tree averaging for preventing the induced tree from overfitting (and from underfitting) the data. In this paper, instead of using those conventional approaches, we apply the Bayesian evidence framework of Gull, Skilling and Mackay to a process of selecting a decision tree. We derive a formal function to measure `the fitness' for each decision tree given a set of observed data. Our method, in fact, is analogous to a well-known Bayesian model selection method for interpolating noisy continuous-value data. As in regression problems, given reasonable assumptions, this derived score function automatically quantifies the principle of Ockham's razor, and hence reasonably deals with the issue of underfitting-overfitting tradeoff.

  15. A bayesian approach to classification criteria for spectacled eiders

    USGS Publications Warehouse

    Taylor, B.L.; Wade, P.R.; Stehn, R.A.; Cochrane, J.F.

    1996-01-01

    To facilitate decisions to classify species according to risk of extinction, we used Bayesian methods to analyze trend data for the Spectacled Eider, an arctic sea duck. Trend data from three independent surveys of the Yukon-Kuskokwim Delta were analyzed individually and in combination to yield posterior distributions for population growth rates. We used classification criteria developed by the recovery team for Spectacled Eiders that seek to equalize errors of under- or overprotecting the species. We conducted both a Bayesian decision analysis and a frequentist (classical statistical inference) decision analysis. Bayesian decision analyses are computationally easier, yield basically the same results, and yield results that are easier to explain to nonscientists. With the exception of the aerial survey analysis of the 10 most recent years, both Bayesian and frequentist methods indicated that an endangered classification is warranted. The discrepancy between surveys warrants further research. Although the trend data are abundance indices, we used a preliminary estimate of absolute abundance to demonstrate how to calculate extinction distributions using the joint probability distributions for population growth rate and variance in growth rate generated by the Bayesian analysis. Recent apparent increases in abundance highlight the need for models that apply to declining and then recovering species.

  16. A Hybrid Approach to Sentiment Sentence Classification in Suicide Notes

    PubMed Central

    Sohn, Sunghwan; Torii, Manabu; Li, Dingcheng; Wagholikar, Kavishwar; Wu, Stephen; Liu, Hongfang

    2012-01-01

    This paper describes the sentiment classification system developed by the Mayo Clinic team for the 2011 I2B2/VA/Cincinnati Natural Language Processing (NLP) Challenge. The sentiment classification task is to assign any pertinent emotion to each sentence in suicide notes. We have implemented three systems that have been trained on suicide notes provided by the I2B2 challenge organizer—a machine learning system, a rule-based system, and a system consisting of a combination of both. Our machine learning system was trained on re-annotated data in which apparently inconsistent emotion assignment was adjusted. Then, the machine learning methods by RIPPER and multinomial Naïve Bayes classifiers, manual pattern matching rules, and the combination of the two systems were tested to determine the emotions within sentences. The combination of the machine learning and rule-based system performed best and produced a micro-average F-score of 0.5640. PMID:22879759

  17. Comparative Study on the Different Testing Techniques in Tree Classification for Detecting the Learning Motivation

    NASA Astrophysics Data System (ADS)

    Juliane, C.; Arman, A. A.; Sastramihardja, H. S.; Supriana, I.

    2017-03-01

    Having motivation to learn is a successful requirement in a learning process, and needs to be maintained properly. This study aims to measure learning motivation, especially in the process of electronic learning (e-learning). Here, data mining approach was chosen as a research method. For the testing process, the accuracy comparative study on the different testing techniques was conducted, involving Cross Validation and Percentage Split. The best accuracy was generated by J48 algorithm with a percentage split technique reaching at 92.19 %. This study provided an overview on how to detect the presence of learning motivation in the context of e-learning. It is expected to be good contribution for education, and to warn the teachers for whom they have to provide motivation.

  18. Hierarchical Object-based Image Analysis approach for classification of sub-meter multispectral imagery in Tanzania

    NASA Astrophysics Data System (ADS)

    Chung, C.; Nagol, J. R.; Tao, X.; Anand, A.; Dempewolf, J.

    2015-12-01

    Increasing agricultural production while at the same time preserving the environment has become a challenging task. There is a need for new approaches for use of multi-scale and multi-source remote sensing data as well as ground based measurements for mapping and monitoring crop and ecosystem state to support decision making by governmental and non-governmental organizations for sustainable agricultural development. High resolution sub-meter imagery plays an important role in such an integrative framework of landscape monitoring. It helps link the ground based data to more easily available coarser resolution data, facilitating calibration and validation of derived remote sensing products. Here we present a hierarchical Object Based Image Analysis (OBIA) approach to classify sub-meter imagery. The primary reason for choosing OBIA is to accommodate pixel sizes smaller than the object or class of interest. Especially in non-homogeneous savannah regions of Tanzania, this is an important concern and the traditional pixel based spectral signature approach often fails. Ortho-rectified, calibrated, pan sharpened 0.5 meter resolution data acquired from DigitalGlobe's WorldView-2 satellite sensor was used for this purpose. Multi-scale hierarchical segmentation was performed using multi-resolution segmentation approach to facilitate the use of texture, neighborhood context, and the relationship between super and sub objects for training and classification. eCognition, a commonly used OBIA software program, was used for this purpose. Both decision tree and random forest approaches for classification were tested. The Kappa index agreement for both algorithms surpassed the 85%. The results demonstrate that using hierarchical OBIA can effectively and accurately discriminate classes at even LCCS-3 legend.

  19. Classification

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  20. A novel approach to malignant-benign classification of pulmonary nodules by using ensemble learning classifiers.

    PubMed

    Tartar, A; Akan, A; Kilic, N

    2014-01-01

    Computer-aided detection systems can help radiologists to detect pulmonary nodules at an early stage. In this paper, a novel Computer-Aided Diagnosis system (CAD) is proposed for the classification of pulmonary nodules as malignant and benign. The proposed CAD system using ensemble learning classifiers, provides an important support to radiologists at the diagnosis process of the disease, achieves high classification performance. The proposed approach with bagging classifier results in 94.7 %, 90.0 % and 77.8 % classification sensitivities for benign, malignant and undetermined classes (89.5 % accuracy), respectively.

  1. Gene selection approach based on improved swarm intelligent optimisation algorithm for tumour classification.

    PubMed

    Jin, Cong; Jin, Shu-Wei

    2016-06-01

    A number of different gene selection approaches based on gene expression profiles (GEP) have been developed for tumour classification. A gene selection approach selects the most informative genes from the whole gene space, which is an important process for tumour classification using GEP. This study presents an improved swarm intelligent optimisation algorithm to select genes for maintaining the diversity of the population. The most essential characteristic of the proposed approach is that it can automatically determine the number of the selected genes. On the basis of the gene selection, the authors construct a variety of the tumour classifiers, including the ensemble classifiers. Four gene datasets are used to evaluate the performance of the proposed approach. The experimental results confirm that the proposed classifiers for tumour classification are indeed effective.

  2. Automatic Pulmonary Artery-Vein Separation and Classification in Computed Tomography Using Tree Partitioning and Peripheral Vessel Matching.

    PubMed

    Charbonnier, Jean-Paul; Brink, Monique; Ciompi, Francesco; Scholten, Ernst T; Schaefer-Prokop, Cornelia M; van Rikxoort, Eva M

    2016-03-01

    We present a method for automatic separation and classification of pulmonary arteries and veins in computed tomography. Our method takes advantage of local information to separate segmented vessels, and global information to perform the artery-vein classification. Given a vessel segmentation, a geometric graph is constructed that represents both the topology and the spatial distribution of the vessels. All nodes in the geometric graph where arteries and veins are potentially merged are identified based on graph pruning and individual branching patterns. At the identified nodes, the graph is split into subgraphs that each contain only arteries or veins. Based on the anatomical information that arteries and veins approach a common alveolar sag, an arterial subgraph is expected to be intertwined with a venous subgraph in the periphery of the lung. This relationship is quantified using periphery matching and is used to group subgraphs of the same artery-vein class. Artery-vein classification is performed on these grouped subgraphs based on the volumetric difference between arteries and veins. A quantitative evaluation was performed on 55 publicly available non-contrast CT scans. In all scans, two observers manually annotated randomly selected vessels as artery or vein. Our method was able to separate and classify arteries and veins with a median accuracy of 89%, closely approximating the inter-observer agreement. All CT scans used in this study, including all results of our system and all manual annotations, are publicly available at "http://www.w3.org/1999/xlink">http://arteryvein.grand-challenge.org".

  3. A data driven approach for condition monitoring of wind turbine blade using vibration signals through best-first tree algorithm and functional trees algorithm: A comparative study.

    PubMed

    Joshuva, A; Sugumaran, V

    2017-03-01

    Wind energy is one of the important renewable energy resources available in nature. It is one of the major resources for production of energy because of its dependability due to the development of the technology and relatively low cost. Wind energy is converted into electrical energy using rotating blades. Due to environmental conditions and large structure, the blades are subjected to various vibration forces that may cause damage to the blades. This leads to a liability in energy production and turbine shutdown. The downtime can be reduced when the blades are diagnosed continuously using structural health condition monitoring. These are considered as a pattern recognition problem which consists of three phases namely, feature extraction, feature selection, and feature classification. In this study, statistical features were extracted from vibration signals, feature selection was carried out using a J48 decision tree algorithm and feature classification was performed using best-first tree algorithm and functional trees algorithm. The better algorithm is suggested for fault diagnosis of wind turbine blade.

  4. A New Approach in Teaching the Features and Classifications of Invertebrate Animals in Biology Courses

    ERIC Educational Resources Information Center

    Sezek, Fatih

    2013-01-01

    This study examined the effectiveness of a new learning approach in teaching classification of invertebrate animals in biology courses. In this approach, we used an impersonal style: the subject jigsaw, which differs from the other jigsaws in that both course topics and student groups are divided. Students in Jigsaw group were divided into five…

  5. Developmental Structuralist Approach to the Classification of Adaptive and Pathologic Personality Organizations: Infancy and Early Childhood.

    ERIC Educational Resources Information Center

    Greenspan, Stanley I.; Lourie, Reginald S.

    This paper applies a developmental structuralist approach to the classification of adaptive and pathologic personality organizations and behavior in infancy and early childhood, and it discusses implications of this approach for preventive intervention. In general, as development proceeds, the structural capacity of the developing infant and child…

  6. A multilayered approach for the analysis of perinatal mortality using different classification systems.

    PubMed

    Gordijn, Sanne J; Korteweg, Fleurisca J; Erwich, Jan Jaap H M; Holm, Jozien P; van Diem, Mariet Th; Bergman, Klasien A; Timmer, Albertus

    2009-06-01

    Many classification systems for perinatal mortality are available, all with their own strengths and weaknesses: none of them has been universally accepted. We present a systematic multilayered approach for the analysis of perinatal mortality based on information related to the moment of death, the conditions associated with death and the underlying cause of death, using a combination of representatives of existing classification systems. We compared the existing classification systems regarding their definition of the perinatal period, level of complexity, inclusion of maternal, foetal and/or placental factors and whether they focus at a clinical or pathological viewpoint. Furthermore, we allocated the classification systems to one of three categories: 'when', 'what' or 'why', dependent on whether the allocation of the individual cases of perinatal mortality is based on the moment of death ('when'), the clinical conditions associated with death ('what'), or the underlying cause of death ('why'). A multilayered approach for the analysis and classification of perinatal mortality is possible by using combinations of existing systems; for example the Wigglesworth or Nordic Baltic ('when'), ReCoDe ('what') and Tulip ('why') classification systems. This approach is useful not only for in depth analysis of perinatal mortality in the developed world but also for analysis of perinatal mortality in the developing countries, where resources to investigate death are often limited.

  7. Characterizing Vocal Repertoires—Hard vs. Soft Classification Approaches

    PubMed Central

    Wadewitz, Philip; Hammerschmidt, Kurt; Battaglia, Demian; Witt, Annette; Wolf, Fred; Fischer, Julia

    2015-01-01

    To understand the proximate and ultimate causes that shape acoustic communication in animals, objective characterizations of the vocal repertoire of a given species are critical, as they provide the foundation for comparative analyses among individuals, populations and taxa. Progress in this field has been hampered by a lack of standard in methodology, however. One problem is that researchers may settle on different variables to characterize the calls, which may impact on the classification of calls. More important, there is no agreement how to best characterize the overall structure of the repertoire in terms of the amount of gradation within and between call types. Here, we address these challenges by examining 912 calls recorded from wild chacma baboons (Papio ursinus). We extracted 118 acoustic variables from spectrograms, from which we constructed different sets of acoustic features, containing 9, 38, and 118 variables; as well 19 factors derived from principal component analysis. We compared and validated the resulting classifications of k-means and hierarchical clustering. Datasets with a higher number of acoustic features lead to better clustering results than datasets with only a few features. The use of factors in the cluster analysis resulted in an extremely poor resolution of emerging call types. Another important finding is that none of the applied clustering methods gave strong support to a specific cluster solution. Instead, the cluster analysis revealed that within distinct call types, subtypes may exist. Because hard clustering methods are not well suited to capture such gradation within call types, we applied a fuzzy clustering algorithm. We found that this algorithm provides a detailed and quantitative description of the gradation within and between chacma baboon call types. In conclusion, we suggest that fuzzy clustering should be used in future studies to analyze the graded structure of vocal repertoires. Moreover, the use of factor analyses to

  8. An improved spanning tree approach for the reliability analysis of supply chain collaborative network

    NASA Astrophysics Data System (ADS)

    Lam, C. Y.; Ip, W. H.

    2012-11-01

    A higher degree of reliability in the collaborative network can increase the competitiveness and performance of an entire supply chain. As supply chain networks grow more complex, the consequences of unreliable behaviour become increasingly severe in terms of cost, effort and time. Moreover, it is computationally difficult to calculate the network reliability of a Non-deterministic Polynomial-time hard (NP-hard) all-terminal network using state enumeration, as this may require a huge number of iterations for topology optimisation. Therefore, this paper proposes an alternative approach of an improved spanning tree for reliability analysis to help effectively evaluate and analyse the reliability of collaborative networks in supply chains and reduce the comparative computational complexity of algorithms. Set theory is employed to evaluate and model the all-terminal reliability of the improved spanning tree algorithm and present a case study of a supply chain used in lamp production to illustrate the application of the proposed approach.

  9. A Voronoi interior adjacency-based approach for generating a contour tree

    NASA Astrophysics Data System (ADS)

    Chen, Jun; Qiao, Chaofei; Zhao, Renliang

    2004-05-01

    A contour tree is a good graphical tool for representing the spatial relations of contour lines and has found many applications in map generalization, map annotation, terrain analysis, etc. A new approach for generating contour trees by introducing a Voronoi-based interior adjacency set concept is proposed in this paper. The immediate interior adjacency set is employed to identify all of the children contours of each contour without contour elevations. It has advantages over existing methods such as the point-in-polygon method and the region growing-based method. This new approach can be used for spatial data mining and knowledge discovering, such as the automatic extraction of terrain features and construction of multi-resolution digital elevation model.

  10. Tropical forest structure characterization using airborne lidar data: an individual tree level approach

    NASA Astrophysics Data System (ADS)

    Ferraz, A.; Saatchi, S. S.

    2015-12-01

    Fine scale tropical forest structure characterization has been performed by means of field measurements techniques that record both the specie and the diameter at the breast height (dbh) for every tree within a given area. Due to dense and complex vegetation, additional important ecological variables (e.g. the tree height and crown size) are usually not measured because they are hardly recognized from the ground. The poor knowledge on the 3D tropical forest structure has been a major limitation for the understanding of different ecological issues such as the spatial distribution of carbon stocks, regeneration and competition dynamics and light penetration gradient assessments. Airborne laser scanning (ALS) is an active remote sensing technique that provides georeferenced distance measurements between the aircraft and the surface. It provides an unstructured 3D point cloud that is a high-resolution model of the forest. This study presents the first approach for tropical forest characterization at a fine scale using remote sensing data. The multi-modal lidar point cloud is decomposed into 3D clusters that correspond to single trees by means of a technique called Adaptive Mean Shift Segmentation (AMS3D). The ability of the corresponding individual tree metrics (tree height, crown area and crown volume) for the estimation of above ground biomass (agb) over the 50 ha CTFS plot in Barro Colorado Island is here assessed. We conclude that our approach is able to map the agb spatial distribution with an error of nearly 12% (RMSE=28 Mg ha-1) compared with field-based estimates over 1ha plots.

  11. Neuropsychological Test Selection for Cognitive Impairment Classification: A Machine Learning Approach

    PubMed Central

    Williams, Jennifer A.; Schmitter-Edgecombe, Maureen; Cook, Diane J.

    2016-01-01

    Introduction Reducing the amount of testing required to accurately detect cognitive impairment is clinically relevant. The aim of this research was to determine the fewest number of clinical measures required to accurately classify participants as healthy older adult, mild cognitive impairment (MCI) or dementia using a suite of classification techniques. Methods Two variable selection machine learning models (i.e., naive Bayes, decision tree), a logistic regression, and two participant datasets (i.e., clinical diagnosis, clinical dementia rating; CDR) were explored. Participants classified using clinical diagnosis criteria included 52 individuals with dementia, 97 with MCI, and 161 cognitively healthy older adults. Participants classified using CDR included 154 individuals CDR = 0, 93 individuals with CDR = 0.5, and 25 individuals with CDR = 1.0+. Twenty-seven demographic, psychological, and neuropsychological variables were available for variable selection. Results No significant difference was observed between naive Bayes, decision tree, and logistic regression models for classification of both clinical diagnosis and CDR datasets. Participant classification (70.0 – 99.1%), geometric mean (60.9 – 98.1%), sensitivity (44.2 – 100%), and specificity (52.7 – 100%) were generally satisfactory. Unsurprisingly, the MCI/CDR = 0.5 participant group was the most challenging to classify. Through variable selection only 2 – 9 variables were required for classification and varied between datasets in a clinically meaningful way. Conclusions The current study results reveal that machine learning techniques can accurately classifying cognitive impairment and reduce the number of measures required for diagnosis. PMID:26332171

  12. A distributed coding approach for stereo sequences in the tree structured Haar transform domain

    NASA Astrophysics Data System (ADS)

    Cancellaro, M.; Carli, M.; Neri, A.

    2009-02-01

    In this contribution, a novel method for distributed video coding for stereo sequences is proposed. The system encodes independently the left and right frames of the stereoscopic sequence. The decoder exploits the side information to achieve the best reconstruction of the correlated video streams. In particular, a syndrome coder approach based on a lifted Tree Structured Haar wavelet scheme has been adopted. The experimental results show the effectiveness of the proposed scheme.

  13. Automated classification of histopathology images of prostate cancer using a Bag-of-Words approach

    NASA Astrophysics Data System (ADS)

    Sanghavi, Foram M.; Agaian, Sos S.

    2016-05-01

    The goals of this paper are (1) test the Computer Aided Classification of the prostate cancer histopathology images based on the Bag-of-Words (BoW) approach (2) evaluate the performance of the classification grade 3 and 4 of the proposed method using the results of the approach proposed by the authors Khurd et al. in [9] and (3) classify the different grades of cancer namely, grade 0, 3, 4, and 5 using the proposed approach. The system performance is assessed using 132 prostate cancer histopathology of different grades. The system performance of the SURF features are also analyzed by comparing the results with SIFT features using different cluster sizes. The results show 90.15% accuracy in detection of prostate cancer images using SURF features with 75 clusters for k-mean clustering. The results showed higher sensitivity for SURF based BoW classification compared to SIFT based BoW.

  14. Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking

    NASA Astrophysics Data System (ADS)

    Umapathy, K.; Ghoraani, B.; Krishnan, S.

    2010-12-01

    Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF) approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.

  15. Classification of prosthetic heart valve sounds. A parametric approach

    SciTech Connect

    Candy, J.V.; Jones, H.E. |

    1995-06-01

    People with heart problems have had their lives extended considerably with the development of the prosthetic heart valve. Great strides have been made in the development of the valves through the use of improved materials as well as efficient mechanical designs. However, since the valves operate continuously over a long period, structural failures can occur-even though they are relatively uncommon. Here the development of techniques to classify the valve either as having intact struts or as having a separated strut, commonly called single leg separation, is discussed. In this paper the signal processing techniques employed to extract the required signals/parameters are briefly reviewed and then it is shown how they can be used to simulate a synthetic heart valve database for eventual Monte Carlo testing. Next, the optimal classifier is developed under assumed conditions and its performance is compared to that of an adpative-type classifier implemented with a probabilistic neural network. Finally, the adaptive classifier is applied to a data set and its performance is analyzed. Based on synthetic data it is shown that excellent performance of the classifiers can be achieved implying a potentially robust solution to this classification problem. 21 refs., 11 figs., 1 tab.

  16. Statistical methods and neural network approaches for classification of data from multiple sources

    NASA Technical Reports Server (NTRS)

    Benediktsson, Jon Atli; Swain, Philip H.

    1990-01-01

    Statistical methods for classification of data from multiple data sources are investigated and compared to neural network models. A problem with using conventional multivariate statistical approaches for classification of data of multiple types is in general that a multivariate distribution cannot be assumed for the classes in the data sources. Another common problem with statistical classification methods is that the data sources are not equally reliable. This means that the data sources need to be weighted according to their reliability but most statistical classification methods do not have a mechanism for this. This research focuses on statistical methods which can overcome these problems: a method of statistical multisource analysis and consensus theory. Reliability measures for weighting the data sources in these methods are suggested and investigated. Secondly, this research focuses on neural network models. The neural networks are distribution free since no prior knowledge of the statistical distribution of the data is needed. This is an obvious advantage over most statistical classification methods. The neural networks also automatically take care of the problem involving how much weight each data source should have. On the other hand, their training process is iterative and can take a very long time. Methods to speed up the training procedure are introduced and investigated. Experimental results of classification using both neural network models and statistical methods are given, and the approaches are compared based on these results.

  17. Marker-Based Hierarchical Segmentation and Classification Approach for Hyperspectral Imagery

    NASA Technical Reports Server (NTRS)

    Tarabalka, Yuliya; Tilton, James C.; Benediktsson, Jon Atli; Chanussot, Jocelyn

    2011-01-01

    The Hierarchical SEGmentation (HSEG) algorithm, which is a combination of hierarchical step-wise optimization and spectral clustering, has given good performances for hyperspectral image analysis. This technique produces at its output a hierarchical set of image segmentations. The automated selection of a single segmentation level is often necessary. We propose and investigate the use of automatically selected markers for this purpose. In this paper, a novel Marker-based HSEG (M-HSEG) method for spectral-spatial classification of hyperspectral images is proposed. First, pixelwise classification is performed and the most reliably classified pixels are selected as markers, with the corresponding class labels. Then, a novel constrained marker-based HSEG algorithm is applied, resulting in a spectral-spatial classification map. The experimental results show that the proposed approach yields accurate segmentation and classification maps, and thus is attractive for hyperspectral image analysis.

  18. Voxel-Based Approach for Estimating Urban Tree Volume from Terrestrial Laser Scanning Data

    NASA Astrophysics Data System (ADS)

    Vonderach, C.; Voegtle, T.; Adler, P.

    2012-07-01

    The importance of single trees and the determination of related parameters has been recognized in recent years, e.g. for forest inventories or management. For urban areas an increasing interest in the data acquisition of trees can be observed concerning aspects like urban climate, CO2 balance, and environmental protection. Urban trees differ significantly from natural systems with regard to the site conditions (e.g. technogenic soils, contaminants, lower groundwater level, regular disturbance), climate (increased temperature, reduced humidity) and species composition and arrangement (habitus and health status) and therefore allometric relations cannot be transferred from natural sites to urban areas. To overcome this problem an extended approach was developed for a fast and non-destructive extraction of branch volume, DBH (diameter at breast height) and height of single trees from point clouds of terrestrial laser scanning (TLS). For data acquisition, the trees were scanned with highest scan resolution from several (up to five) positions located around the tree. The resulting point clouds (20 to 60 million points) are analysed with an algorithm based on voxel (volume elements) structure, leading to an appropriate data reduction. In a first step, two kinds of noise reduction are carried out: the elimination of isolated voxels as well as voxels with marginal point density. To obtain correct volume estimates, the voxels inside the stem and branches (interior voxels) where voxels contain no laser points must be regarded. For this filling process, an easy and robust approach was developed based on a layer-wise (horizontal layers of the voxel structure) intersection of four orthogonal viewing directions. However, this procedure also generates several erroneous "phantom" voxels, which have to be eliminated. For this purpose the previous approach was extended by a special region growing algorithm. In a final step the volume is determined layer-wise based on the extracted

  19. A sonographic approach to prenatal classification of congenital spine anomalies

    PubMed Central

    Robertson, Meiri; Sia, Sock Bee

    2015-01-01

    Abstract Objective: To develop a classification system for congenital spine anomalies detected by prenatal ultrasound. Methods: Data were collected from fetuses with spine abnormalities diagnosed in our institution over a five‐year period between June 2005 and June 2010. The ultrasound images were analysed to determine which features were associated with different congenital spine anomalies. Findings of the prenatal ultrasound images were correlated with other prenatal imaging, post mortem findings, post mortem imaging, neonatal imaging, karyotype, and other genetic workup. Data from published case reports of prenatal diagnosis of rare congenital spine anomalies were analysed to provide a comprehensive work. Results: During the study period, eighteen cases of spine abnormalities were diagnosed in 7819 women. The mean gestational age at diagnosis was 18.8w ± 2.2 SD. While most cases represented open NTD, a spectrum of vertebral abnormalities were diagnosed prenatally. These included hemivertebrae, block vertebrae, cleft or butterfly vertebrae, sacral agenesis, and a lipomeningocele. The most sensitive features for diagnosis of a spine abnormality included flaring of the vertebral arch ossification centres, abnormal spine curvature, and short spine length. While reported findings at the time of diagnosis were often conservative, retrospective analysis revealed good correlation with radiographic imaging. 3D imaging was found to be a valuable tool in many settings. Conclusions: Analysis of the study findings showed prenatal ultrasound allowed detection of disruption to the normal appearances of the fetal spine. Using the three features of flaring of the vertebral arch ossification centres, abnormal spine curvature, and short spine length, an algorithm was devised to aid with the diagnosis of spine anomalies for those who perform and report prenatal ultrasound. PMID:28191204

  20. Counting scars on tree stems to assess rockfall hazards: A low effort approach, but how reliable?

    NASA Astrophysics Data System (ADS)

    Trappmann, Daniel; Stoffel, Markus

    2013-01-01

    Rockfall is a widespread and hazardous process in mountain environments, but data on past events are only rarely available. Growth-ring series from trees impacted by rockfall were successfully used in the past to overcome the lack of archival records. Dendrogeomorphic techniques have been demonstrated to allow very accurate dating and reconstruction of spatial and temporal rockfall activity, but the approach has been cited to be labor intensive and time consuming. In this study, we present a simplified method to quantify rockfall processes on forested slopes requiring less time and efforts. The approach is based on a counting of visible scars on the stem surface of Common beech (Fagus sylvatica L.). Data are presented from a site in the Inn valley (Austria), where rocks are frequently detached from an ~ 200-m-high, south-facing limestone cliff. We compare results obtained from (i) the "classical" analysis of growth disturbances in the tree-ring series of 33 Norway spruces (Picea abies (L.) Karst.) and (ii) data obtained with a scar count on the stem surface of 50 F. sylvatica trees. A total of 277 rockfall events since A.D. 1819 could be reconstructed from tree-ring records of P. abies, whereas 1140 scars were observed on the stem surface of F. sylvatica. Absolute numbers of rockfalls (and hence return intervals) vary significantly between the approaches, and the mean number of rockfalls observed on the stem surface of F. sylvatica exceeds that of P. abies by a factor of 2.7. On the other hand, both methods yield comparable data on the spatial distribution of relative rockfall activity. Differences may be explained by a great portion of masked scars in P. abies and the conservation of signs of impacts on the stem of F. sylvatica. Besides, data indicate that several scars on the bark of F. sylvatica may stem from the same impact and thus lead to an overestimation of rockfall activity.

  1. Investigating the limitations of tree species classification using the Combined Cluster and Discriminant Analysis method for low density ALS data from a dense forest region in Aggtelek (Hungary)

    NASA Astrophysics Data System (ADS)

    Koma, Zsófia; Deák, Márton; Kovács, József; Székely, Balázs; Kelemen, Kristóf; Standovár, Tibor

    2016-04-01

    Airborne Laser Scanning (ALS) is a widely used technology for forestry classification applications. However, single tree detection and species classification from low density ALS point cloud is limited in a dense forest region. In this study we investigate the division of a forest into homogenous groups at stand level. The study area is located in the Aggtelek karst region (Northeast Hungary) with a complex relief topography. The ALS dataset contained only 4 discrete echoes (at 2-4 pt/m2 density) from the study area during leaf-on season. Ground-truth measurements about canopy closure and proportion of tree species cover are available for every 70 meter in 500 square meter circular plots. In the first step, ALS data were processed and geometrical and intensity based features were calculated into a 5×5 meter raster based grid. The derived features contained: basic statistics of relative height, canopy RMS, echo ratio, openness, pulse penetration ratio, basic statistics of radiometric feature. In the second step the data were investigated using Combined Cluster and Discriminant Analysis (CCDA, Kovács et al., 2014). The CCDA method first determines a basic grouping for the multiple circle shaped sampling locations using hierarchical clustering and then for the arising grouping possibilities a core cycle is executed comparing the goodness of the investigated groupings with random ones. Out of these comparisons difference values arise, yielding information about the optimal grouping out of the investigated ones. If sub-groups are then further investigated, one might even find homogeneous groups. We found that low density ALS data classification into homogeneous groups are highly dependent on canopy closure, and the proportion of the dominant tree species. The presented results show high potential using CCDA for determination of homogenous separable groups in LiDAR based tree species classification. Aggtelek Karst/Slovakian Karst Caves" (HUSK/1101/221/0180, Aggtelek NP

  2. Prediction and Diagnosis of Non-Alcoholic Fatty Liver Disease (NAFLD) and Identification of Its Associated Factors Using the Classification Tree Method

    PubMed Central

    Birjandi, Mehdi; Ayatollahi, Seyyed Mohammad Taghi; Pourahmad, Saeedeh; Safarpour, Ali Reza

    2016-01-01

    Background Non-alcoholic fatty liver disease (NAFLD) is the most common form of liver disease in many parts of the world. Objectives The aim of the present study was to identify the most important factors influencing NAFLD using a classification tree (CT) to predict the probability of NAFLD. Patients and Methods This cross-sectional study was conducted in Kavar, a town in the south of Fars province, Iran. A total of 1,600 individuals were selected for the study via the stratified method and multiple-stage cluster random sampling. A total of 30 demographic and clinical variables were measured for each individual. Participants were divided into two datasets: testing and training. We used the training dataset (1,120 individuals) to build the CT and the testing dataset (480 individuals) to assess the CT. The CT was also used to estimate class and to predict fatty liver occurrence. Results NAFLD was diagnosed in 22% of the individuals in the sample. Our findings revealed that the following variables, based on univariate analysis, had a significant association with NAFLD: marital status, history of hepatitis B vaccine, history of surgery, body mass index (BMI), waist-hip ratio (WHR), systolic blood pressure (SBP), diastolic blood pressure (DBP), high-density lipoprotein (HDL), triglycerides (TG), alanine aminotransferase (ALT), cholesterol (CHO0, aspartate aminotransferase (AST), glucose (GLU), albumin (AL), and age (P < 0.05). The main affecting variables for predicting NAFLD based on the CT and in order of importance were as follows: BMI, WHR, triglycerides, glucose, SBP, and alanine aminotransferase. The goodness of fit model based on the training and testing datasets were as follows: prediction accuracy (80%, 75%), sensitivity (74%, 73%), specificity (83%, 77%), and the area under the receiver operating characteristic (ROC) curve (78%, 75%), respectively. Conclusions The CT is a suitable and easy-to-interpret approach for decision-making and predicting NAFLD. PMID

  3. A novel deep learning approach for classification of EEG motor imagery signals

    NASA Astrophysics Data System (ADS)

    Rezaei Tabar, Yousef; Halici, Ugur

    2017-02-01

    Objective. Signal classification is an important issue in brain computer interface (BCI) systems. Deep learning approaches have been used successfully in many recent studies to learn features and classify different types of data. However, the number of studies that employ these approaches on BCI applications is very limited. In this study we aim to use deep learning methods to improve classification performance of EEG motor imagery signals. Approach. In this study we investigate convolutional neural networks (CNN) and stacked autoencoders (SAE) to classify EEG Motor Imagery signals. A new form of input is introduced to combine time, frequency and location information extracted from EEG signal and it is used in CNN having one 1D convolutional and one max-pooling layers. We also proposed a new deep network by combining CNN and SAE. In this network, the features that are extracted in CNN are classified through the deep network SAE. Main results. The classification performance obtained by the proposed method on BCI competition IV dataset 2b in terms of kappa value is 0.547. Our approach yields 9% improvement over the winner algorithm of the competition. Significance. Our results show that deep learning methods provide better classification performance compared to other state of art approaches. These methods can be applied successfully to BCI systems where the amount of data is large due to daily recording.

  4. A Consensus Tree Approach for Reconstructing Human Evolutionary History and Detecting Population Substructure

    NASA Astrophysics Data System (ADS)

    Tsai, Ming-Chi; Blelloch, Guy; Ravi, R.; Schwartz, Russell

    The random accumulation of variations in the human genome over time implicitly encodes a history of how human populations have arisen, dispersed, and intermixed since we emerged as a species. Reconstructing that history is a challenging computational and statistical problem but has important applications both to basic research and to the discovery of genotype-phenotype correlations. In this study, we present a novel approach to inferring human evolutionary history from genetic variation data. Our approach uses the idea of consensus trees, a technique generally used to reconcile species trees from divergent gene trees, adapting it to the problem of finding the robust relationships within a set of intraspecies phylogenies derived from local regions of the genome. We assess the quality of the method on two large-scale genetic variation data sets: the HapMap Phase II and the Human Genome Diversity Project. Qualitative comparison to a consensus model of the evolution of modern human population groups shows that our inferences closely match our best current understanding of human evolutionary history. A further comparison with results of a leading method for the simpler problem of population substructure assignment verifies that our method provides comparable accuracy in identifying meaningful population subgroups in addition to inferring the relationships among them.

  5. Comprehensive Decision Tree Models in Bioinformatics

    PubMed Central

    Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter

    2012-01-01

    Purpose Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification model are possible. Methods This paper presents an extension to an existing machine learning environment and a study on visual tuning of decision tree classifiers. The motivation for this research comes from the need to build effective and easily interpretable decision tree models by so called one-button data mining approach where no parameter tuning is needed. To avoid bias in classification, no classification performance measure is used during the tuning of the model that is constrained exclusively by the dimensions of the produced decision tree. Results The proposed visual tuning of decision trees was evaluated on 40 datasets containing classical machine learning problems and 31 datasets from the field of bioinformatics. Although we did not expected significant differences in classification performance, the results demonstrate a significant increase of accuracy in less complex visually tuned decision trees. In contrast to classical machine learning benchmarking datasets, we observe higher accuracy gains in bioinformatics datasets. Additionally, a user study was carried out to confirm the assumption that the tree tuning times are significantly lower for the proposed method in comparison to manual tuning of the decision tree. Conclusions The empirical results demonstrate that by building simple models constrained by predefined visual boundaries, one not only achieves good comprehensibility, but also very good classification performance that does not differ from usually more complex models built using default settings of the classical decision tree algorithm. In addition, our study demonstrates the suitability of visually tuned decision trees for datasets with binary class

  6. Head Pose Estimation on Eyeglasses Using Line Detection and Classification Approach

    NASA Astrophysics Data System (ADS)

    Setthawong, Pisal; Vannija, Vajirasak

    This paper proposes a unique approach for head pose estimation of subjects with eyeglasses by using a combination of line detection and classification approaches. Head pose estimation is considered as an important non-verbal form of communication and could also be used in the area of Human-Computer Interface. A major improvement of the proposed approach is that it allows estimation of head poses at a high yaw/pitch angle when compared with existing geometric approaches, does not require expensive data preparation and training, and is generally fast when compared with other approaches.

  7. Optimization of a Non-traditional Unsupervised Classification Approach for Land Cover Analysis

    NASA Technical Reports Server (NTRS)

    Boyd, R. K.; Brumfield, J. O.; Campbell, W. J.

    1982-01-01

    The conditions under which a hybrid of clustering and canonical analysis for image classification produce optimum results were analyzed. The approach involves generation of classes by clustering for input to canonical analysis. The importance of the number of clusters input and the effect of other parameters of the clustering algorithm (ISOCLS) were examined. The approach derives its final result by clustering the canonically transformed data. Therefore the importance of number of clusters requested in this final stage was also examined. The effect of these variables were studied in terms of the average separability (as measured by transformed divergence) of the final clusters, the transformation matrices resulting from different numbers of input classes, and the accuracy of the final classifications. The research was performed with LANDSAT MSS data over the Hazleton/Berwick Pennsylvania area. Final classifications were compared pixel by pixel with an existing geographic information system to provide an indication of their accuracy.

  8. A Voxel-Map Quantitative Analysis Approach for Atherosclerotic Noncalcified Plaques of the Coronary Artery Tree

    PubMed Central

    Li, Ying; Chen, Wei; Chen, Yonglin; Chu, Chun; Fang, Bingji; Tan, Liwen

    2013-01-01

    Noncalcified plaques (NCPs) are associated with the presence of lipid-core plaques that are prone to rupture. Thus, it is important to detect and monitor the development of NCPs. Contrast-enhanced coronary Computed Tomography Angiography (CTA) is a potential imaging technique to identify atherosclerotic plaques in the whole coronary tree, but it fails to provide information about vessel walls. In order to overcome the limitations of coronary CTA and provide more meaningful quantitative information for percutaneous coronary intervention (PCI), we proposed a Voxel-Map based on mathematical morphology to quantitatively analyze the noncalcified plaques on a three-dimensional coronary artery wall model (3D-CAWM). This approach is a combination of Voxel-Map analysis techniques, plaque locating, and anatomical location related labeling, which show more detailed and comprehensive coronary tree wall visualization. PMID:24348749

  9. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach.

    PubMed

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members.

  10. Identifying Risk and Protective Factors in Recidivist Juvenile Offenders: A Decision Tree Approach

    PubMed Central

    Ortega-Campos, Elena; García-García, Juan; Gil-Fenoy, Maria José; Zaldívar-Basurto, Flor

    2016-01-01

    Research on juvenile justice aims to identify profiles of risk and protective factors in juvenile offenders. This paper presents a study of profiles of risk factors that influence young offenders toward committing sanctionable antisocial behavior (S-ASB). Decision tree analysis is used as a multivariate approach to the phenomenon of repeated sanctionable antisocial behavior in juvenile offenders in Spain. The study sample was made up of the set of juveniles who were charged in a court case in the Juvenile Court of Almeria (Spain). The period of study of recidivism was two years from the baseline. The object of study is presented, through the implementation of a decision tree. Two profiles of risk and protective factors are found. Risk factors associated with higher rates of recidivism are antisocial peers, age at baseline S-ASB, problems in school and criminality in family members. PMID:27611313

  11. Dynamics of flexible bodies in tree topology - A computer oriented approach

    NASA Technical Reports Server (NTRS)

    Singh, R. P.; Vandervoort, R. J.; Likins, P. W.

    1984-01-01

    An approach suited for automatic generation of the equations of motion for large mechanical systems (i.e., large space structures, mechanisms, robots, etc.) is presented. The system topology is restricted to a tree configuration. The tree is defined as an arbitrary set of rigid and flexible bodies connected by hinges characterizing relative translations and rotations of two adjoining bodies. The equations of motion are derived via Kane's method. The resulting equation set is of minimum dimension. Dynamical equations are imbedded in a computer program called TREETOPS. Extensive control simulation capability is built in the TREETOPS program. The simulation is driven by an interactive set-up program resulting in an easy to use analysis tool.

  12. Classification of Phylogenetic Profiles for Protein Function Prediction: An SVM Approach

    NASA Astrophysics Data System (ADS)

    Kotaru, Appala Raju; Joshi, Ramesh C.

    Predicting the function of an uncharacterized protein is a major challenge in post-genomic era due to problems complexity and scale. Having knowledge of protein function is a crucial link in the development of new drugs, better crops, and even the development of biochemicals such as biofuels. Recently numerous high-throughput experimental procedures have been invented to investigate the mechanisms leading to the accomplishment of a protein’s function and Phylogenetic profile is one of them. Phylogenetic profile is a way of representing a protein which encodes evolutionary history of proteins. In this paper we proposed a method for classification of phylogenetic profiles using supervised machine learning method, support vector machine classification along with radial basis function as kernel for identifying functionally linked proteins. We experimentally evaluated the performance of the classifier with the linear kernel, polynomial kernel and compared the results with the existing tree kernel. In our study we have used proteins of the budding yeast saccharomyces cerevisiae genome. We generated the phylogenetic profiles of 2465 yeast genes and for our study we used the functional annotations that are available in the MIPS database. Our experiments show that the performance of the radial basis kernel is similar to polynomial kernel is some functional classes together are better than linear, tree kernel and over all radial basis kernel outperformed the polynomial kernel, linear kernel and tree kernel. In analyzing these results we show that it will be feasible to make use of SVM classifier with radial basis function as kernel to predict the gene functionality using phylogenetic profiles.

  13. Updating the US Hydrologic Classification: An Approach to Clustering and Stratifying Ecohydrologic Data

    SciTech Connect

    McManamay, Ryan A; Bevelhimer, Mark S; Kao, Shih-Chieh

    2013-01-01

    Hydrologic classifications unveil the structure of relationships among groups of streams with differing stream flow and provide a foundation for drawing inferences about the principles that govern those relationships. Hydrologic classes provide a template to describe ecological patterns, generalize hydrologic responses to disturbance, and stratify research and management needs applicable to ecohydrology. We developed two updated hydrologic classifications for the continental US using two streamflow datasets of varying reference standards. Using only reference-quality gages, we classified 1715 stream gages into 12 classes across the US. By including more streamflow gages (n=2618) in a separate classification, we increased the dimensionality (i.e. classes) and hydrologic distinctiveness within regions at the expense of decreasing the natural flow standards (i.e. reference quality). Greater numbers of classes and higher regional affiliation within our hydrologic classifications compared to that of the previous US hydrologic classification (Poff, 1996) suggested that the level of hydrologic variation and resolution was not completely represented in smaller sample sizes. Part of the utility of classification systems rests in their ability classify new objects and stratify analyses. We constructed separate random forests to predict hydrologic class membership based on hydrologic indices or landscape variables. In addition, we provide an approach to assessing potential outliers due to hydrologic alteration based on class assignment. Departures from class membership due to disturbance take into account multiple hydrologic indices simultaneously; thus, classes can be used to determine if disturbed streams are functioning within the realm of natural hydrology.

  14. Risk assessment for enterprise resource planning (ERP) system implementations: a fault tree analysis approach

    NASA Astrophysics Data System (ADS)

    Zeng, Yajun; Skibniewski, Miroslaw J.

    2013-08-01

    Enterprise resource planning (ERP) system implementations are often characterised with large capital outlay, long implementation duration, and high risk of failure. In order to avoid ERP implementation failure and realise the benefits of the system, sound risk management is the key. This paper proposes a probabilistic risk assessment approach for ERP system implementation projects based on fault tree analysis, which models the relationship between ERP system components and specific risk factors. Unlike traditional risk management approaches that have been mostly focused on meeting project budget and schedule objectives, the proposed approach intends to address the risks that may cause ERP system usage failure. The approach can be used to identify the root causes of ERP system implementation usage failure and quantify the impact of critical component failures or critical risk events in the implementation process.

  15. Mapping raised bogs with an iterative one-class classification approach

    NASA Astrophysics Data System (ADS)

    Mack, Benjamin; Roscher, Ribana; Stenzel, Stefanie; Feilhauer, Hannes; Schmidtlein, Sebastian; Waske, Björn

    2016-10-01

    Land use and land cover maps are one of the most commonly used remote sensing products. In many applications the user only requires a map of one particular class of interest, e.g. a specific vegetation type or an invasive species. One-class classifiers are appealing alternatives to common supervised classifiers because they can be trained with labeled training data of the class of interest only. However, training an accurate one-class classification (OCC) model is challenging, particularly when facing a large image, a small class and few training samples. To tackle these problems we propose an iterative OCC approach. The presented approach uses a biased Support Vector Machine as core classifier. In an iterative pre-classification step a large part of the pixels not belonging to the class of interest is classified. The remaining data is classified by a final classifier with a novel model and threshold selection approach. The specific objective of our study is the classification of raised bogs in a study site in southeast Germany, using multi-seasonal RapidEye data and a small number of training sample. Results demonstrate that the iterative OCC outperforms other state of the art one-class classifiers and approaches for model selection. The study highlights the potential of the proposed approach for an efficient and improved mapping of small classes such as raised bogs. Overall the proposed approach constitutes a feasible approach and useful modification of a regular one-class classifier.

  16. Wittgenstein's philosophy and a dimensional approach to the classification of mental disorders -- a preliminary scheme.

    PubMed

    Mackinejad, Kioumars; Sharifi, Vandad

    2006-01-01

    In this paper the importance of Wittgenstein's philosophical ideas for the justification of a dimensional approach to the classification of mental disorders is discussed. Some of his basic concepts in his Philosophical Investigations, such as 'family resemblances', 'grammar' and 'language-game' and their relations to the concept of mental disorder are explored.

  17. DeepSAT: A Deep Learning Approach to Tree-Cover Delineation in 1-m NAIP Imagery for the Continental United States

    NASA Technical Reports Server (NTRS)

    Ganguly, Sangram; Basu, Saikat; Nemani, Ramakrishna R.; Mukhopadhyay, Supratik; Michaelis, Andrew; Votava, Petr

    2016-01-01

    High resolution tree cover classification maps are needed to increase the accuracy of current land ecosystem and climate model outputs. Limited studies are in place that demonstrates the state-of-the-art in deriving very high resolution (VHR) tree cover products. In addition, most methods heavily rely on commercial softwares that are difficult to scale given the region of study (e.g. continents to globe). Complexities in present approaches relate to (a) scalability of the algorithm, (b) large image data processing (compute and memory intensive), (c) computational cost, (d) massively parallel architecture, and (e) machine learning automation. In addition, VHR satellite datasets are of the order of terabytes and features extracted from these datasets are of the order of petabytes. In our present study, we have acquired the National Agriculture Imagery Program (NAIP) dataset for the Continental United States at a spatial resolution of 1-m. This data comes as image tiles (a total of quarter million image scenes with 60 million pixels) and has a total size of 65 terabytes for a single acquisition. Features extracted from the entire dataset would amount to 8-10 petabytes. In our proposed approach, we have implemented a novel semi-automated machine learning algorithm rooted on the principles of "deep learning" to delineate the percentage of tree cover. Using the NASA Earth Exchange (NEX) initiative, we have developed an end-to-end architecture by integrating a segmentation module based on Statistical Region Merging, a classification algorithm using Deep Belief Network and a structured prediction algorithm using Conditional Random Fields to integrate the results from the segmentation and classification modules to create per-pixel class labels. The training process is scaled up using the power of GPUs and the prediction is scaled to quarter million NAIP tiles spanning the whole of Continental United States using the NEX HPC supercomputing cluster. An initial pilot over the

  18. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  19. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  20. An ensemble classification approach for improved Land use/cover change detection

    NASA Astrophysics Data System (ADS)

    Chellasamy, M.; Ferré, T. P. A.; Humlekrog Greve, M.; Larsen, R.; Chinnasamy, U.

    2014-11-01

    Change Detection (CD) methods based on post-classification comparison approaches are claimed to provide potentially reliable results. They are considered to be most obvious quantitative method in the analysis of Land Use Land Cover (LULC) changes which provides from - to change information. But, the performance of post-classification comparison approaches highly depends on the accuracy of classification of individual images used for comparison. Hence, we present a classification approach that produce accurate classified results which aids to obtain improved change detection results. Machine learning is a part of broader framework in change detection, where neural networks have drawn much attention. Neural network algorithms adaptively estimate continuous functions from input data without mathematical representation of output dependence on input. A common practice for classification is to use Multi-Layer-Perceptron (MLP) neural network with backpropogation learning algorithm for prediction. To increase the ability of learning and prediction, multiple inputs (spectral, texture, topography, and multi-temporal information) are generally stacked to incorporate diversity of information. On the other hand literatures claims backpropagation algorithm to exhibit weak and unstable learning in use of multiple inputs, while dealing with complex datasets characterized by mixed uncertainty levels. To address the problem of learning complex information, we propose an ensemble classification technique that incorporates multiple inputs for classification unlike traditional stacking of multiple input data. In this paper, we present an Endorsement Theory based ensemble classification that integrates multiple information, in terms of prediction probabilities, to produce final classification results. Three different input datasets are used in this study: spectral, texture and indices, from SPOT-4 multispectral imagery captured on 1998 and 2003. Each SPOT image is classified

  1. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery using a Probabilistic Learning Framework

    NASA Astrophysics Data System (ADS)

    Basu, S.; Ganguly, S.; Michaelis, A.; Votava, P.; Roy, A.; Mukhopadhyay, S.; Nemani, R. R.

    2015-12-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  2. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery Using a Probabilistic Learning Framework

    NASA Technical Reports Server (NTRS)

    Basu, Saikat; Ganguly, Sangram; Michaelis, Andrew; Votava, Petr; Roy, Anshuman; Mukhopadhyay, Supratik; Nemani, Ramakrishna

    2015-01-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets, which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  3. Breadth-first search approach to enumeration of tree-like chemical compounds.

    PubMed

    Zhao, Yang; Hayashida, Morihiro; Jindalertudomdee, Jira; Nagamochi, Hiroshi; Akutsu, Tatsuya

    2013-12-01

    Molecular enumeration plays a basic role in the design of drugs, which has been studied by mathematicians, computer scientists, and chemists for quite a long time. Although many researchers are involved in developing enumeration algorithms specific to drug design systems, molecular enumeration is still a hard problem to date due to its exponentially increasing large search space with larger number of atoms. To alleviate this defect, we propose efficient algorithms, BfsSimEnum and BfsMulEnum to enumerate tree-like molecules without and with multiple bonds, respectively, where chemical compounds are represented as molecular graphs. In order to reduce the large search space, we adjust some important concepts such as left-heavy, center-rooted, and normal form to molecular tree graphs. Different from many existing approaches, BfsSimEnum and BfsMulEnum firstly enumerate tree-like compounds by breadth-first search order. Computational experiments are performed to compare with several existing methods. The results suggest that our proposed methods are exact and more efficient.

  4. UAV based tree height estimation in apple orchards: potential of multiple approaches

    NASA Astrophysics Data System (ADS)

    Mejia-Aguilar, Abraham; Tomelleri, Enrico; Vilardi, Andrea; Zebisch, Marc

    2015-04-01

    Canopy height, as part of vegetation structure, is ecologically important for ecological studies on biomass, matter flows or meteorology. Measuring the growth of canopy can be undertaken by the use multiple remote sensing techniques. In this study, we firstly use data generated from an Unmanned Aerial Vehicles (UAV) with a simultaneous consumer-grade RGB and modified IR cameras, configured in nadir and multi-angle views to generate 3D models for Digital Surface Model (DSM) and Digital Terrain Models (DTM) in order to estimate tree height in apple orchards in South Tyrol, Italy. We evaluate the use of Ground Control Points (GCP) to minimize the error in scale and orientation. Then, we validate and compare the results of our primary data collection with data generated by geolocated field measurements over several selected tree species. Additionally, we compare DSM and DTM obtained from a recent 1-meter resolution LIDAR campaign (Light Detection and Ranging). The main purpose of this study is to contrast multiple estimation approaches and evaluate their utility for the estimation of canopy height, highlighting the use of UAV systems as a fast, reliable and non-expensive technique especially for small scale applications. The study is conducted in a homogenous tree canopy consisting of apple orchards located in Caldaro -South Tyrol, Italy. We end with proposing a potential low-cost and inexpensive application combining models for DSM from the UAV with DTM obtained from LIDAR for applications that should be updated frequently.

  5. A latent discriminative model-based approach for classification of imaginary motor tasks from EEG data

    NASA Astrophysics Data System (ADS)

    Delgado Saa, Jaime F.; Çetin, Müjdat

    2012-04-01

    We consider the problem of classification of imaginary motor tasks from electroencephalography (EEG) data for brain-computer interfaces (BCIs) and propose a new approach based on hidden conditional random fields (HCRFs). HCRFs are discriminative graphical models that are attractive for this problem because they (1) exploit the temporal structure of EEG; (2) include latent variables that can be used to model different brain states in the signal; and (3) involve learned statistical models matched to the classification task, avoiding some of the limitations of generative models. Our approach involves spatial filtering of the EEG signals and estimation of power spectra based on autoregressive modeling of temporal segments of the EEG signals. Given this time-frequency representation, we select certain frequency bands that are known to be associated with execution of motor tasks. These selected features constitute the data that are fed to the HCRF, parameters of which are learned from training data. Inference algorithms on the HCRFs are used for the classification of motor tasks. We experimentally compare this approach to the best performing methods in BCI competition IV as well as a number of more recent methods and observe that our proposed method yields better classification accuracy.

  6. A stochastic based approach for a new site classification method: application to the Algerian seismic code

    NASA Astrophysics Data System (ADS)

    Beneldjouzi, Mohamed; Laouami, Nasser

    2015-12-01

    Building codes have widely considered the shear wave velocity to make a reliable subsoil seismic classification, based on the knowledge of the mechanical properties of material deposits down to bedrock. This approach has limitations because geophysical data are often very expensive to obtain. Recently, other alternatives have been proposed based on measurements of background noise and estimation of the H/V amplification curve. However, the use of this technique needs a regulatory framework before it can become a realistic site classification procedure. This paper proposes a new formulation for characterizing design sites in accordance with the Algerian seismic building code (RPA99/ver.2003), through transfer functions, by following a stochastic approach combined to a statistical study. For each soil type, the deterministic calculation of the average transfer function is performed over a wide sample of 1-D soil profiles, where the average shear wave (S-W) velocity, V s, in soil layers is simulated using random field theory. Average transfer functions are also used to calculate average site factors and normalized acceleration response spectra to highlight the amplification potential of each site type, since frequency content of the transfer function is significantly similar to that of the H/V amplification curve. Comparison is done with the RPA99/ver.2003 and Eurocode8 (EC8) design response spectra, respectively. In the absence of geophysical data, the proposed classification approach together with micro-tremor measures can be used toward a better soil classification.

  7. Integrative Chemical-Biological Read-Across Approach for Chemical Hazard Classification

    PubMed Central

    Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander

    2013-01-01

    Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays (”biological” similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from those of both chemical and biological analogs whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone, or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models. PMID:23848138

  8. Integrative chemical-biological read-across approach for chemical hazard classification.

    PubMed

    Low, Yen; Sedykh, Alexander; Fourches, Denis; Golbraikh, Alexander; Whelan, Maurice; Rusyn, Ivan; Tropsha, Alexander

    2013-08-19

    Traditional read-across approaches typically rely on the chemical similarity principle to predict chemical toxicity; however, the accuracy of such predictions is often inadequate due to the underlying complex mechanisms of toxicity. Here, we report on the development of a hazard classification and visualization method that draws upon both chemical structural similarity and comparisons of biological responses to chemicals measured in multiple short-term assays ("biological" similarity). The Chemical-Biological Read-Across (CBRA) approach infers each compound's toxicity from both chemical and biological analogues whose similarities are determined by the Tanimoto coefficient. Classification accuracy of CBRA was compared to that of classical RA and other methods using chemical descriptors alone or in combination with biological data. Different types of adverse effects (hepatotoxicity, hepatocarcinogenicity, mutagenicity, and acute lethality) were classified using several biological data types (gene expression profiling and cytotoxicity screening). CBRA-based hazard classification exhibited consistently high external classification accuracy and applicability to diverse chemicals. Transparency of the CBRA approach is aided by the use of radial plots that show the relative contribution of analogous chemical and biological neighbors. Identification of both chemical and biological features that give rise to the high accuracy of CBRA-based toxicity prediction facilitates mechanistic interpretation of the models.

  9. A new approach to the hazard classification of alloys based on transformation/dissolution.

    PubMed

    Skeaff, James M; Hardy, David J; King, Pierrette

    2008-01-01

    Most of the metals produced for commercial application enter into service as alloys which, together with metals and all other chemicals in commerce, are subject to a hazard identification and classification initiative now being implemented in a number of jurisdictions worldwide, including the European Union Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) initiative, effective 1 June 2007. This initiative has considerable implications for environmental protection and market access. While a method for the hazard identification and classification of metals is available in the recently developed United Nations (UN) guidance document on the Globally Harmonized System of Hazard Classification and Labelling (GHS), an approach for alloys has yet to be formulated. Within the GHS, a transformation/dissolution protocol (T/ DP) for metals and sparingly soluble metal compounds is provided as a standard laboratory method for measuring the rate and extent of the release of metals into aqueous media from metal-bearing substances. By comparison with ecotoxicity reference data, T/D data can be used to derive UN GHS classification proposals. In this study we applied the T/DP for the 1st time to several economically important metals and alloys: iron powder, nickel powder, copper powder, and the alloys Fe-2Cu-0.6C (copper = 2%, carbon = 0.6%), Fe-2Ni-0.6C, Stainless Steel 304, Monel, brass, Inconel, and nickel-silver. The iron and copper powders and the iron and nickel powders had been sintered to produce the Fe-2Me-0.6C (Me = copper or nickel) alloys which made them essentially resistant to reaction with the aqueous media, so they would not classify under the GHS, although their component copper and nickel metal powders would. Forming a protective passivating film, chromium in the Stainless Steel 304 and Inconel alloys protected them from reaction with the aqueous media, so that their metal releases were minimal and would not result in GHS classification

  10. Stygoregions – a promising approach to a bioregional classification of groundwater systems

    PubMed Central

    Stein, Heide; Griebler, Christian; Berkhoff, Sven; Matzke, Dirk; Fuchs, Andreas; Hahn, Hans Jürgen

    2012-01-01

    Linked to diverse biological processes, groundwater ecosystems deliver essential services to mankind, the most important of which is the provision of drinking water. In contrast to surface waters, ecological aspects of groundwater systems are ignored by the current European Union and national legislation. Groundwater management and protection measures refer exclusively to its good physicochemical and quantitative status. Current initiatives in developing ecologically sound integrative assessment schemes by taking groundwater fauna into account depend on the initial classification of subsurface bioregions. In a large scale survey, the regional and biogeographical distribution patterns of groundwater dwelling invertebrates were examined for many parts of Germany. Following an exploratory approach, our results underline that the distribution patterns of invertebrates in groundwater are not in accordance with any existing bioregional classification system established for surface habitats. In consequence, we propose to develope a new classification scheme for groundwater ecosystems based on stygoregions. PMID:22993698

  11. An approach for classification of hydrogeological systems at the regional scale based on groundwater hydrographs

    NASA Astrophysics Data System (ADS)

    Haaf, Ezra; Barthel, Roland

    2016-04-01

    When assessing hydrogeological conditions at the regional scale, the analyst is often confronted with uncertainty of structures, inputs and processes while having to base inference on scarce and patchy data. Haaf and Barthel (2015) proposed a concept for handling this predicament by developing a groundwater systems classification framework, where information is transferred from similar, but well-explored and better understood to poorly described systems. The concept is based on the central hypothesis that similar systems react similarly to the same inputs and vice versa. It is conceptually related to PUB (Prediction in ungauged basins) where organization of systems and processes by quantitative methods is intended and used to improve understanding and prediction. Furthermore, using the framework it is expected that regional conceptual and numerical models can be checked or enriched by ensemble generated data from neighborhood-based estimators. In a first step, groundwater hydrographs from a large dataset in Southern Germany are compared in an effort to identify structural similarity in groundwater dynamics. A number of approaches to group hydrographs, mostly based on a similarity measure - which have previously only been used in local-scale studies, can be found in the literature. These are tested alongside different global feature extraction techniques. The resulting classifications are then compared to a visual "expert assessment"-based classification which serves as a reference. A ranking of the classification methods is carried out and differences shown. Selected groups from the classifications are related to geological descriptors. Here we present the most promising results from a comparison of classifications based on series correlation, different series distances and series features, such as the coefficients of the discrete Fourier transform and the intrinsic mode functions of empirical mode decomposition. Additionally, we show examples of classes

  12. A robust approach for tree segmentation in deciduous forests using small-footprint airborne LiDAR data

    NASA Astrophysics Data System (ADS)

    Hamraz, Hamid; Contreras, Marco A.; Zhang, Jun

    2016-10-01

    This paper presents a non-parametric approach for segmenting trees from airborne LiDAR data in deciduous forests. Based on the LiDAR point cloud, the approach collects crown information such as steepness and height on-the-fly to delineate crown boundaries, and most importantly, does not require a priori assumptions of crown shape and size. The approach segments trees iteratively starting from the tallest within a given area to the smallest until all trees have been segmented. To evaluate its performance, the approach was applied to the University of Kentucky Robinson Forest, a deciduous closed-canopy forest with complex terrain and vegetation conditions. The approach identified 94% of dominant and co-dominant trees with a false detection rate of 13%. About 62% of intermediate, overtopped, and dead trees were also detected with a false detection rate of 15%. The overall segmentation accuracy was 77%. Correlations of the segmentation scores of the proposed approach with local terrain and stand metrics was not significant, which is likely an indication of the robustness of the approach as results are not sensitive to the differences in terrain and stand structures.

  13. Application Of Decision Tree Approach To Student Selection Model- A Case Study

    NASA Astrophysics Data System (ADS)

    Harwati; Sudiya, Amby

    2016-01-01

    The main purpose of the institution is to provide quality education to the students and to improve the quality of managerial decisions. One of the ways to improve the quality of students is to arrange the selection of new students with a more selective. This research takes the case in the selection of new students at Islamic University of Indonesia, Yogyakarta, Indonesia. One of the university's selection is through filtering administrative selection based on the records of prospective students at the high school without paper testing. Currently, that kind of selection does not yet has a standard model and criteria. Selection is only done by comparing candidate application file, so the subjectivity of assessment is very possible to happen because of the lack standard criteria that can differentiate the quality of students from one another. By applying data mining techniques classification, can be built a model selection for new students which includes criteria to certain standards such as the area of origin, the status of the school, the average value and so on. These criteria are determined by using rules that appear based on the classification of the academic achievement (GPA) of the students in previous years who entered the university through the same way. The decision tree method with C4.5 algorithm is used here. The results show that students are given priority for admission is that meet the following criteria: came from the island of Java, public school, majoring in science, an average value above 75, and have at least one achievement during their study in high school.

  14. Geographical characterization of greek virgin olive oils (cv. Koroneiki) using 1H and 31P NMR fingerprinting with canonical discriminant analysis and classification binary trees.

    PubMed

    Petrakis, Panos V; Agiomyrgianaki, Alexia; Christophoridou, Stella; Spyros, Apostolos; Dais, Photis

    2008-05-14

    This work deals with the prediction of the geographical origin of monovarietal virgin olive oil (cv. Koroneiki) samples from three regions of southern Greece, namely, Peloponnesus, Crete, and Zakynthos, and collected in five harvesting years (2001-2006). All samples were chemically analyzed by means of 1H and 31P NMR spectroscopy and characterized according to their content in fatty acids, phenolics, diacylglycerols, total free sterols, free acidity, and iodine number. Biostatistical analysis showed that the fruiting pattern of the olive tree complicates the geographical separation of oil samples and the selection of significant chemical compounds. In this way the inclusion of the harvesting year improved the classification of samples, but increased the dimensionality of the data. Discriminant analysis showed that the geographical prediction at the level of three regions is very high (87%) and becomes (74%) when we pass to the thinner level of six sites (Chania, Sitia, and Heraklion in Crete; Lakonia and Messinia in Peloponnesus; Zakynthos). The use of classification and binary trees made possible the construction of a geographical prediction algorithm for unknown samples in a self-improvement fashion, which can be readily extended to other varieties and areas.

  15. Ship classification using nonlinear features of radiated sound: an approach based on empirical mode decomposition.

    PubMed

    Bao, Fei; Li, Chen; Wang, Xinlong; Wang, Qingfu; Du, Shuanping

    2010-07-01

    Classification for ship-radiated underwater sound is one of the most important and challenging subjects in underwater acoustical signal processing. An approach to ship classification is proposed in this work based on analysis of ship-radiated acoustical noise in subspaces of intrinsic mode functions attained via the ensemble empirical mode decomposition. It is shown that detection and acquisition of stable and reliable nonlinear features become practically feasible by nonlinear analysis of the time series of individual decomposed components, each of which is simple enough and well represents an oscillatory mode of ship dynamics. Surrogate and nonlinear predictability analysis are conducted to probe and measure the nonlinearity and regularity. The results of both methods, which verify each other, substantiate that ship-radiated noises contain components with deterministic nonlinear features well serving for efficient classification of ships. The approach perhaps opens an alternative avenue in the direction toward object classification and identification. It may also import a new view of signals as complex as ship-radiated sound.

  16. A Novel Approach to Probabilistic Biomarker-Based Classification Using Functional Near-Infrared Spectroscopy

    PubMed Central

    Hahn, Tim; Marquand, Andre F; Plichta, Michael M; Ehlis, Ann-Christine; Schecklmann, Martin W; Dresler, Thomas; Jarczok, Tomasz A; Eirich, Elisa; Leonhard, Christine; Reif, Andreas; Lesch, Klaus-Peter; Brammer, Michael J; Mourao-Miranda, Janaina; Fallgatter, Andreas J

    2013-01-01

    Pattern recognition approaches to the analysis of neuroimaging data have brought new applications such as the classification of patients and healthy controls within reach. In our view, the reliance on expensive neuroimaging techniques which are not well tolerated by many patient groups and the inability of most current biomarker algorithms to accommodate information about prior class frequencies (such as a disorder's prevalence in the general population) are key factors limiting practical application. To overcome both limitations, we propose a probabilistic pattern recognition approach based on cheap and easy-to-use multi-channel near-infrared spectroscopy (fNIRS) measurements. We show the validity of our method by applying it to data from healthy controls (n = 14) enabling differentiation between the conditions of a visual checkerboard task. Second, we show that high-accuracy single subject classification of patients with schizophrenia (n = 40) and healthy controls (n = 40) is possible based on temporal patterns of fNIRS data measured during a working memory task. For classification, we integrate spatial and temporal information at each channel to estimate overall classification accuracy. This yields an overall accuracy of 76% which is comparable to the highest ever achieved in biomarker-based classification of patients with schizophrenia. In summary, the proposed algorithm in combination with fNIRS measurements enables the analysis of sub-second, multivariate temporal patterns of BOLD responses and high-accuracy predictions based on low-cost, easy-to-use fNIRS patterns. In addition, our approach can easily compensate for variable class priors, which is highly advantageous in making predictions in a wide range of clinical neuroimaging applications. Hum Brain Mapp, 2013. © 2012 Wiley Periodicals, Inc. PMID:22965654

  17. Time-dependent approach for single trial classification of covert visuospatial attention

    NASA Astrophysics Data System (ADS)

    Tonin, L.; Leeb, R.; Millán, J. del R.

    2012-08-01

    Recently, several studies have started to explore covert visuospatial attention as a control signal for brain-computer interfaces (BCIs). Covert visuospatial attention represents the ability to change the focus of attention from one point in the space without overt eye movements. Nevertheless, the full potential and possible applications of this paradigm remain relatively unexplored. Voluntary covert visuospatial attention might allow a more natural and intuitive interaction with real environments as neither stimulation nor gazing is required. In order to identify brain correlates of covert visuospatial attention, classical approaches usually rely on the whole α-band over long time intervals. In this work, we propose a more detailed analysis in the frequency and time domains to enhance classification performance. In particular, we investigate the contribution of α sub-bands and the role of time intervals in carrying information about visual attention. Previous neurophysiological studies have already highlighted the role of temporal dynamics in attention mechanisms. However, these important aspects are not yet exploited in BCI. In this work, we studied different methods that explicitly cope with the natural brain dynamics during visuospatial attention tasks in order to enhance BCI robustness and classification performances. Results with ten healthy subjects demonstrate that our approach identifies spectro-temporal patterns that outperform the state-of-the-art classification method. On average, our time-dependent classification reaches 0.74 ± 0.03 of the area under the ROC (receiver operating characteristic) curve (AUC) value with an increase of 12.3% with respect to standard methods (0.65 ± 0.4). In addition, the proposed approach allows faster classification (<1 instead of 3 s), without compromising performances. Finally, our analysis highlights the fact that discriminant patterns are not stable for the whole trial period but are changing over short time

  18. Ecophysiological responses of trees to long- term N deposition: a multi isotopes approach

    NASA Astrophysics Data System (ADS)

    Battipaglia, G.; Lubritto, C.; Altieri, S.; Marzaioli, F.; Cherubini, P.; Cotrufo, M. F.

    2009-04-01

    Anthropogenic emissions of nitrogen compounds, principally derived from the burning of fossil fuels, have lead to regional changes in atmospheric and precipitation chemistry. The fate and environmental consequences of these changes on ecosystems functions and on forest growth has attracted considerable research. The d15N measurements have been used successfully for detecting changes in N deposition and incorporation of atmospheric N into leaves (Siegwolf et al,2001) and tree rings (Poulson et al.,1995; Saurer et al.,2004, Guerrieri et al.2009). We show main results arising from a study of mature Pinus pinea individuals exposed to large amount of traffic exhaust for 20 years. Specifically, we examined the time-related trend in the growth residuals through dendrochronological analysis and C and N isotopes. A consistent decrease in the ring width starting from 1980 with a slight increase in δ13C value has been found as a consequence of environmental stress event. More over the effect of the fossil source 14C dilution on the atmospheric bomb enriched background has been detected in tree rings over the last decades, as a consequence of the increase in uptaking of traffic exhaust. The great variability in δ15N values of tree rings with time underlines the difficulties we encountered in using N as an environmental tool and open new questions and research avenues. Guerrieri M.R., Siegwolf R.T.W., Saurer M., Jäggi M., Cherubini., Ripullone F., Borghetti M., (2009)"Impact of different nitrogen emission sources on tree physiology as assessed by a triple stable isotope approach" Atmospheric Environment 43:410-418 Pearson J., Wellis D.M., Seller K.J., Bennet A., Soares A., Woodall J., Ingroulle M.J. (2000). Traffic exposure increases natural 15N and heavy metal concentrations in mosses. New Phytologist 147: 317-326. Siegwolf R.T.W., Matyssek R., Saurer M., Maurer S., Günthardt-Georg M.S., Schmutz P. and Bucher J.B. "Stable isotope analysis reveals differential effects of

  19. What determines tree mortality in dry environments? A multi-perspective approach.

    PubMed

    Dorman, Michael; Svoray, Tal; Perevolotsky, Avi; Moshe, Yitzhak; Sarris, Dimitrios

    2015-06-01

    dendrochronological and remotely sensed performance indicators, in contrast to potential bias when using a single approach. For example, dendrochronological data suggested highly resilient tree growth, since it was based only on the "surviving" portion of the population, thus failing to identify past demographic changes evident through remote sensing. We therefore suggest that evaluation of forest resilience should be based on several metrics, each suited for detecting transitions at a different level of organization.

  20. Bag-of-features approach for improvement of lung tissue classification in diffuse lung disease

    NASA Astrophysics Data System (ADS)

    Kato, Noriji; Fukui, Motofumi; Isozaki, Takashi

    2009-02-01

    Many automated techniques have been proposed to classify diffuse lung disease patterns. Most of the techniques utilize texture analysis approaches with second and higher order statistics, and show successful classification result among various lung tissue patterns. However, the approaches do not work well for the patterns with inhomogeneous texture distribution within a region of interest (ROI), such as reticular and honeycombing patterns, because the statistics can only capture averaged feature over the ROI. In this work, we have introduced the bag-of-features approach to overcome this difficulty. In the approach, texture images are represented as histograms or distributions of a few basic primitives, which are obtained by clustering local image features. The intensity descriptor and the Scale Invariant Feature Transformation (SIFT) descriptor are utilized to extract the local features, which have significant discriminatory power due to their specificity to a particular image class. In contrast, the drawback of the local features is lack of invariance under translation and rotation. We improved the invariance by sampling many local regions so that the distribution of the local features is unchanged. We evaluated the performance of our system in the classification task with 5 image classes (ground glass, reticular, honeycombing, emphysema, and normal) using 1109 ROIs from 211 patients. Our system achieved high classification accuracy of 92.8%, which is superior to that of the conventional system with the gray level co-occurrence matrix (GLCM) feature especially for inhomogeneous texture patterns.

  1. Risk profiles for weight gain among postmenopausal women: A classification and regression tree analysis approach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Risk factors for obesity and weight gain are typically evaluated individually while "adjusting for" the influence of other confounding factors, and few studies, if any, have created risk profiles by clustering risk factors. We identified subgroups of postmenopausal women homogeneous in their cluster...

  2. Factors Associated with Caregiver Stability in Permanent Placements: A Classification Tree Approach

    ERIC Educational Resources Information Center

    Proctor, Laura J.; Van Dusen Randazzo, Katherine; Litrownik, Alan J.; Newton, Rae R.; Davis, Inger P.; Villodas, Miguel

    2011-01-01

    Objective: Identify individual and environmental variables associated with caregiver stability and instability for children in diverse permanent placement types (i.e., reunification, adoption, and long-term foster care/guardianship with relatives or non-relatives), following 5 or more months in out-of-home care prior to age 4 due to substantiated…

  3. In silico prediction of toxicity of phenols to Tetrahymena pyriformis by using genetic algorithm and decision tree-based modeling approach.

    PubMed

    Abbasitabar, Fatemeh; Zare-Shahabadi, Vahid

    2017-04-01

    Risk assessment of chemicals is an important issue in environmental protection; however, there is a huge lack of experimental data for a large number of end-points. The experimental determination of toxicity of chemicals involves high costs and time-consuming process. In silico tools such as quantitative structure-toxicity relationship (QSTR) models, which are constructed on the basis of computational molecular descriptors, can predict missing data for toxic end-points for existing or even not yet synthesized chemicals. Phenol derivatives are known to be aquatic pollutants. With this background, we aimed to develop an accurate and reliable QSTR model for the prediction of toxicity of 206 phenols to Tetrahymena pyriformis. A multiple linear regression (MLR)-based QSTR was obtained using a powerful descriptor selection tool named Memorized_ACO algorithm. Statistical parameters of the model were 0.72 and 0.68 for Rtraining(2) and Rtest(2), respectively. To develop a high-quality QSTR model, classification and regression tree (CART) was employed. Two approaches were considered: (1) phenols were classified into different modes of action using CART and (2) the phenols in the training set were partitioned to several subsets by a tree in such a manner that in each subset, a high-quality MLR could be developed. For the first approach, the statistical parameters of the resultant QSTR model were improved to 0.83 and 0.75 for Rtraining(2) and Rtest(2), respectively. Genetic algorithm was employed in the second approach to obtain an optimal tree, and it was shown that the final QSTR model provided excellent prediction accuracy for the training and test sets (Rtraining(2) and Rtest(2) were 0.91 and 0.93, respectively). The mean absolute error for the test set was computed as 0.1615.

  4. A new computer approach to mixed feature classification for forestry application

    NASA Technical Reports Server (NTRS)

    Kan, E. P.

    1976-01-01

    A computer approach for mapping mixed forest features (i.e., types, classes) from computer classification maps is discussed. Mixed features such as mixed softwood/hardwood stands are treated as admixtures of softwood and hardwood areas. Large-area mixed features are identified and small-area features neglected when the nominal size of a mixed feature can be specified. The computer program merges small isolated areas into surrounding areas by the iterative manipulation of the postprocessing algorithm that eliminates small connected sets. For a forestry application, computer-classified LANDSAT multispectral scanner data of the Sam Houston National Forest were used to demonstrate the proposed approach. The technique was successful in cleaning the salt-and-pepper appearance of multiclass classification maps and in mapping admixtures of softwood areas and hardwood areas. However, the computer-mapped mixed areas matched very poorly with the ground truth because of inadequate resolution and inappropriate definition of mixed features.

  5. Image-Based Airborne Sensors: A Combined Approach for Spectral Signatures Classification through Deterministic Simulated Annealing

    PubMed Central

    Guijarro, María; Pajares, Gonzalo; Herrera, P. Javier

    2009-01-01

    The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the Deterministic Simulated Annealing (DSA) framework. The simple classifiers used are the well tested supervised parametric Bayesian estimator and the Fuzzy Clustering. The DSA is an optimization approach, which minimizes an energy function. The main contribution of DSA is its ability to avoid local minima during the optimization process thanks to the annealing scheme. It outperforms simple classifiers used for the combination and some combined strategies, including a scheme based on the fuzzy cognitive maps and an optimization approach based on the Hopfield neural network paradigm. PMID:22399989

  6. Automatic Training Sample Selection for a Multi-Evidence Based Crop Classification Approach

    NASA Astrophysics Data System (ADS)

    Chellasamy, M.; Ferre, P. A. Ty; Humlekrog Greve, M.

    2014-09-01

    An approach to use the available agricultural parcel information to automatically select training samples for crop classification is investigated. Previous research addressed the multi-evidence crop classification approach using an ensemble classifier. This first produced confidence measures using three Multi-Layer Perceptron (MLP) neural networks trained separately with spectral, texture and vegetation indices; classification labels were then assigned based on Endorsement Theory. The present study proposes an approach to feed this ensemble classifier with automatically selected training samples. The available vector data representing crop boundaries with corresponding crop codes are used as a source for training samples. These vector data are created by farmers to support subsidy claims and are, therefore, prone to errors such as mislabeling of crop codes and boundary digitization errors. The proposed approach is named as ECRA (Ensemble based Cluster Refinement Approach). ECRA first automatically removes mislabeled samples and then selects the refined training samples in an iterative training-reclassification scheme. Mislabel removal is based on the expectation that mislabels in each class will be far from cluster centroid. However, this must be a soft constraint, especially when working with a hypothesis space that does not contain a good approximation of the targets classes. Difficulty in finding a good approximation often exists either due to less informative data or a large hypothesis space. Thus this approach uses the spectral, texture and indices domains in an ensemble framework to iteratively remove the mislabeled pixels from the crop clusters declared by the farmers. Once the clusters are refined, the selected border samples are used for final learning and the unknown samples are classified using the multi-evidence approach. The study is implemented with WorldView-2 multispectral imagery acquired for a study area containing 10 crop classes. The proposed

  7. A Comparative Assessment of the Influences of Human Impacts on Soil Cd Concentrations Based on Stepwise Linear Regression, Classification and Regression Tree, and Random Forest Models

    PubMed Central

    Qiu, Lefeng; Wang, Kai; Long, Wenli; Wang, Ke; Hu, Wei; Amable, Gabriel S.

    2016-01-01

    Soil cadmium (Cd) contamination has attracted a great deal of attention because of its detrimental effects on animals and humans. This study aimed to develop and compare the performances of stepwise linear regression (SLR), classification and regression tree (CART) and random forest (RF) models in the prediction and mapping of the spatial distribution of soil Cd and to identify likely sources of Cd accumulation in Fuyang County, eastern China. Soil Cd data from 276 topsoil (0–20 cm) samples were collected and randomly divided into calibration (222 samples) and validation datasets (54 samples). Auxiliary data, including detailed land use information, soil organic matter, soil pH, and topographic data, were incorporated into the models to simulate the soil Cd concentrations and further identify the main factors influencing soil Cd variation. The predictive models for soil Cd concentration exhibited acceptable overall accuracies (72.22% for SLR, 70.37% for CART, and 75.93% for RF). The SLR model exhibited the largest predicted deviation, with a mean error (ME) of 0.074 mg/kg, a mean absolute error (MAE) of 0.160 mg/kg, and a root mean squared error (RMSE) of 0.274 mg/kg, and the RF model produced the results closest to the observed values, with an ME of 0.002 mg/kg, an MAE of 0.132 mg/kg, and an RMSE of 0.198 mg/kg. The RF model also exhibited the greatest R2 value (0.772). The CART model predictions closely followed, with ME, MAE, RMSE, and R2 values of 0.013 mg/kg, 0.154 mg/kg, 0.230 mg/kg and 0.644, respectively. The three prediction maps generally exhibited similar and realistic spatial patterns of soil Cd contamination. The heavily Cd-affected areas were primarily located in the alluvial valley plain of the Fuchun River and its tributaries because of the dramatic industrialization and urbanization processes that have occurred there. The most important variable for explaining high levels of soil Cd accumulation was the presence of metal smelting industries. The

  8. A bag of cells approach for antinuclear antibodies HEp-2 image classification.

    PubMed

    Wiliem, Arnold; Hobson, Peter; Minchin, Rodney F; Lovell, Brian C

    2015-06-01

    The antinuclear antibody (ANA) test via indirect immunofluorescence applied on Human Epithelial type 2 (HEp-2) cells is a pathology test commonly used to identify connective tissue diseases (CTDs). Despite its effectiveness, the test is still considered labor intensive and time consuming. Applying image-based computer aided diagnosis (CAD) systems is one of the possible ways to address these issues. Ideally, a CAD system should be able to classify ANA HEp-2 images taken by a camera fitted to a fluorescence microscope. Unfortunately, most prior works have primarily focused on the HEp-2 cell image classification problem which is one of the early essential steps in the system pipeline. In this work we directly tackle the specimen image classification problem. We aim to develop a system that can be easily scaled and has competitive accuracy. ANA HEp-2 images or ANA images are generally comprised of a number of cells. Patterns exhibiting in the cells are then used to make inference on the ANA image pattern. To that end, we adapted a popular approach for general image classification problems, namely a bag of visual words approach. Each specimen is considered as a visual document containing visual vocabularies represented by its cells. A specimen image is then represented by a histogram of visual vocabulary occurrences. We name this approach as the Bag of Cells approach. We studied the performance of the proposed approach on a set of images taken from 262 ANA positive patient sera. The results show the proposed approach has competitive performance compared to the recent state-of-the-art approaches. Our proposal can also be expanded to other tests involving examining patterns of human cells to make inferences.

  9. [Proposals for social class classification based on the Spanish National Classification of Occupations 2011 using neo-Weberian and neo-Marxist approaches].

    PubMed

    Domingo-Salvany, Antònia; Bacigalupe, Amaia; Carrasco, José Miguel; Espelt, Albert; Ferrando, Josep; Borrell, Carme

    2013-01-01

    In Spain, the new National Classification of Occupations (Clasificación Nacional de Ocupaciones [CNO-2011]) is substantially different to the 1994 edition, and requires adaptation of occupational social classes for use in studies of health inequalities. This article presents two proposals to measure social class: the new classification of occupational social class (CSO-SEE12), based on the CNO-2011 and a neo-Weberian perspective, and a social class classification based on a neo-Marxist approach. The CSO-SEE12 is the result of a detailed review of the CNO-2011 codes. In contrast, the neo-Marxist classification is derived from variables related to capital and organizational and skill assets. The proposed CSO-SEE12 consists of seven classes that can be grouped into a smaller number of categories according to study needs. The neo-Marxist classification consists of 12 categories in which home owners are divided into three categories based on capital goods and employed persons are grouped into nine categories composed of organizational and skill assets. These proposals are complemented by a proposed classification of educational level that integrates the various curricula in Spain and provides correspondences with the International Standard Classification of Education.

  10. Dynamic frequency feature selection based approach for classification of motor imageries.

    PubMed

    Luo, Jing; Feng, Zuren; Zhang, Jun; Lu, Na

    2016-08-01

    Electroencephalography (EEG) is one of the most popular techniques to record the brain activities such as motor imagery, which is of low signal-to-noise ratio and could lead to high classification error. Therefore, selection of the most discriminative features could be crucial to improve the classification performance. However, the traditional feature selection methods employed in brain-computer interface (BCI) field (e.g. Mutual Information-based Best Individual Feature (MIBIF), Mutual Information-based Rough Set Reduction (MIRSR) and cross-validation) mainly focus on the overall performance on all the trials in the training set, and thus may have very poor performance on some specific samples, which is not acceptable. To address this problem, a novel sequential forward feature selection approach called Dynamic Frequency Feature Selection (DFFS) is proposed in this paper. The DFFS method emphasized the importance of the samples that got misclassified while only pursuing high overall classification performance. In the DFFS based classification scheme, the EEG data was first transformed to frequency domain using Wavelet Packet Decomposition (WPD), which is then employed as the candidate set for further discriminatory feature selection. The features are selected one by one in a boosting manner. After one feature being selected, the importance of the correctly classified samples based on the feature will be decreased, which is equivalent to increasing the importance of the misclassified samples. Therefore, a complement feature to the current features could be selected in the next run. The selected features are then fed to a classifier trained by random forest algorithm. Finally, a time series voting-based method is utilized to improve the classification performance. Comparisons between the DFFS-based approach and state-of-art methods on BCI competition IV data set 2b have been conducted, which have shown the superiority of the proposed algorithm.

  11. Detection of fallen trees in ALS point clouds using a Normalized Cut approach trained by simulation

    NASA Astrophysics Data System (ADS)

    Polewski, Przemyslaw; Yao, Wei; Heurich, Marco; Krzystek, Peter; Stilla, Uwe

    2015-07-01

    Downed dead wood is regarded as an important part of forest ecosystems from an ecological perspective, which drives the need for investigating its spatial distribution. Based on several studies, Airborne Laser Scanning (ALS) has proven to be a valuable remote sensing technique for obtaining such information. This paper describes a unified approach to the detection of fallen trees from ALS point clouds based on merging short segments into whole stems using the Normalized Cut algorithm. We introduce a new method of defining the segment similarity function for the clustering procedure, where the attribute weights are learned from labeled data. Based on a relationship between Normalized Cut's similarity function and a class of regression models, we show how to learn the similarity function by training a classifier. Furthermore, we propose using an appearance-based stopping criterion for the graph cut algorithm as an alternative to the standard Normalized Cut threshold approach. We set up a virtual fallen tree generation scheme to simulate complex forest scenarios with multiple overlapping fallen stems. This simulated data is then used as a basis to learn both the similarity function and the stopping criterion for Normalized Cut. We evaluate our approach on 5 plots from the strictly protected mixed mountain forest within the Bavarian Forest National Park using reference data obtained via a manual field inventory. The experimental results show that our method is able to detect up to 90% of fallen stems in plots having 30-40% overstory cover with a correctness exceeding 80%, even in quite complex forest scenes. Moreover, the performance for feature weights trained on simulated data is competitive with the case when the weights are calculated using a grid search on the test data, which indicates that the learned similarity function and stopping criterion can generalize well on new plots.

  12. Feature Selection and Classification of Electroencephalographic Signals: An Artificial Neural Network and Genetic Algorithm Based Approach.

    PubMed

    Erguzel, Turker Tekin; Ozekes, Serhat; Tan, Oguz; Gultekin, Selahattin

    2015-10-01

    Feature selection is an important step in many pattern recognition systems aiming to overcome the so-called curse of dimensionality. In this study, an optimized classification method was tested in 147 patients with major depressive disorder (MDD) treated with repetitive transcranial magnetic stimulation (rTMS). The performance of the combination of a genetic algorithm (GA) and a back-propagation (BP) neural network (BPNN) was evaluated using 6-channel pre-rTMS electroencephalographic (EEG) patterns of theta and delta frequency bands. The GA was first used to eliminate the redundant and less discriminant features to maximize classification performance. The BPNN was then applied to test the performance of the feature subset. Finally, classification performance using the subset was evaluated using 6-fold cross-validation. Although the slow bands of the frontal electrodes are widely used to collect EEG data for patients with MDD and provide quite satisfactory classification results, the outcomes of the proposed approach indicate noticeably increased overall accuracy of 89.12% and an area under the receiver operating characteristic (ROC) curve (AUC) of 0.904 using the reduced feature set.

  13. Computational Classification Approach to Profile Neuron Subtypes from Brain Activity Mapping Data.

    PubMed

    Li, Meng; Zhao, Fang; Lee, Jason; Wang, Dong; Kuang, Hui; Tsien, Joe Z

    2015-07-27

    The analysis of cell type-specific activity patterns during behaviors is important for better understanding of how neural circuits generate cognition, but has not been well explored from in vivo neurophysiological datasets. Here, we describe a computational approach to uncover distinct cell subpopulations from in vivo neural spike datasets. This method, termed "inter-spike-interval classification-analysis" (ISICA), is comprised of four major steps: spike pattern feature-extraction, pre-clustering analysis, clustering classification, and unbiased classification-dimensionality selection. By using two key features of spike dynamic - namely, gamma distribution shape factors and a coefficient of variation of inter-spike interval - we show that this ISICA method provides invariant classification for dopaminergic neurons or CA1 pyramidal cell subtypes regardless of the brain states from which spike data were collected. Moreover, we show that these ISICA-classified neuron subtypes underlie distinct physiological functions. We demonstrate that the uncovered dopaminergic neuron subtypes encoded distinct aspects of fearful experiences such as valence or value, whereas distinct hippocampal CA1 pyramidal cells responded differentially to ketamine-induced anesthesia. This ISICA method should be useful to better data mining of large-scale in vivo neural datasets, leading to novel insights into circuit dynamics associated with cognitions.

  14. Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use

    PubMed Central

    Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil

    2013-01-01

    The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648

  15. Computational Classification Approach to Profile Neuron Subtypes from Brain Activity Mapping Data

    PubMed Central

    Li, Meng; Zhao, Fang; Lee, Jason; Wang, Dong; Kuang, Hui; Tsien, Joe Z.

    2015-01-01

    The analysis of cell type-specific activity patterns during behaviors is important for better understanding of how neural circuits generate cognition, but has not been well explored from in vivo neurophysiological datasets. Here, we describe a computational approach to uncover distinct cell subpopulations from in vivo neural spike datasets. This method, termed “inter-spike-interval classification-analysis” (ISICA), is comprised of four major steps: spike pattern feature-extraction, pre-clustering analysis, clustering classification, and unbiased classification-dimensionality selection. By using two key features of spike dynamic - namely, gamma distribution shape factors and a coefficient of variation of inter-spike interval - we show that this ISICA method provides invariant classification for dopaminergic neurons or CA1 pyramidal cell subtypes regardless of the brain states from which spike data were collected. Moreover, we show that these ISICA-classified neuron subtypes underlie distinct physiological functions. We demonstrate that the uncovered dopaminergic neuron subtypes encoded distinct aspects of fearful experiences such as valence or value, whereas distinct hippocampal CA1 pyramidal cells responded differentially to ketamine-induced anesthesia. This ISICA method should be useful to better data mining of large-scale in vivo neural datasets, leading to novel insights into circuit dynamics associated with cognitions. PMID:26212360

  16. Uncertain-tree: discriminating among competing approaches to the phylogenetic analysis of phenotype data

    PubMed Central

    Tanner, Alastair R.; Fleming, James F.; Tarver, James E.; Pisani, Davide

    2017-01-01

    Morphological data provide the only means of classifying the majority of life's history, but the choice between competing phylogenetic methods for the analysis of morphology is unclear. Traditionally, parsimony methods have been favoured but recent studies have shown that these approaches are less accurate than the Bayesian implementation of the Mk model. Here we expand on these findings in several ways: we assess the impact of tree shape and maximum-likelihood estimation using the Mk model, as well as analysing data composed of both binary and multistate characters. We find that all methods struggle to correctly resolve deep clades within asymmetric trees, and when analysing small character matrices. The Bayesian Mk model is the most accurate method for estimating topology, but with lower resolution than other methods. Equal weights parsimony is more accurate than implied weights parsimony, and maximum-likelihood estimation using the Mk model is the least accurate method. We conclude that the Bayesian implementation of the Mk model should be the default method for phylogenetic estimation from phenotype datasets, and we explore the implications of our simulations in reanalysing several empirical morphological character matrices. A consequence of our finding is that high levels of resolution or the ability to classify species or groups with much confidence should not be expected when using small datasets. It is now necessary to depart from the traditional parsimony paradigms of constructing character matrices, towards datasets constructed explicitly for Bayesian methods. PMID:28077778

  17. A Splay Tree-Based Approach for Efficient Resource Location in P2P Networks

    PubMed Central

    Zhou, Wei; Tan, Zilong; Yao, Shaowen; Wang, Shipu

    2014-01-01

    Resource location in structured P2P system has a critical influence on the system performance. Existing analytical studies of Chord protocol have shown some potential improvements in performance. In this paper a splay tree-based new Chord structure called SChord is proposed to improve the efficiency of locating resources. We consider a novel implementation of the Chord finger table (routing table) based on the splay tree. This approach extends the Chord finger table with additional routing entries. Adaptive routing algorithm is proposed for implementation, and it can be shown that hop count is significantly minimized without introducing any other protocol overheads. We analyze the hop count of the adaptive routing algorithm, as compared to Chord variants, and demonstrate sharp upper and lower bounds for both worst-case and average case settings. In addition, we theoretically analyze the hop reducing in SChord and derive the fact that SChord can significantly reduce the routing hops as compared to Chord. Several simulations are presented to evaluate the performance of the algorithm and support our analytical findings. The simulation results show the efficiency of SChord. PMID:24778602

  18. Uncertain-tree: discriminating among competing approaches to the phylogenetic analysis of phenotype data.

    PubMed

    Puttick, Mark N; O'Reilly, Joseph E; Tanner, Alastair R; Fleming, James F; Clark, James; Holloway, Lucy; Lozano-Fernandez, Jesus; Parry, Luke A; Tarver, James E; Pisani, Davide; Donoghue, Philip C J

    2017-01-11

    Morphological data provide the only means of classifying the majority of life's history, but the choice between competing phylogenetic methods for the analysis of morphology is unclear. Traditionally, parsimony methods have been favoured but recent studies have shown that these approaches are less accurate than the Bayesian implementation of the Mk model. Here we expand on these findings in several ways: we assess the impact of tree shape and maximum-likelihood estimation using the Mk model, as well as analysing data composed of both binary and multistate characters. We find that all methods struggle to correctly resolve deep clades within asymmetric trees, and when analysing small character matrices. The Bayesian Mk model is the most accurate method for estimating topology, but with lower resolution than other methods. Equal weights parsimony is more accurate than implied weights parsimony, and maximum-likelihood estimation using the Mk model is the least accurate method. We conclude that the Bayesian implementation of the Mk model should be the default method for phylogenetic estimation from phenotype datasets, and we explore the implications of our simulations in reanalysing several empirical morphological character matrices. A consequence of our finding is that high levels of resolution or the ability to classify species or groups with much confidence should not be expected when using small datasets. It is now necessary to depart from the traditional parsimony paradigms of constructing character matrices, towards datasets constructed explicitly for Bayesian methods.

  19. Novel consensus approaches to the reliable ranking of features for seabed imagery classification.

    PubMed

    Harrison, Richard; Birchall, Roger; Mann, Dave; Wang, Wenjia

    2012-12-01

    Feature saliency estimation and feature selection are important tasks in machine learning applications. Filters, such as distance measures are commonly used as an efficient means of estimating the saliency of individual features. However, feature rankings derived from different distance measures are frequently inconsistent. This can present reliability issues when the rankings are used for feature selection. Two novel consensus approaches to creating a more robust ranking are presented in this paper. Our experimental results show that the consensus approaches can improve reliability over a range of feature parameterizations and various seabed texture classification tasks in sidescan sonar mosaic imagery.

  20. Novel approach for simultaneous sediment classification and concentration determination of water turbidity

    NASA Astrophysics Data System (ADS)

    Duarte, Daniel P.; Prats, Sergio; Keizer, J. J.; Georgieva, Petia; Nogueira, Rogério; Bilro, Lúcia

    2015-09-01

    A new approach for data analysis and classification for datasets obtained by a multiparameter optical turbidity sensor is proposed. This approach is based on the combination of statistical or machine learning methods such as linear regressions and clustering analysis. A case study is presented using a 6 dimensional fiber optic sensor to simultaneously classify types of sediments and concentration. Results show a 79% of success for the used training data sets. The methodology proposed is flexible because can be easily adapted to other physical scenarios.

  1. Comparison of Standard and Novel Signal Analysis Approaches to Obstructive Sleep Apnea Classification

    PubMed Central

    Roebuck, Aoife; Clifford, Gari D.

    2015-01-01

    Obstructive sleep apnea (OSA) is a disorder characterized by repeated pauses in breathing during sleep, which leads to deoxygenation and voiced chokes at the end of each episode. OSA is associated by daytime sleepiness and an increased risk of serious conditions such as cardiovascular disease, diabetes, and stroke. Between 2 and 7% of the adult population globally has OSA, but it is estimated that up to 90% of those are undiagnosed and untreated. Diagnosis of OSA requires expensive and cumbersome screening. Audio offers a potential non-contact alternative, particularly with the ubiquity of excellent signal processing on every phone. Previous studies have focused on the classification of snoring and apneic chokes. However, such approaches require accurate identification of events. This leads to limited accuracy and small study populations. In this work, we propose an alternative approach which uses multiscale entropy (MSE) coefficients presented to a classifier to identify disorder in vocal patterns indicative of sleep apnea. A database of 858 patients was used, the largest reported in this domain. Apneic choke, snore, and noise events encoded with speech analysis features were input into a linear classifier. Coefficients of MSE derived from the first 4 h of each recording were used to train and test a random forest to classify patients as apneic or not. Standard speech analysis approaches for event classification achieved an out-of-sample accuracy (Ac) of 76.9% with a sensitivity (Se) of 29.2% and a specificity (Sp) of 88.7% but high variance. For OSA severity classification, MSE provided an out-of-sample Ac of 79.9%, Se of 66.0%, and Sp = 88.8%. Including demographic information improved the MSE-based classification performance to Ac = 80.5%, Se = 69.2%, and Sp = 87.9%. These results indicate that audio recordings could be used in screening for OSA, but are generally under-sensitive. PMID:26380256

  2. A water balance approach for reconstructing streamflow using tree-ring proxy records

    NASA Astrophysics Data System (ADS)

    Saito, Laurel; Biondi, Franco; Devkota, Rajan; Vittori, Jasmine; Salas, Jose D.

    2015-10-01

    Tree-ring data have been used to augment limited instrumental records of climate and provide a longer view of past variability, thus improving assessments of future scenarios. For streamflow reconstructions, traditional regression-based approaches cannot examine factors that may alter streamflow independently of climate, such as changes in land use or land cover. In this study, seasonal water balance models were used as a mechanistic approach to reconstruct streamflow with proxy inputs of precipitation and air temperature. We examined a Thornthwaite water balance model modified to have seasonal components and a simple water balance model with a snow component. These two models were calibrated with a shuffled complex evolution approach using PRISM and proxy seasonal temperature and precipitation to reconstruct streamflow for the upper reaches of the West Walker River basin at Coleville, CA. Overall, the modified Thornthwaite model performed best during calibration, with R2 values of 0.96 and 0.80 using PRISM and proxy inputs, respectively. The modified Thornthwaite model was then used to reconstruct streamflow during AD 1500-1980 for the West Walker River basin. The reconstruction included similar wet and dry episodes as other regression-based records for the Great Basin, and provided estimates of actual evapotranspiration and of April 1 snow water equivalence. Given its limited input requirements, this approach is suitable in areas where sparse instrumental data are available to improve proxy-based streamflow reconstructions and to explore non-climatic reasons for streamflow variability during the reconstruction period.

  3. Metabarcoding of marine nematodes – evaluation of reference datasets used in tree-based taxonomy assignment approach

    PubMed Central

    2016-01-01

    Abstract Background Metabarcoding is becoming a common tool used to assess and compare diversity of organisms in environmental samples. Identification of OTUs is one of the critical steps in the process and several taxonomy assignment methods were proposed to accomplish this task. This publication evaluates the quality of reference datasets, alongside with several alignment and phylogeny inference methods used in one of the taxonomy assignment methods, called tree-based approach. This approach assigns anonymous OTUs to taxonomic categories based on relative placements of OTUs and reference sequences on the cladogram and support that these placements receive. New information In tree-based taxonomy assignment approach, reliable identification of anonymous OTUs is based on their placement in monophyletic and highly supported clades together with identified reference taxa. Therefore, it requires high quality reference dataset to be used. Resolution of phylogenetic trees is strongly affected by the presence of erroneous sequences as well as alignment and phylogeny inference methods used in the process. Two preparation steps are essential for the successful application of tree-based taxonomy assignment approach. Curated collections of genetic information do include erroneous sequences. These sequences have detrimental effect on the resolution of cladograms used in tree-based approach. They must be identified and excluded from the reference dataset beforehand. Various combinations of multiple sequence alignment and phylogeny inference methods provide cladograms with different topology and bootstrap support. These combinations of methods need to be tested in order to determine the one that gives highest resolution for the particular reference dataset. Completing the above mentioned preparation steps is expected to decrease the number of unassigned OTUs and thus improve the results of the tree-based taxonomy assignment approach. PMID:27932919

  4. A Lagrangian Relax-and-Cut Approach for the Bounded Diameter Minimum Spanning Tree Problem

    NASA Astrophysics Data System (ADS)

    Raidl, Günther R.; Gruber, Martin

    2008-09-01

    We consider the problem of finding for a given weighted graph a minimum cost spanning tree whose diameter does not exceed a specified upper bound. This problem is NP-hard and has several applications, e.g. when designing communication networks and quality of service is of concern. We model the problem as an integer linear program (ILP) using so-called jump inequalities. Since the number of these constraints grows exponentially with the problem size, solving this ILP directly is not feasible. Instead, we relax the jump constraints in a Lagrangian fashion and apply a cutting plane algorithm to separate violated inequalities. This relax-and-cut approach yields relatively tight lower bounds especially for larger problem instances on which exact techniques are not applicable. High quality feasible solutions, i.e. upper bounds, are obtained by a repair heuristic in combination with a powerful variable neighborhood descent strategy.

  5. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests

    ERIC Educational Resources Information Center

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2009-01-01

    Recursive partitioning methods have become popular and widely used tools for nonparametric regression and classification in many scientific fields. Especially random forests, which can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine, and…

  6. Extended Gabor approach applied to classification of emphysematous patterns in computed tomography

    PubMed Central

    Escalante-Ramírez, Boris; Cristóbal, Gabriel; Estépar, Raúl San José

    2014-01-01

    Chronic obstructive pulmonary disease (COPD) is a progressive and irreversible lung condition typically related to emphysema. It hinders air from passing through airpaths and causes that alveolar sacs lose their elastic quality. Findings of COPD may be manifested in a variety of computed tomography (CT) studies. Nevertheless, visual assessment of CT images is time-consuming and depends on trained observers. Hence, a reliable computer-aided diagnosis system would be useful to reduce time and inter-evaluator variability. In this paper, we propose a new emphysema classification framework based on complex Gabor filters and local binary patterns. This approach simultaneously encodes global characteristics and local information to describe emphysema morphology in CT images. Kernel Fisher analysis was used to reduce dimensionality and to find the most discriminant nonlinear boundaries among classes. Finally, classification was performed using the k-nearest neighbor classifier. The results have shown the effectiveness of our approach for quantifying lesions due to emphysema and that the combination of descriptors yields to a better classification performance. PMID:24496558

  7. Insights into geomorphic and vegetation spatial patterns within dynamic river floodplains using soft classification approaches

    NASA Astrophysics Data System (ADS)

    Guneralp, I.; Filippi, A. M.; Guneralp, B.; You, M.

    2014-12-01

    Lowland rivers in broad alluvial floodplains create one of the most dynamic landscapes, governed by multiple, and commonly nonlinear, interactions among geomorphic, hydrologic, and ecologic processes. Fluvial landforms and land-cover patches composing the floodplains of lowland rivers vary in their shapes and sizes because of variations in vegetation biomass, topography, and soil composition (e.g., of abandoned meanders versus accreting bars) across space. Such floodplain heterogeneity, in turn, influences future river-channel evolution by creating variability in channel-migration rates. In this study, using Landsat 5 Thematic Mapper data and alternative image-classification approaches, we investigate geomorphic and vegetation spatial patterns in a dynamic large tropical river. Specifically, we examine the spatial relations between river-channel planform and fluvial-landform and land-cover patterns across the floodplain. We classify the images using both hard and soft classification algorithms. We characterize the structure of geomorphic landform and vegetation components of the floodplain by computing a range of class-level landscape metrics based on the classified images. Results indicate that comparable classification accuracies are accrued for the inherently hard and (hardened) soft classification images, ranging from 89.8% to 91.8% overall accuracy. However, soft classification images provide unique information regarding spatially-varying similarities and differences in water-column properties of oxbow lakes and the main river channel. Proximity analyses, where buffer zones along the river with distances corresponding to 5, 10, and 20 river-channel widths are constructed, reveal that the average size of forest patches first increase away from the river banks but they become sparse after a distance of 10 channel widths away from the river.

  8. Selection-Fusion Approach for Classification of Datasets with Missing Values

    PubMed Central

    Ghannad-Rezaie, Mostafa; Soltanian-Zadeh, Hamid; Ying, Hao; Dong, Ming

    2010-01-01

    This paper proposes a new approach based on missing value pattern discovery for classifying incomplete data. This approach is particularly designed for classification of datasets with a small number of samples and a high percentage of missing values where available missing value treatment approaches do not usually work well. Based on the pattern of the missing values, the proposed approach finds subsets of samples for which most of the features are available and trains a classifier for each subset. Then, it combines the outputs of the classifiers. Subset selection is translated into a clustering problem, allowing derivation of a mathematical framework for it. A trade off is established between the computational complexity (number of subsets) and the accuracy of the overall classifier. To deal with this trade off, a numerical criterion is proposed for the prediction of the overall performance. The proposed method is applied to seven datasets from the popular University of California, Irvine data mining archive and an epilepsy dataset from Henry Ford Hospital, Detroit, Michigan (total of eight datasets). Experimental results show that classification accuracy of the proposed method is superior to those of the widely used multiple imputations method and four other methods. They also show that the level of superiority depends on the pattern and percentage of missing values. PMID:20212921

  9. Semi-automatic classification of glaciovolcanic landforms: An object-based mapping approach based on geomorphometry

    NASA Astrophysics Data System (ADS)

    Pedersen, G. B. M.

    2016-02-01

    A new object-oriented approach is developed to classify glaciovolcanic landforms (Procedure A) and their landform elements boundaries (Procedure B). It utilizes the principle that glaciovolcanic edifices are geomorphometrically distinct from lava shields and plains (Pedersen and Grosse, 2014), and the approach is tested on data from Reykjanes Peninsula, Iceland. The outlined procedures utilize slope and profile curvature attribute maps (20 m/pixel) and the classified results are evaluated quantitatively through error matrix maps (Procedure A) and visual inspection (Procedure B). In procedure A, the highest obtained accuracy is 94.1%, but even simple mapping procedures provide good results (> 90% accuracy). Successful classification of glaciovolcanic landform element boundaries (Procedure B) is also achieved and this technique has the potential to delineate the transition from intraglacial to subaerial volcanic activity in orthographic view. This object-oriented approach based on geomorphometry overcomes issues with vegetation cover, which has been typically problematic for classification schemes utilizing spectral data. Furthermore, it handles complex edifice outlines well and is easily incorporated into a GIS environment, where results can be edited or fused with other mapping results. The approach outlined here is designed to map glaciovolcanic edifices within the Icelandic neovolcanic zone but may also be applied to similar subaerial or submarine volcanic settings, where steep volcanic edifices are surrounded by flat plains.

  10. Hydrometeor classification through statistical clustering of polarimetric radar measurements: a semi-supervised approach

    NASA Astrophysics Data System (ADS)

    Besic, Nikola; Ventura, Jordi Figueras i.; Grazioli, Jacopo; Gabella, Marco; Germann, Urs; Berne, Alexis

    2016-09-01

    Polarimetric radar-based hydrometeor classification is the procedure of identifying different types of hydrometeors by exploiting polarimetric radar observations. The main drawback of the existing supervised classification methods, mostly based on fuzzy logic, is a significant dependency on a presumed electromagnetic behaviour of different hydrometeor types. Namely, the results of the classification largely rely upon the quality of scattering simulations. When it comes to the unsupervised approach, it lacks the constraints related to the hydrometeor microphysics. The idea of the proposed method is to compensate for these drawbacks by combining the two approaches in a way that microphysical hypotheses can, to a degree, adjust the content of the classes obtained statistically from the observations. This is done by means of an iterative approach, performed offline, which, in a statistical framework, examines clustered representative polarimetric observations by comparing them to the presumed polarimetric properties of each hydrometeor class. Aside from comparing, a routine alters the content of clusters by encouraging further statistical clustering in case of non-identification. By merging all identified clusters, the multi-dimensional polarimetric signatures of various hydrometeor types are obtained for each of the studied representative datasets, i.e. for each radar system of interest. These are depicted by sets of centroids which are then employed in operational labelling of different hydrometeors. The method has been applied on three C-band datasets, each acquired by different operational radar from the MeteoSwiss Rad4Alp network, as well as on two X-band datasets acquired by two research mobile radars. The results are discussed through a comparative analysis which includes a corresponding supervised and unsupervised approach, emphasising the operational potential of the proposed method.

  11. Operational optimization of irrigation scheduling for citrus trees using an ensemble based data assimilation approach

    NASA Astrophysics Data System (ADS)

    Hendricks Franssen, H.; Han, X.; Martinez, F.; Jimenez, M.; Manzano, J.; Chanzy, A.; Vereecken, H.

    2013-12-01

    Data assimilation (DA) techniques, like the local ensemble transform Kalman filter (LETKF) not only offer the opportunity to update model predictions by assimilating new measurement data in real time, but also provide an improved basis for real-time (DA-based) control. This study focuses on the optimization of real-time irrigation scheduling for fields of citrus trees near Picassent (Spain). For three selected fields the irrigation was optimized with DA-based control, and for other fields irrigation was optimized on the basis of a more traditional approach where reference evapotranspiration for citrus trees was estimated using the FAO-method. The performance of the two methods is compared for the year 2013. The DA-based real-time control approach is based on ensemble predictions of soil moisture profiles, using the Community Land Model (CLM). The uncertainty in the model predictions is introduced by feeding the model with weather predictions from an ensemble prediction system (EPS) and uncertain soil hydraulic parameters. The model predictions are updated daily by assimilating soil moisture data measured by capacitance probes. The measurement data are assimilated with help of LETKF. The irrigation need was calculated for each of the ensemble members, averaged, and logistic constraints (hydraulics, energy costs) were taken into account for the final assigning of irrigation in space and time. For the operational scheduling based on this approach only model states and no model parameters were updated by the model. Other, non-operational simulation experiments for the same period were carried out where (1) neither ensemble weather forecast nor DA were used (open loop), (2) Only ensemble weather forecast was used, (3) Only DA was used, (4) also soil hydraulic parameters were updated in data assimilation and (5) both soil hydraulic and plant specific parameters were updated. The FAO-based and DA-based real-time irrigation control are compared in terms of soil moisture

  12. Ottawa's urban forest: A geospatial approach to data collection for the UFORE/i-Tree Eco ecosystem services valuation model

    NASA Astrophysics Data System (ADS)

    Palmer, Michael D.

    The i-Tree Eco model, developed by the U.S. Forest Service, is commonly used to estimate the value of the urban forest and the ecosystem services trees provide. The model relies on field-based measurements to estimate ecosystem service values. However, the methods for collecting the field data required for the model can be extensive and costly for large areas, and data collection can thus be a barrier to implementing the model for many cities. This study investigated the use of geospatial technologies as a means to collect urban forest structure measurements within the City of Ottawa, Ontario. Results show that geospatial data collection methods can serve as a proxy for urban forest structure parameters required by i-Tree Eco. Valuations using the geospatial approach are shown to be less accurate than those developed from field-based data, but significantly less expensive. Planners must weigh the limitations of either approach when planning assessment projects.

  13. Evolution and Classification of Myosins, a Paneukaryotic Whole-Genome Approach

    PubMed Central

    Sebé-Pedrós, Arnau; Grau-Bové, Xavier; Richards, Thomas A.; Ruiz-Trillo, Iñaki

    2014-01-01

    Myosins are key components of the eukaryotic cytoskeleton, providing motility for a broad diversity of cargoes. Therefore, understanding the origin and evolutionary history of myosin classes is crucial to address the evolution of eukaryote cell biology. Here, we revise the classification of myosins using an updated taxon sampling that includes newly or recently sequenced genomes and transcriptomes from key taxa. We performed a survey of eukaryotic genomes and phylogenetic analyses of the myosin gene family, reconstructing the myosin toolkit at different key nodes in the eukaryotic tree of life. We also identified the phylogenetic distribution of myosin diversity in terms of number of genes, associated protein domains and number of classes in each taxa. Our analyses show that new classes (i.e., paralogs) and domain architectures were continuously generated throughout eukaryote evolution, with a significant expansion of myosin abundance and domain architectural diversity at the stem of Holozoa, predating the origin of animal multicellularity. Indeed, single-celled holozoans have the most complex myosin complement among eukaryotes, with paralogs of most myosins previously considered animal specific. We recover a dynamic evolutionary history, with several lineage-specific expansions (e.g., the myosin III-like gene family diversification in choanoflagellates), convergence in protein domain architectures (e.g., fungal and animal chitin synthase myosins), and important secondary losses. Overall, our evolutionary scheme demonstrates that the ancestral eukaryote likely had a complex myosin repertoire that included six genes with different protein domain architectures. Finally, we provide an integrative and robust classification, useful for future genomic and functional studies on this crucial eukaryotic gene family. PMID:24443438

  14. Semi-Automated Approach for Mapping Urban Trees from Integrated Aerial LiDAR Point Cloud and Digital Imagery Datasets

    NASA Astrophysics Data System (ADS)

    Dogon-Yaro, M. A.; Kumar, P.; Rahman, A. Abdul; Buyuksalih, G.

    2016-09-01

    Mapping of trees plays an important role in modern urban spatial data management, as many benefits and applications inherit from this detailed up-to-date data sources. Timely and accurate acquisition of information on the condition of urban trees serves as a tool for decision makers to better appreciate urban ecosystems and their numerous values which are critical to building up strategies for sustainable development. The conventional techniques used for extracting trees include ground surveying and interpretation of the aerial photography. However, these techniques are associated with some constraints, such as labour intensive field work and a lot of financial requirement which can be overcome by means of integrated LiDAR and digital image datasets. Compared to predominant studies on trees extraction mainly in purely forested areas, this study concentrates on urban areas, which have a high structural complexity with a multitude of different objects. This paper presented a workflow about semi-automated approach for extracting urban trees from integrated processing of airborne based LiDAR point cloud and multispectral digital image datasets over Istanbul city of Turkey. The paper reveals that the integrated datasets is a suitable technology and viable source of information for urban trees management. As a conclusion, therefore, the extracted information provides a snapshot about location, composition and extent of trees in the study area useful to city planners and other decision makers in order to understand how much canopy cover exists, identify new planting, removal, or reforestation opportunities and what locations have the greatest need or potential to maximize benefits of return on investment. It can also help track trends or changes to the urban trees over time and inform future management decisions.

  15. Cloud field classification based upon high spatial resolution textural features. II - Simplified vector approaches

    NASA Technical Reports Server (NTRS)

    Chen, D. W.; Sengupta, S. K.; Welch, R. M.

    1989-01-01

    This paper compares the results of cloud-field classification derived from two simplified vector approaches, the Sum and Difference Histogram (SADH) and the Gray Level Difference Vector (GLDV), with the results produced by the Gray Level Cooccurrence Matrix (GLCM) approach described by Welch et al. (1988). It is shown that the SADH method produces accuracies equivalent to those obtained using the GLCM method, while the GLDV method fails to resolve error clusters. Compared to the GLCM method, the SADH method leads to a 31 percent saving in run time and a 50 percent saving in storage requirements, while the GLVD approach leads to a 40 percent saving in run time and an 87 percent saving in storage requirements.

  16. Proposition of novel classification approach and features for improved real-time arrhythmia monitoring.

    PubMed

    Kim, Yoon Jae; Heo, Jeong; Park, Kwang Suk; Kim, Sungwan

    2016-08-01

    Arrhythmia refers to a group of conditions in which the heartbeat is irregular, fast, or slow due to abnormal electrical activity in the heart. Some types of arrhythmia such as ventricular fibrillation may result in cardiac arrest or death. Thus, arrhythmia detection becomes an important issue, and various studies have been conducted. Additionally, an arrhythmia detection algorithm for portable devices such as mobile phones has recently been developed because of increasing interest in e-health care. This paper proposes a novel classification approach and features, which are validated for improved real-time arrhythmia monitoring. The classification approach that was employed for arrhythmia detection is based on the concept of ensemble learning and the Taguchi method and has the advantage of being accurate and computationally efficient. The electrocardiography (ECG) data for arrhythmia detection was obtained from the MIT-BIH Arrhythmia Database (n=48). A novel feature, namely the heart rate variability calculated from 5s segments of ECG, which was not considered previously, was used. The novel classification approach and feature demonstrated arrhythmia detection accuracy of 89.13%. When the same data was classified using the conventional support vector machine (SVM), the obtained accuracy was 91.69%, 88.14%, and 88.74% for Gaussian, linear, and polynomial kernels, respectively. In terms of computation time, the proposed classifier was 5821.7 times faster than conventional SVM. In conclusion, the proposed classifier and feature showed performance comparable to those of previous studies, while the computational complexity and update interval were highly reduced.

  17. A multi-label, semi-supervised classification approach applied to personality prediction in social media.

    PubMed

    Lima, Ana Carolina E S; de Castro, Leandro Nunes

    2014-10-01

    Social media allow web users to create and share content pertaining to different subjects, exposing their activities, opinions, feelings and thoughts. In this context, online social media has attracted the interest of data scientists seeking to understand behaviours and trends, whilst collecting statistics for social sites. One potential application for these data is personality prediction, which aims to understand a user's behaviour within social media. Traditional personality prediction relies on users' profiles, their status updates, the messages they post, etc. Here, a personality prediction system for social media data is introduced that differs from most approaches in the literature, in that it works with groups of texts, instead of single texts, and does not take users' profiles into account. Also, the proposed approach extracts meta-attributes from texts and does not work directly with the content of the messages. The set of possible personality traits is taken from the Big Five model and allows the problem to be characterised as a multi-label classification task. The problem is then transformed into a set of five binary classification problems and solved by means of a semi-supervised learning approach, due to the difficulty in annotating the massive amounts of data generated in social media. In our implementation, the proposed system was trained with three well-known machine-learning algorithms, namely a Naïve Bayes classifier, a Support Vector Machine, and a Multilayer Perceptron neural network. The system was applied to predict the personality of Tweets taken from three datasets available in the literature, and resulted in an approximately 83% accurate prediction, with some of the personality traits presenting better individual classification rates than others.

  18. An improved hyperspectral image classification approach based on ISODATA and SKR method

    NASA Astrophysics Data System (ADS)

    Hong, Pu; Ye, Xiao-feng; Yu, Hui; Zhang, Zhi-jie; Cai, Yu-fei; Tang, Xin; Tang, Wei; Wang, Chensheng

    2016-11-01

    Hyper-spectral images can not only provide spatial information but also a wealth of spectral information. A short list of applications includes environmental mapping, global change research, geological research, wetlands mapping, assessment of trafficability, plant and mineral identification and abundance estimation, crop analysis, and bathymetry. A crucial aspect of hyperspectral image analysis is the identification of materials present in an object or scene being imaged. Classification of a hyperspectral image sequence amounts to identifying which pixels contain various spectrally distinct materials that have been specified by the user. Several techniques for classification of multi-hyperspectral pixels have been used from minimum distance and maximum likelihood classifiers to correlation matched filter-based approaches such as spectral signature matching and the spectral angle mapper. In this paper, an improved hyperspectral images classification algorithm is proposed. In the proposed method, an improved similarity measurement method is applied, in which both the spectrum similarity and space similarity are considered. We use two different weighted matrix to estimate the spectrum similarity and space similarity between two pixels, respectively. And then whether these two pixels represent the same material can be determined. In order to reduce the computational cost the wavelet transform is also applied prior to extract the spectral and space features. The proposed method is tested using hyperspectral imagery collected by the National Aeronautics and Space Administration Jet Propulsion Laboratory. Experimental results the efficiency of this new method on hyperspectral images associated with space object material identification.

  19. A novel approach to ECG classification based upon two-layered HMMs in body sensor networks.

    PubMed

    Liang, Wei; Zhang, Yinlong; Tan, Jindong; Li, Yang

    2014-03-27

    This paper presents a novel approach to ECG signal filtering and classification. Unlike the traditional techniques which aim at collecting and processing the ECG signals with the patient being still, lying in bed in hospitals, our proposed algorithm is intentionally designed for monitoring and classifying the patient's ECG signals in the free-living environment. The patients are equipped with wearable ambulatory devices the whole day, which facilitates the real-time heart attack detection. In ECG preprocessing, an integral-coefficient-band-stop (ICBS) filter is applied, which omits time-consuming floating-point computations. In addition, two-layered Hidden Markov Models (HMMs) are applied to achieve ECG feature extraction and classification. The periodic ECG waveforms are segmented into ISO intervals, P subwave, QRS complex and T subwave respectively in the first HMM layer where expert-annotation assisted Baum-Welch algorithm is utilized in HMM modeling. Then the corresponding interval features are selected and applied to categorize the ECG into normal type or abnormal type (PVC, APC) in the second HMM layer. For verifying the effectiveness of our algorithm on abnormal signal detection, we have developed an ECG body sensor network (BSN) platform, whereby real-time ECG signals are collected, transmitted, displayed and the corresponding classification outcomes are deduced and shown on the BSN screen.

  20. A tri-fold hybrid classification approach for diagnostics with unexampled faulty states

    NASA Astrophysics Data System (ADS)

    Tamilselvan, Prasanna; Wang, Pingfeng

    2015-01-01

    System health diagnostics provides diversified benefits such as improved safety, improved reliability and reduced costs for the operation and maintenance of engineered systems. Successful health diagnostics requires the knowledge of system failures. However, with an increasing system complexity, it is extraordinarily difficult to have a well-tested system so that all potential faulty states can be realized and studied at product testing stage. Thus, real time health diagnostics requires automatic detection of unexampled system faulty states based upon sensory data to avoid sudden catastrophic system failures. This paper presents a trifold hybrid classification (THC) approach for structural health diagnosis with unexampled health states (UHS), which comprises of preliminary UHS identification using a new thresholded Mahalanobis distance (TMD) classifier, UHS diagnostics using a two-class support vector machine (SVM) classifier, and exampled health states diagnostics using a multi-class SVM classifier. The proposed THC approach, which takes the advantages of both TMD and SVM-based classification techniques, is able to identify and isolate the unexampled faulty states through interactively detecting the deviation of sensory data from the exampled health states and forming new ones autonomously. The proposed THC approach is further extended to a generic framework for health diagnostics problems with unexampled faulty states and demonstrated with health diagnostics case studies for power transformers and rolling bearings.

  1. A data mining approach to optimize pellets manufacturing process based on a decision tree algorithm.

    PubMed

    Ronowicz, Joanna; Thommes, Markus; Kleinebudde, Peter; Krysiński, Jerzy

    2015-06-20

    The present study is focused on the thorough analysis of cause-effect relationships between pellet formulation characteristics (pellet composition as well as process parameters) and the selected quality attribute of the final product. The shape using the aspect ratio value expressed the quality of pellets. A data matrix for chemometric analysis consisted of 224 pellet formulations performed by means of eight different active pharmaceutical ingredients and several various excipients, using different extrusion/spheronization process conditions. The data set contained 14 input variables (both formulation and process variables) and one output variable (pellet aspect ratio). A tree regression algorithm consistent with the Quality by Design concept was applied to obtain deeper understanding and knowledge of formulation and process parameters affecting the final pellet sphericity. The clear interpretable set of decision rules were generated. The spehronization speed, spheronization time, number of holes and water content of extrudate have been recognized as the key factors influencing pellet aspect ratio. The most spherical pellets were achieved by using a large number of holes during extrusion, a high spheronizer speed and longer time of spheronization. The described data mining approach enhances knowledge about pelletization process and simultaneously facilitates searching for the optimal process conditions which are necessary to achieve ideal spherical pellets, resulting in good flow characteristics. This data mining approach can be taken into consideration by industrial formulation scientists to support rational decision making in the field of pellets technology.

  2. An approach for evaluating the effectiveness of various ozone air quality standards for protecting trees.

    PubMed

    Hogsett, William E; Tingey, David T; Lee, E Henry; Beedlow, Peter A; Andersen, Christian P

    2008-06-01

    We demonstrate an approach for evaluating the level of protection attained using a variety of forms and levels of past, current, and proposed Air Quality Standards (AQSs). The U.S. Clean Air Act requires the establishment of ambient air quality standards to protect health and public welfare. However, determination of attainment of these standards is based on ambient pollutant concentrations rather than prevention of adverse effects. To determine if a given AQS protected against adverse effects on vegetation, hourly ozone concentrations were adjusted to create exposure levels that "just attain" a given standard. These exposures were used in combination with a physiologically-based tree growth model to account for the interactions of climate and ozone. In the evaluation, we used ozone concentrations from two 6-year time periods from the San Bernardino Mountains in California. There were clear differences in the level of vegetation protection achieved with the various AQSs. Based on modeled plant growth, the most effective standards were the California 8-hr average maximum of 70 ppb and a seasonal, cumulative, concentration-weighted index (SUM06), which if attained, resulted in annual growth reductions of 1% or less. Least effective was the 1-hr maximum of 120 ppb which resulted in a 7% annual reduction. We conclude that combining climate, exposure scenarios, and a process-based plant growth simulator was a useful approach for evaluating effectiveness of current or proposed air quality standards, or evaluating the form and/or level of a standard based on preventing adverse growth effects.

  3. Early and Mid-Holocene Climate Variability - A Multi-Proxy Approach from Multi-Millennial Tree Ring Records

    NASA Astrophysics Data System (ADS)

    Ziehmer, Malin Michelle; Nicolussi, Kurt; Schlüchter, Christian; Leuenberger, Markus

    2016-04-01

    Most reconstructions of Holocene climate variability in the Alps are based on low-frequency archives such as glacier and tree line fluctuations. However; recent finds of wood remains in glacier forefields in the Alps reveal a unique high-frequency archive allowing climate reconstruction over the entire Holocene. The evolution of Holocene climate can be reconstructed by using a multi-proxy approach combining tree ring width and multiple stable isotope chronologies by establishing highly resolved stable isotope records from calendar-dated wood which covers the past 9000 years b2k. Therefore, we collected samples in the Alps covering a large SW-NE transect, primarily in glacier forefields but also in peat bogs and small lakes. The multiple sample locations allow the analysis of climatic conditions along a climatic gradient characterized by the change from an Atlantic to a more continental climate. Subsequently, tree ring widths are measured and samples are calendrically dated by means of tree ring analysis. Due to the large amount of samples for stable isotope analysis (> 8000 samples to cover the entire Holocene by guaranteeing a sample replication of 4 samples per time unit of 5 years), dated wood samples are separated into 5-year tree ring blocks. These blocks are sliced and the cellulose is extracted after a standardized procedure and crushed by ultrasonic homogenization. In order to establish multi-proxy records, the stable isotopes of carbon, oxygen and hydrogen are simultaneously measured. Both the 5-year tree ring width and multiple stable isotope series offer new insights into the Early and Mid-Holocene climate and its variability in the Alps. The stable isotope records reveal interesting low-frequency variability. But they also display expected offsets caused by the measurement of individual trees revealing effects of sampling site, tree species and growth trend. These effects offer an additional insight into the tree growth and stand behavior of single

  4. Probabilistic gait classification in children with cerebral palsy: a Bayesian approach.

    PubMed

    Van Gestel, Leen; De Laet, Tinne; Di Lello, Enrico; Bruyninckx, Herman; Molenaers, Guy; Van Campenhout, Anja; Aertbeliën, Erwin; Schwartz, Mike; Wambacq, Hans; De Cock, Paul; Desloovere, Kaat

    2011-01-01

    Three-dimensional gait analysis (3DGA) generates a wealth of highly variable data. Gait classifications help to reduce, simplify and interpret this vast amount of 3DGA data and thereby assist and facilitate clinical decision making in the treatment of CP. CP gait is often a mix of several clinically accepted distinct gait patterns. Therefore, there is a need for a classification which characterizes each CP gait by different degrees of membership for several gait patterns, which are considered by clinical experts to be highly relevant. In this respect, this paper introduces Bayesian networks (BN) as a new approach for classification of 3DGA data of the ankle and knee in children with CP. A BN is a probabilistic graphical model that represents a set of random variables and their conditional dependencies via a directed acyclic graph. Furthermore, they provide an explicit way of introducing clinical expertise as prior knowledge to guide the BN in its analysis of the data and the underlying clinically relevant relationships. BNs also enable to classify gait on a continuum of patterns, as their outcome consists of a set of probabilistic membership values for different clinically accepted patterns. A group of 139 patients with CP was recruited and divided into a training- (n=80% of all patients) and a validation-dataset (n=20% of all patients). An average classification accuracy of 88.4% was reached. The BN of this study achieved promising accuracy rates and was found to be successful for classifying ankle and knee joint motion on a continuum of different clinically relevant gait patterns.

  5. Model-based approach to the detection and classification of mines in sidescan sonar.

    PubMed

    Reed, Scott; Petillot, Yvan; Bell, Judith

    2004-01-10

    This paper presents a model-based approach to mine detection and classification by use of sidescan sonar. Advances in autonomous underwater vehicle technology have increased the interest in automatic target recognition systems in an effort to automate a process that is currently carried out by a human operator. Current automated systems generally require training and thus produce poor results when the test data set is different from the training set. This has led to research into unsupervised systems, which are able to cope with the large variability in conditions and terrains seen in sidescan imagery. The system presented in this paper first detects possible minelike objects using a Markov random field model, which operates well on noisy images, such as sidescan, and allows a priori information to be included through the use of priors. The highlight and shadow regions of the object are then extracted with a cooperating statistical snake, which assumes these regions are statistically separate from the background. Finally, a classification decision is made using Dempster-Shafer theory, where the extracted features are compared with synthetic realizations generated with a sidescan sonar simulator model. Results for the entire process are shown on real sidescan sonar data. Similarities between the sidescan sonar and synthetic aperture radar (SAR) imaging processes ensure that the approach outlined here could be made applied to SAR image analysis.

  6. Detection and classification of interstitial lung diseases and emphysema using a joint morphological-fuzzy approach

    NASA Astrophysics Data System (ADS)

    Chang Chien, Kuang-Che; Fetita, Catalin; Brillet, Pierre-Yves; Prêteux, Françoise; Chang, Ruey-Feng

    2009-02-01

    Multi-detector computed tomography (MDCT) has high accuracy and specificity on volumetrically capturing serial images of the lung. It increases the capability of computerized classification for lung tissue in medical research. This paper proposes a three-dimensional (3D) automated approach based on mathematical morphology and fuzzy logic for quantifying and classifying interstitial lung diseases (ILDs) and emphysema. The proposed methodology is composed of several stages: (1) an image multi-resolution decomposition scheme based on a 3D morphological filter is used to detect and analyze the different density patterns of the lung texture. Then, (2) for each pattern in the multi-resolution decomposition, six features are computed, for which fuzzy membership functions define a probability of association with a pathology class. Finally, (3) for each pathology class, the probabilities are combined up according to the weight assigned to each membership function and two threshold values are used to decide the final class of the pattern. The proposed approach was tested on 10 MDCT cases and the classification accuracy was: emphysema: 95%, fibrosis/honeycombing: 84% and ground glass: 97%.

  7. Texture classification of anatomical structures in CT using a context-free machine learning approach

    NASA Astrophysics Data System (ADS)

    Jiménez del Toro, Oscar A.; Foncubierta-Rodríguez, Antonio; Depeursinge, Adrien; Müller, Henning

    2015-03-01

    Medical images contain a large amount of visual information about structures and anomalies in the human body. To make sense of this information, human interpretation is often essential. On the other hand, computer-based approaches can exploit information contained in the images by numerically measuring and quantifying specific visual features. Annotation of organs and other anatomical regions is an important step before computing numerical features on medical images. In this paper, a texture-based organ classification algorithm is presented, which can be used to reduce the time required for annotating medical images. The texture of organs is analyzed using a combination of state-of-the-art techniques: the Riesz transform and a bag of meaningful visual words. The effect of a meaningfulness transformation in the visual word space yields two important advantages that can be seen in the results. The number of descriptors is enormously reduced down to 10% of the original size, whereas classification accuracy is improved by up to 25% with respect to the baseline approach.

  8. Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach

    NASA Astrophysics Data System (ADS)

    Du, Shihong; Zhang, Fangli; Zhang, Xiuyuan

    2015-07-01

    While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature's contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems.

  9. A case-comparison study of automatic document classification utilizing both serial and parallel approaches

    NASA Astrophysics Data System (ADS)

    Wilges, B.; Bastos, R. C.; Mateus, G. P.; Dantas, M. A. R.

    2014-10-01

    A well-known problem faced by any organization nowadays is the high volume of data that is available and the required process to transform this volume into differential information. In this study, a case-comparison study of automatic document classification (ADC) approach is presented, utilizing both serial and parallel paradigms. The serial approach was implemented by adopting the RapidMiner software tool, which is recognized as the worldleading open-source system for data mining. On the other hand, considering the MapReduce programming model, the Hadoop software environment has been used. The main goal of this case-comparison study is to exploit differences between these two paradigms, especially when large volumes of data such as Web text documents are utilized to build a category database. In the literature, many studies point out that distributed processing in unstructured documents have been yielding efficient results in utilizing Hadoop. Results from our research indicate a threshold to such efficiency.

  10. Land cover classification of Landsat 8 satellite data based on Fuzzy Logic approach

    NASA Astrophysics Data System (ADS)

    Taufik, Afirah; Sakinah Syed Ahmad, Sharifah

    2016-06-01

    The aim of this paper is to propose a method to classify the land covers of a satellite image based on fuzzy rule-based system approach. The study uses bands in Landsat 8 and other indices, such as Normalized Difference Water Index (NDWI), Normalized difference built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) as input for the fuzzy inference system. The selected three indices represent our main three classes called water, built- up land, and vegetation. The combination of the original multispectral bands and selected indices provide more information about the image. The parameter selection of fuzzy membership is performed by using a supervised method known as ANFIS (Adaptive neuro fuzzy inference system) training. The fuzzy system is tested for the classification on the land cover image that covers Klang Valley area. The results showed that the fuzzy system approach is effective and can be explored and implemented for other areas of Landsat data.

  11. An image-based approach for classification of human micro-doppler radar signatures

    NASA Astrophysics Data System (ADS)

    Tivive, Fok Hing Chi; Phung, Son Lam; Bouzerdoum, Abdesselam

    2013-05-01

    With the advances in radar technology, there is an increasing interest in automatic radar-based human gait identification. This is because radar signals can penetrate through most dielectric materials. In this paper, an image-based approach is proposed for classifying human micro-Doppler radar signatures. The time-varying radar signal is first converted into a time-frequency representation, which is then cast as a two-dimensional image. A descriptor is developed to extract micro-Doppler features from local time-frequency patches centered along the torso Doppler frequency. Experimental results based on real data collected from a 24-GHz Doppler radar showed that the proposed approach achieves promising classification performance.

  12. A multi-neighbor-joining approach for phylogenetic tree reconstruction and visualization.

    PubMed

    Silva, Ana Estela A da; Villanueva, Wilfredo J P; Knidel, Helder; Bonato, Viniacute Cius; Reis, Sérgio F dos; Von Zuben, Fernando J

    2005-09-30

    The computationally challenging problem of reconstructing the phylogeny of a set of contemporary data, such as DNA sequences or morphological attributes, was treated by an extended version of the neighbor-joining (NJ) algorithm. The original NJ algorithm provides a single-tree topology, after a cascade of greedy pairing decisions that tries to simultaneously optimize the minimum evolution and the least squares criteria. Given that some sub-trees are more stable than others, and that the minimum evolution tree may not be achieved by the original NJ algorithm, we propose a multi-neighbor-joining (MNJ) algorithm capable of performing multiple pairing decisions at each level of the tree reconstruction, keeping various partial solutions along the recursive execution of the NJ algorithm. The main advantages of the new reconstruction procedure are: 1) as is the case for the original NJ algorithm, the MNJ algorithm is still a low-cost reconstruction method; 2) a further investigation of the alternative topologies may reveal stable and unstable sub-trees; 3) the chance of achieving the minimum evolution tree is greater; 4) tree topologies with very similar performances will be simultaneously presented at the output. When there are multiple unrooted tree topologies to be compared, a visualization tool is also proposed, using a radial layout to uniformly distribute the branches with the help of well-known metaheuristics used in computer science.

  13. A novel approach of mining strong jumping emerging patterns based on BSC-tree

    NASA Astrophysics Data System (ADS)

    Liu, Quanzhong; Shi, Peng; Hu, Zhengguo; Zhang, Yang

    2014-03-01

    It is a great challenge to discover strong jumping emerging patterns (SJEPs) from a high-dimensional dataset because of the huge pattern space. In this article, we propose a dynamically growing contrast pattern tree (DGCP-tree) structure to store grown patterns and their path codes arrays with 1-bit counts, which are from the constructed bit string compression tree. A method of mining SJEPs based on DGCP-tree is developed. In order to reduce the pattern search space, we introduce a novel pattern pruning method, which dramatically reduces non-minimal jumping emerging patterns (JEPs) during the mining process. Experiments are performed on three real cancer datasets and three datasets from the University of California, Irvine machine-learning repository. Compared with the well-known CP-tree method, the results show that the proposed method is substantially faster, able to handle higher-dimensional datasets and to prune more non-minimal JEPs.

  14. A probabilistic approach to segmentation and classification of neoplasia in uterine cervix images using color and geometric features

    NASA Astrophysics Data System (ADS)

    Srinivasan, Yeshwanth; Hernes, Dana; Tulpule, Bhakti; Yang, Shuyu; Guo, Jiangling; Mitra, Sunanda; Yagneswaran, Sriraja; Nutter, Brian; Jeronimo, Jose; Phillips, Benny; Long, Rodney; Ferris, Daron

    2005-04-01

    Automated segmentation and classification of diagnostic markers in medical imagery are challenging tasks. Numerous algorithms for segmentation and classification based on statistical approaches of varying complexity are found in the literature. However, the design of an efficient and automated algorithm for precise classification of desired diagnostic markers is extremely image-specific. The National Library of Medicine (NLM), in collaboration with the National Cancer Institute (NCI), is creating an archive of 60,000 digitized color images of the uterine cervix. NLM is developing tools for the analysis and dissemination of these images over the Web for the study of visual features correlated with precancerous neoplasia and cancer. To enable indexing of images of the cervix, it is essential to develop algorithms for the segmentation of regions of interest, such as acetowhitened regions, and automatic identification and classification of regions exhibiting mosaicism and punctation. Success of such algorithms depends, primarily, on the selection of relevant features representing the region of interest. We present color and geometric features based statistical classification and segmentation algorithms yielding excellent identification of the regions of interest. The distinct classification of the mosaic regions from the non-mosaic ones has been obtained by clustering multiple geometric and color features of the segmented sections using various morphological and statistical approaches. Such automated classification methodologies will facilitate content-based image retrieval from the digital archive of uterine cervix and have the potential of developing an image based screening tool for cervical cancer.

  15. Evaluating an ensemble classification approach for crop diversity verification in Danish greening subsidy control

    NASA Astrophysics Data System (ADS)

    Chellasamy, Menaka; Ferré, Ty Paul Andrew; Greve, Mogens Humlekrog

    2016-07-01

    Beginning in 2015, Danish farmers are obliged to meet specific crop diversification rules based on total land area and number of crops cultivated to be eligible for new greening subsidies. Hence, there is a need for the Danish government to extend their subsidy control system to verify farmers' declarations to warrant greening payments under the new crop diversification rules. Remote Sensing (RS) technology has been used since 1992 to control farmers' subsidies in Denmark. However, a proper RS-based approach is yet to be finalised to validate new crop diversity requirements designed for assessing compliance under the recent subsidy scheme (2014-2020); This study uses an ensemble classification approach (proposed by the authors in previous studies) for validating the crop diversity requirements of the new rules. The approach uses a neural network ensemble classification system with bi-temporal (spring and early summer) WorldView-2 imagery (WV2) and includes the following steps: (1) automatic computation of pixel-based prediction probabilities using multiple neural networks; (2) quantification of the classification uncertainty using Endorsement Theory (ET); (3) discrimination of crop pixels and validation of the crop diversification rules at farm level; and (4) identification of farmers who are violating the requirements for greening subsidies. The prediction probabilities are computed by a neural network ensemble supplied with training samples selected automatically using farmers declared parcels (field vectors containing crop information and the field boundary of each crop). Crop discrimination is performed by considering a set of conclusions derived from individual neural networks based on ET. Verification of the diversification rules is performed by incorporating pixel-based classification uncertainty or confidence intervals with the class labels at the farmer level. The proposed approach was tested with WV2 imagery acquired in 2011 for a study area in Vennebjerg

  16. Selection bias in species distribution models: An econometric approach on forest trees based on structural modeling

    NASA Astrophysics Data System (ADS)

    Martin-StPaul, N. K.; Ay, J. S.; Guillemot, J.; Doyen, L.; Leadley, P.

    2014-12-01

    Species distribution models (SDMs) are widely used to study and predict the outcome of global changes on species. In human dominated ecosystems the presence of a given species is the result of both its ecological suitability and human footprint on nature such as land use choices. Land use choices may thus be responsible for a selection bias in the presence/absence data used in SDM calibration. We present a structural modelling approach (i.e. based on structural equation modelling) that accounts for this selection bias. The new structural species distribution model (SSDM) estimates simultaneously land use choices and species responses to bioclimatic variables. A land use equation based on an econometric model of landowner choices was joined to an equation of species response to bioclimatic variables. SSDM allows the residuals of both equations to be dependent, taking into account the possibility of shared omitted variables and measurement errors. We provide a general description of the statistical theory and a set of applications on forest trees over France using databases of climate and forest inventory at different spatial resolution (from 2km to 8km). We also compared the outputs of the SSDM with outputs of a classical SDM (i.e. Biomod ensemble modelling) in terms of bioclimatic response curves and potential distributions under current climate and climate change scenarios. The shapes of the bioclimatic response curves and the modelled species distribution maps differed markedly between SSDM and classical SDMs, with contrasted patterns according to species and spatial resolutions. The magnitude and directions of these differences were dependent on the correlations between the errors from both equations and were highest for higher spatial resolutions. A first conclusion is that the use of classical SDMs can potentially lead to strong miss-estimation of the actual and future probability of presence modelled. Beyond this selection bias, the SSDM we propose represents

  17. Effects of a Peer Assessment System Based on a Grid-Based Knowledge Classification Approach on Computer Skills Training

    ERIC Educational Resources Information Center

    Hsu, Ting-Chia

    2016-01-01

    In this study, a peer assessment system using the grid-based knowledge classification approach was developed to improve students' performance during computer skills training. To evaluate the effectiveness of the proposed approach, an experiment was conducted in a computer skills certification course. The participants were divided into three…

  18. Analysis of electrical tree ageing in silicone rubber by physicochemical approach

    NASA Astrophysics Data System (ADS)

    Zhou, Y. X.; Nie, Q.; Chen, Z. Z.; Liu, R.

    2009-08-01

    In this paper, the characteristics of electrical tree ageing in silicone rubber (SIR) under AC voltage were studied. The electrical tree initiation rate is 20% after the application of 6 kV AC voltage for 1000 hours. Samples are separated into three kinds according to processes of electrical tree formation: virgin samples without voltage application, non-treed samples without electrical tree formation after 1000 hours and treed samples with electrical tree formation after 1000 hours. Certain physicochemical diagnostic tests were carried out to understand the degradation of material, ascribed to long-time voltage application, using differential scanning calorimetry (DSC) and thermogravimetric-differential thermal analysis (TG-DTA). Physicochemical analyses, especially the DSC results, show that no additional phases are formed in the processes of electrical tree ageing in SIR. Reduction of the melting point and crystallinity of SIR is observed in the sequence of virgin samples, non-treed samples and treed samples. The activation energy values were calculated from the TG-DTA data. Compared to virgin samples, obvious reduction of activation energy value is observed in non-treed samples. Degradation in SIR has already occurred before electrical tree formation and charge injection and extraction by high field electrode under AC voltage is regarded as the reason.

  19. Establishing a cause and effect relationship for ambient ozone exposure and tree growth in the forest: progress and an experimental approach.

    PubMed

    Manning, William J

    2005-10-01

    Much has been written about the effects of ambient ozone on tree growth. Cause and effect has been established with seedlings in chambers. Results from multi-year studies with older tree seedlings, in open-top chambers, have been inconclusive, due to chamber effects. Extrapolation of results from chambers to trees in the forest is not possible. Predictive models for forest tree growth reductions caused by ozone have been developed, but not verified. Dendrochronological methods have been used to establish correlations between radial growth reductions in forest trees and ambient ozone exposure. The protective chemical ethylenediurea (EDU) has been used to protect tree seedlings from ozone injury. An experimental approach is advocated here that utilizes forest trees selected for sensitivity and non-sensitivity to ozone, dendrochronological methods, the protective chemical EDU, and monitoring data for ambient ozone, stomatal conductance, soil moisture potential, air temperature, PAR, etc. in long-term investigations to establish cause and effect relationships.

  20. Diagnostic classification of specific phobia subtypes using structural MRI data: a machine-learning approach.

    PubMed

    Lueken, Ulrike; Hilbert, Kevin; Wittchen, Hans-Ulrich; Reif, Andreas; Hahn, Tim

    2015-01-01

    While neuroimaging research has advanced our knowledge about fear circuitry dysfunctions in anxiety disorders, findings based on diagnostic groups do not translate into diagnostic value for the individual patient. Machine-learning generates predictive information that can be used for single subject classification. We applied Gaussian process classifiers to a sample of patients with specific phobia as a model disorder for pathological forms of anxiety to test for classification based on structural MRI data. Gray (GM) and white matter (WM) volumetric data were analyzed in 33 snake phobics (SP; animal subtype), 26 dental phobics (DP; blood-injection-injury subtype) and 37 healthy controls (HC). Results showed good accuracy rates for GM and WM data in predicting phobia subtypes (GM: 62 % phobics vs. HC, 86 % DP vs. HC, 89 % SP vs. HC, 89 % DP vs. SP; WM: 88 % phobics vs. HC, 89 % DP vs. HC, 79 % SP vs. HC, 79 % DP vs. HC). Regarding GM, classification improved when considering the subtype compared to overall phobia status. The discriminatory brain pattern was not solely based on fear circuitry structures but included widespread cortico-subcortical networks. Results demonstrate that multivariate pattern recognition represents a promising approach for the development of neuroimaging-based diagnostic markers that could support clinical decisions. Regarding the increasing number of fMRI studies on anxiety disorders, researchers are encouraged to use functional and structural data not only for studying phenotype characteristics on a group level, but also to evaluate their incremental value for diagnostic or prognostic purposes.

  1. A modified reachability tree approach to analysis of unbounded Petri nets.

    PubMed

    Wang, Fei-Yue; Gao, Yanqing; Zhou, MengChu

    2004-02-01

    Reachability trees, especially the corresponding Karp-Miller's finite reachability trees generated for Petri nets are fundamental for systematically investigating many characteristics such as boundedness, liveness, and performance of systems modeled by Petri nets. However, too much information is lost in a FRT to render it useful for many applications. In this paper, modified reachability trees (MRT) of Petri nets are introduced that extend the capability of Karp-Miller's FRTs in solving the liveness, deadlock, and reachability problems, and in defining or determining possible firing sequences. The finiteness of MRT is proved and several examples are presented to illustrate the advantages of MRT over FRT.

  2. Classification of non native tree species in Adda Park (Italy) through multispectral and multitemporal surveys from UAV

    NASA Astrophysics Data System (ADS)

    Pinto, Livio; Sona, Giovanna; Biffi, Andrea; Dosso, Paolo; Passoni, Daniele; Baracani, Matteo

    2014-05-01

    July, was realized over a longer period : from 09/07/2013 to 28/08/2013, due to weather condition and technical reasons. In any case the vegetation characteristics resulted to be unchanged. The second set of flights, in autumn, were done in a shorter period, during the days 16-17-18 October 2013, thus obtaining even better homogeneity of the vegetation conditions. Image and data processing are based on standard classification techniques, both pixel and object based, applied simultaneously on multispectral and multitemporal data, with the aim of producing a thematic map of the species of interest. The classification accuracies will be computed on the basis of ground truth comparison, to study possible misclassification among species.

  3. SFSSClass: an integrated approach for miRNA based tumor classification

    PubMed Central

    2010-01-01

    Background MicroRNA (miRNA) expression profiling data has recently been found to be particularly important in cancer research and can be used as a diagnostic and prognostic tool. Current approaches of tumor classification using miRNA expression data do not integrate the experimental knowledge available in the literature. A judicious integration of such knowledge with effective miRNA and sample selection through a biclustering approach could be an important step in improving the accuracy of tumor classification. Results In this article, a novel classification technique called SFSSClass is developed that judiciously integrates a biclustering technique SAMBA for simultaneous feature (miRNA) and sample (tissue) selection (SFSS), a cancer-miRNA network that we have developed by mining the literature of experimentally verified cancer-miRNA relationships and a classifier uncorrelated shrunken centroid (USC). SFSSClass is used for classifying multiple classes of tumors and cancer cell lines. In a part of the investigation, poorly differentiated tumors (PDT) having non diagnostic histological appearance are classified while training on more differentiated tumor (MDT) samples. The proposed method is found to outperform the best known accuracy in the literature on the experimental data sets. For example, while the best accuracy reported in the literature for classifying PDT samples is ~76.5%, the accuracy of SFSSClass is found to be ~82.3%. The advantage of incorporating biclustering integrated with the cancer-miRNA network is evident from the consistently better performance of SFSSClass (integration of SAMBA, cancer-miRNA network and USC) over USC (eg., ~70.5% for SFSSClass versus ~58.8% in classifying a set of 17 MDT samples from 9 tumor types, ~91.7% for SFSSClass versus ~75% in classifying 12 cell lines from 6 tumor types and ~82.3% for SFSSClass versus ~41.2% in classifying 17 PDT samples from 11 tumor types). Conclusion In this article, we develop the SFSSClass

  4. Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations.

    PubMed

    Torkzaban, Bahareh; Kayvanjoo, Amir Hossein; Ardalan, Arman; Mousavi, Soraya; Mariotti, Roberto; Baldoni, Luciana; Ebrahimie, Esmaeil; Ebrahimi, Mansour; Hosseini-Mazinani, Mehdi

    2015-01-01

    Finding efficient analytical techniques is overwhelmingly turning into a bottleneck for the effectiveness of large biological data. Machine learning offers a novel and powerful tool to advance classification and modeling solutions in molecular biology. However, these methods have been less frequently used with empirical population genetics data. In this study, we developed a new combined approach of data analysis using microsatellite marker data from our previous studies of olive populations using machine learning algorithms. Herein, 267 olive accessions of various origins including 21 reference cultivars, 132 local ecotypes, and 37 wild olive specimens from the Iranian plateau, together with 77 of the most represented Mediterranean varieties were investigated using a finely selected panel of 11 microsatellite markers. We organized data in two '4-targeted' and '16-targeted' experiments. A strategy of assaying different machine based analyses (i.e. data cleaning, feature selection, and machine learning classification) was devised to identify the most informative loci and the most diagnostic alleles to represent the population and the geography of each olive accession. These analyses revealed microsatellite markers with the highest differentiating capacity and proved efficiency for our method of clustering olive accessions to reflect upon their regions of origin. A distinguished highlight of this study was the discovery of the best combination of markers for better differentiating of populations via machine learning models, which can be exploited to distinguish among other biological populations.

  5. Machine Learning Based Classification of Microsatellite Variation: An Effective Approach for Phylogeographic Characterization of Olive Populations

    PubMed Central

    Mousavi, Soraya; Mariotti, Roberto; Baldoni, Luciana; Ebrahimie, Esmaeil; Ebrahimi, Mansour; Hosseini-Mazinani, Mehdi

    2015-01-01

    Finding efficient analytical techniques is overwhelmingly turning into a bottleneck for the effectiveness of large biological data. Machine learning offers a novel and powerful tool to advance classification and modeling solutions in molecular biology. However, these methods have been less frequently used with empirical population genetics data. In this study, we developed a new combined approach of data analysis using microsatellite marker data from our previous studies of olive populations using machine learning algorithms. Herein, 267 olive accessions of various origins including 21 reference cultivars, 132 local ecotypes, and 37 wild olive specimens from the Iranian plateau, together with 77 of the most represented Mediterranean varieties were investigated using a finely selected panel of 11 microsatellite markers. We organized data in two ‘4-targeted’ and ‘16-targeted’ experiments. A strategy of assaying different machine based analyses (i.e. data cleaning, feature selection, and machine learning classification) was devised to identify the most informative loci and the most diagnostic alleles to represent the population and the geography of each olive accession. These analyses revealed microsatellite markers with the highest differentiating capacity and proved efficiency for our method of clustering olive accessions to reflect upon their regions of origin. A distinguished highlight of this study was the discovery of the best combination of markers for better differentiating of populations via machine learning models, which can be exploited to distinguish among other biological populations. PMID:26599001

  6. A multimodal temporal panorama approach for moving vehicle detection, reconstruction, and classification

    NASA Astrophysics Data System (ADS)

    Wang, Tao; Zhu, Zhigang

    2012-06-01

    Moving vehicle detection and classification using multimodal data is a challenging task in data collection, audio-visual alignment, data labeling and feature selection under uncontrolled environments with occlusions, motion blurs, varying image resolutions and perspective distortions. In this work, we propose an effective multimodal temporal panorama approach for the task using a novel long-range audio-visual sensing system. A new audio-visual vehicle (AVV) dataset for moving vehicle detection and classification is created, which features automatic vehicle detection and audio-visual alignment, accurate vehicle extraction and reconstruction, and efficient data labeling. In particular, vehicles' visual images are reconstructed once detected in order to remove most of the occlusions, motion blurs, and variations of perspective views. Multimodal audio-visual features are extracted, including global geometric features (aspect ratios, profiles), local structure features (HOGs), as well various audio features (MFCCs, etc). Using radial-based SVMs, the effectiveness of the integration of these multimodal features is thoroughly and systemically studied. The concept of MTP may not be only limited to visual, motion and audio modalities; it could also be applicable to other sensing modalities that can obtain data in the temporal domain.

  7. A machine learning approach for classification of anatomical coverage in CT

    NASA Astrophysics Data System (ADS)

    Wang, Xiaoyong; Lo, Pechin; Ramakrishna, Bharath; Goldin, Johnathan; Brown, Matthew

    2016-03-01

    Automatic classification of anatomical coverage of medical images is critical for big data mining and as a pre-processing step to automatically trigger specific computer aided diagnosis systems. The traditional way to identify scans through DICOM headers has various limitations due to manual entry of series descriptions and non-standardized naming conventions. In this study, we present a machine learning approach where multiple binary classifiers were used to classify different anatomical coverages of CT scans. A one-vs-rest strategy was applied. For a given training set, a template scan was selected from the positive samples and all other scans were registered to it. Each registered scan was then evenly split into k × k × k non-overlapping blocks and for each block the mean intensity was computed. This resulted in a 1 × k3 feature vector for each scan. The feature vectors were then used to train a SVM based classifier. In this feasibility study, four classifiers were built to identify anatomic coverages of brain, chest, abdomen-pelvis, and chest-abdomen-pelvis CT scans. Each classifier was trained and tested using a set of 300 scans from different subjects, composed of 150 positive samples and 150 negative samples. Area under the ROC curve (AUC) of the testing set was measured to evaluate the performance in a two-fold cross validation setting. Our results showed good classification performance with an average AUC of 0.96.

  8. Heterogeneous ensemble approach with discriminative features and modified-SMOTEbagging for pre-miRNA classification

    PubMed Central

    Lertampaiporn, Supatcha; Thammarongtham, Chinae; Nukoolkit, Chakarida; Kaewkamnerdpong, Boonserm; Ruengjitchatchawalya, Marasri

    2013-01-01

    An ensemble classifier approach for microRNA precursor (pre-miRNA) classification was proposed based upon combining a set of heterogeneous algorithms including support vector machine (SVM), k-nearest neighbors (kNN) and random forest (RF), then aggregating their prediction through a voting system. Additionally, the proposed algorithm, the classification performance was also improved using discriminative features, self-containment and its derivatives, which have shown unique structural robustness characteristics of pre-miRNAs. These are applicable across different species. By applying preprocessing methods—both a correlation-based feature selection (CFS) with genetic algorithm (GA) search method and a modified-Synthetic Minority Oversampling Technique (SMOTE) bagging rebalancing method—improvement in the performance of this ensemble was observed. The overall prediction accuracies obtained via 10 runs of 5-fold cross validation (CV) was 96.54%, with sensitivity of 94.8% and specificity of 98.3%—this is better in trade-off sensitivity and specificity values than those of other state-of-the-art methods. The ensemble model was applied to animal, plant and virus pre-miRNA and achieved high accuracy, >93%. Exploiting the discriminative set of selected features also suggests that pre-miRNAs possess high intrinsic structural robustness as compared with other stem loops. Our heterogeneous ensemble method gave a relatively more reliable prediction than those using single classifiers. Our program is available at http://ncrna-pred.com/premiRNA.html. PMID:23012261

  9. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    NASA Astrophysics Data System (ADS)

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.

    2015-07-01

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.

  10. High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    DOE PAGES

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; ...

    2015-07-09

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguishmore » between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.« less

  11. High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    SciTech Connect

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijhout, Falko P.

    2015-07-09

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.

  12. A High Throughput Ambient Mass Spectrometric Approach to Species Identification and Classification from Chemical Fingerprint Signatures

    PubMed Central

    Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.

    2015-01-01

    A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required. PMID:26156000

  13. Multi-Objective Particle Swarm Optimization Approach for Cost-Based Feature Selection in Classification.

    PubMed

    Zhang, Yong; Gong, Dun-Wei; Cheng, Jian

    2017-01-01

    Feature selection is an important data-preprocessing technique in classification problems such as bioinformatics and signal processing. Generally, there are some situations where a user is interested in not only maximizing the classification performance but also minimizing the cost that may be associated with features. This kind of problem is called cost-based feature selection. However, most existing feature selection approaches treat this task as a single-objective optimization problem. This paper presents the first study of multi-objective particle swarm optimization (PSO) for cost-based feature selection problems. The task of this paper is to generate a Pareto front of nondominated solutions, that is, feature subsets, to meet different requirements of decision-makers in real-world applications. In order to enhance the search capability of the proposed algorithm, a probability-based encoding technology and an effective hybrid operator, together with the ideas of the crowding distance, the external archive, and the Pareto domination relationship, are applied to PSO. The proposed PSO-based multi-objective feature selection algorithm is compared with several multi-objective feature selection algorithms on five benchmark datasets. Experimental results show that the proposed algorithm can automatically evolve a set of nondominated solutions, and it is a highly competitive feature selection method for solving cost-based feature selection problems.

  14. Flood mapping using VHR satellite imagery: a comparison between different classification approaches

    NASA Astrophysics Data System (ADS)

    Franci, Francesca; Boccardo, Piero; Mandanici, Emanuele; Roveri, Elena; Bitelli, Gabriele

    2016-10-01

    Various regions in Europe have suffered from severe flooding over the last decades. Flood disasters often have a broad extent and a high frequency. They are considered the most devastating natural hazards because of the tremendous fatalities, injuries, property damages, economic and social disruption that they cause. In this context, Earth Observation techniques have become a key tool for flood risk and damage assessment. In particular, remote sensing facilitates flood surveying, providing valuable information, e.g. flood occurrence, intensity and progress of flood inundation, spurs and embankments affected/threatened. The present work aims to investigate the use of Very High Resolution satellite imagery for mapping flood-affected areas. The case study is the November 2013 flood event which occurred in Sardinia region (Italy), affecting a total of 2,700 people and killing 18 persons. The investigated zone extends for 28 km2 along the Posada river, from the Maccheronis dam to the mouth in the Tyrrhenian sea. A post-event SPOT6 image was processed by means of different classification methods, in order to produce the flood map of the analysed area. The unsupervised classification algorithm ISODATA was tested. A pixel-based supervised technique was applied using the Maximum Likelihood algorithm; moreover, the SPOT 6 image was processed by means of object-oriented approaches. The produced flood maps were compared among each other and with an independent data source, in order to evaluate the performance of each method, also in terms of time demand.

  15. An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests

    PubMed Central

    Strobl, Carolin; Malley, James; Tutz, Gerhard

    2010-01-01

    Recursive partitioning methods have become popular and widely used tools for non-parametric regression and classification in many scientific fields. Especially random forests, that can deal with large numbers of predictor variables even in the presence of complex interactions, have been applied successfully in genetics, clinical medicine and bioinformatics within the past few years. High dimensional problems are common not only in genetics, but also in some areas of psychological research, where only few subjects can be measured due to time or cost constraints, yet a large amount of data is generated for each subject. Random forests have been shown to achieve a high prediction accuracy in such applications, and provide descriptive variable importance measures reflecting the impact of each variable in both main effects and interactions. The aim of this work is to introduce the principles of the standard recursive partitioning methods as well as recent methodological improvements, to illustrate their usage for low and high dimensional data exploration, but also to point out limitations of the methods and potential pitfalls in their practical application. Application of the methods is illustrated using freely available implementations in the R system for statistical computing. PMID:19968396

  16. Classification of emotional states from electrocardiogram signals: a non-linear approach based on hurst

    PubMed Central

    2013-01-01

    Background Identifying the emotional state is helpful in applications involving patients with autism and other intellectual disabilities; computer-based training, human computer interaction etc. Electrocardiogram (ECG) signals, being an activity of the autonomous nervous system (ANS), reflect the underlying true emotional state of a person. However, the performance of various methods developed so far lacks accuracy, and more robust methods need to be developed to identify the emotional pattern associated with ECG signals. Methods Emotional ECG data was obtained from sixty participants by inducing the six basic emotional states (happiness, sadness, fear, disgust, surprise and neutral) using audio-visual stimuli. The non-linear feature ‘Hurst’ was computed using Rescaled Range Statistics (RRS) and Finite Variance Scaling (FVS) methods. New Hurst features were proposed by combining the existing RRS and FVS methods with Higher Order Statistics (HOS). The features were then classified using four classifiers – Bayesian Classifier, Regression Tree, K- nearest neighbor and Fuzzy K-nearest neighbor. Seventy percent of the features were used for training and thirty percent for testing the algorithm. Results Analysis of Variance (ANOVA) conveyed that Hurst and the proposed features were statistically significant (p < 0.001). Hurst computed using RRS and FVS methods showed similar classification accuracy. The features obtained by combining FVS and HOS performed better with a maximum accuracy of 92.87% and 76.45% for classifying the six emotional states using random and subject independent validation respectively. Conclusions The results indicate that the combination of non-linear analysis and HOS tend to capture the finer emotional changes that can be seen in healthy ECG data. This work can be further fine tuned to develop a real time system. PMID:23680041

  17. A novel approach to phylogenetic tree construction using stochastic optimization and clustering

    PubMed Central

    Qin, Ling; Chen, Yixin; Pan, Yi; Chen, Ling

    2006-01-01

    Background The problem of inferring the evolutionary history and constructing the phylogenetic tree with high performance has become one of the major problems in computational biology. Results A new phylogenetic tree construction method from a given set of objects (proteins, species, etc.) is presented. As an extension of ant colony optimization, this method proposes an adaptive phylogenetic clustering algorithm based on a digraph to find a tree structure that defines the ancestral relationships among the given objects. Conclusion Our phylogenetic tree construction method is tested to compare its results with that of the genetic algorithm (GA). Experimental results show that our algorithm converges much faster and also achieves higher quality than GA. PMID:17217517

  18. The Application of Classification and Regression Trees for the Triage of Women for Referral to Colposcopy and the Estimation of Risk for Cervical Intraepithelial Neoplasia: A Study Based on 1625 Cases with Incomplete Data from Molecular Tests

    PubMed Central

    Pouliakis, Abraham; Karakitsou, Efrossyni; Chrelias, Charalampos; Pappas, Asimakis; Panayiotides, Ioannis; Valasoulis, George; Kyrgiou, Maria; Paraskevaidis, Evangelos; Karakitsos, Petros

    2015-01-01

    Objective. Nowadays numerous ancillary techniques detecting HPV DNA and mRNA compete with cytology; however no perfect test exists; in this study we evaluated classification and regression trees (CARTs) for the production of triage rules and estimate the risk for cervical intraepithelial neoplasia (CIN) in cases with ASCUS+ in cytology. Study Design. We used 1625 cases. In contrast to other approaches we used missing data to increase the data volume, obtain more accurate results, and simulate real conditions in the everyday practice of gynecologic clinics and laboratories. The proposed CART was based on the cytological result, HPV DNA typing, HPV mRNA detection based on NASBA and flow cytometry, p16 immunocytochemical expression, and finally age and parous status. Results. Algorithms useful for the triage of women were produced; gynecologists could apply these in conjunction with available examination results and conclude to an estimation of the risk for a woman to harbor CIN expressed as a probability. Conclusions. The most important test was the cytological examination; however the CART handled cases with inadequate cytological outcome and increased the diagnostic accuracy by exploiting the results of ancillary techniques even if there were inadequate missing data. The CART performance was better than any other single test involved in this study. PMID:26339651

  19. Schizophrenia detection and classification by advanced analysis of EEG recordings using a single electrode approach.

    PubMed

    Dvey-Aharon, Zack; Fogelson, Noa; Peled, Avi; Intrator, Nathan

    2015-01-01

    Electroencephalographic (EEG) analysis has emerged as a powerful tool for brain state interpretation and diagnosis, but not for the diagnosis of mental disorders; this may be explained by its low spatial resolution or depth sensitivity. This paper concerns the diagnosis of schizophrenia using EEG, which currently suffers from several cardinal problems: it heavily depends on assumptions, conditions and prior knowledge regarding the patient. Additionally, the diagnostic experiments take hours, and the accuracy of the analysis is low or unreliable. This article presents the "TFFO" (Time-Frequency transformation followed by Feature-Optimization), a novel approach for schizophrenia detection showing great success in classification accuracy with no false positives. The methodology is designed for single electrode recording, and it attempts to make the data acquisition process feasible and quick for most patients.

  20. Use of a Novel Grammatical Inference Approach in Classification of Amyloidogenic Hexapeptides.

    PubMed

    Wieczorek, Wojciech; Unold, Olgierd

    2016-01-01

    The present paper is a novel contribution to the field of bioinformatics by using grammatical inference in the analysis of data. We developed an algorithm for generating star-free regular expressions which turned out to be good recommendation tools, as they are characterized by a relatively high correlation coefficient between the observed and predicted binary classifications. The experiments have been performed for three datasets of amyloidogenic hexapeptides, and our results are compared with those obtained using the graph approaches, the current state-of-the-art methods in heuristic automata induction, and the support vector machine. The results showed the superior performance of the new grammatical inference algorithm on fixed-length amyloid datasets.

  1. ISOLATING CONTENT AND METADATA FROM WEBLOGS USING CLASSIFICATION AND RULE-BASED APPROACHES

    SciTech Connect

    Marshall, Eric J.; Bell, Eric B.

    2011-09-04

    The emergence and increasing prevalence of social media, such as internet forums, weblogs (blogs), wikis, etc., has created a new opportunity to measure public opinion, attitude, and social structures. A major challenge in leveraging this information is isolating the content and metadata in weblogs, as there is no standard, universally supported, machine-readable format for presenting this information. We present two algorithms for isolating this information. The first uses web block classification, where each node in the Document Object Model (DOM) for a page is classified according to one of several pre-defined attributes from a common blog schema. The second uses a set of heuristics to select web blocks. These algorithms perform at a level suitable for initial use, validating this approach for isolating content and metadata from blogs. The resultant data serves as a starting point for analytical work on the content and substance of collections of weblog pages.

  2. Sensitivity of Bovine Tuberculosis Surveillance in Wildlife in France: A Scenario Tree Approach.

    PubMed

    Rivière, Julie; Le Strat, Yann; Dufour, Barbara; Hendrikx, Pascal

    2015-01-01

    Bovine tuberculosis (bTB) is a common disease in cattle and wildlife, with an impact on animal and human health, and economic implications. Infected wild animals have been detected in some European countries, and bTB reservoirs in wildlife have been identified, potentially hindering the eradication of bTB from cattle populations. However, the surveillance of bTB in wildlife involves several practical difficulties and is not currently covered by EU legislation. We report here the first assessment of the sensitivity of the bTB surveillance system for free-ranging wildlife launched in France in 2011 (the Sylvatub system), based on scenario tree modelling. Three surveillance system components were identified: (i) passive scanning surveillance for hunted wild boar, red deer and roe deer, based on carcass examination, (ii) passive surveillance on animals found dead, moribund or with abnormal behaviour, for wild boar, red deer, roe deer and badger and (iii) active surveillance for wild boar and badger. The application of these three surveillance system components depends on the geographic risk of bTB infection in wildlife, which in turn depends on the prevalence of bTB in cattle. We estimated the effectiveness of the three components of the Sylvatub surveillance system quantitatively, for each species separately. Active surveillance and passive scanning surveillance by carcass examination were the approaches most likely to detect at least one infected animal in a population with a given design prevalence, regardless of the local risk level and species considered. The awareness of hunters, which depends on their training and the geographic risk, was found to affect surveillance sensitivity. The results obtained are relevant for hunters and veterinary authorities wishing to determine the actual efficacy of wildlife bTB surveillance as a function of geographic area and species, and could provide support for decision-making processes concerning the enhancement of surveillance

  3. Sensitivity of Bovine Tuberculosis Surveillance in Wildlife in France: A Scenario Tree Approach

    PubMed Central

    Rivière, Julie

    2015-01-01

    Bovine tuberculosis (bTB) is a common disease in cattle and wildlife, with an impact on animal and human health, and economic implications. Infected wild animals have been detected in some European countries, and bTB reservoirs in wildlife have been identified, potentially hindering the eradication of bTB from cattle populations. However, the surveillance of bTB in wildlife involves several practical difficulties and is not currently covered by EU legislation. We report here the first assessment of the sensitivity of the bTB surveillance system for free-ranging wildlife launched in France in 2011 (the Sylvatub system), based on scenario tree modelling. Three surveillance system components were identified: (i) passive scanning surveillance for hunted wild boar, red deer and roe deer, based on carcass examination, (ii) passive surveillance on animals found dead, moribund or with abnormal behaviour, for wild boar, red deer, roe deer and badger and (iii) active surveillance for wild boar and badger. The application of these three surveillance system components depends on the geographic risk of bTB infection in wildlife, which in turn depends on the prevalence of bTB in cattle. We estimated the effectiveness of the three components of the Sylvatub surveillance system quantitatively, for each species separately. Active surveillance and passive scanning surveillance by carcass examination were the approaches most likely to detect at least one infected animal in a population with a given design prevalence, regardless of the local risk level and species considered. The awareness of hunters, which depends on their training and the geographic risk, was found to affect surveillance sensitivity. The results obtained are relevant for hunters and veterinary authorities wishing to determine the actual efficacy of wildlife bTB surveillance as a function of geographic area and species, and could provide support for decision-making processes concerning the enhancement of surveillance

  4. Assessment of the potential enhancement of rural food security in Mexico using decision tree land use classification on medium resolution satellite imagery

    NASA Astrophysics Data System (ADS)

    Bermeo, A.; Couturier, S.

    2017-01-01

    Because of its renewed importance in international agendas, food security in sub-tropical countries has been the object of studies at different scales, although the spatial components of food security are still largely undocumented. Among other aspects, food security can be assessed using a food selfsufficiency index. We propose a spatial representation of this assessment in the densely populated rural area of the Huasteca Poblana, Mexico, where there is a known tendency towards the loss of selfsufficiency of basic grains. The main agricultural systems in this area are the traditional milpa (a multicrop practice with maize as the main basic crop) system, coffee plantations and grazing land for bovine livestock. We estimate a potential additional milpa - based maize production by smallholders identifying the presence of extensive coffee and pasture systems in the production data of the agricultural census. The surface of extensive coffee plantations and pasture land were estimated using the detailed coffee agricultural census data, and a decision tree combining unsupervised and supervised spectral classification techniques of medium scale (Landsat) satellite imagery. We find that 30% of the territory would benefit more than 50% increment in food security and 13% could theoretically become maize self-sufficient from the conversion of extensive systems to the traditional multicrop milpa system.

  5. Identifying changes in dissolved organic matter content and characteristics by fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis during wastewater treatment.

    PubMed

    Yu, Huibin; Song, Yonghui; Liu, Ruixia; Pan, Hongwei; Xiang, Liancheng; Qian, Feng

    2014-10-01

    The stabilization of latent tracers of dissolved organic matter (DOM) of wastewater was analyzed by three-dimensional excitation-emission matrix (EEM) fluorescence spectroscopy coupled with self-organizing map and classification and regression tree analysis (CART) in wastewater treatment performance. DOM of water samples collected from primary sedimentation, anaerobic, anoxic, oxic and secondary sedimentation tanks in a large-scale wastewater treatment plant contained four fluorescence components: tryptophan-like (C1), tyrosine-like (C2), microbial humic-like (C3) and fulvic-like (C4) materials extracted by self-organizing map. These components showed good positive linear correlations with dissolved organic carbon of DOM. C1 and C2 were representative components in the wastewater, and they were removed to a higher extent than those of C3 and C4 in the treatment process. C2 was a latent parameter determined by CART to differentiate water samples of oxic and secondary sedimentation tanks from the successive treatment units, indirectly proving that most of tyrosine-like material was degraded by anaerobic microorganisms. C1 was an accurate parameter to comprehensively separate the samples of the five treatment units from each other, indirectly indicating that tryptophan-like material was decomposed by anaerobic and aerobic bacteria. EEM fluorescence spectroscopy in combination with self-organizing map and CART analysis can be a nondestructive effective method for characterizing structural component of DOM fractions and monitoring organic matter removal in wastewater treatment process.

  6. An iterative approach to optimize change classification in SAR time series data

    NASA Astrophysics Data System (ADS)

    Boldt, Markus; Thiele, Antje; Schulz, Karsten; Hinz, Stefan

    2016-10-01

    The detection of changes using remote sensing imagery has become a broad field of research with many approaches for many different applications. Besides the simple detection of changes between at least two images acquired at different times, analyses which aim on the change type or category are at least equally important. In this study, an approach for a semi-automatic classification of change segments is presented. A sparse dataset is considered to ensure the fast and simple applicability for practical issues. The dataset is given by 15 high resolution (HR) TerraSAR-X (TSX) amplitude images acquired over a time period of one year (11/2013 to 11/2014). The scenery contains the airport of Stuttgart (GER) and its surroundings, including urban, rural, and suburban areas. Time series imagery offers the advantage of analyzing the change frequency of selected areas. In this study, the focus is set on the analysis of small-sized high frequently changing regions like parking areas, construction sites and collecting points consisting of high activity (HA) change objects. For each HA change object, suitable features are extracted and a k-means clustering is applied as the categorization step. Resulting clusters are finally compared to a previously introduced knowledge-based class catalogue, which is modified until an optimal class description results. In other words, the subjective understanding of the scenery semantics is optimized by the data given reality. Doing so, an even sparsely dataset containing only amplitude imagery can be evaluated without requiring comprehensive training datasets. Falsely defined classes might be rejected. Furthermore, classes which were defined too coarsely might be divided into sub-classes. Consequently, classes which were initially defined too narrowly might be merged. An optimal classification results when the combination of previously defined key indicators (e.g., number of clusters per class) reaches an optimum.

  7. One Approach to Classification of Users and Automatic Clustering of Documents.

    ERIC Educational Resources Information Center

    Frants, Valery I.; And Others

    1993-01-01

    Shows how to automatically construct a classification of users and a clustering of documents and cross-references among clusters based on users' information needs. Feedback in the construction of this classification and clustering that allows for the classification to be changed to reflect changing needs of users is also described. (22 references)…

  8. An allometry-based approach for understanding forest structure, predicting tree-size distribution and assessing the degree of disturbance

    PubMed Central

    Anfodillo, Tommaso; Carrer, Marco; Simini, Filippo; Popa, Ionel; Banavar, Jayanth R.; Maritan, Amos

    2013-01-01

    Tree-size distribution is one of the most investigated subjects in plant population biology. The forestry literature reports that tree-size distribution trajectories vary across different stands and/or species, whereas the metabolic scaling theory suggests that the tree number scales universally as −2 power of diameter. Here, we propose a simple functional scaling model in which these two opposing results are reconciled. Basic principles related to crown shape, energy optimization and the finite-size scaling approach were used to define a set of relationships based on a single parameter that allows us to predict the slope of the tree-size distributions in a steady-state condition. We tested the model predictions on four temperate mountain forests. Plots (4 ha each, fully mapped) were selected with different degrees of human disturbance (semi-natural stands versus formerly managed). Results showed that the size distribution range successfully fitted by the model is related to the degree of forest disturbance: in semi-natural forests the range is wide, whereas in formerly managed forests, the agreement with the model is confined to a very restricted range. We argue that simple allometric relationships, at an individual level, shape the structure of the whole forest community. PMID:23193128

  9. An allometry-based approach for understanding forest structure, predicting tree-size distribution and assessing the degree of disturbance.

    PubMed

    Anfodillo, Tommaso; Carrer, Marco; Simini, Filippo; Popa, Ionel; Banavar, Jayanth R; Maritan, Amos

    2013-01-22

    Tree-size distribution is one of the most investigated subjects in plant population biology. The forestry literature reports that tree-size distribution trajectories vary across different stands and/or species, whereas the metabolic scaling theory suggests that the tree number scales universally as -2 power of diameter. Here, we propose a simple functional scaling model in which these two opposing results are reconciled. Basic principles related to crown shape, energy optimization and the finite-size scaling approach were used to define a set of relationships based on a single parameter that allows us to predict the slope of the tree-size distributions in a steady-state condition. We tested the model predictions on four temperate mountain forests. Plots (4 ha each, fully mapped) were selected with different degrees of human disturbance (semi-natural stands versus formerly managed). Results showed that the size distribution range successfully fitted by the model is related to the degree of forest disturbance: in semi-natural forests the range is wide, whereas in formerly managed forests, the agreement with the model is confined to a very restricted range. We argue that simple allometric relationships, at an individual level, shape the structure of the whole forest community.

  10. Classification of boreal forest by satellite and inventory data using neural network approach

    NASA Astrophysics Data System (ADS)

    Romanov, A. A.

    2012-12-01

    The main objective of this research was to develop methodology for boreal (Siberian Taiga) land cover classification in a high accuracy level. The study area covers the territories of Central Siberian several parts along the Yenisei River (60-62 degrees North Latitude): the right bank includes mixed forest and dark taiga, the left - pine forests; so were taken as a high heterogeneity and statistically equal surfaces concerning spectral characteristics. Two main types of data were used: time series of middle spatial resolution satellite images (Landsat 5, 7 and SPOT4) and inventory datasets from the nature fieldworks (used for training samples sets preparation). Method of collecting field datasets included a short botany description (type/species of vegetation, density, compactness of the crowns, individual height and max/min diameters representative of each type, surface altitude of the plot), at the same time the geometric characteristic of each training sample unit corresponded to the spatial resolution of satellite images and geo-referenced (prepared datasets both of the preliminary processing and verification). The network of test plots was planned as irregular and determined by the landscape oriented approach. The main focus of the thematic data processing has been allocated for the use of neural networks (fuzzy logic inc.); therefore, the results of field studies have been converting input parameter of type / species of vegetation cover of each unit and the degree of variability. Proposed approach involves the processing of time series separately for each image mainly for the verification: shooting parameters taken into consideration (time, albedo) and thus expected to assess the quality of mapping. So the input variables for the networks were sensor bands, surface altitude, solar angels and land surface temperature (for a few experiments); also given attention to the formation of the formula class on the basis of statistical pre-processing of results of

  11. Clinical features of organophosphate poisoning: A review of different classification systems and approaches

    PubMed Central

    Peter, John Victor; Sudarsan, Thomas Isiah; Moran, John L.

    2014-01-01

    Purpose: The typical toxidrome in organophosphate (OP) poisoning comprises of the Salivation, Lacrimation, Urination, Defecation, Gastric cramps, Emesis (SLUDGE) symptoms. However, several other manifestations are described. We review the spectrum of symptoms and signs in OP poisoning as well as the different approaches to clinical features in these patients. Materials and Methods: Articles were obtained by electronic search of PubMed® between 1966 and April 2014 using the search terms organophosphorus compounds or phosphoric acid esters AND poison or poisoning AND manifestations. Results: Of the 5026 articles on OP poisoning, 2584 articles pertained to human poisoning; 452 articles focusing on clinical manifestations in human OP poisoning were retrieved for detailed evaluation. In addition to the traditional approach of symptoms and signs of OP poisoning as peripheral (muscarinic, nicotinic) and central nervous system receptor stimulation, symptoms were alternatively approached using a time-based classification. In this, symptom onset was categorized as acute (within 24-h), delayed (24-h to 2-week) or late (beyond 2-week). Although most symptoms occur with minutes or hours following acute exposure, delayed onset symptoms occurring after a period of minimal or mild symptoms, may impact treatment and timing of the discharge following acute exposure. Symptoms and signs were also viewed as an organ specific as cardiovascular, respiratory or neurological manifestations. An organ specific approach enables focused management of individual organ dysfunction that may vary with different OP compounds. Conclusions: Different approaches to the symptoms and signs in OP poisoning may better our understanding of the underlying mechanism that in turn may assist with the management of acutely poisoned patients. PMID:25425841

  12. State of the Art Approach to the Classification of Epileptic Seizures and Epilepsies

    PubMed Central

    BARÇIN, Ebru; AKTEKİN, Berrin

    2014-01-01

    In the light of the latest knowledge acquired from clinical and laboratory research dealing with genetic, molecular biology and neuroimaging, existing classifications were successively revised by the International League Against Epilepsy (ILAE) in 2001, 2006, and 2010. In the latest classification established in 2010, proposals articulated radical changes in terms of concepts and definitions of the previously published classifications and put forward new classifications for epileptic seizures, epilepsies and electroclinical syndromes. This review refers to the changes of the new classification with their reasons and criticisms.

  13. Single event and TREE latchup mitigation for a star tracker sensor: An innovative approach to system level latchup mitigation

    SciTech Connect

    Kimbrough, J.R.; Colella, N.J.; Davis, R.W.; Bruener, D.B.; Coakley, P.G.; Lutjens, S.W.; Mallon, C.E.

    1994-08-01

    Electronic packages designed for spacecraft should be fault-tolerant and operate without ground control intervention through extremes in the space radiation environment. If designed for military use, the electronics must survive and function in a nuclear radiation environment. This paper presents an innovative ``blink`` approach rather than the typical ``operate through`` approach to achieve system level latchup mitigation on a prototype star tracker camera. Included are circuit designs, flash x-ray test data, and heavy ion data demonstrating latchup mitigation protecting micro-electronics from current latchup and burnout due to Single Event Latchup (SEL) and Transient Radiation Effects on Electronics (TREE).

  14. Assessment of the classification abilities of the CNS multi-parametric optimization approach by the method of logistic regression.

    PubMed

    Raevsky, O A; Polianczyk, D E; Mukhametov, A; Grigorev, V Y

    2016-08-01

    Assessment of "CNS drugs/CNS candidates" classification abilities of the multi-parametric optimization (CNS MPO) approach was performed by logistic regression. It was found that the five out of the six separately used physical-chemical properties (topological polar surface area, number of hydrogen-bonded donor atoms, basicity, lipophilicity of compound in neutral form and at pH = 7.4) provided accuracy of recognition below 60%. Only the descriptor of molecular weight (MW) could correctly classify two-thirds of the studied compounds. Aggregation of all six properties in the MPOscore did not improve the classification, which was worse than the classification using only MW. The results of our study demonstrate the imperfection of the CNS MPO approach; in its current form it is not very useful for computer design of new, effective CNS drugs.

  15. Tracing carbon uptake from a natural CO2 spring into tree rings: an isotope approach.

    PubMed

    Saurer, Matthias; Cherubini, Paolo; Bonani, Georges; Siegwolf, Rolf

    2003-10-01

    We analyzed 14C, 13C and 18O isotope variations over a 50-year period in tree rings of Quercus ilex L. trees growing at a natural CO2 spring in a Mediterranean ecosystem. We compared trees from two sites, one with high and one with low exposure to CO2 from the spring. The spring CO2 is free of 14C. Thus, this carbon can be traced in the wood, and the amount originating from the spring calculated. The amount decreased over time, from about 40% in 1950 to 15% at present for the site near the spring, indicating a potential difficulty in the use of natural CO2 springs for elevated CO2 research. The reason for the decrease may be decreasing emission from the spring or changes in stand structure, e.g., growth of the canopy into regions with lower concentrations. We used the 14C-calculated CO2 concentration in the canopy to determine the 13C discrimination of the plants growing under elevated CO2 by calculating the effective canopy air 13C/12C isotopic composition. The trees near the spring showed a 2.5 per thousand larger 13C discrimination than the more distant trees at the beginning of the investigated period, i.e., for the young trees, but this difference gradually disappeared. Higher discrimination under elevated CO2 indicated reduced photosynthetic capacity or increased stomatal conductance. The latter assumption is unlikely as inferred from the 18O data, which were insensitive to CO2 concentration. In conclusion, we found evidence for a downward adjustment of photosynthesis under elevated CO2 in Q. ilex in this dry, nutrient-poor environment.

  16. Kane's equations of flexible multibody systems with tree structure - A computer-oriented modeling approach

    NASA Astrophysics Data System (ADS)

    Jin, Liang; Bauer, Helmut F.

    1991-09-01

    Kane's dynamical model of flexible multibody space systems with tree structure is developed in this paper. The system topology is restricted to a tree configuration which is defined as an arbitrary set of flexible and rigid bodies connected by hinges characterizing relative translations and rotations of two adjoining bodies. The relative translational velocities, angular velocities, and the differential of model coordinates are selected as the generalized velocities. The motion equations of minimum dimension are derived via Kane's method. The resulting equations are suitable for automatic generation and computer simulation.

  17. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

    PubMed Central

    Galpert, Deborah; del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

  18. Assessment of Sampling Approaches for Remote Sensing Image Classification in the Iranian Playa Margins

    NASA Astrophysics Data System (ADS)

    Kazem Alavipanah, Seyed

    There are some problems in soil salinity studies based upon remotely sensed data: 1-spectral world is full of ambiguity and therefore soil reflectance can not be attributed to a single soil property such as salinity, 2) soil surface conditions as a function of time and space is a complex phenomena, 3) vegetation with a dynamic biological nature may create some problems in the study of soil salinity. Due to these problems the first question which may arise is how to overcome or minimise these problems. In this study we hypothesised that different sources of data, well established sampling plan and optimum approach could be useful. In order to choose representative training sites in the Iranian playa margins, to define the spectral and informational classes and to overcome some problems encountered in the variation within the field, the following attempts were made: 1) Principal Component Analysis (PCA) in order: a) to determine the most important variables, b) to understand the Landsat satellite images and the most informative components, 2) the photomorphic unit (PMU) consideration and interpretation; 3) study of salt accumulation and salt distribution in the soil profile, 4) use of several forms of field data, such as geologic, geomorphologic and soil information; 6) confirmation of field data and land cover types with farmers and the members of the team. The results led us to find at suitable approaches with a high and acceptable image classification accuracy and image interpretation. KEY WORDS; Photo Morphic Unit, Pprincipal Ccomponent Analysis, Soil Salinity, Field Work, Remote Sensing

  19. An Abstract Description Approach to the Discovery and Classification of Bioinformatics Web Sources

    SciTech Connect

    Rocco, D; Critchlow, T J

    2003-05-01

    The World Wide Web provides an incredible resource to genomics researchers in the form of dynamic data sources--e.g. BLAST sequence homology search interfaces. The growth rate of these sources outpaces the speed at which they can be manually classified, meaning that the available data is not being utilized to its full potential. Existing research has not addressed the problems of automatically locating, classifying, and integrating classes of bioinformatics data sources. This paper presents an overview of a system for finding classes of bioinformatics data sources and integrating them behind a unified interface. We examine an approach to classifying these sources automatically that relies on an abstract description format: the service class description. This format allows a domain expert to describe the important features of an entire class of services without tying that description to any particular Web source. We present the features of this description format in the context of BLAST sources to show how the service class description relates to Web sources that are being described. We then show how a service class description can be used to classify an arbitrary Web source to determine if that source is an instance of the described service. To validate the effectiveness of this approach, we have constructed a prototype that can correctly classify approximately two-thirds of the BLAST sources we tested. We then examine these results, consider the factors that affect correct automatic classification, and discuss future work.

  20. A self-trained semisupervised SVM approach to the remote sensing land cover classification

    NASA Astrophysics Data System (ADS)

    Liu, Ying; Zhang, Bai; Wang, Li-min; Wang, Nan

    2013-09-01

    Support vector machines (SVM) are nowadays receiving increasing attention in remote sensing applications although this technique is very sensitive to the parameters setting and training set definition. Self-training is an effective semisupervised method, which can reduce the effort needed to prepare the training set by training the model with a small number of labeled examples and an additional set of unlabeled examples. In this study, a novel semisupervised SVM model that uses self-training approach is proposed to address the problem of remote sensing land cover classification. The key characteristics of this approach are that (1) the self-adaptive mutation particle swarm optimization algorithm is introduced to get the optimum parameters that improve the generalization performance of the SVM classifier, and (2) the Gustafson-Kessel fuzzy clustering algorithm is proposed for the selection of unlabeled points to reduce the impact of ineffective labels. The effectiveness of the proposed technique is evaluated firstly with samples from remote sensing data and then by identifying different land cover regions in the remote sensing imagery. Experimental results show that accuracy level is increased by applying this learning scheme, which results in the smallest generalization error compared with the other schemes.

  1. A Mechanism-based 3D-QSAR Approach for Classification ...

    EPA Pesticide Factsheets

    Organophosphate (OP) and carbamate esters can inhibit acetylcholinesterase (AChE) by binding covalently to a serine residue in the enzyme active site, and their inhibitory potency depends largely on affinity for the enzyme and the reactivity of the ester. Despite this understanding, there has been no mechanism-based in silico approach for classification and prediction of the inhibitory potency of ether OPs or carbamates. This prompted us to develop a three dimensional prediction framework for OPs, carbamates, and their analogs. Inhibitory structures of a compound that can form the covalent bond were identified through analysis of docked conformations of the compound and its metabolites. Inhibitory potencies of the selected structures were then predicted using a previously developed three dimensional quantitative structure-active relationship. This approach was validated with a large number of structurally diverse OP and carbamate compounds encompassing widely used insecticides and structural analogs including OP flame retardants and thio- and dithiocarbamate pesticides. The modeling revealed that: (1) in addition to classical OP metabolic activation, the toxicity of carbamate compounds can be dependent on biotransformation, (2) OP and carbamate analogs such as OP flame retardants and thiocarbamate herbicides can act as AChEI, (3) hydrogen bonds at the oxyanion hole is critical for AChE inhibition through the covalent bond, and (4) π–π interaction with Trp86

  2. New approach for automatic classification of Alzheimer's disease, mild cognitive impairment and healthy brain magnetic resonance images.

    PubMed

    Lahmiri, Salim; Boukadoum, Mounir

    2014-01-01

    Explored is the utility of modelling brain magnetic resonance images as a fractal object for the classification of healthy brain images against those with Alzheimer's disease (AD) or mild cognitive impairment (MCI). More precisely, fractal multi-scale analysis is used to build feature vectors from the derived Hurst's exponents. These are then classified by support vector machines (SVMs). Three experiments were conducted: in the first the SVM was trained to classify AD against healthy images. In the second experiment, the SVM was trained to classify AD against MCI and, in the third experiment, a multiclass SVM was trained to classify all three types of images. The experimental results, using the 10-fold cross-validation technique, indicate that the SVM achieved 97.08% ± 0.05 correct classification rate, 98.09% ± 0.04 sensitivity and 96.07% ± 0.07 specificity for the classification of healthy against MCI images, thus outperforming recent works found in the literature. For the classification of MCI against AD, the SVM achieved 97.5% ± 0.04 correct classification rate, 100% sensitivity and 94.93% ± 0.08 specificity. The third experiment also showed that the multiclass SVM provided highly accurate classification results. The processing time for a given image was 25 s. These findings suggest that this approach is efficient and may be promising for clinical applications.

  3. Automatic approach to solve the morphological galaxy classification problem using the sparse representation technique and dictionary learning

    NASA Astrophysics Data System (ADS)

    Diaz-Hernandez, R.; Ortiz-Esquivel, A.; Peregrina-Barreto, H.; Altamirano-Robles, L.; Gonzalez-Bernal, J.

    2016-06-01

    The observation of celestial objects in the sky is a practice that helps astronomers to understand the way in which the Universe is structured. However, due to the large number of observed objects with modern telescopes, the analysis of these by hand is a difficult task. An important part in galaxy research is the morphological structure classification based on the Hubble sequence. In this research, we present an approach to solve the morphological galaxy classification problem in an automatic way by using the Sparse Representation technique and dictionary learning with K-SVD. For the tests in this work, we use a database of galaxies extracted from the Principal Galaxy Catalog (PGC) and the APM Equatorial Catalogue of Galaxies obtaining a total of 2403 useful galaxies. In order to represent each galaxy frame, we propose to calculate a set of 20 features such as Hu's invariant moments, galaxy nucleus eccentricity, gabor galaxy ratio and some other features commonly used in galaxy classification. A stage of feature relevance analysis was performed using Relief-f in order to determine which are the best parameters for the classification tests using 2, 3, 4, 5, 6 and 7 galaxy classes making signal vectors of different length values with the most important features. For the classification task, we use a 20-random cross-validation technique to evaluate classification accuracy with all signal sets achieving a score of 82.27 % for 2 galaxy classes and up to 44.27 % for 7 galaxy classes.

  4. New approach for automatic classification of Alzheimer's disease, mild cognitive impairment and healthy brain magnetic resonance images

    PubMed Central

    Boukadoum, Mounir

    2014-01-01

    Explored is the utility of modelling brain magnetic resonance images as a fractal object for the classification of healthy brain images against those with Alzheimer's disease (AD) or mild cognitive impairment (MCI). More precisely, fractal multi-scale analysis is used to build feature vectors from the derived Hurst's exponents. These are then classified by support vector machines (SVMs). Three experiments were conducted: in the first the SVM was trained to classify AD against healthy images. In the second experiment, the SVM was trained to classify AD against MCI and, in the third experiment, a multiclass SVM was trained to classify all three types of images. The experimental results, using the 10-fold cross-validation technique, indicate that the SVM achieved 97.08% ± 0.05 correct classification rate, 98.09% ± 0.04 sensitivity and 96.07% ± 0.07 specificity for the classification of healthy against MCI images, thus outperforming recent works found in the literature. For the classification of MCI against AD, the SVM achieved 97.5% ± 0.04 correct classification rate, 100% sensitivity and 94.93% ± 0.08 specificity. The third experiment also showed that the multiclass SVM provided highly accurate classification results. The processing time for a given image was 25 s. These findings suggest that this approach is efficient and may be promising for clinical applications. PMID:26609373

  5. Validation of a novel classification model of psychogenic nonepileptic seizures by video-EEG analysis and a machine learning approach.

    PubMed

    Magaudda, Adriana; Laganà, Angela; Calamuneri, Alessandro; Brizzi, Teresa; Scalera, Cinzia; Beghi, Massimiliano; Cornaggia, Cesare Maria; Di Rosa, Gabriella

    2016-07-01

    The aim of this study was to validate a novel classification for the diagnosis of PNESs. Fifty-five PNES video-EEG recordings were retrospectively analyzed by four epileptologists and one psychiatrist in a blind manner and classified into four distinct groups: Hypermotor (H), Akinetic (A), Focal Motor (FM), and with Subjective Symptoms (SS). Eleven signs and symptoms, which are frequently found in PNESs, were chosen for statistical validation of our classification. An artificial neural network (ANN) analyzed PNES video recordings based on the signs and symptoms mentioned above. By comparing results produced by the ANN with classifications given by examiners, we were able to understand whether such classification was objective and generalizable. Through accordance metrics based on signs and symptoms (range: 0-100%), we found that most of the seizures belonging to class A showed a high degree of accordance (mean±SD=73%±5%); a similar pattern was found for class SS (80% slightly lower accordance was reported for class H (58%±18%)), with a minimum of 30% in some cases. Low agreement arose from the FM group. Seizures were univocally assigned to a given class in 83.6% of seizures. The ANN classified PNESs in the same way as visual examination in 86.7%. Agreement between ANN classification and visual classification reached 83.3% (SD=17.8%) accordance for class H, 100% (SD=22%) for class A, 83.3% (SD=21.2%) for class SS, and 50% (SD=19.52%) for class FM. This is the first study in which the validity of a new PNES classification was established and reached in two different ways. Video-EEG evaluation needs to be performed by an experienced clinician, but later on, it may be fed into ANN analysis, whose feedback will provide guidance for differential diagnosis. Our analysis, supported by the ML approach, showed that this model of classification could be objectively performed by video-EEG examination.

  6. Comments on "A modified reachability tree approach to analysis of unbounded Petri nets".

    PubMed

    Ru, Yu; Wu, Weimin; Hadjicostis, Christoforos N

    2006-10-01

    The above paper introduced the construction of a modified reachability tree (MRT) for (unbounded) Petri nets and its application to reachability, liveness, and deadlock analysis. This note shows via a counterexample that some of the MRT properties claimed in the above paper are incorrect.

  7. Buildings classification from airborne LiDAR point clouds through OBIA and ontology driven approach

    NASA Astrophysics Data System (ADS)

    Tomljenovic, Ivan; Belgiu, Mariana; Lampoltshammer, Thomas J.

    2013-04-01

    In the last years, airborne Light Detection and Ranging (LiDAR) data proved to be a valuable information resource for a vast number of applications ranging from land cover mapping to individual surface feature extraction from complex urban environments. To extract information from LiDAR data, users apply prior knowledge. Unfortunately, there is no consistent initiative for structuring this knowledge into data models that can be shared and reused across different applications and domains. The absence of such models poses great challenges to data interpretation, data fusion and integration as well as information transferability. The intention of this work is to describe the design, development and deployment of an ontology-based system to classify buildings from airborne LiDAR data. The novelty of this approach consists of the development of a domain ontology that specifies explicitly the knowledge used to extract features from airborne LiDAR data. The overall goal of this approach is to investigate the possibility for classification of features of interest from LiDAR data by means of domain ontology. The proposed workflow is applied to the building extraction process for the region of "Biberach an der Riss" in South Germany. Strip-adjusted and georeferenced airborne LiDAR data is processed based on geometrical and radiometric signatures stored within the point cloud. Region-growing segmentation algorithms are applied and segmented regions are exported to the GeoJSON format. Subsequently, the data is imported into the ontology-based reasoning process used to automatically classify exported features of interest. Based on the ontology it becomes possible to define domain concepts, associated properties and relations. As a consequence, the resulting specific body of knowledge restricts possible interpretation variants. Moreover, ontologies are machinable and thus it is possible to run reasoning on top of them. Available reasoners (FACT++, JESS, Pellet) are used to check

  8. A discriminative model-constrained EM approach to 3D MRI brain tissue classification and intensity non-uniformity correction

    NASA Astrophysics Data System (ADS)

    Wels, Michael; Zheng, Yefeng; Huber, Martin; Hornegger, Joachim; Comaniciu, Dorin

    2011-06-01

    We describe a fully automated method for tissue classification, which is the segmentation into cerebral gray matter (GM), cerebral white matter (WM), and cerebral spinal fluid (CSF), and intensity non-uniformity (INU) correction in brain magnetic resonance imaging (MRI) volumes. It combines supervised MRI modality-specific discriminative modeling and unsupervised statistical expectation maximization (EM) segmentation into an integrated Bayesian framework. While both the parametric observation models and the non-parametrically modeled INUs are estimated via EM during segmentation itself, a Markov random field (MRF) prior model regularizes segmentation and parameter estimation. Firstly, the regularization takes into account knowledge about spatial and appearance-related homogeneity of segments in terms of pairwise clique potentials of adjacent voxels. Secondly and more importantly, patient-specific knowledge about the global spatial distribution of brain tissue is incorporated into the segmentation process via unary clique potentials. They are based on a strong discriminative model provided by a probabilistic boosting tree (PBT) for classifying image voxels. It relies on the surrounding context and alignment-based features derived from a probabilistic anatomical atlas. The context considered is encoded by 3D Haar-like features of reduced INU sensitivity. Alignment is carried out fully automatically by means of an affine registration algorithm minimizing cross-correlation. Both types of features do not immediately use the observed intensities provided by the MRI modality but instead rely on specifically transformed features, which are less sensitive to MRI artifacts. Detailed quantitative evaluations on standard phantom scans and standard real-world data show the accuracy and robustness of the proposed method. They also demonstrate relative superiority in comparison to other state-of-the-art approaches to this kind of computational task: our method achieves average

  9. The suitability of the dual isotope approach (δ13C and δ18O) in tree ring studies

    NASA Astrophysics Data System (ADS)

    Siegwolf, Rolf; Saurer, Matthias

    2016-04-01

    The use of stable isotopes, complementary to tree ring width data in tree ring research has proven to be a powerful tool in studying the impact of environmental parameters on tree physiology and growth. These three proxies are thus instrumental for climate reconstruction and improve the understanding of underlying causes of growth changes. In various cases, however, their use suggests non-plausible interpretations. Often the use of one isotope alone does not allow the detection of such "erroneous isotope responses". A careful analysis of these deviating results shows that either the validity of the carbon isotope discrimination concept is no longer true (Farquhar et al. 1982) or the assumptions for the leaf water enrichment model (Cernusak et al., 2003) are violated and thus both fractionation models are not applicable. In this presentation we discuss such cases when the known fractionation concepts fail and do not allow a correct interpretation of the isotope data. With the help of the dual isotope approach (Scheidegger et al.; 2000) it is demonstrated, how to detect and uncover the causes for such anomalous isotope data. The fractionation concepts and their combinations before the background of CO2 and H2O gas exchange are briefly explained and the specific use of the dual isotope approach for tree ring data analyses and interpretations are demonstrated. References: Cernusak, L. A., Arthur, D. J., Pate, J. S. and Farquhar, G. D.: Water relations link carbon and oxygen isotope discrimination to phloem sap sugar concentration in Eucalyptus globules, Plant Physiol., 131, 1544-1554, 2003. Farquhar, G. D., O'Leary, M. H. and Berry, J. A.: On the relationship between carbon isotope discrimination and the intercellular carbon dioxide concentration in leaves, Aust. J. Plant Physiol., 9, 121-137, 1982. Scheidegger, Y., Saurer, M., Bahn, M. and Siegwolf, R.: Linking stable oxygen and carbon isotopes with stomatal conductance and photosynthetic capacity: A conceptual model

  10. Alternative standardization approaches to improving streamflow reconstructions with ring-width indices of riparian trees

    USGS Publications Warehouse

    Meko, David M; Friedman, Jonathan M.; Touchan, Ramzi; Edmondson, Jesse R.; Griffin, Eleanor R.; Scott, Julian A.

    2015-01-01

    Old, multi-aged populations of riparian trees provide an opportunity to improve reconstructions of streamflow. Here, ring widths of 394 plains cottonwood (Populus deltoids, ssp. monilifera) trees in the North Unit of Theodore Roosevelt National Park, North Dakota, are used to reconstruct streamflow along the Little Missouri River (LMR), North Dakota, US. Different versions of the cottonwood chronology are developed by (1) age-curve standardization (ACS), using age-stratified samples and a single estimated curve of ring width against estimated ring age, and (2) time-curve standardization (TCS), using a subset of longer ring-width series individually detrended with cubic smoothing splines of width against year. The cottonwood chronologies are combined with the first principal component of four upland conifer chronologies developed by conventional methods to investigate the possible value of riparian tree-ring chronologies for streamflow reconstruction of the LMR. Regression modeling indicates that the statistical signal for flow is stronger in the riparian cottonwood than in the upland chronologies. The flow signal from cottonwood complements rather than repeats the signal from upland conifers and is especially strong in young trees (e.g. 5–35 years). Reconstructions using a combination of cottonwoods and upland conifers are found to explain more than 50% of the variance of LMR flow over a 1935–1990 calibration period and to yield reconstruction of flow to 1658. The low-frequency component of reconstructed flow is sensitive to the choice of standardization method for the cottonwood. In contrast to the TCS version, the ACS reconstruction features persistent low flows in the 19th century. Results demonstrate the value to streamflow reconstruction of riparian cottonwood and suggest that more studies are needed to exploit the low-frequency streamflow signal in densely sampled age-stratified stands of riparian trees.

  11. Toward the Improvement of Trail Classification in National Parks Using the Recreation Opportunity Spectrum Approach

    NASA Astrophysics Data System (ADS)

    Oishi, Yoshitaka

    2013-06-01

    Trail settings in national parks are essential management tools for improving both ecological conservation efforts and the quality of visitor experiences. This study proposes a plan for the appropriate maintenance of trails in Chubusangaku National Park, Japan, based on the recreation opportunity spectrum (ROS) approach. First, we distributed 452 questionnaires to determine park visitors' preferences for setting a trail (response rate = 68 %). Respondents' preferences were then evaluated according to the following seven parameters: access, remoteness, naturalness, facilities and site management, social encounters, visitor impact, and visitor management. Using nonmetric multidimensional scaling and cluster analysis, the visitors were classified into seven groups. Last, we classified the actual trails according to the visitor questionnaire criteria to examine the discrepancy between visitors' preferences and actual trail settings. The actual trail classification indicated that while most developed trails were located in accessible places, primitive trails were located in remote areas. However, interestingly, two visitor groups seemed to prefer a well-conserved natural environment and, simultaneously, easily accessible trails. This finding does not correspond to a premise of the ROS approach, which supposes that primitive trails should be located in remote areas without ready access. Based on this study's results, we propose that creating trails, which afford visitors the opportunity to experience a well-conserved natural environment in accessible areas is a useful means to provide visitors with diverse recreation opportunities. The process of data collection and analysis in this study can be one approach to produce ROS maps for providing visitors with recreational opportunities of greater diversity and higher quality.

  12. Toward the improvement of trail classification in national parks using the recreation opportunity spectrum approach.

    PubMed

    Oishi, Yoshitaka

    2013-06-01

    Trail settings in national parks are essential management tools for improving both ecological conservation efforts and the quality of visitor experiences. This study proposes a plan for the appropriate maintenance of trails in Chubusangaku National Park, Japan, based on the recreation opportunity spectrum (ROS) approach. First, we distributed 452 questionnaires to determine park visitors' preferences for setting a trail (response rate = 68 %). Respondents' preferences were then evaluated according to the following seven parameters: access, remoteness, naturalness, facilities and site management, social encounters, visitor impact, and visitor management. Using nonmetric multidimensional scaling and cluster analysis, the visitors were classified into seven groups. Last, we classified the actual trails according to the visitor questionnaire criteria to examine the discrepancy between visitors' preferences and actual trail settings. The actual trail classification indicated that while most developed trails were located in accessible places, primitive trails were located in remote areas. However, interestingly, two visitor groups seemed to prefer a well-conserved natural environment and, simultaneously, easily accessible trails. This finding does not correspond to a premise of the ROS approach, which supposes that primitive trails should be located in remote areas without ready access. Based on this study's results, we propose that creating trails, which afford visitors the opportunity to experience a well-conserved natural environment in accessible areas is a useful means to provide visitors with diverse recreation opportunities. The process of data collection and analysis in this study can be one approach to produce ROS maps for providing visitors with recreational opportunities of greater diversity and higher quality.

  13. Detection of dispersed radio pulses: a machine learning approach to candidate identification and classification

    NASA Astrophysics Data System (ADS)

    Devine, Thomas Ryan; Goseva-Popstojanova, Katerina; McLaughlin, Maura

    2016-06-01

    Searching for extraterrestrial, transient signals in astronomical data sets is an active area of current research. However, machine learning techniques are lacking in the literature concerning single-pulse detection. This paper presents a new, two-stage approach for identifying and classifying dispersed pulse groups (DPGs) in single-pulse search output. The first stage identified DPGs and extracted features to characterize them using a new peak identification algorithm which tracks sloping tendencies around local maxima in plots of signal-to-noise ratio versus dispersion measure. The second stage used supervised machine learning to classify DPGs. We created four benchmark data sets: one unbalanced and three balanced versions using three different imbalance treatments. We empirically evaluated 48 classifiers by training and testing binary and multiclass versions of six machine learning algorithms on each of the four benchmark versions. While each classifier had advantages and disadvantages, all classifiers with imbalance treatments had higher recall values than those with unbalanced data, regardless of the machine learning algorithm used. Based on the benchmarking results, we selected a subset of classifiers to classify the full, unlabelled data set of over 1.5 million DPGs identified in 42 405 observations made by the Green Bank Telescope. Overall, the classifiers using a multiclass ensemble tree learner in combination with two oversampling imbalance treatments were the most efficient; they identified additional known pulsars not in the benchmark data set and provided six potential discoveries, with significantly less false positives than the other classifiers.

  14. Contrasting regional and national mechanisms for predicting elevated arsenic in private wells across the United States using classification and regression trees.

    PubMed

    Frederick, Logan; VanDerslice, James; Taddie, Marissa; Malecki, Kristen; Gregg, Josh; Faust, Nicholas; Johnson, William P

    2016-03-15

    Arsenic contamination in groundwater is a public health and environmental concern in the United States (U.S.) particularly where monitoring is not required under the Safe Water Drinking Act. Previous studies suggest the influence of regional mechanisms for arsenic mobilization into groundwater; however, no study has examined how influencing parameters change at a continental scale spanning multiple regions. We herein examine covariates for groundwater in the western, central and eastern U.S. regions representing mechanisms associated with arsenic concentrations exceeding the U.S. Environmental Protection Agency maximum contamination level (MCL) of 10 parts per billion (ppb). Statistically significant covariates were identified via classification and regression tree (CART) analysis, and included hydrometeorological and groundwater chemical parameters. The CART analyses were performed at two scales: national and regional; for which three physiographic regions located in the western (Payette Section and the Snake River Plain), central (Osage Plains of the Central Lowlands), and eastern (Embayed Section of the Coastal Plains) U.S. were examined. Validity of each of the three regional CART models was indicated by values >85% for the area under the receiver-operating characteristic curve. Aridity (precipitation minus potential evapotranspiration) was identified as the primary covariate associated with elevated arsenic at the national scale. At the regional scale, aridity and pH were the major covariates in the arid to semi-arid (western) region; whereas dissolved iron (taken to represent chemically reducing conditions) and pH were major covariates in the temperate (eastern) region, although additional important covariates emerged, including elevated phosphate. Analysis in the central U.S. region indicated that elevated arsenic concentrations were driven by a mixture of those observed in the western and eastern regions.

  15. Metastatic ovarian carcinoma to the brain: an approach to identification and classification for neuropathologists.

    PubMed

    Nafisi, Houman; Cesari, Matthew; Karamchandani, Jason; Balasubramaniam, Gayathiri; Keith, Julia Lee

    2015-04-01

    Brain metastasis is an uncommon but increasing manifestation of ovarian epithelial carcinoma and neuropathologists' collective experience with these tumors is limited. We present clinicopathological characteristics of 13 cases of brain metastases from ovarian epithelial carcinoma diagnosed at two academic institutions. The mean ages at diagnosis of the ovarian carcinoma and their subsequent brain metastases were 58.7 and 62.8 years, respectively. At the time of initial diagnosis of ovarian carcinoma the majority of patients had an advanced stage and none had brain metastases as their first manifestation of malignancy. Brain metastases tended to be multiple with ring-enhancing features on neuroimaging. Primary tumors and their brain metastases were all high-grade histologically and the histologic subtypes were: nine high-grade serous carcinoma (HGSC) cases, two clear cell carcinoma (CCC) cases and a single case each of carcinosarcoma and high-grade adenocarcinoma. A recommended histo- and immunopathological approach to these tumours are provided to aid neuropathologists in the recognition and classification of metastatic ovarian carcinoma to the brain.

  16. A robust automatic birdsong phrase classification: A template-based approach.

    PubMed

    Kaewtip, Kantapon; Alwan, Abeer; O'Reilly, Colm; Taylor, Charles E

    2016-11-01

    Automatic phrase detection systems of bird sounds are useful in several applications as they reduce the need for manual annotations. However, birdphrase detection is challenging due to limited training data and background noise. Limited data occur because of limited recordings or the existence of rare phrases. Background noise interference occurs because of the intrinsic nature of the recording environment such as wind or other animals. This paper presents a different approach to birdsong phrase classification using template-based techniques suitable even for limited training data and noisy environments. The algorithm utilizes dynamic time-warping (DTW) and prominent (high-energy) time-frequency regions of training spectrograms to derive templates. The performance of the proposed algorithm is compared with the traditional DTW and hidden Markov models (HMMs) methods under several training and test conditions. DTW works well when the data are limited, while HMMs do better when more data are available, yet they both suffer when the background noise is severe. The proposed algorithm outperforms DTW and HMMs in most training and testing conditions, usually with a high margin when the background noise level is high. The innovation of this work is that the proposed algorithm is robust to both limited training data and background noise.

  17. Sonoelastomics for Breast Tumor Classification: A Radiomics Approach with Clustering-Based Feature Selection on Sonoelastography.

    PubMed

    Zhang, Qi; Xiao, Yang; Suo, Jingfeng; Shi, Jun; Yu, Jinhua; Guo, Yi; Wang, Yuanyuan; Zheng, Hairong

    2017-02-20

    A radiomics approach to sonoelastography, called "sonoelastomics," is proposed for classification of benign and malignant breast tumors. From sonoelastograms of breast tumors, a high-throughput 364-dimensional feature set was calculated consisting of shape features, intensity statistics, gray-level co-occurrence matrix texture features and contourlet texture features, which quantified the shape, hardness and hardness heterogeneity of a tumor. The high-throughput features were then selected for feature reduction using hierarchical clustering and three-feature selection metrics. For a data set containing 42 malignant and 75 benign tumors from 117 patients, seven selected sonoelastomic features achieved an area under the receiver operating characteristic curve of 0.917, an accuracy of 88.0%, a sensitivity of 85.7% and a specificity of 89.3% in a validation set via the leave-one-out cross-validation, revealing superiority over the principal component analysis, deep polynomial networks and manually selected features. The sonoelastomic features are valuable in breast tumor differentiation.

  18. Building and Solving Odd-One-Out Classification Problems: A Systematic Approach

    ERIC Educational Resources Information Center

    Ruiz, Philippe E.

    2011-01-01

    Classification problems ("find the odd-one-out") are frequently used as tests of inductive reasoning to evaluate human or animal intelligence. This paper introduces a systematic method for building the set of all possible classification problems, followed by a simple algorithm for solving the problems of the R-ASCM, a psychometric test derived…

  19. A Philosophical Approach to Describing Science Content: An Example From Geologic Classification.

    ERIC Educational Resources Information Center

    Finley, Fred N.

    1981-01-01

    Examines how research of philosophers of science may be useful to science education researchers and curriculum developers in the development of descriptions of science content related to classification schemes. Provides examples of concept analysis of two igneous rock classification schemes. (DS)

  20. Categorizing ideas about trees: a tree of trees.

    PubMed

    Fisler, Marie; Lecointre, Guillaume

    2013-01-01

    The aim of this study is to explore whether matrices and MP trees used to produce systematic categories of organisms could be useful to produce categories of ideas in history of science. We study the history of the use of trees in systematics to represent the diversity of life from 1766 to 1991. We apply to those ideas a method inspired from coding homologous parts of organisms. We discretize conceptual parts of ideas, writings and drawings about trees contained in 41 main writings; we detect shared parts among authors and code them into a 91-characters matrix and use a tree representation to show who shares what with whom. In other words, we propose a hierarchical representation of the shared ideas about trees among authors: this produces a "tree of trees." Then, we categorize schools of tree-representations. Classical schools like "cladists" and "pheneticists" are recovered but others are not: "gradists" are separated into two blocks, one of them being called here "grade theoreticians." We propose new interesting categories like the "buffonian school," the "metaphoricians," and those using "strictly genealogical classifications." We consider that networks are not useful to represent shared ideas at the present step of the study. A cladogram is made for showing who is sharing what with whom, but also heterobathmy and homoplasy of characters. The present cladogram is not modelling processes of transmission of ideas about trees, and here it is mostly used to test for proximity of ideas of the same age and for categorization.